Effortless File Parsing in NestJS: Manage CSV and XLSX Uploads in Memory for Speed, Security, and Scalability
Handling file uploads in a web application is a common task, but dealing with different file types and ensuring they are processed correctly can be challenging. Often, developers need to parse uploaded files without saving them to the server, which is especially important for reducing server storage costs and ensuring that sensitive data is not unnecessarily retained. In this article, we’ll walk through the process of creating a custom NestJS module to handle file uploads specifically for CSV and XLS/XLSX files, and we’ll parse these files in memory using Node.js streams, so no static files are created on the server.
NestJS is a progressive Node.js framework that leverages TypeScript and provides an out-of-the-box application architecture that enables you to build highly testable, scalable, loosely coupled, and easily maintainable applications. By using NestJS, we can take advantage of its modular structure, powerful dependency injection system, and extensive ecosystem.
Before we dive into the code, let’s set up a new NestJS project. If you haven’t already, install the NestJS CLI:
npm install -g @nestjs/cli
Create a new NestJS project:
nest new your-super-name
Navigate into the project directory:
cd your-super-name
We’ll need to install some additional packages to handle file uploads and parsing:
npm install @nestjs/platform-express multer exceljsfile-type
To customize the file upload process, we’ll create a custom Multer storage engine. This engine will ensure that only CSV and XLS/XLSX files are accepted, parse them in memory using Node.js streams, and return the parsed data without saving any files to disk.
Create a new file for our engine:
import { PassThrough } from 'stream'; import * as fileType from 'file-type'; import { BadRequestException } from '@nestjs/common'; import { Request } from 'express'; import { Workbook } from 'exceljs'; import { createParserCsvOrXlsx } from './parser-factory.js'; const ALLOWED_MIME_TYPES = [ 'text/csv', 'application/vnd.ms-excel', 'text/comma-separated-values', 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'application/vnd.ms-excel', ] as const; export class CsvOrXlsxMulterEngine { private destKey: string; private maxFileSize: number; constructor(opts: { destKey: string; maxFileSize: number }) { this.destKey = opts.destKey; this.maxFileSize = opts.maxFileSize; } async _handleFile(req: Request, file: any, cb: any) { try { const contentLength = Number(req.headers['content-length']); if ( typeof contentLength === 'number' && contentLength > this.maxFileSize ) { throw new Error(`Max file size is ${this.maxFileSize} bytes.`); } const fileStream = await fileType.fileTypeStream(file.stream); const mime = fileStream.fileType?.mime ?? file.mimetype; if (!ALLOWED_MIME_TYPES.includes(mime)) { throw new BadRequestException('File must be *.csv or *.xlsx'); } const replacementStream = new PassThrough(); fileStream.pipe(replacementStream); const parser = createParserCsvOrXlsx(mime); const data = await parser.read(replacementStream); cb(null, { [this.destKey]: mime === 'text/csv' ? data : (data as Workbook).getWorksheet(), }); } catch (error) { cb(error); } } _removeFile(req: Request, file: any, cb: any) { cb(null); } }
This custom storage engine checks the file’s MIME type and ensures it’s either a CSV or XLS/XLSX file. It then processes the file entirely in memory using Node.js streams, so no temporary files are created on the server. This approach is both efficient and secure, especially when dealing with sensitive data.
The parser factory is responsible for determining the appropriate parser based on the file type.
Create a new file for our parser:
import excel from 'exceljs'; export function createParserCsvOrXlsx(mime: string) { const workbook = new excel.Workbook(); return [ 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'application/vnd.ms-excel', ].includes(mime) ? workbook.xlsx : workbook.csv; }
This factory function checks the MIME type and returns the appropriate parser (either xlsx or csv).
Next, let’s create a controller to handle file uploads using our custom storage engine.
Generate a new controller:
nest g controller files
In the files.controller.ts, configure the file upload using Multer and the custom storage engine:
import { Controller, Post, UploadedFile, UseInterceptors, } from '@nestjs/common'; import { FileInterceptor } from '@nestjs/platform-express'; import { Worksheet } from 'exceljs'; import { CsvOrXlsxMulterEngine } from '../../shared/multer-engines/csv-xlsx/engine.js'; import { FilesService } from './files.service.js'; const MAX_FILE_SIZE_IN_MiB = 1000000000; // Only for test @Controller('files') export class FilesController { constructor(private readonly filesService: FilesService) {} @UseInterceptors( FileInterceptor('file', { storage: new CsvOrXlsxMulterEngine({ maxFileSize: MAX_FILE_SIZE_IN_MiB, destKey: 'worksheet', }), }), ) @Post() create(@UploadedFile() data: { worksheet: Worksheet }) { return this.filesService.format(data.worksheet); } }
This controller sets up an endpoint to handle file uploads. The uploaded file is processed by the CsvOrXlsxMulterEngine, and the parsed data is returned in the response without ever being saved to disk.
Finally, we need to set up a module to include our controller.
Generate a new module:
nest g module files
In the files.module.ts, import the controller:
import { Module } from '@nestjs/common'; import { FilesController } from './files.controller.js'; import { FilesService } from './files.service.js'; @Module({ providers: [FilesService], controllers: [FilesController], }) export class FilesModule {}
Make sure to import this module into your AppModule:
To test the file upload functionality, we can create a simple HTML page that allows users to upload CSV or XLS/XLSX files. This page will send the file to our /api/files endpoint, where it will be parsed and processed in memory.
Here’s the basic HTML file for testing the file upload:
File Upload Upload a File (CSV or XLSX)
To render the HTML page for file uploads, we first need to install an additional NestJS module called @nestjs/serve-static. You can do this by running the following command:
npm install @nestjs/serve-static
After installing, we need to configure this module in AppModule:
import { Module } from '@nestjs/common'; import { join } from 'path'; import { ServeStaticModule } from '@nestjs/serve-static'; import { FilesModule } from './modules/files/files.module.js'; @Module({ imports: [ FilesModule, ServeStaticModule.forRoot({ rootPath: join(new URL('..', import.meta.url).pathname, 'public'), serveRoot: '/', }), ], }) export class AppModule {}
This setup will allow us to serve static files from the public directory. Now, we can open the file upload page by navigating to http://localhost:3000 in your browser.
Upload Your File
To upload a file, follow these steps:
Once the file is uploaded successfully, you should see a confirmation that the file has been uploaded and formatted.
Note: I haven’t included code for formatting the uploaded file, as this depends on the library you choose for processing CSV or XLS/XLSX files. You can view the complete implementation on GitHub.
Comparing Pros and Cons of In-Memory File Processing
When deciding whether to use in-memory file processing or saving files to disk, it’s important to understand the trade-offs.
No Temporary Files on Disk:
Faster Processing:
Simplified Cleanup:
Memory Usage:
File Size Limitations:
Complexity in Error Handling:
Small to Medium Files: If your application deals with relatively small files, in-memory processing can offer speed and simplicity.
Security-Sensitive Applications: When handling sensitive data that shouldn’t be stored on disk, in-memory processing can reduce the risk of data breaches.
High-Performance Scenarios: Applications that require high throughput and minimal latency may benefit from the reduced overhead of in-memory processing.
Large Files: If your application needs to process very large files, disk-based processing may be necessary to avoid running out of memory.
Resource-Constrained Environments: In cases where server memory is limited, processing files on disk can prevent memory exhaustion and allow for better resource management.
Persistent Storage Needs: If you need to retain a copy of the uploaded file for auditing, backup, or later retrieval, saving files to disk is necessary.
Integration with External Storage Services: For large files, consider uploading them to external storage services like AWS S3, Google Cloud
Scalability: Cloud storage solutions can handle massive files and provide redundancy, ensuring that your data is safe and easily accessible from multiple geographic locations.
Cost Efficiency: Using cloud storage can be more cost-effective for handling large files, as it reduces the need for local server resources and provides pay-as-you-go pricing.
In this article, we’ve created a custom file upload module in NestJS that handles CSV and XLS/XLSX files, parses them in memory, and returns the parsed data without saving any files to disk. This approach leverages the power of Node.js streams, making it both efficient and secure, as no temporary files are left on the server.
We’ve also explored the pros and cons of in-memory file processing versus saving files to disk. While in-memory processing offers speed, security, and simplicity, it’s important to consider the memory usage and potential file size limitations before adopting this approach.
Whether you’re building an enterprise application or a small project, handling file uploads and parsing correctly is crucial. With this setup, you’re well on your way to mastering file uploads in NestJS without worrying about unnecessary server storage or data security issues.
Feel free to share your thoughts and improvements in the comments section below!
If you enjoyed this article or found these tools useful, make sure to follow me on Dev.to for more insights and tips on coding and development. I regularly share helpful content to make your coding journey smoother.
Follow me on X (Twitter), where I share more interesting thoughts, updates, and discussions about programming and tech! Don't miss out - click those follow buttons.
You can also follow me on LinkedIn for professional insights, updates on my latest projects, and discussions about coding, tech trends, and more. Don't miss out on valuable content that can help you level up your development skills - let's connect!
Отказ от ответственности: Все предоставленные ресурсы частично взяты из Интернета. В случае нарушения ваших авторских прав или других прав и интересов, пожалуйста, объясните подробные причины и предоставьте доказательства авторских прав или прав и интересов, а затем отправьте их по электронной почте: [email protected]. Мы сделаем это за вас как можно скорее.
Copyright© 2022 湘ICP备2022001581号-3