High-performance CSV processing engine designed to handle multi-gigabyte files with a constant low memory footprint. Leveraging Node.js Streams and Backpressure, it prevents “Out-of-Memory” (OOM) crashes common in naive implementations.
A common scenario in Node.js backends:
products.csv file (500MB, 2 million rows).fs.readFile).FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory. The server crashes, restarting all active connections.This project solves this by treating data as a Flow, not a Block.

flowchart LR
Client[Client Upload]
filesys[Busboy Stream]
parser[CSV Parser]
batcher[Batch Processor]
db[(MongoDB)]
Client -->|MultiPart Stream| filesys
filesys -->|Pipe| parser
parser -->|Row by Row| batcher
subgraph Memory Protection [Node.js Pipeline]
direction TB
batcher -- "Buffer Full (1k rows)" --> db
db -. "Ack (Backpressure)" .-> batcher
batcher -. "Pause Reading" .-> parser
end
stream.pipeline to ensure proper cleanup and error handling. If the request is aborted, the file stream closes immediately.bulkWrite operations, increasing Throughput by ~50x.@fastify/multipart (Busboy wrapper)csv-parse (Streamable Parser)npm install
# Start MongoDB (if using Docker)
docker-compose up -d
# Start Application
npm run dev
Instead of manually searching for a large CSV, use the included helper scripts to generate and upload a test file.
A. Generate 100MB CSV
npx ts-node scripts/generate-csv.ts
Creates large_file.csv in the root (100MB).
B. Stream Upload to Server
npx ts-node scripts/test-upload.ts
This script streams the file preventing client-side OOM and logs the server response.
Observe the Logs:
[Progress] Processed: 10000 | Failed: 0 | Heap: 42MB
[Progress] Processed: 20000 | Failed: 0 | Heap: 43MB <-- Stable Memory!
To visualize the memory efficiency in real-time without external tools:
test-client.html in your browser (drag and drop the file).npm run upload.| Limitation | Solution in v2.0 |
|---|---|
| Single Node | Works on one server. For distributed processing, we would need to stream the file to Amazon S3 first and trigger an SQS Worker. |
Gérson Resplandes Backend Engineer focused on Performance & Stream Architectures.