Merge multiple CSV files in JavaScript (Node.js and browser)

Merging CSV files is a common task in data processing workflows. Whether you're consolidating monthly reports, combining data exports from different systems, or building ETL pipelines, JavaScript provides several approaches to handle this efficiently.

This tutorial covers merging CSV files in both Node.js and browser environments, with TypeScript support and memory-efficient streaming for large files.

Prerequisites

Node.js 18+ (for server-side examples)
npm or yarn
Basic familiarity with JavaScript/TypeScript

Library options

Before diving into code, here's a comparison of popular CSV libraries for JavaScript:

Library	Best For	Environment	Streaming	Weekly Dependents
Papa Parse	Browser + Node, simple API	Both	Yes (Web Workers)	2,406
csv-parse	Node.js streams, TypeScript	Node.js	Yes (Transform)	2,667
csv-merger	Quick CLI merges	Node.js	No	4
data-forge	DataFrame operations, SQL-like joins	Node.js	No	N/A

For most use cases, Papa Parse offers the best balance of simplicity and features. For Node.js applications processing large files, csv-parse provides better streaming support.

Step 1: Basic merge with Papa Parse (Node.js)

Install Papa Parse:

npm install papaparse

Here's a straightforward approach for merging small to medium CSV files:

const Papa = require('papaparse');
const fs = require('fs');

function mergeCSVFiles(filePaths, outputPath) {
  let allData = [];
  let headers = null;

  for (const filePath of filePaths) {
    const fileContent = fs.readFileSync(filePath, 'utf8');
    const result = Papa.parse(fileContent, { header: true });

    if (!headers) {
      headers = result.meta.fields;
    }
    allData = allData.concat(result.data);
  }

  const csv = Papa.unparse(allData);
  fs.writeFileSync(outputPath, csv);
  console.log(`Merged ${filePaths.length} files into ${outputPath}`);
  return csv;
}

// Usage
mergeCSVFiles(
  ['sales_jan.csv', 'sales_feb.csv', 'sales_mar.csv'],
  'sales_q1.csv'
);

This function reads each CSV file, parses it with headers enabled, concatenates all rows, and writes the merged result. The header: true option treats the first row as column names and returns data as an array of objects.

Step 2: TypeScript implementation with type safety

For TypeScript projects, add proper type definitions:

npm install papaparse @types/papaparse

import Papa from 'papaparse';
import * as fs from 'fs';

interface CSVRow {
  [key: string]: string;
}

interface MergeResult {
  data: CSVRow[];
  headers: string[];
  rowCount: number;
}

function mergeCSVFiles(filePaths: string[], outputPath: string): MergeResult {
  const allData: CSVRow[] = [];
  let headers: string[] = [];

  for (const filePath of filePaths) {
    const fileContent = fs.readFileSync(filePath, 'utf8');
    const result = Papa.parse<CSVRow>(fileContent, {
      header: true,
      skipEmptyLines: true,
    });

    if (headers.length === 0 && result.meta.fields) {
      headers = result.meta.fields;
    }

    allData.push(...result.data);
  }

  const csv = Papa.unparse(allData);
  fs.writeFileSync(outputPath, csv);

  return {
    data: allData,
    headers,
    rowCount: allData.length,
  };
}

// Usage
const result = mergeCSVFiles(
  ['data/users_2024.csv', 'data/users_2025.csv'],
  'data/users_merged.csv'
);
console.log(`Merged ${result.rowCount} rows with headers: ${result.headers.join(', ')}`);

The generic type parameter Papa.parse<CSVRow> ensures type safety when working with parsed data.

Step 3: Memory-efficient streaming for large files

When working with large CSV files (hundreds of thousands of rows), loading everything into memory causes crashes. Use streams instead:

npm install csv-parse csv-stringify

import { createReadStream, createWriteStream } from 'fs';
import { parse } from 'csv-parse';
import { stringify } from 'csv-stringify';

interface StreamMergeOptions {
  inputFiles: string[];
  outputFile: string;
  onProgress?: (file: string, rowCount: number) => void;
}

async function streamMergeCSV(options: StreamMergeOptions): Promise<number> {
  const { inputFiles, outputFile, onProgress } = options;

  const outputStream = createWriteStream(outputFile);
  const stringifier = stringify({ header: true });
  stringifier.pipe(outputStream);

  let totalRows = 0;

  for (const inputFile of inputFiles) {
    let fileRows = 0;

    await new Promise<void>((resolve, reject) => {
      createReadStream(inputFile)
        .pipe(parse({ columns: true, skip_empty_lines: true }))
        .on('data', (row: Record<string, string>) => {
          stringifier.write(row);
          fileRows++;
          totalRows++;
        })
        .on('end', () => {
          onProgress?.(inputFile, fileRows);
          resolve();
        })
        .on('error', reject);
    });
  }

  stringifier.end();

  return new Promise((resolve, reject) => {
    outputStream.on('finish', () => resolve(totalRows));
    outputStream.on('error', reject);
  });
}

// Usage
streamMergeCSV({
  inputFiles: ['large_file_1.csv', 'large_file_2.csv', 'large_file_3.csv'],
  outputFile: 'merged_output.csv',
  onProgress: (file, count) => console.log(`Processed ${file}: ${count} rows`),
}).then((total) => {
  console.log(`Total rows merged: ${total}`);
});

This approach processes one row at a time, keeping memory usage constant regardless of file size. The columns: true option in csv-parse works like Papa Parse's header: true.

Step 4: Browser implementation

In the browser, use the File API with Papa Parse. Install Papa Parse via CDN or npm:

<script src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.5.3/papaparse.min.js"></script>

Or with a bundler:

npm install papaparse

import Papa from 'papaparse';

interface ParsedCSV {
  data: Record<string, string>[];
  meta: { fields?: string[] };
}

function handleFileUpload(files: FileList): void {
  const allData: Record<string, string>[] = [];
  let filesProcessed = 0;

  Array.from(files).forEach((file) => {
    Papa.parse(file, {
      header: true,
      skipEmptyLines: true,
      complete: (results: ParsedCSV) => {
        allData.push(...results.data);
        filesProcessed++;

        if (filesProcessed === files.length) {
          const mergedCSV = Papa.unparse(allData);
          downloadCSV(mergedCSV, 'merged.csv');
        }
      },
      error: (error: Error) => {
        console.error(`Error parsing ${file.name}:`, error.message);
      },
    });
  });
}

function downloadCSV(csv: string, filename: string): void {
  const blob = new Blob([csv], { type: 'text/csv;charset=utf-8;' });
  const link = document.createElement('a');
  link.href = URL.createObjectURL(blob);
  link.download = filename;
  link.click();
  URL.revokeObjectURL(link.href);
}

// HTML: <input type="file" multiple accept=".csv" onchange="handleFileUpload(this.files)" />

The Papa.parse(file, config) function accepts a File object directly, parsing it asynchronously.

Step 5: Web Workers for large browser files

Large files freeze the browser UI during processing. Papa Parse supports Web Workers to offload parsing to a background thread:

import Papa from 'papaparse';

interface RowData {
  data: Record<string, string>;
}

function processLargeFile(file: File, onRow: (row: Record<string, string>) => void): Promise<void> {
  return new Promise((resolve, reject) => {
    Papa.parse(file, {
      header: true,
      worker: true,
      step: (row: RowData) => {
        onRow(row.data);
      },
      complete: () => {
        resolve();
      },
      error: (error: Error) => {
        reject(error);
      },
    });
  });
}

async function mergeWithWorker(files: File[]): Promise<string> {
  const allData: Record<string, string>[] = [];

  for (const file of files) {
    await processLargeFile(file, (row) => {
      allData.push(row);
    });
    console.log(`Processed ${file.name}`);
  }

  return Papa.unparse(allData);
}

The worker: true option creates a Web Worker automatically. The step callback receives each row as it's parsed, allowing for progress indicators or early termination.

Step 6: Handling different column headers

CSV files from different sources often have mismatched columns. This function normalizes headers before merging:

interface CSVData {
  headers: string[];
  rows: Record<string, string>[];
}

function collectUniqueHeaders(datasets: CSVData[]): string[] {
  const headerSet = new Set<string>();

  for (const dataset of datasets) {
    for (const header of dataset.headers) {
      headerSet.add(header);
    }
  }

  return Array.from(headerSet);
}

function normalizeAndMerge(datasets: CSVData[]): Record<string, string>[] {
  const allHeaders = collectUniqueHeaders(datasets);
  const normalizedData: Record<string, string>[] = [];

  for (const dataset of datasets) {
    for (const row of dataset.rows) {
      const normalizedRow: Record<string, string> = {};

      for (const header of allHeaders) {
        normalizedRow[header] = row[header] ?? '';
      }

      normalizedData.push(normalizedRow);
    }
  }

  return normalizedData;
}

// Usage with Papa Parse
function parseAndMerge(filePaths: string[]): string {
  const datasets: CSVData[] = filePaths.map((filePath) => {
    const content = fs.readFileSync(filePath, 'utf8');
    const result = Papa.parse<Record<string, string>>(content, { header: true });

    return {
      headers: result.meta.fields ?? [],
      rows: result.data,
    };
  });

  const merged = normalizeAndMerge(datasets);
  return Papa.unparse(merged);
}

This approach collects all unique headers from every file, then fills in empty strings for any missing columns. The result is a consistent schema across all merged rows.

Complete example

Here's a production-ready Node.js script that handles all edge cases:

import Papa from 'papaparse';
import * as fs from 'fs';
import * as path from 'path';

interface MergeOptions {
  inputDir: string;
  outputFile: string;
  filePattern?: RegExp;
  normalizeHeaders?: boolean;
}

interface MergeStats {
  filesProcessed: number;
  totalRows: number;
  headers: string[];
  outputPath: string;
}

function mergeCSVDirectory(options: MergeOptions): MergeStats {
  const { inputDir, outputFile, filePattern = /\.csv$/i, normalizeHeaders = true } = options;

  // Find all CSV files
  const files = fs.readdirSync(inputDir)
    .filter((file) => filePattern.test(file))
    .map((file) => path.join(inputDir, file))
    .sort();

  if (files.length === 0) {
    throw new Error(`No CSV files found in ${inputDir}`);
  }

  // Parse all files
  const datasets = files.map((filePath) => {
    const content = fs.readFileSync(filePath, 'utf8');
    const result = Papa.parse<Record<string, string>>(content, {
      header: true,
      skipEmptyLines: true,
      transformHeader: (header) => header.trim(),
    });

    return {
      file: path.basename(filePath),
      headers: result.meta.fields ?? [],
      rows: result.data,
    };
  });

  // Collect headers
  let allHeaders: string[];
  if (normalizeHeaders) {
    const headerSet = new Set<string>();
    datasets.forEach((d) => d.headers.forEach((h) => headerSet.add(h)));
    allHeaders = Array.from(headerSet);
  } else {
    allHeaders = datasets[0].headers;
  }

  // Merge data
  const mergedData: Record<string, string>[] = [];
  for (const dataset of datasets) {
    for (const row of dataset.rows) {
      const normalizedRow: Record<string, string> = {};
      for (const header of allHeaders) {
        normalizedRow[header] = row[header] ?? '';
      }
      mergedData.push(normalizedRow);
    }
  }

  // Write output
  const csv = Papa.unparse(mergedData, { columns: allHeaders });
  fs.writeFileSync(outputFile, csv);

  return {
    filesProcessed: files.length,
    totalRows: mergedData.length,
    headers: allHeaders,
    outputPath: outputFile,
  };
}

// Usage
const stats = mergeCSVDirectory({
  inputDir: './data/monthly-reports',
  outputFile: './data/annual-report.csv',
  filePattern: /^report_\d{4}_\d{2}\.csv$/,
  normalizeHeaders: true,
});

console.log(`Merged ${stats.filesProcessed} files`);
console.log(`Total rows: ${stats.totalRows}`);
console.log(`Headers: ${stats.headers.join(', ')}`);
console.log(`Output: ${stats.outputPath}`);

Common pitfalls

Memory exhaustion with large files

Loading entire files into memory fails with large CSVs. Node.js crashes with "JavaScript heap out of memory" errors.

// Problem: loads everything into memory
const data = fs.readFileSync('huge.csv', 'utf8');
Papa.parse(data);

// Solution: use streaming
Papa.parse(fs.createReadStream('huge.csv'), {
  step: function(row) {
    processRow(row.data);
  }
});

For files over 100MB, always use streaming with csv-parse or Papa Parse's step callback.

Encoding issues with non-UTF-8 files

Files exported from older systems often use ISO-8859-1 or Windows-1252 encoding, causing garbled characters.

// Specify encoding explicitly
Papa.parse(file, {
  encoding: 'ISO-8859-1',
  complete: function(results) {
    console.log(results);
  }
});

// Or with Node.js streams
fs.createReadStream('file.csv', { encoding: 'latin1' })
  .pipe(parse({ columns: true }));

Fields containing commas or quotes

CSV fields with special characters break naive parsing. Both Papa Parse and csv-parse handle this automatically when using their parsers.

// Papa Parse handles this correctly
Papa.parse('"Name, Jr.",value,"He said ""Hello"""');
// Result: [['Name, Jr.', 'value', 'He said "Hello"']]

Never parse CSV files with simple split(',') logic.

Browser UI freezing

Large file processing blocks the main thread, freezing the browser. Enable Web Workers:

Papa.parse(largeFile, {
  worker: true,  // Offload to Web Worker
  step: function(row, parser) {
    if (shouldStop) {
      parser.abort();
    }
  }
});

Inconsistent line endings

Files may use Windows (CRLF), Unix (LF), or Mac (CR) line endings. Papa Parse and csv-parse normalize these automatically, but if you're using manual parsing, account for all three:

const lines = content.split(/\r\n|\n|\r/);

When you need a visual interface

Building CSV merge functionality into a user-facing application requires handling file uploads, validation, error states, and progress indicators. If you're building a web application that needs CSV import capabilities with column mapping and validation, ImportCSV provides a React component that handles these concerns out of the box, including automatic schema detection and support for large files.