Handle CSV newlines inside fields (multiline cell values)

Your CSV parser just broke. A user uploaded a file where the "description" column contains line breaks, and now your code treats each line as a separate row. The data is mangled beyond recognition.

This happens because naive CSV parsing with split('\n') cannot distinguish between newlines that separate records and newlines that are part of a field's content. The solution lies in understanding how RFC 4180 handles quoted fields and using a proper parser.

Prerequisites

Node.js 18+
Basic familiarity with JavaScript/TypeScript
A code editor

What you'll learn

By the end of this tutorial, you'll understand:

Why simple string splitting breaks on multiline CSV fields
How RFC 4180 defines quoted field handling
How to parse CSVs with multiline content using Papa Parse (browser) and csv-parse (Node.js)
How to write/export CSV files that preserve newlines correctly
Common pitfalls and how to avoid them

Step 1: Understanding the problem

Consider this CSV data with a multiline description:

name,description
Alice,"Hello
World"
Bob,Simple

Alice's description contains a newline between "Hello" and "World". Here's what happens when you parse it naively:

// This breaks on multiline fields
const csvData = `name,description
Alice,"Hello
World"
Bob,Simple`;

const rows = csvData.split('\n');
console.log(rows);
// Output:
// ['name,description', 'Alice,"Hello', 'World"', 'Bob,Simple']

The parser produced 4 rows instead of 3. Alice's description was torn apart, creating an orphan 'World"' row that will cause errors downstream. This is the exact problem developers encounter when users upload CSVs exported from Excel or other tools that allow multiline cell content.

Step 2: RFC 4180 quoting rules

The official CSV specification, RFC 4180, defines exactly how to handle this scenario.

Rule 6: Fields containing line breaks (CRLF), double quotes, or commas must be enclosed in double quotes.

Rule 7: A double quote inside a quoted field must be escaped by preceding it with another double quote.

The ABNF grammar from the spec:

escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE

This explicitly allows CR (carriage return) and LF (line feed) characters within quoted fields. When a parser encounters an opening double quote, it must continue reading until it finds a closing double quote, treating any newlines in between as part of the field content rather than as record separators.

Key insight: A compliant CSV parser must track whether it's currently inside a quoted field. Only newlines outside quoted fields separate records.

Step 3: Parsing multiline fields with Papa Parse (browser)

Papa Parse is the most popular CSV parser for browser-based JavaScript applications. It handles quoted fields with embedded newlines automatically.

npm install papaparse
npm install --save-dev @types/papaparse

import Papa from 'papaparse';

interface Person {
  name: string;
  description: string;
}

const csvData = `name,description
Alice,"Hello
World"
Bob,Simple`;

const results = Papa.parse<Person>(csvData, {
  header: true,
  skipEmptyLines: true,
});

console.log(results.data);
// Output:
// [
//   { name: 'Alice', description: 'Hello\nWorld' },
//   { name: 'Bob', description: 'Simple' }
// ]

Papa Parse correctly identifies that the newline between "Hello" and "World" is inside double quotes, preserving it as part of Alice's description rather than splitting the record.

Configuration options

Papa Parse handles most cases out of the box. Here are the relevant options:

Papa.parse(csvData, {
  header: true,           // First row contains column names
  skipEmptyLines: true,   // Ignore blank lines
  quoteChar: '"',         // Character used to quote fields (default: ")
  escapeChar: '"',        // Character used to escape quotes (default: ")
  newline: '',            // Auto-detect line endings
});

The newline: '' setting (empty string) tells Papa Parse to auto-detect whether the file uses \n, \r\n, or \r as record separators.

Step 4: Parsing multiline fields with csv-parse (Node.js)

For server-side Node.js applications, csv-parse from the node-csv suite provides robust RFC 4180 compliance.

npm install csv-parse

import { parse } from 'csv-parse/sync';

interface Person {
  name: string;
  description: string;
}

const csvData = `name,description
Alice,"Hello
World"
Bob,Simple`;

const records = parse(csvData, {
  columns: true,        // Use first row as headers
  skip_empty_lines: true,
  bom: true,            // Handle UTF-8 BOM from Excel
}) as Person[];

console.log(records);
// Output:
// [
//   { name: 'Alice', description: 'Hello\nWorld' },
//   { name: 'Bob', description: 'Simple' }
// ]

Streaming large files

For large CSV files, use the streaming API to avoid loading everything into memory:

import { parse } from 'csv-parse';
import { createReadStream } from 'fs';

interface Person {
  name: string;
  description: string;
}

const parser = createReadStream('./data.csv').pipe(
  parse({
    columns: true,
    bom: true,
  })
);

const records: Person[] = [];

for await (const record of parser) {
  records.push(record as Person);
}

console.log(`Parsed ${records.length} records`);

Step 5: Writing CSV with multiline fields

When generating CSV output, fields containing newlines, commas, or double quotes must be properly quoted. Here's how to do it with different approaches.

Using Papa Parse (browser/Node.js)

import Papa from 'papaparse';

interface Person {
  name: string;
  description: string;
}

const data: Person[] = [
  { name: 'Alice', description: 'Hello\nWorld' },
  { name: 'Bob', description: 'Simple' },
];

const csv = Papa.unparse(data);
console.log(csv);
// Output:
// name,description
// Alice,"Hello
// World"
// Bob,Simple

Papa Parse's unparse() function automatically detects that Alice's description contains a newline and wraps it in double quotes.

Using csv-stringify (Node.js)

npm install csv-stringify

import { stringify } from 'csv-stringify/sync';

interface Person {
  name: string;
  description: string;
}

const data: Person[] = [
  { name: 'Alice', description: 'Hello\nWorld' },
  { name: 'Bob', description: 'Simple' },
];

const csv = stringify(data, {
  header: true,
  columns: ['name', 'description'],
});

console.log(csv);
// Output:
// name,description
// Alice,"Hello
// World"
// Bob,Simple

Manual escape function

If you need to generate CSV without a library, use this escape function:

function escapeCSVValue(value: string | number | null | undefined): string {
  if (value === null || value === undefined) return '';

  const str = String(value);

  // Check if the value contains characters that require quoting
  if (str.includes(',') || str.includes('\n') || str.includes('\r') || str.includes('"')) {
    // Escape double quotes by doubling them, then wrap in quotes
    return `"${str.replace(/"/g, '""')}"`;
  }

  return str;
}

function toCSVRow(values: (string | number | null | undefined)[]): string {
  return values.map(escapeCSVValue).join(',');
}

// Usage
const header = toCSVRow(['name', 'description']);
const row1 = toCSVRow(['Alice', 'Hello\nWorld']);
const row2 = toCSVRow(['Bob', 'Simple']);

const csv = [header, row1, row2].join('\r\n');
console.log(csv);

This function follows RFC 4180: if a field contains a comma, newline, carriage return, or double quote, it wraps the entire field in double quotes and escapes any internal double quotes by doubling them.

Step 6: Handling mixed line endings

A common edge case occurs when files mix \r\n (Windows) and \n (Unix) line endings. This happens when Windows and Unix users collaborate on the same data, or when content is copy-pasted from different sources.

The problem

name,description\r\n
Alice,"Hello\nWorld"\r\n
Bob,Simple\r\n

Here, \r\n separates records, but \n appears inside Alice's quoted field. Some parsers may incorrectly detect \n as the record delimiter.

Solution: Normalize line endings

Before parsing, normalize all line endings to a consistent format:

function normalizeLineEndings(content: string): string {
  // Convert all line endings to \n
  return content.replace(/\r\n/g, '\n').replace(/\r/g, '\n');
}

const normalized = normalizeLineEndings(rawCSV);
const results = Papa.parse(normalized, { header: true });

Alternatively, explicitly set the record delimiter if you know what the file uses:

Papa.parse(csvData, {
  header: true,
  newline: '\r\n',  // Explicitly set to CRLF
});

Complete example

Here's a React component that demonstrates parsing and displaying CSV data with multiline fields:

import { useState } from 'react';
import Papa from 'papaparse';

interface CSVRow {
  [key: string]: string;
}

export function CSVViewer() {
  const [data, setData] = useState<CSVRow[]>([]);
  const [headers, setHeaders] = useState<string[]>([]);

  function handleFileUpload(event: React.ChangeEvent<HTMLInputElement>) {
    const file = event.target.files?.[0];
    if (!file) return;

    Papa.parse<CSVRow>(file, {
      header: true,
      skipEmptyLines: true,
      complete: (results) => {
        setData(results.data);
        if (results.data.length > 0) {
          setHeaders(Object.keys(results.data[0]));
        }
      },
      error: (error) => {
        console.error('Parse error:', error.message);
      },
    });
  }

  return (
    <div>
      <input type="file" accept=".csv" onChange={handleFileUpload} />

      {data.length > 0 && (
        <table>
          <thead>
            <tr>
              {headers.map((header) => (
                <th key={header}>{header}</th>
              ))}
            </tr>
          </thead>
          <tbody>
            {data.map((row, index) => (
              <tr key={index}>
                {headers.map((header) => (
                  <td key={header} style={{ whiteSpace: 'pre-wrap' }}>
                    {row[header]}
                  </td>
                ))}
              </tr>
            ))}
          </tbody>
        </table>
      )}
    </div>
  );
}

Note the whiteSpace: 'pre-wrap' style on table cells. This preserves newlines in the displayed content so multiline fields render correctly.

Common pitfalls

Using split() instead of a proper parser

Problem: Using csvString.split('\n') or even csvString.split(/\r?\n|\r/) breaks when a quoted field contains a newline.

Solution: Always use a library like Papa Parse or csv-parse that understands RFC 4180 quoting rules.

Forgetting to quote fields when exporting

Problem: Generating CSV by simple concatenation without checking for special characters.

// Wrong
const csv = data.map(row => `${row.name},${row.description}`).join('\n');

Solution: Use a library's unparse/stringify function, or apply the escapeCSVValue function shown above to every field.

Excel UTF-8 BOM issues

Problem: CSV files exported from Excel in UTF-8 format include a Byte Order Mark (BOM) at the start. This can cause the first header to be misread as "\ufeffname" instead of "name".

Solution: Enable BOM handling in your parser:

// csv-parse
parse(csvData, { bom: true });

// Papa Parse handles BOM automatically in most cases

Mixed \r\n and \n in the same file

Problem: When record separators use \r\n but field content uses \n, auto-detection may guess wrong.

Solution: Normalize line endings before parsing, or explicitly set the newline option if you know the expected format.

Not escaping double quotes inside fields

Problem: A field contains a double quote character, but it wasn't escaped.

Wrong CSV:

name,quote
Alice,"She said "Hello""

Correct CSV:

name,quote
Alice,"She said ""Hello"""

Solution: When writing CSV, always escape double quotes by doubling them (" becomes "").

The easier way: ImportCSV

All of this complexity - RFC 4180 compliance, quoting rules, newline handling, BOM detection - can be handled automatically.

import { CSVImporter } from '@importcsv/react';

function App() {
  return (
    <CSVImporter
      onComplete={(data) => {
        // Multiline fields are already parsed correctly
        console.log(data);
      }}
    />
  );
}

ImportCSV handles multiline fields, mixed line endings, Excel BOM issues, and malformed quoting automatically. Users see a preview of their data with multiline content properly formatted before completing the import.