Blog
February 2, 2026

BigQuery Upload CSV: Console, CLI & API Methods

Learn 4 ways to upload CSV files to BigQuery: Console GUI, bq CLI, Python API, and Node.js. Includes code examples, schema options, and quotas.

14 mins read

BigQuery Upload CSV: Console, CLI & API Methods

How to Upload CSV Files to BigQuery

Loading CSV data into BigQuery is one of those tasks you'll do constantly as a data engineer. BigQuery gives you four different ways to do it, which is great for flexibility but can make it hard to know where to start.

I'll walk through the Console GUI, the bq command-line tool, and both Python and Node.js APIs. Each has its place. The Console is fastest for one-off uploads. The CLI is my go-to for anything scripted. The APIs give you full control when you're building pipelines.

Quick Method Comparison

Here's the cheat sheet:

MethodBest ForMax File SizeComplexity
Console (GUI)Quick uploads, small local files100 MB (local), unlimited (GCS)Easy
bq CLIScripting, automation, larger files5 TB per load jobMedium
Python APICustom pipelines, programmatic control5 TB per load jobMedium
Node.js APIJavaScript/TypeScript pipelines5 TB per load jobMedium
Cloud Storage StagingLarge files, parallel loading5 TB per load jobMedium

If your file is under 100 MB and you're just exploring, use the Console. For anything larger or automated, grab the CLI or an API. All methods support files up to 5 TB when loading from Cloud Storage (only local Console uploads have that 100 MB cap).

Method 1: Console Upload (GUI)

The Console is the quickest way to get a CSV into BigQuery. No code, no setup. Just click through a form. I use this when I'm exploring a dataset or need to test something fast.

Step-by-Step Instructions

  1. Navigate to the BigQuery Console in your browser
  2. In the Explorer pane, expand your project and select the target dataset
  3. Click Create table in the Dataset info section
  4. Configure the Source section:
    • Create table from: Select Upload for local files or Google Cloud Storage for GCS
    • Select your CSV file or enter the GCS URI (e.g., gs://mybucket/myfile.csv)
    • File format: CSV
  5. Configure the Destination section:
    • Select your Project and Dataset
    • Enter a Table name
    • Table type: Native table
  6. Configure the Schema section:
    • Enable Auto detect for automatic schema inference, OR
    • Manually define each column with name and type
  7. Expand Advanced options to configure:
    • Write preference: Write if empty, Append to table, or Overwrite table
    • Number of errors allowed: How many bad rows to tolerate (default: 0)
    • Header rows to skip: Usually 1 if your CSV has headers
    • Field delimiter: Comma (default), Tab, Pipe, or Custom
    • Allow quoted newlines: Enable for multi-line field values
    • Allow jagged rows: Enable to allow rows with missing trailing columns
  8. Click Create Table to start the load job

Console Upload Limitations

The Console caps local uploads at 100 MB. That sounds small, but it's enough for most exploratory work. If you need more, you have options:

  • Upload to Cloud Storage first, then load from there
  • Switch to the bq CLI or API
  • Split the file (annoying, but it works)

Don't use the Console for production workflows. Anything that needs to run more than once should be scripted.

Method 2: bq Command-Line Tool

The bq CLI comes with the Google Cloud SDK. I think it's the best option for most use cases. You get full control without writing actual code, and it's easy to wrap in shell scripts or CI pipelines.

Basic Syntax

The core command structure for loading CSV files is:

bq load \
  --source_format=CSV \
  --skip_leading_rows=1 \
  PROJECT_ID:DATASET.TABLE \
  PATH_TO_SOURCE \
  SCHEMA

Load from Local File with Auto-Detection

When you just want to get data in and let BigQuery figure out the types:

bq load \
  --autodetect \
  --source_format=CSV \
  mydataset.mytable \
  ./mydata.csv

The --autodetect flag makes BigQuery sample your data and guess the column types. Works surprisingly well most of the time.

Load from Cloud Storage with Explicit Schema

When you need to control exactly what type each column gets (production pipelines, usually):

bq load \
  --source_format=CSV \
  --skip_leading_rows=1 \
  mydataset.mytable \
  gs://mybucket/mydata.csv \
  name:STRING,email:STRING,created_at:TIMESTAMP

Load Using a Schema File

If you have more than a few columns, putting the schema in a JSON file is easier to maintain:

bq load \
  --source_format=CSV \
  --skip_leading_rows=1 \
  mydataset.mytable \
  gs://mybucket/mydata.csv \
  ./schema.json

Your schema.json file should contain an array of field definitions:

[
  {"name": "name", "type": "STRING", "mode": "REQUIRED"},
  {"name": "email", "type": "STRING", "mode": "NULLABLE"},
  {"name": "age", "type": "INTEGER", "mode": "NULLABLE"},
  {"name": "created_at", "type": "TIMESTAMP", "mode": "NULLABLE"}
]

Loading Multiple Files with Wildcards

To load all CSV files matching a pattern:

bq load \
  --source_format=CSV \
  --skip_leading_rows=1 \
  mydataset.mytable \
  'gs://mybucket/data/*.csv' \
  schema.json

One gotcha: BigQuery doesn't support multiple asterisks in paths. So gs://bucket/*/data/*.csv won't work. Bit annoying if you have a complex folder structure.

Common bq load Flags Reference

FlagDescriptionDefault
--source_format=CSVSpecify CSV formatRequired
--autodetectAuto-detect schema from datafalse
--skip_leading_rows=NSkip N header rows0
--field_delimiter=','Column delimiter charactercomma
--allow_quoted_newlinesAllow newlines inside quoted fieldsfalse
--allow_jagged_rowsAllow rows with missing trailing columnsfalse
--max_bad_records=NMax errors before job fails0
--replaceOverwrite existing tablefalse
--noreplaceAppend to existing tabletrue
--encoding=UTF-8Character encodingUTF-8

Method 3: Python API

The Python client library is what you want when you're building actual data pipelines. You get error handling, retries, and the ability to integrate with whatever else your pipeline does.

Installation

Install the BigQuery client library:

pip install google-cloud-bigquery

Load from Local CSV File

Uploading a local file with auto-detection:

from google.cloud import bigquery

client = bigquery.Client()

table_id = "your-project.your_dataset.your_table"

job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.CSV,
    skip_leading_rows=1,
    autodetect=True,
)

with open("mydata.csv", "rb") as source_file:
    job = client.load_table_from_file(source_file, table_id, job_config=job_config)

job.result()  # Wait for job to complete

table = client.get_table(table_id)
print(f"Loaded {table.num_rows} rows to {table_id}")

Load from Cloud Storage with Explicit Schema

For production, define your schema explicitly. Auto-detection will eventually bite you when some file has slightly different formatting:

from google.cloud import bigquery

client = bigquery.Client()

table_id = "your-project.your_dataset.your_table"

job_config = bigquery.LoadJobConfig(
    schema=[
        bigquery.SchemaField("name", "STRING"),
        bigquery.SchemaField("email", "STRING"),
        bigquery.SchemaField("created_at", "TIMESTAMP"),
    ],
    skip_leading_rows=1,
    source_format=bigquery.SourceFormat.CSV,
)

uri = "gs://mybucket/mydata.csv"

load_job = client.load_table_from_uri(uri, table_id, job_config=job_config)
load_job.result()  # Wait for completion

destination_table = client.get_table(table_id)
print(f"Loaded {destination_table.num_rows} rows.")

Controlling Write Behavior

The write_disposition setting determines what happens when the table already exists:

job_config = bigquery.LoadJobConfig(
    write_disposition=bigquery.WriteDisposition.WRITE_TRUNCATE,
    source_format=bigquery.SourceFormat.CSV,
    skip_leading_rows=1,
)

Your options:

  • WRITE_TRUNCATE: Wipes the table and replaces it with your new data
  • WRITE_APPEND: Adds rows to whatever's already there (this is the default)
  • WRITE_EMPTY: Only writes if the table is empty, fails otherwise

Error Handling

Always wrap load operations in try-except. Loads fail for all sorts of reasons (schema mismatches, permission issues, malformed data):

from google.cloud import bigquery
from google.cloud.exceptions import GoogleCloudError

client = bigquery.Client()

try:
    job = client.load_table_from_uri(uri, table_id, job_config=job_config)
    job.result()
    print(f"Loaded {client.get_table(table_id).num_rows} rows")
except GoogleCloudError as e:
    print(f"Load job failed: {e}")

Method 4: Node.js API

If you're in a JavaScript or TypeScript environment, the Node.js client works well. I've used it in Cloud Functions and Express apps where Python wasn't an option.

Installation

Install the BigQuery client library:

npm install @google-cloud/bigquery @google-cloud/storage

Load from Local CSV File

Local upload with auto-detection:

const { BigQuery } = require("@google-cloud/bigquery");

const bigquery = new BigQuery();

async function loadLocalCSV() {
  const datasetId = "my_dataset";
  const tableId = "my_table";
  const filename = "./mydata.csv";

  const metadata = {
    sourceFormat: "CSV",
    skipLeadingRows: 1,
    autodetect: true,
  };

  const [job] = await bigquery
    .dataset(datasetId)
    .table(tableId)
    .load(filename, metadata);

  console.log(`Job ${job.id} completed.`);

  const errors = job.status.errors;
  if (errors && errors.length > 0) {
    throw errors;
  }
}

loadLocalCSV();

Load from Cloud Storage with Explicit Schema

Loading from GCS with a defined schema:

const { BigQuery } = require("@google-cloud/bigquery");
const { Storage } = require("@google-cloud/storage");

const bigquery = new BigQuery();
const storage = new Storage();

async function loadCSVFromGCS() {
  const datasetId = "my_dataset";
  const tableId = "my_table";
  const bucketName = "my-bucket";
  const filename = "mydata.csv";

  const metadata = {
    sourceFormat: "CSV",
    skipLeadingRows: 1,
    schema: {
      fields: [
        { name: "name", type: "STRING" },
        { name: "email", type: "STRING" },
        { name: "created_at", type: "TIMESTAMP" },
      ],
    },
    location: "US",
  };

  const [job] = await bigquery
    .dataset(datasetId)
    .table(tableId)
    .load(storage.bucket(bucketName).file(filename), metadata);

  console.log(`Job ${job.id} completed.`);

  const errors = job.status.errors;
  if (errors && errors.length > 0) {
    throw errors;
  }
}

loadCSVFromGCS();

Controlling Write Behavior

Set the writeDisposition option to control how data is written:

const metadata = {
  sourceFormat: "CSV",
  skipLeadingRows: 1,
  writeDisposition: "WRITE_TRUNCATE", // Overwrite table
  // writeDisposition: "WRITE_APPEND",  // Append (default)
  // writeDisposition: "WRITE_EMPTY",   // Write only if empty
};

Async/Await Pattern with Error Handling

A fuller example with error handling (you'll want something like this in production):

const { BigQuery } = require("@google-cloud/bigquery");

const bigquery = new BigQuery();

async function loadCSVWithErrorHandling() {
  const datasetId = "my_dataset";
  const tableId = "my_table";
  const filename = "./mydata.csv";

  try {
    const [job] = await bigquery
      .dataset(datasetId)
      .table(tableId)
      .load(filename, {
        sourceFormat: "CSV",
        skipLeadingRows: 1,
        autodetect: true,
      });

    if (job.status.errors && job.status.errors.length > 0) {
      console.error("Errors encountered:", job.status.errors);
      return;
    }

    const [table] = await bigquery.dataset(datasetId).table(tableId).get();
    console.log(`Loaded ${table.metadata.numRows} rows`);
  } catch (error) {
    console.error("Load job failed:", error.message);
  }
}

loadCSVWithErrorHandling();

Cloud Storage Staging for Large Files

Once your files get bigger than 100 MB (or you want better performance), stage them in Cloud Storage first. It's an extra step, but worth it.

Why Bother with Cloud Storage?

A few reasons I reach for GCS staging:

  • Gets around the Console's 100 MB limit
  • BigQuery can read multiple GCS files in parallel, which speeds things up
  • Wildcard patterns let you load a bunch of files at once
  • If the load fails, your data is still sitting in the bucket for a retry

Staging Workflow

Pretty simple:

  1. Upload your CSV file(s) to a Cloud Storage bucket
  2. Load from the GCS URI into BigQuery using any method (Console, CLI, or API)

GCS URI Formats

BigQuery supports several URI patterns for loading from GCS:

PatternExampleDescription
Single filegs://mybucket/data/myfile.csvLoad one specific file
Wildcard (all CSVs)gs://mybucket/data/*.csvLoad all CSV files in folder
Wildcard (prefix)gs://mybucket/data/users*.csvLoad files matching prefix

Again, no multiple wildcards. gs://bucket/*/data/*.csv won't work.

Region Requirements

Your GCS bucket has to be in the same region as your BigQuery dataset. Different regions = failed load. This trips people up more often than you'd expect. Check your dataset location in the Console and make sure your bucket matches.

Schema: Auto-Detection vs Explicit

Auto-detection is convenient. Explicit schemas are reliable. You're always trading one against the other.

How Auto-Detection Works

BigQuery looks at the first 500 rows and guesses:

  • Column names (from the header row)
  • Data types for each column
  • The delimiter you're using
  • Whether there are quoted newlines

When Auto-Detection Works Well

Auto-detection does fine when your data is clean, you're using standard types (strings, numbers, ISO dates), and the first 500 rows look like the rest of the file.

When to Use Explicit Schema

Go explicit if:

  • Your columns are all strings (auto-detect can't find the header)
  • You want INTEGER for IDs instead of STRING
  • You have INTERVAL types (auto-detect doesn't support these)
  • You're running in production and can't afford surprises
  • Your files might have slightly different formatting

Schema Definition Formats

With the bq CLI, you can define schemas inline:

name:STRING,email:STRING,age:INTEGER,created_at:TIMESTAMP

Or use JSON for more columns and finer control:

[
  {"name": "name", "type": "STRING", "mode": "REQUIRED"},
  {"name": "email", "type": "STRING", "mode": "NULLABLE"},
  {"name": "age", "type": "INTEGER", "mode": "NULLABLE"},
  {"name": "created_at", "type": "TIMESTAMP", "mode": "NULLABLE"}
]

Supported Data Types

BigQuery handles most common types in CSV:

  • Text: STRING, BYTES
  • Numeric: INTEGER, INT64, FLOAT, FLOAT64, NUMERIC, BIGNUMERIC
  • Boolean: BOOLEAN, BOOL
  • Date/Time: TIMESTAMP, DATE, TIME, DATETIME
  • Other: GEOGRAPHY, JSON

No ARRAY or STRUCT support in CSV, though. If you need nested data, you'll have to switch to JSON or Avro.

Quotas and Limits

BigQuery has limits. You probably won't hit most of them, but a few can sneak up on you.

Load Job Limits

LimitValue
Load jobs per table per day1,500 (including failures)
Load jobs per project per day100,000 (including failures)
Maximum file size (uncompressed)5 TB
Maximum file size (gzip compressed)4 GB
Maximum row size100 MB
Maximum columns per table10,000
Maximum cell size100 MB

CSV-Specific Limitations

A few things to watch for:

  • No nested data: CSV just can't do it
  • BOMs cause problems: Strip byte order marks from your files
  • Compression trade-off: gzip saves space but BigQuery can't parallelize reads
  • Don't mix: All files in one load job need to be either compressed or not (no mixing)

Date and Time Format Requirements

BigQuery expects specific formats for temporal data:

  • DATE: Must use YYYY-MM-DD format (e.g., 2026-02-15)
  • TIMESTAMP: Use YYYY-MM-DD HH:MM:SS or YYYY/MM/DD HH:MM:SS

Encoding Support

BigQuery handles UTF-8 (recommended), ISO-8859-1, UTF-16BE/LE, and UTF-32BE/LE. Stick with UTF-8 unless you have a specific reason not to.

Pricing Note

Good news: batch loading is free. You only pay for storage once the data lands.

Troubleshooting Common Errors

You'll hit these at some point. Here's what they mean and how to fix them.

"Input CSV files are not splittable"

This means your gzip file is over 4 GB, or you're using --allow_quoted_newlines on a big file.

Fix it by splitting into smaller chunks, keeping gzip files under 4 GB, or dropping --allow_quoted_newlines if you don't actually need it.

"Error detected while parsing row"

Almost always a quoting issue. Look for unescaped quotes inside fields, mismatched quote pairs, or a delimiter that doesn't match what you specified.

Schema Mismatch Errors

Your data doesn't match the schema. Check that column order is right, date formats follow BigQuery's requirements, and types align. While debugging, --max_bad_records=N lets you see how many rows are failing.

Permission Errors

You need bigquery.dataEditor on the dataset. For GCS loads, also need storage.objectViewer on the bucket.

Alternative: ImportCSV for User-Facing Uploads

Everything above assumes you're the one uploading data. But what if you need end users to upload CSVs through your app?

That's a different problem. You need validation with friendly error messages, column mapping when headers don't match, data cleaning, progress indicators. Building all of that yourself takes a while.

ImportCSV is an embeddable component that handles the user-facing parts. Your users get a polished upload experience, you get clean data ready for BigQuery.

Summary

Four ways to get CSV data into BigQuery:

  • Console: Quick and easy, but limited to 100 MB for local files
  • bq CLI: My recommendation for anything scripted
  • Python/Node.js APIs: When you need programmatic control

For production, use explicit schemas and load from Cloud Storage. Auto-detection is fine for exploring data but will eventually cause problems if you rely on it.

Watch out for the 1,500 load jobs per table per day limit. And make sure your date formats match what BigQuery expects.

Wrap-up

CSV imports shouldn't slow you down. ImportCSV aims to expand into your workflow — whether you're building data import flows, handling customer uploads, or processing large datasets.

If that sounds like the kind of tooling you want to use, try ImportCSV .