Snowflake CSV Import: COPY INTO, Stages & Automation (2026 Guide)
Load CSV files into Snowflake using COPY INTO, internal/external stages, SnowSQL, Snowpipe, and Python. Includes syntax, error handling, and performance tips.

How to Import CSV Files into Snowflake
To import a CSV into Snowflake, use the COPY INTO command after staging your file with PUT (for local files) or by pointing to cloud storage. For quick one-off uploads under 250 MB, the Snowsight web UI works without any SQL at all.
Snowflake gives you five ways to load CSV data. Which one you pick depends on file size, frequency, and how much automation you need. Here's working syntax for all five that you can copy straight into your workflow.
Quick Method Comparison
Here is the decision matrix:
| Method | Best For | Max File Size | Complexity | Automation |
|---|---|---|---|---|
| Snowsight UI | Quick uploads, small files | 250 MB per file, 250 files max | Easy | None |
| PUT + COPY INTO | Scripted batch loads from local files | Unlimited (chunked) | Medium | Scriptable |
| External Stage + COPY INTO | Cloud storage integration (S3/GCS/Azure) | Unlimited | Medium | Scriptable |
| Snowpipe | Continuous automated loading | Recommended 100-250 MB per file | Higher | Fully automated |
Python Connector (write_pandas) | DataFrame workflows | Limited by memory | Medium | Scriptable |
File under 250 MB? Use Snowsight. Anything recurring, scripted, or large? COPY INTO. Files landing continuously and you need them loaded within minutes? That's Snowpipe.
Prerequisites
Every method below assumes you already have a target table. Set that up first:
-- Create database and schema
CREATE DATABASE IF NOT EXISTS my_database;
CREATE SCHEMA IF NOT EXISTS my_database.my_schema;
USE DATABASE my_database;
USE SCHEMA my_schema;
-- Create target table
CREATE TABLE IF NOT EXISTS customers (
id INTEGER,
name VARCHAR(100),
email VARCHAR(255),
signup_date DATE,
balance DECIMAL(10,2)
);You also need the right privileges on your role: USAGE on the database and schema, INSERT on the target table, and CREATE STAGE or CREATE FILE FORMAT if you plan to create those objects.
Method 1: Snowsight Web Interface
The Snowsight UI is the fastest path from CSV to table. No CLI, no SQL, no staging commands. Just drag your file in and go.
Best for: Quick one-off uploads, non-technical users, small datasets under 250 MB.
Creating a New Table from CSV
- Sign in to Snowsight
- Select Create > Table > From File
- Select or create your target database and schema
- Drag-and-drop your CSV file (or click Browse)
- Enter a name for the new table, click Next
- Review the auto-detected schema (Snowflake uses the
INFER_SCHEMAfunction under the hood) - Adjust column names and data types if needed
- Click Load
That's it. Seriously. Snowflake infers the schema from your CSV headers and sample data, creates the table, and loads the rows. The schema detection is surprisingly good, though I've seen it guess wrong on ambiguous date formats.
Loading into an Existing Table
- Navigate to Ingestion > Add Data > Load data into a Table
- Select your files (drag-and-drop, browse, or add from an existing stage)
- Select the target database, schema, and table
- Click Next to review the schema mapping
- Choose a file format (or create a new named format)
- Configure loading options:
- Loading method: Append (default) or Replace
- Match by column names: case insensitive by default
- ON_ERROR behavior (more on this below)
- Click Load
Snowsight Limitations
The UI caps you at 250 files and 250 MB per file. Fine for ad-hoc analysis or prototyping. Not for production. The real problem: you can't automate it. The moment you need to load data more than a handful of times, move to COPY INTO.
Method 2: PUT + COPY INTO (The Core Method)
This is the workhorse. Upload your file to an internal stage with PUT, then load it into your table with COPY INTO. Two steps. Every other method is just a variation on this pattern.
Step 1: Create a File Format
You can inline format options directly in your COPY INTO statement, but honestly, creating a named file format is worth the 30 seconds. It keeps your load commands clean and you won't accidentally use different settings across loads.
CREATE OR REPLACE FILE FORMAT my_csv_format
TYPE = 'CSV'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 1
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
TRIM_SPACE = TRUE
ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE
NULL_IF = ('NULL', 'null', '');Here are the format options you will reach for most often:
| Option | Default | Description |
|---|---|---|
FIELD_DELIMITER | , | Column separator (comma, tab \t, pipe |) |
SKIP_HEADER | 0 | Number of header rows to skip |
FIELD_OPTIONALLY_ENCLOSED_BY | NONE | Quote character (" or ') |
TRIM_SPACE | FALSE | Trim leading/trailing whitespace |
ERROR_ON_COLUMN_COUNT_MISMATCH | TRUE | Error if CSV columns != table columns |
NULL_IF | \\N | Strings to interpret as NULL |
COMPRESSION | AUTO | AUTO, GZIP, BZ2, BROTLI, ZSTD, NONE |
ENCODING | UTF8 | Character encoding |
ESCAPE | NONE | Escape character for enclosed fields |
ESCAPE_UNENCLOSED_FIELD | \\ | Escape character for unenclosed fields |
Step 2: Upload Files with PUT
The PUT command uploads local files to an internal stage. You have three stage types to choose from:
| Stage Type | Reference | Scope | Use Case |
|---|---|---|---|
| User stage | @~ | Per-user, private | Single user staging files for multiple tables |
| Table stage | @%table_name | Per-table | Files destined for a single specific table |
| Named stage | @stage_name | Shared across users/tables | Most common for teams; flexible and reusable |
-- Upload to user stage
PUT file:///tmp/data/customers.csv @~;
-- Upload to table stage
PUT file:///tmp/data/customers.csv @%customers;
-- Upload to named internal stage
PUT file:///tmp/data/customers.csv @my_stage;
-- Upload multiple files with wildcard, parallel threads
PUT file:///tmp/data/customers_*.csv @my_stage
PARALLEL = 8
AUTO_COMPRESS = TRUE;A few things to know about PUT (some of these will bite you):
- It only works with internal stages, not external ones.
- It cannot run from Snowsight worksheets. You'll get an error with zero explanation. Use SnowSQL, the Snowflake CLI, or a driver (Python, JDBC, etc.) instead.
AUTO_COMPRESSisTRUEby default, so files get gzipped automatically.PARALLELdefaults to 4 upload threads. You can go up to 99.- Aim for files in the 100-250 MB compressed range for best throughput.
Step 3: Load with COPY INTO
Once your files are staged, COPY INTO does the actual loading:
-- Basic load from a named stage
COPY INTO customers
FROM @my_stage
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format')
ON_ERROR = 'CONTINUE';You can also transform data during the load. Use positional column references ($1, $2, etc.) in a subquery:
-- Transform columns during load
COPY INTO customers (id, name, email, signup_date, balance)
FROM (
SELECT $1, $2, $3, TO_DATE($4, 'YYYY-MM-DD'), $5
FROM @my_stage/customers.csv.gz
)
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format');Or use inline format options if you prefer not to create a named format:
-- COPY INTO with inline format and file pattern
COPY INTO customers
FROM @~
FILE_FORMAT = (
TYPE = 'CSV'
FIELD_DELIMITER = ','
SKIP_HEADER = 1
)
PATTERN = '.*customers.*[.]csv'
ON_ERROR = 'SKIP_FILE';To load specific files by name rather than a pattern:
COPY INTO customers
FROM @my_stage
FILES = ('customers_001.csv.gz', 'customers_002.csv.gz')
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format');Method 3: External Stage + COPY INTO
If your CSV files already live in cloud storage, skip PUT entirely. Create an external stage pointing to your S3 bucket, GCS path, or Azure Blob container, then run COPY INTO directly against it. Much cleaner.
Supported: Amazon S3, Google Cloud Storage, Microsoft Azure Blob Storage.
Create an External Stage
-- External stage with direct credentials (fine for testing)
CREATE OR REPLACE STAGE my_s3_stage
URL = 's3://my-bucket/csv-data/'
CREDENTIALS = (AWS_KEY_ID = '...' AWS_SECRET_KEY = '...')
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format');
-- Storage integration (recommended for production)
CREATE OR REPLACE STAGE my_s3_stage
URL = 's3://my-bucket/csv-data/'
STORAGE_INTEGRATION = my_s3_integration
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format');Use a storage integration for production. Embedding credentials in stage definitions is a bad idea -- your Snowflake admin should manage access through IAM roles instead.
Load from External Stage
-- Verify files are visible
LIST @my_s3_stage;
-- Load all CSV files
COPY INTO customers
FROM @my_s3_stage
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format')
ON_ERROR = 'CONTINUE';
-- Load from a specific path with pattern
COPY INTO customers
FROM @my_s3_stage/2024/01/
PATTERN = '.*[.]csv'
ON_ERROR = 'SKIP_FILE_5%';Organize your files by path (e.g., s3://bucket/region/date/). This lets you run targeted COPY INTO commands against specific partitions instead of scanning everything in the stage. Trust me, it matters once you're loading hundreds of files. Scanning an entire unorganized stage is painfully slow.
Method 4: Snowpipe for Continuous Loading
Snowpipe automates the whole thing. Files land in a stage, a cloud event notification triggers Snowpipe, data appears in your table within minutes. No cron jobs. No manual triggers. No warehouse to babysit.
How Snowpipe Works
- CSV files land in a stage (internal or external)
- A cloud event notification (S3 SQS, GCS Pub/Sub, Azure Event Grid) or REST API call triggers Snowpipe
- Snowpipe loads the data using serverless compute managed by Snowflake
- No user-managed warehouse needed
Create a Pipe
CREATE OR REPLACE PIPE my_csv_pipe
AUTO_INGEST = TRUE
AS
COPY INTO customers
FROM @my_s3_stage
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format')
ON_ERROR = 'SKIP_FILE';Snowpipe vs Bulk Loading
| Aspect | Bulk (COPY INTO) | Snowpipe |
|---|---|---|
| Compute | User warehouse | Serverless (Snowflake-managed) |
| Billing | Warehouse runtime | Per-second compute + queue overhead |
| Latency | Manual trigger | Minutes after file arrival |
| Load history | 64 days in table metadata | 14 days in pipe metadata |
| Best for | Large batch loads | Continuous micro-batches |
On cost: Snowpipe charges include overhead for file queue management, and it adds up faster than you'd expect. Stage files at 100-250 MB and avoid triggering loads more than once per minute. If you're doing large nightly batch loads, a dedicated warehouse with COPY INTO is almost always cheaper.
Method 5: Python Connector with write_pandas()
If you're already in Python -- cleaning data with pandas, running transformations, whatever -- the Snowflake Python connector lets you push DataFrames directly into Snowflake tables. No SQL needed.
Install
pip install snowflake-connector-python[pandas]Load CSV via write_pandas
import pandas as pd
from snowflake.connector import connect
from snowflake.connector.pandas_tools import write_pandas
# Read CSV into DataFrame
df = pd.read_csv('customers.csv')
# Connect to Snowflake
conn = connect(
account='myorg-myaccount',
user='my_user',
password='my_password',
database='my_database',
schema='my_schema',
warehouse='my_warehouse'
)
# Write DataFrame to Snowflake table
success, nchunks, nrows, _ = write_pandas(
conn,
df,
'CUSTOMERS',
database='MY_DATABASE',
schema='MY_SCHEMA'
)
print(f"Loaded {nrows} rows in {nchunks} chunks")
conn.close()Under the hood, write_pandas does something clever: it converts your DataFrame to Parquet, uploads to a temp stage via PUT, runs COPY INTO, then cleans up after itself. You get bulk loading speed without writing any SQL. The downside? Error messages when something goes wrong are not great. I've spent more time than I'd like debugging opaque write_pandas failures that turned out to be simple type mismatches.
Key parameters:
chunk_size-- rows per chunk (default: all at once)compression--"gzip"(default, better compression) or"snappy"(faster)on_error-- same options as COPY INTO (default:"ABORT_STATEMENT")parallel-- upload threads (default: 4)
Error Handling: ON_ERROR Options
Getting ON_ERROR right saves you hours of debugging down the road. Here are all the options:
| Option | Behavior | When to Use |
|---|---|---|
ABORT_STATEMENT (default) | Stops the entire load on the first error | Production loads where data integrity is non-negotiable |
CONTINUE | Skips bad rows, loads everything else | Exploratory data or best-effort loads |
SKIP_FILE | Skips the entire file if any row errors | Multi-file loads where each file should be all-or-nothing |
SKIP_FILE_n | Skips the file when error count reaches n | Fixed error tolerance (e.g., SKIP_FILE_5) |
SKIP_FILE_n% | Skips the file when error percentage reaches n% | Percentage-based tolerance (e.g., SKIP_FILE_5%) |
My recommendation: ABORT_STATEMENT for production, CONTINUE when you're poking at a new dataset. Loading many files in one COPY INTO? Use SKIP_FILE so one bad file doesn't kill the entire batch.
Validate Before Loading
Always dry-run first. Catch problems before they become real:
-- Dry-run: returns errors without loading anything
COPY INTO customers
FROM @my_stage
FILE_FORMAT = (FORMAT_NAME = 'my_csv_format')
VALIDATION_MODE = 'RETURN_ERRORS';Check Load History
After a load completes, inspect what went wrong:
-- View rejected rows from the last COPY INTO
SELECT * FROM TABLE(VALIDATE(customers, JOB_ID => '_last'));Handling Different Delimiters
Not every CSV actually uses commas. (The name is a lie.) Snowflake handles all common delimiters through the FIELD_DELIMITER option:
-- Tab-separated values (TSV)
CREATE OR REPLACE FILE FORMAT my_tsv_format
TYPE = 'CSV'
FIELD_DELIMITER = '\t'
SKIP_HEADER = 1;
-- Pipe-delimited
CREATE OR REPLACE FILE FORMAT my_pipe_format
TYPE = 'CSV'
FIELD_DELIMITER = '|'
SKIP_HEADER = 1;
-- Semicolon-delimited (common in European locales)
CREATE OR REPLACE FILE FORMAT my_semicolon_format
TYPE = 'CSV'
FIELD_DELIMITER = ';'
SKIP_HEADER = 1;Snowflake also auto-detects and decompresses GZIP (.gz), BZ2 (.bz2), Zstandard (.zst), and Deflate files. For Brotli (.br), you need to set SOURCE_COMPRESSION = BROTLI explicitly.
Performance Tips for Large Files
Loading a 50 MB CSV and loading a 50 GB CSV are completely different problems. Here's what actually matters at scale:
-
Split large files into 100-250 MB compressed chunks. Snowflake loads files in parallel across warehouse nodes. One giant file means one node does all the work. The rest sit idle. That's expensive waste.
-
Use compression.
PUTgzips by default. If you're staging externally, compress before upload. The reduced transfer time more than pays for the CPU. -
Scale the warehouse up, then back down. Spin up an X-LARGE for a big batch, then suspend immediately after. You only pay for the seconds it runs.
-
Organize files by path. Partition your staged files (by date, region, etc.) so you can target subsets with
COPY INTOinstead of scanning everything. -
Use PATTERN or FILES. Load only what you need. Don't scan the whole stage.
-
For large uncompressed CSVs over 128 MB: Ensure your files are RFC 4180-compliant (no embedded newlines within fields) and set
COMPRESSION = NONEso Snowflake can split and scan a single file across multiple nodes in parallel. This only works when fields do not contain newline characters -- if your data has multi-line text fields, compress the file and rely on splitting into multiple files instead. -
Avoid single files over 100 GB. You'll likely hit the 24-hour query timeout. Split them first.
Here's a quick way to split and compress on Linux or macOS:
# Split into ~100MB chunks by line count
split -l 500000 large_data.csv chunk_
# Compress each chunk
gzip chunk_*Common Errors and Fixes
| Error | Cause | Fix |
|---|---|---|
Number of columns in file does not match | CSV columns do not match table columns | Set ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE or fix the CSV |
Field delimiter not found | Wrong delimiter specified | Check the actual delimiter, update your FILE_FORMAT |
Invalid date format | Date string does not match expected format | Use DATE_FORMAT option or transform with TO_DATE() |
File not found | Wrong stage path or file not uploaded | Run LIST @stage to verify files exist |
Insufficient privileges | Missing INSERT or USAGE grant | Grant required privileges to the role |
PUT not supported in worksheets | PUT cannot run in Snowsight SQL worksheets | Use SnowSQL CLI or the Python connector |
Duplicate files skipped | Same filename was already loaded | Use FORCE = TRUE to reload, or rename files |
The duplicate files issue trips up everyone at least once. Snowflake tracks which files have been loaded and silently skips them on subsequent COPY INTO runs. Zero rows loaded, zero errors. Super confusing. This is actually a useful feature for idempotent pipelines, but during development when you're iterating on a file? Maddening. Set FORCE = TRUE to override it.
Pricing Considerations
- Bulk loading (COPY INTO): Billed by warehouse runtime. Credits accrue per second while the warehouse is active.
- Snowpipe: Per-second serverless compute plus file queue overhead. Usually cheaper for small, frequent loads -- but not always, so do the math for your workload.
- Data storage: Standard Snowflake storage rates apply to all loaded data.
- Data transfer: Cross-region or cross-cloud loads may hit you with data transfer charges. These are easy to overlook.
Tip: Use an X-SMALL warehouse for small files. Scale up for large batch loads, then suspend immediately after.
Skip the SQL: Use ImportCSV Instead
All the methods above assume you are the one loading data. But what if your end users need to upload CSVs through a web interface? That's a different problem entirely.
ImportCSV gives you an embeddable CSV import widget that handles validation, column mapping, and error correction before data hits your database. No file format config, no staging, no delimiter debugging. Your users get a clean upload experience. You get clean data. It's the fastest way to add CSV import to your app without building it yourself.
Wrap-up
CSV imports shouldn't slow you down. ImportCSV aims to expand into your workflow — whether you're building data import flows, handling customer uploads, or processing large datasets.
If that sounds like the kind of tooling you want to use, try ImportCSV .