# How to Use Pandas to_sql to Write a DataFrame to a Database: A Complete Guide

> Learn to use pandas to_sql to write DataFrames to databases. Our guide covers automatic schema inference and insert strategies for efficient data management.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-13

---

**Use `DataFrame.to_sql(name, con, if_exists='fail', index=True)` to write pandas DataFrame data to SQL databases via SQLAlchemy engines or DB-API connections, with automatic schema inference and configurable insert strategies.**

The `to_sql` method in pandas provides a high-level interface for persisting DataFrame data to relational databases. According to the pandas-dev/pandas source code, this functionality is implemented in [`pandas/io/sql.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/io/sql.py) and exposed through the DataFrame class in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py), supporting backends including SQLite, PostgreSQL, MySQL, and Oracle through SQLAlchemy or raw DB-API connections.

## How DataFrame.to_sql Works Under the Hood

Understanding the internal architecture helps optimize database writes and troubleshoot connection issues.

### Connection Handling and the SQLDatabase Wrapper

When you pass a connection object to `to_sql`, pandas wraps it using the `SQLDatabase` class defined in [`pandas/io/sql.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/io/sql.py). If you provide a DB-API connection directly, pandas internally adapts it to provide a consistent interface. When SQLAlchemy is available, the method leverages its engine abstraction to handle dialect-specific SQL generation and connection pooling.

### Schema Inference and Dtype Mapping

The method automatically infers SQL column types from pandas dtypes through the `_convert_dtypes` function in [`pandas/io/sql.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/io/sql.py). Standard mappings include `int64 → BIGINT`, `float64 → DOUBLE PRECISION`, and `object → TEXT`. You can override these defaults using the `dtype` parameter to specify exact SQL types such as `DECIMAL(10,2)` or `VARCHAR(255)`.

### Insert Strategies with if_exists

The `if_exists` parameter controls table creation behavior:

- **`if_exists='fail'`** (default): Raises a `ValueError` if the target table already exists.
- **`if_exists='replace'`**: Drops the existing table and creates a new schema based on the DataFrame structure.
- **`if_exists='append'`**: Inserts rows into the existing table without modifying the schema.

### Bulk Loading and Performance Optimization

For large datasets, `to_sql` supports chunked inserts via the `chunksize` parameter, which breaks the operation into smaller transactions to manage memory usage. Additionally, the `method` parameter accepts a callable that receives a database cursor and the data to insert. When the underlying DB-API driver supports `executemany`, pandas performs efficient bulk inserts rather than row-by-row operations.

### Index and Transaction Management

By default, pandas writes the DataFrame index as a column named `index`. Set `index=False` to omit it, or use `index_label` to specify a custom column name. When using SQLAlchemy engines, the entire write operation executes within a single transaction that commits upon success or rolls back on failure.

## Practical Code Examples

### Writing to SQLite with SQLAlchemy Engine

Connect to an SQLite database and write a new table, replacing any existing data:

```python
import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("sqlite:///example.db")
df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age":  [25, 30, 35],
    "salary": [70000.0, 80000.0, 90000.0]
})

df.to_sql(name="employees", con=engine, if_exists="replace", index=False)

```

### Appending Data to PostgreSQL Tables

Add new rows to an existing PostgreSQL table without dropping the current schema:

```python
engine = create_engine(
    "postgresql+psycopg2://user:password@localhost:5432/mydb"
)

new_rows = pd.DataFrame({
    "name": ["David", "Eva"],
    "age":  [28, 32],
    "salary": [75000.0, 82000.0]
})

new_rows.to_sql(name="employees", con=engine, if_exists="append", index=False)

```

### Custom SQL Type Mapping for MySQL

Force specific SQL column types when creating tables in MySQL:

```python
engine = create_engine("mysql+pymysql://user:pwd@localhost/test")
df = pd.DataFrame({
    "product_id": [1, 2, 3],
    "description": ["A", "B", "C"],
    "price": [9.99, 19.99, 29.99]
})

dtype_map = {"price": "DECIMAL(10,2)"}
df.to_sql(
    name="catalog",
    con=engine,
    if_exists="replace",
    index=False,
    dtype=dtype_map
)

```

### Optimized Bulk Inserts with Custom Methods

Implement a custom insertion method for maximum control over the bulk loading process:

```python
def bulk_insert(cursor, df, table, **kw):
    cols = ",".join(df.columns)
    placeholders = ",".join(["?"] * len(df.columns))
    sql = f"INSERT INTO {table} ({cols}) VALUES ({placeholders})"
    cursor.executemany(sql, df.itertuples(index=False, name=None))

engine = create_engine("sqlite:///bulk.db")
df_big = pd.DataFrame(
    {"col1": range(10000), "col2": ["x"] * 10000}
)

df_big.to_sql(
    name="big_table",
    con=engine,
    if_exists="replace",
    index=False,
    method=bulk_insert,
    chunksize=2000
)

```

## Summary

- **[`pandas/io/sql.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/io/sql.py)** contains the core implementation of `to_sql`, including the `SQLDatabase` wrapper and dtype conversion logic.
- **[`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py)** exposes `to_sql` as a public DataFrame method.
- **Connection flexibility**: Works with both SQLAlchemy engines and raw DB-API connections.
- **Schema control**: Automatic type mapping can be overridden with the `dtype` parameter.
- **Data safety**: Use `if_exists='append'` to add data without destroying existing tables, or `'replace'` for fresh schema creation.
- **Performance tuning**: Leverage `chunksize` for memory management and custom `method` callables for driver-specific optimizations like `executemany`.

## Frequently Asked Questions

### What database backends are supported by pandas to_sql?

The method supports any database accessible via SQLAlchemy or DB-API 2.0, including SQLite, PostgreSQL, MySQL, Microsoft SQL Server, Oracle, and cloud variants like Amazon Redshift or Google BigQuery through appropriate dialect drivers.

### How do I prevent pandas from writing the DataFrame index to SQL?

Set `index=False` in the `to_sql` call. If you want to preserve the index but rename the column, use `index_label='custom_name'` instead of the default `'index'`.

### What is the difference between if_exists='replace' and 'append'?

The `'replace'` option drops the existing table entirely and recreates it based on the current DataFrame schema, which destroys existing data and constraints. The `'append'` option inserts rows into the existing table structure without modifying the schema, preserving existing data and indexes.

### How can I improve performance when writing large DataFrames to SQL?

Use the `chunksize` parameter to process data in batches (e.g., `chunksize=10000`), and consider passing a custom `method` callable that utilizes your driver's `executemany` capability. For massive datasets, database-specific bulk loading tools like PostgreSQL's `COPY` or MySQL's `LOAD DATA INFILE` may outperform `to_sql`.