# How to Filter a Pandas DataFrame Using IN and NOT IN Like SQL WHERE

> Filter pandas DataFrames like SQL WHERE using isin and ~ for IN and NOT IN conditions. Learn efficient data selection techniques in pandas.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-14

---

**Use the `isin()` method to test for membership and the bitwise NOT operator `~` for negation, enabling SQL-style `WHERE col IN (...)` and `WHERE col NOT IN (...)` filtering in pandas.**

Filtering rows based on whether column values exist in a specific set is one of the most common SQL operations. In the `pandas-dev/pandas` repository, this functionality is implemented through the `isin()` method, which provides a vectorized, high-performance way to filter a pandas DataFrame using IN and NOT IN like SQL WHERE clauses.

## Understanding the `isin()` Method for SQL-Style Filtering

The pandas library implements SQL-equivalent `IN` operators through two primary entry points: `Series.isin()` for single-column checks and `DataFrame.isin()` for multi-column or element-wise comparisons.

### Series.isin() for Column-Wise Membership Testing

When you need to check if values in a single column exist within a specified set, `Series.isin()` returns a Boolean Series that serves as a filter mask. According to the `pandas-dev/pandas` source code, this method is implemented in **[`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py)** at line 6114.

The method accepts various collection types—lists, sets, dictionaries, or even another pandas Series—as the `values` argument, making it flexible for different data workflows.

### DataFrame.isin() for Multi-Column Filtering

For checking membership across multiple columns simultaneously, `DataFrame.isin()` creates a Boolean DataFrame of the same shape, where each cell indicates whether that specific element exists in the provided values. This method is defined in **[`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py)** at line 18326.

Unlike the Series version, `DataFrame.isin()` is typically used when you want to filter based on exact row matches against a reference table or when performing element-wise membership testing across the entire DataFrame.

### The Core Algorithm Behind the Scenes

Both `Series.isin()` and `DataFrame.isin()` delegate to the low-level, vectorized algorithm located in **[`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py)** at line 493. This `isin(comps, values)` function performs the actual membership testing on underlying NumPy arrays or ExtensionArrays, ensuring consistent performance across data types including nullable integers, strings, and categoricals.

## How to Implement NOT IN in Pandas

SQL's `NOT IN` operator is expressed in pandas through logical negation of the Boolean mask generated by `isin()`. The bitwise NOT operator `~` inverts the True/False values, effectively converting an "in" check to a "not in" check.

```python

# SQL equivalent: WHERE column NOT IN ('value1', 'value2')

mask = ~df['column'].isin(['value1', 'value2'])
filtered_df = df[mask]

```

This pattern works identically for both Series and DataFrame objects, maintaining consistency across the pandas API.

## Practical Examples: SQL WHERE IN and NOT IN in Pandas

### Filtering Rows with IN Condition

To replicate `SELECT * FROM table WHERE city IN ('Paris', 'Berlin')`, use `Series.isin()` to generate a filter mask:

```python
import pandas as pd

df = pd.DataFrame({
    "city": ["New York", "Paris", "Tokyo", "Berlin"],
    "population": [8_400_000, 2_200_000, 9_300_000, 3_600_000]
})

# SQL: WHERE city IN ('Paris', 'Berlin')

mask = df["city"].isin(["Paris", "Berlin"])
result = df[mask]
print(result)

```

**Output:**

```

    city  population
1  Paris    2200000
3 Berlin    3600000

```

### Excluding Rows with NOT IN Condition

To exclude specific values using SQL's `NOT IN` logic, apply the `~` operator to invert the Boolean mask:

```python

# SQL: WHERE city NOT IN ('Tokyo')

mask = ~df["city"].isin(["Tokyo"])
result = df[mask]
print(result)

```

**Output:**

```

       city  population
0  New York    8400000
1     Paris    2200000
3    Berlin    3600000

```

### Combining Multiple Conditions

Complex SQL queries with multiple `IN` conditions and logical operators translate directly to pandas using `&` (AND) and `|` (OR):

```python

# SQL: WHERE city IN ('Paris', 'Berlin') AND population > 2_500_000

mask = df["city"].isin(["Paris", "Berlin"]) & (df["population"] > 2_500_000)
result = df[mask]
print(result)

```

**Output:**

```

    city  population
3 Berlin    3600000

```

### Using a DataFrame as the Lookup Table

The `DataFrame.isin()` method allows you to filter based on exact row matches against another DataFrame, similar to SQL's `WHERE (col1, col2) IN (SELECT ...)`:

```python
allowed = pd.DataFrame({
    "city": ["New York", "Tokyo"],
    "population": [8_400_000, 9_300_000]
})

# Keep rows that appear exactly in `allowed` (both columns must match)

mask = df.isin(allowed)
result = df[mask.all(axis=1)]
print(result)

```

**Output:**

```

       city  population
0  New York    8400000
2     Tokyo    9300000

```

## Summary

- **Use `Series.isin()`** (implemented in [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py)) to test membership in a single column, returning a Boolean mask for filtering.
- **Use `DataFrame.isin()`** (implemented in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py)) to perform element-wise membership testing across multiple columns.
- **Apply the `~` operator** to invert `isin()` results, achieving SQL-style `NOT IN` functionality.
- **Leverage the core algorithm** in [`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py) for vectorized, high-performance membership testing across all pandas data types.
- **Combine masks** using `&` (AND) and `|` (OR) to replicate complex SQL `WHERE` clauses with multiple `IN` conditions.

## Frequently Asked Questions

### How do I filter a pandas DataFrame using a list of values like SQL IN?

Use the `isin()` method on a Series to create a Boolean mask, then pass that mask to the DataFrame indexer. For example: `df[df['column'].isin(['value1', 'value2'])]`. This pattern, implemented in [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py), is the direct equivalent of SQL's `WHERE column IN (...)`.

### What is the equivalent of SQL NOT IN in pandas?

The equivalent of SQL `NOT IN` is the logical negation of the `isin()` mask using the bitwise NOT operator `~`. The syntax is `df[~df['column'].isin(values)]`, which inverts the Boolean mask to exclude matching rows rather than include them.

### Can I use isin() with multiple columns in pandas?

Yes, `DataFrame.isin()` operates element-wise across all columns, returning a Boolean DataFrame of the same shape. To filter rows where all columns match values in a reference set, use `df[df.isin(values).all(axis=1)]`. For column-specific logic, combine individual `Series.isin()` calls with the `&` operator.

### How does pandas isin() handle null values compared to SQL?

Unlike SQL where `NULL IN (NULL)` evaluates to unknown (false in practice), pandas `isin()` treats `NaN` or `None` as distinct values. By default, `NaN` is not considered equal to `NaN` in membership tests. To include missing values in your filter, you must explicitly check for nulls using `pd.isna()` and combine it with your `isin()` mask using the `|` operator.