How to Keep Only the Date in pandas to_datetime: 4 Vectorized Methods
To extract only the date portion after parsing with pd.to_datetime(), use the .dt accessor methods such as .normalize() or .floor('D') to retain the datetime64[ns] dtype, or .date to obtain native Python date objects.
When working with the pandas-dev/pandas repository, converting string columns to datetime objects creates Timestamp values that include nanosecond-precision time data. If you need to keep only the date part from pandas to_datetime results, the library provides several vectorized approaches that avoid Python-level loops and operate directly on the underlying NumPy arrays.
Method 1: Reset Time to Midnight with dt.normalize()
The .dt.normalize() method, implemented in pandas/core/indexes/accessors.py within the DatetimeProperties class, sets each timestamp's time component to 00:00:00 while preserving the datetime64[ns] dtype. This approach is optimal when you require the result to remain in datetime format for subsequent timezone conversions or arithmetic operations.
import pandas as pd
df = pd.DataFrame({"raw": ["2024-03-15 14:23:05", "2024-03-16 09:45:00"]})
df["parsed"] = pd.to_datetime(df["raw"])
# Normalize to midnight, keeping datetime64[ns]
df["date_only"] = df["parsed"].dt.normalize()
print(df["date_only"])
# 0 2024-03-15
# 1 2024-03-16
# dtype: datetime64[ns]
According to the source implementation in pandas/core/indexes/accessors.py, normalize() rounds down to the start of the day without altering the underlying array resolution.
Method 2: Floor to Daily Frequency with dt.floor('D')
For workflows that already employ frequency-based rounding, use .dt.floor('D') to floor each timestamp to the beginning of the day. This method, also defined in pandas/core/indexes/accessors.py, accepts any frequency string, providing flexibility if you later need to floor to hourly or weekly boundaries.
# Equivalent to normalize() but uses frequency syntax
df["date_floored"] = df["parsed"].dt.floor("D")
Method 3: Extract Python date Objects with dt.date
When you require native Python datetime.date objects—such as for JSON serialization or integration with the standard library—access the .dt.date property. This converts the Series dtype from datetime64[ns] to object, which increases memory overhead but ensures compatibility with libraries expecting pure date instances.
df["py_date"] = df["parsed"].dt.date
print(df["py_date"].dtype)
# dtype('O')
Method 4: Format as Strings with dt.strftime()
If your pipeline requires string representations for display, logging, or CSV export without time components, use .dt.strftime('%Y-%m-%d'). This returns an object-dtype Series containing formatted strings.
df["date_str"] = df["parsed"].dt.strftime("%Y-%m-%d")
Source Code Architecture
The initial parsing logic for pandas.to_datetime() resides in pandas/core/tools/datetimes.py, which handles string parsing, timezone localization, and the construction of DatetimeIndex or Series objects. Once converted, the .dt accessor methods delegate to the DatetimeProperties class in pandas/core/indexes/accessors.py, ensuring all date-extraction operations remain vectorized through the underlying NumPy datetime64 arrays.
Summary
.dt.normalize()– Zeros the time to midnight while keeping thedatetime64[ns]dtype; best for maintaining datetime functionality..dt.floor('D')– Floors to the start of the day using frequency syntax; ideal when consistency with other rounding operations matters..dt.date– Returns Pythondateobjects; suitable for JSON serialization but converts the Series toobjectdtype..dt.strftime('%Y-%m-%d')– Generates string representations; use when the output must be text-based.
All four methods execute as vectorized operations on the entire Series without Python-level iteration, leveraging the optimized C-backed implementations in the pandas-dev/pandas source tree.
Frequently Asked Questions
Does pd.to_datetime() have a parameter to parse dates without time?
No, the to_datetime function in pandas/core/tools/datetimes.py always parses complete Timestamp objects including time information. You must post-process the result using one of the .dt accessor methods to remove or isolate the time component.
What is the performance difference between normalize() and floor('D')?
Both methods execute similar vectorized underlying operations in pandas/core/indexes/accessors.py, so performance is comparable for large datasets. Choose based on semantic clarity: normalize() explicitly indicates intent to zero out time, while floor('D') aligns with other frequency-based rounding operations in your codebase.
Why does .dt.date change my Series to object dtype?
The date property returns individual Python datetime.date objects, which are not supported by NumPy's native datetime64 array structure. Consequently, pandas stores them in a generic object dtype Series, significantly increasing memory usage compared to the compact datetime64[ns] representation used by normalize() or floor().
Can I keep only the date when parsing multiple columns simultaneously?
Yes, apply pd.to_datetime() to each column first, then use the .dt accessor on the resulting datetime columns. For DataFrames with multiple datetime columns, apply these operations column-wise or use df.apply() with the appropriate .dt method to ensure all date extractions remain vectorized.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s https://instagit.com/install.md