feat: Accept kwargs in `{DataFrame,Series}.to_pandas` #2879

dangotbanned · 2025-07-24T12:13:40Z

What type of PR is this? (check all applicable)

Related issues

Related issue enh: Passing arguments to pa.Table.to_pandas #2123
Originated from refactor: Simplify compliant Series.hist #2839 (comment)
Extra context (enh: Passing arguments to pa.Table.to_pandas #2123 (comment))

Description

Note

Changed scope a bit from the original PR, see (#2879 (review)) and the original description for details

Show original description

Description

While I was looking into (#2839 (comment)) I noticed that the default (and only) behavior for nw.DataFrame.to_pandas when backed by either pa.Table, pl.DataFrame isn't great for nested datatypes:

import pyarrow as pa

data = {"breakpoint": [2, 3], "count": [3, 2]}

>>> pa.table(data).to_struct_array().to_pandas()
0    {'breakpoint': 2, 'count': 3}
1    {'breakpoint': 3, 'count': 2}
dtype: object

We can do better than this, and currently this PR will instead give us something that works with pd.Series.struct:

import pyarrow as pa
import narwhals as nw

data = {"breakpoint": [2, 3], "count": [3, 2]}

>>> nw.from_native(pa.table(data).to_struct_array(), series_only=True).to_pandas()
0    {'breakpoint': 2, 'count': 3}
1    {'breakpoint': 3, 'count': 2}
Name: , dtype: struct<breakpoint: int64, count: int64>[pyarrow]

Questions

Would we want this for any other DTypes e.g. List?
Would it be worthwhile to expose this as the to_pandas(use_pyarrow_extension_array=...) parameter from polars?
i. Note that this defaults to False, which is equivalent to our current behavior and pyarrow's

Adds optional keyword-only arguments to both of DataFrame.to_pandas, Series.to_pandas.

This aligns us with the same methods found in polars and pyarrow:

As mentioned in (#2123), this can reduce the memory overhead if configured correctly.

But, the main benefit I'm excited about is that we can preserve pyarrow data types (nulls, nested data) see (#2123 (comment)) - which were previously lost unconditionally

Example

import pyarrow as pa

import narwhals as nw

data = {"breakpoint": [2, 3], "count": [3, 2], "what": [None, 1]}
native = pa.table(data).to_struct_array()
series = nw.from_native(native, series_only=True)

Before

>>> series.to_pandas()
0    {'breakpoint': 2, 'count': 3, 'what': None}
1     {'breakpoint': 3, 'count': 2, 'what': 1.0}
Name: , dtype: object

After

>>> series.to_pandas(use_pyarrow_extension_array=True)
0    {'breakpoint': 2, 'count': 3, 'what': None}
1     {'breakpoint': 3, 'count': 2, 'what': 1.0}
Name: , dtype: struct<breakpoint: int64, count: int64, what: int64>[pyarrow]

Tasks

https://results.pre-commit.ci/run/github/760058710/1753359223.9ca0gmbGR1a7G8L-0BQRug

FBruzzesi

Thanks for spotting this @dangotbanned , it seems reasonable to me!
Yet if we do it, I would rather keep it consistent for all the dtypes that would only be supported with ArrowDtype's. WDYT?

narwhals/_arrow/utils.py

- Lightly adapted fromhttps://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.to_pandas - Most likely, will edit it down but there's too many options + ambiguous names to not have something

No intention of supporting it there

Was causing another test to fail that expected numpy dtypes

Fixes: https://github.com/narwhals-dev/narwhals/actions/runs/16556200639/job/46817960571?pr=2879 Related: apache/arrow#38520

#2879 (comment)

https://results.pre-commit.ci/run/github/760058710/1756111460.u8gj5dBHRiWhTRmz_gWpdA

…ypes

dangotbanned added 4 commits July 24, 2025 11:41

add to_pandas_types_mapper

2a880fc

feat: Preserve Struct dtype in pyarrow -> pandas

cb764d0

test: Add test_pyarrow_to_pandas_struct

0951477

test: Also test DataFrame

9e3d295

dangotbanned added enhancement New feature or request pyarrow Issue is related to pyarrow backend pandas-like Issue is related to pandas-like backends labels Jul 24, 2025

dangotbanned added 3 commits July 24, 2025 12:16

ignore banned import

46dc0fe

https://results.pre-commit.ci/run/github/760058710/1753359223.9ca0gmbGR1a7G8L-0BQRug

Merge branch 'main' into arrow-to-pandas-dtypes

789967c

Merge branch 'main' into arrow-to-pandas-dtypes

ccdda2e

dangotbanned requested a review from FBruzzesi July 25, 2025 21:51

FBruzzesi reviewed Jul 25, 2025

View reviewed changes

narwhals/_arrow/utils.py Outdated Show resolved Hide resolved

dangotbanned mentioned this pull request Jul 25, 2025

enh: Passing arguments to pa.Table.to_pandas #2123

Open

dangotbanned added 3 commits July 27, 2025 17:09

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

33f892d

docs(typing): Add ToPandasArrowKwds

18e7415

- Lightly adapted fromhttps://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.to_pandas - Most likely, will edit it down but there's too many options + ambiguous names to not have something

feat: Update compliant-level to_pandas signatures

9457086

dangotbanned changed the title ~~feat: Preserve Struct dtype in pyarrow -> pandas~~ feat: Accept kwargs in (DataFrame|Series).to_pandas Jul 27, 2025

dangotbanned linked an issue Jul 27, 2025 that may be closed by this pull request

enh: Passing arguments to pa.Table.to_pandas #2123

Open

Merge branch 'main' into arrow-to-pandas-dtypes

1c812be

dangotbanned added eager-only and removed pyarrow Issue is related to pyarrow backend pandas-like Issue is related to pandas-like backends labels Jul 27, 2025

dangotbanned added 8 commits July 27, 2025 21:34

Update ArrowDataFrame runtime

dcec444

feat: Update DataFrame.to_pandas

795bdaf

test: Update DataFrame.to_pandas

9e7e776

fix: Ignore **kwds for interchange

eaeed95

No intention of supporting it there

revert: Don't default to using types_mapper

b1c5d65

Was causing another test to fail that expected numpy dtypes

Merge branch 'main' into arrow-to-pandas-dtypes

21fc170

Merge branch 'main' into arrow-to-pandas-dtypes

c02d7ac

fix: Add to_struct_array compat for pyarrow<15

636b4ac

Fixes: https://github.com/narwhals-dev/narwhals/actions/runs/16556200639/job/46817960571?pr=2879 Related: apache/arrow#38520

dangotbanned added 7 commits August 15, 2025 11:51

Merge branch 'main' into arrow-to-pandas-dtypes

f72975e

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

40da5a8

Merge branch 'main' into arrow-to-pandas-dtypes

458f7d2

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

777a6d2

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

1033063

Merge branch 'main' into arrow-to-pandas-dtypes

1493d08

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

d7a0d91

dangotbanned requested a review from MarcoGorelli August 22, 2025 16:42

dangotbanned added 9 commits August 23, 2025 12:50

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

6a3632d

refactor: light shrinking

8420334

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

dc71c12

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

271ec7e

revert: ToPandasArrowKwds

b997fe7

#2879 (comment)

ci: remove annotation

a8566cf

https://results.pre-commit.ci/run/github/760058710/1756111460.u8gj5dBHRiWhTRmz_gWpdA

Merge branch 'main' into arrow-to-pandas-dtypes

77f2e66

Merge branch 'main' into arrow-to-pandas-dtypes

73b3362

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

2d94580

dangotbanned removed the request for review from MarcoGorelli August 28, 2025 20:04

dangotbanned added 7 commits August 28, 2025 20:04

Merge branch 'main' into arrow-to-pandas-dtypes

2c21602

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

9f74638

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

e41fd27

Merge branch 'main' into arrow-to-pandas-dtypes

05a7c03

Merge branch 'main' into arrow-to-pandas-dtypes

d49fa88

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

3c779a8

test: Adapt is_pandas to style of (#3098)

7a1da97

dangotbanned requested a review from MarcoGorelli September 7, 2025 15:34

dangotbanned added 4 commits September 7, 2025 15:53

Merge branch 'main' into arrow-to-pandas-dtypes

7912c18

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

36b3ca0

rMerge remote-tracking branch 'upstream/main' into arrow-to-pandas-dt…

1b9bde7

…ypes

Merge remote-tracking branch 'upstream/main' into arrow-to-pandas-dtypes

140d700

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Accept kwargs in `{DataFrame,Series}.to_pandas` #2879

feat: Accept kwargs in `{DataFrame,Series}.to_pandas` #2879

Uh oh!

dangotbanned commented Jul 24, 2025 •

edited

Loading

Uh oh!

FBruzzesi left a comment

Uh oh!

Uh oh!

Uh oh!

feat: Accept kwargs in {DataFrame,Series}.to_pandas #2879

Are you sure you want to change the base?

feat: Accept kwargs in {DataFrame,Series}.to_pandas #2879

Uh oh!

Conversation

dangotbanned commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Related issues

Description

Description

Questions

Example

Before

After

Tasks

Uh oh!

FBruzzesi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

feat: Accept kwargs in `{DataFrame,Series}.to_pandas` #2879

feat: Accept kwargs in `{DataFrame,Series}.to_pandas` #2879

dangotbanned commented Jul 24, 2025 •

edited

Loading