Skip to content

Conversation

juleswg23
Copy link
Contributor

@juleswg23 juleswg23 commented Aug 13, 2025

Summary

Work in Progress for new feature, grand summary rows

This PR summary will be updated

Related GitHub Issues and PRs

Checklist

@juleswg23 juleswg23 changed the title Feat: grand summary rows Feat: add grand_summary_rows() method Aug 13, 2025
Copy link

codecov bot commented Aug 13, 2025

Codecov Report

❌ Patch coverage is 97.48954% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.64%. Comparing base (654757a) to head (179a68f).

Files with missing lines Patch % Lines
great_tables/_gt_data.py 96.15% 3 Missing ⚠️
great_tables/_locations.py 92.85% 2 Missing ⚠️
great_tables/_tbl_data.py 95.65% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #765      +/-   ##
==========================================
+ Coverage   91.45%   91.64%   +0.19%     
==========================================
  Files          47       47              
  Lines        5558     5746     +188     
==========================================
+ Hits         5083     5266     +183     
- Misses        475      480       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot temporarily deployed to pr-765 August 13, 2025 22:01 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 15:03 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 17:56 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 18:30 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 18:30 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 18:32 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 14, 2025 19:04 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 26, 2025 16:55 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 26, 2025 17:24 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 26, 2025 17:39 Destroyed
@juleswg23 juleswg23 marked this pull request as ready for review August 26, 2025 19:24
@github-actions github-actions bot temporarily deployed to pr-765 August 26, 2025 19:27 Destroyed
@github-actions github-actions bot temporarily deployed to pr-765 August 26, 2025 19:35 Destroyed
@machow machow self-assigned this Aug 28, 2025
row_stub_var = data._boxhead._get_stub_column()

stub_layout = data._stub._get_stub_layout(options=data._options)
has_summary_rows = bool(data._summary_rows or data._summary_rows_grand)
Copy link
Contributor Author

@juleswg23 juleswg23 Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expression appears several times in this PR. I'm not sure if there's a better approach that doesn't have multiple sources of truth.

has_summary_rows = bool(data._summary_rows or data._summary_rows_grand)

summary_row_stub_var = ColInfo(
"__summary_row__", ColInfoTypeEnum.stub, column_align="left"
)
column_vars = [summary_row_stub_var] + column_vars
Copy link
Contributor Author

@juleswg23 juleswg23 Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this case today – I wonder if having a flag in the ColInfo object is worth creating, to avoid using the ColInfo.var attribute to choose the summary row.



def max_expr(df: pd.DataFrame):
return df.max(numeric_only=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are helpers. Does it make sense to have them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super confident in my testing of the new single dispatch function, eval_aggregate(). Open to feedback/changes.

if columns is not None:
raise NotImplementedError(
"Currently, grand_summary_rows() does not support column selection."
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we raise an error, which is documented in the docstring. Flagging in a comment just in case there's a better way to stub this out.

def grand_summary_rows(
self: GTSelf,
fns: dict[str, PlExpr] | dict[str, Callable[[TblData], Any]],
fmt: FormatFn | None = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having used this in a couple cases, I am leaning towards accepting a function instead of a FormatFn, the same way that fmt() can take a custom-built fn. I haven't explore the implementation details of this switch.

# Replace with numeric values from new row
for key, new_value in summary_row.values.items():
if isinstance(new_value, (int, float)):
merged_values[key] = new_value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tell this is not the ideal way, since we can't guarantee that summary values have to be numeric.

_heading=Heading(),
_stubhead=None,
_summary_rows=SummaryRows(),
_summary_rows_grand=SummaryRows(_is_grand_summary=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like _is_grand_summary is used so that for methods like __getitem__ and add_summary_rows() some arguments that are required for regular summary rows can be optional for the grand one.


@set_style.register
def _(loc: LocBody, data: GTData, style: list[CellStyle]) -> GTData:
# @set_style.register(LocSummary)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: double check this



def grand_summary_rows(
self: GTSelf,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: let's make these keyword only for now, so we can feel out final signature

side=side,
)

self._summary_rows_grand.add_summary_row(summary_row_info)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ensure we don't mutate here and ensure summary row data objects are not allowed to mutate themselves

)
from ._spanners import spanners_print_matrix
from ._tbl_data import _get_cell, cast_frame_to_string, replace_null_frame
from ._tbl_data import TblData, _get_cell, cast_frame_to_string, replace_null_frame
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put TblData in a if TYPE_CHECKING block, to flag that we are not instantiating any Tbly things

else:
cell_content = summary_row.values.get(colinfo.var)
else:
if colinfo.var == "__summary_row__":
Copy link
Collaborator

@machow machow Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: double check this (used twice)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants