Conversation
WalkthroughThis PR refactors RSSA from a path-based class to a dataclass with explicit public fields (parameters, tracks) and introduces IO methods (read_from_file, save_to_files, load_from_saved_files). FileParameters gains JSON serialization support. The rssa_reader is enhanced with streaming via scan_tracks, and parse_header/parse_tracks are refactored to accept file paths and return typed results. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip We've launched Issue Planner and it is currently in beta. Please try it out and share your feedback on Discord! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/f4enix/output/rssa/rssa.py (1)
107-108:⚠️ Potential issue | 🟡 MinorMinor: trailing comma in summary output.
Line 108 produces
" photons: 0, "with a trailing comma and space. This looks like a formatting oversight.📝 Suggested fix
- summary += f" photons: {self.photon_tracks.shape[0]}, " + summary += f" photons: {self.photon_tracks.shape[0]}.\n"src/f4enix/output/rssa/rssa_reader.py (1)
200-231:⚠️ Potential issue | 🟡 MinorThe "b" column transformation loses semantic information for multi-digit photon indicators.
The formula
abs(b) / 10^floor(log10(abs(b)))extracts only the leading single digit. Photon tracks documented as having indicator16are transformed to1, not16. While the binary filtering logic (b == 8for neutrons,b != 8for photons) still works correctly, the transformedbcolumn no longer matches the documented particle-type encoding stated at line 37 ofrssa.py.This appears intentional for filtering efficiency, but the docstring is misleading: after transformation, photons become
1, not16. Either clarify the documentation or preserve the originalbvalue if the full encoded information is needed elsewhere.Additionally, guard against
b == 0to preventlog10(0)from producing-inf.
🤖 Fix all issues with AI agents
In `@src/f4enix/output/rssa/rssa.py`:
- Around line 49-60: The docstring example for RSSA shows a filename in the
get_summary output but the current get_summary() implementation (in class RSSA)
is path-agnostic and emits "RSSA file was recorded..." without the filename;
update the example in the docstring (or adjust RSSA.get_summary) so they match:
either remove the filename from the example output or change RSSA.get_summary to
include a filename attribute when available (consult RSSA.read_from_file and the
RSSA instance attributes to include the original filename). Ensure the docstring
example and the get_summary() behavior are consistent and reference
RSSA.read_from_file, the RSSA class, and get_summary() for locating the code to
change.
- Line 1: The module docstring is stale and mentions "parsing of D1S-UNED
meshinfo files" while this file handles RSSA parsing; update the top-level
docstring in rssa.py to accurately describe the module's purpose (e.g., parsing
RSSA files, expected inputs/outputs, and any relevant classes or functions like
parse_rssa, RSSAParser or similar symbols present in the file) so it matches the
actual functionality.
In `@tests/test_rssa.py`:
- Around line 71-83: The test_scan_tracks_file currently calls .with_columns()
and .filter() with no arguments which will raise a TypeError; update the test to
either remove these no-arg calls or replace them with concrete Polars
expressions (e.g., .with_columns([...]) or .filter(some_predicate)) so
scanned_tracks remains a valid LazyFrame before .collect(); look for the
test_scan_tracks_file function and modify the chain using scan_tracks,
scanned_tracks, and RSSA to eliminate the empty .with_columns()/.filter() calls.
🧹 Nitpick comments (4)
src/f4enix/output/rssa/rssa_helpers.py (1)
66-81: Consider using@classmethodinstead of@staticmethodfor the alternate constructor.
load_from_jsonis a factory that returns aFileParametersinstance — the idiomatic Python pattern for this is@classmethodso subclasses work correctly and the intent (alternate constructor) is clearer. Not blocking, but a nice improvement.♻️ Suggested change
- `@staticmethod` - def load_from_json(path: Path) -> "FileParameters": + `@classmethod` + def load_from_json(cls, path: Path) -> "FileParameters": """Loads the file parameters from a JSON file.""" with open(path) as infile: data = json.load(infile) data["surfaces"] = [ SurfaceParameters( id=surface["id"], info=surface["info"], type=surface["type"], num_parameters=surface["num_parameters"], parameters=surface["parameters"], ) for surface in data["surfaces"] ] - return FileParameters(**data) + return cls(**data)src/f4enix/output/rssa/rssa_reader.py (2)
200-210: Reshape at EOF may fail if the file has trailing bytes not aligned to 96.When
number_of_bytes_to_read > 0and the file has fewer bytes remaining than requested,np.fromfilereturns whatever is left. If that remainder isn't a multiple of 96,data.reshape(-1, 96)at line 209 raises aValueError. For well-formed RSSA files this shouldn't occur, but a brief guard would improve robustness.🛡️ Suggested defensive trim
data = np.fromfile(file, BYTE, number_of_bytes_to_read) + # Trim any trailing bytes that don't form a complete track record + remainder = len(data) % 96 + if remainder != 0: + data = data[:-remainder] + + if len(data) == 0: + return pl.DataFrame(schema=SCHEMA) + data = data.reshape(-1, 96)
45-51:parse_tracksre-parses the header just to skip it — minor inefficiency.The header is parsed and discarded. This is fine for correctness, but callers who already have the header (e.g., after calling
parse_header) pay the cost twice. Consider accepting an optional pre-parsed offset or combining both into a single call. Not blocking given the header is small.src/f4enix/output/rssa/rssa.py (1)
65-70:read_from_fileopens and parses the file header twice.
parse_header(path)opens the file and reads the header, thenparse_tracks(path)opens the file again and re-parses the header just to skip it. For small headers this is negligible, but it's easy to avoid by reading header + tracks in a single open:♻️ Suggested single-pass read
`@staticmethod` def read_from_file(path: Path | str) -> "RSSA": """Loads the RSSA file from the given path as generated by MCNP.""" - parameters = parse_header(path) - tracks = parse_tracks(path) + with open(path, "rb") as infile: + from f4enix.output.rssa.rssa_reader import _parse_header_binary, _parse_tracks_binary + parameters = _parse_header_binary(infile) + tracks = _parse_tracks_binary(infile) return RSSA(parameters, tracks)Alternatively, the reader module could expose a single
parse_filefunction that returns both in one pass.
| @@ -1,79 +1,95 @@ | |||
| """This module is related to the parsing of D1S-UNED meshinfo files.""" | |||
There was a problem hiding this comment.
Stale module docstring.
The docstring says "parsing of D1S-UNED meshinfo files" but this module handles RSSA files. Please update.
📝 Suggested fix
-"""This module is related to the parsing of D1S-UNED meshinfo files."""
+"""This module is related to the parsing and representation of RSSA files."""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """This module is related to the parsing of D1S-UNED meshinfo files.""" | |
| """This module is related to the parsing and representation of RSSA files.""" |
🤖 Prompt for AI Agents
In `@src/f4enix/output/rssa/rssa.py` at line 1, The module docstring is stale and
mentions "parsing of D1S-UNED meshinfo files" while this file handles RSSA
parsing; update the top-level docstring in rssa.py to accurately describe the
module's purpose (e.g., parsing RSSA files, expected inputs/outputs, and any
relevant classes or functions like parse_rssa, RSSAParser or similar symbols
present in the file) so it matches the actual functionality.
| Examples | ||
| -------- | ||
| >>> from f4enix.output.rssa import RSSA | ||
| ... my_rssa = RSSA.read_from_file('small_cyl.w') | ||
| ... print(my_rssa) | ||
| RSSA file small_cyl.w was recorded using the following surfaces: | ||
| Surface ID: 1, type: 1 | ||
| The total number of tracks recorded is 72083. | ||
| Neutrons: 72083 photons: 0. | ||
| The simulation that produced this RSSA run 100000 histories. | ||
| The amount of independent histories that reached the RSSA surfaces was 70797. | ||
| """ |
There was a problem hiding this comment.
Docstring example output doesn't match get_summary() implementation.
The example on line 54 shows "RSSA file small_cyl.w was recorded..." but get_summary() at line 102 produces "RSSA file was recorded..." (no filename, since the class is now path-agnostic).
📝 Suggested fix for the example output
- RSSA file small_cyl.w was recorded using the following surfaces:
+ RSSA file was recorded using the following surfaces:🤖 Prompt for AI Agents
In `@src/f4enix/output/rssa/rssa.py` around lines 49 - 60, The docstring example
for RSSA shows a filename in the get_summary output but the current
get_summary() implementation (in class RSSA) is path-agnostic and emits "RSSA
file was recorded..." without the filename; update the example in the docstring
(or adjust RSSA.get_summary) so they match: either remove the filename from the
example output or change RSSA.get_summary to include a filename attribute when
available (consult RSSA.read_from_file and the RSSA instance attributes to
include the original filename). Ensure the docstring example and the
get_summary() behavior are consistent and reference RSSA.read_from_file, the
RSSA class, and get_summary() for locating the code to change.
| def test_scan_tracks_file(rssa): | ||
| path = Path(RESOURCES.joinpath("small_cyl.w")) # type: ignore | ||
| file_parameters = parse_header(path) | ||
| scanned_tracks = scan_tracks(path) | ||
| scanned_tracks = ( | ||
| scanned_tracks.head(3) | ||
| .with_columns() # We can filter by columns | ||
| .filter() # We can apply any predicate to the LazyFrame | ||
| ) | ||
| small_rssa = RSSA(file_parameters, scanned_tracks.collect()) | ||
|
|
||
| assert small_rssa.tracks.shape == (3, 11) | ||
| assert rssa.tracks.head(3).equals(small_rssa.tracks) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Polars LazyFrame filter method no arguments behavior TypeError
💡 Result:
In Polars, LazyFrame.filter is not a “no-arg”/no-op method.
- In older versions (e.g., docs for
0.18), the signature isfilter(predicate: IntoExpr), so callinglf.filter()raises Python’sTypeErrorbecause the requiredpredicateargument is missing. (docs.pola.rs) - In current stable/dev docs, the signature is
filter(*predicates, **constraints), but it’s still defined as “filter rows … based on a predicate expression”, so calling it with no predicates/constraints is meaningless and you should avoid it. (docs.pola.rs)
What to do instead
- If you meant “don’t filter”, just don’t call
filter:lf = lf if predicate is None else lf.filter(predicate)
- If you need a deliberate no-op filter, pass a trivial predicate:
lf.filter(True)(keeps all rows) (docs.pola.rs)
Citations:
- 1: https://docs.pola.rs/docs/python/version/0.18/reference/lazyframe/api/polars.LazyFrame.filter.html?utm_source=openai
- 2: https://docs.pola.rs/api/python/stable/reference/lazyframe/?utm_source=openai
- 3: https://docs.pola.rs/api/python/stable/reference/lazyframe/?utm_source=openai
🌐 Web query:
Polars LazyFrame with_columns no arguments
💡 Result:
In Python Polars, LazyFrame.with_columns is defined with only optional varargs/kwargs:
LazyFrame.with_columns(*exprs, **named_exprs) -> LazyFrameSo calling it with no arguments is valid and is effectively a no-op (it returns a new LazyFrame representing “add zero columns”). [1]
If you’re conditionally building expressions, the common pattern is:
exprs: list[pl.Expr] = [...]
lf = lf.with_columns(exprs) if exprs else lf(You can pass a list of expressions directly; Polars accepts that form too.) [1]
Sources
[1] Polars docs: polars.LazyFrame.with_columns signature and usage: (docs.pola.rs)
Citations:
🌐 Web query:
Polars documentation LazyFrame filter required arguments
💡 Result:
In current Polars Python docs, LazyFrame.filter is defined as:
LazyFrame.filter(
*predicates: IntoExprColumn | Iterable[IntoExprColumn] | bool | list[bool] | np.ndarray,
**constraints: Any,
) -> LazyFrameSo there are no strictly required named parameters in the signature; instead you pass:
- one or more positional
predicates(boolean expression(s)), and/or - zero or more keyword
constraintslikecolname=value(equivalent topl.col("colname") == value, AND-ed together). [1][2]
In older docs (e.g., 0.18) the signature was:
LazyFrame.filter(predicate: IntoExpr) -> Selfwhere the single predicate argument was required. [3]
Sources: Polars LazyFrame.filter (stable/current) [1], (py-polars reference) [2], and version 0.18 docs [3].
🏁 Script executed:
cd tests && find . -name "test_rssa.py" -type f | head -5Repository: Fusion4Energy/F4Enix
Length of output: 78
🏁 Script executed:
cat pyproject.toml | grep -A 5 -B 5 "polars"Repository: Fusion4Energy/F4Enix
Length of output: 237
🏁 Script executed:
grep -r "polars" . --include="*.py" | grep "import" | head -3Repository: Fusion4Energy/F4Enix
Length of output: 238
🏁 Script executed:
cat -n tests/test_rssa.py | sed -n '71,83p'Repository: Fusion4Energy/F4Enix
Length of output: 680
🏁 Script executed:
cat -n tests/test_rssa.py | sed -n '1,30p'Repository: Fusion4Energy/F4Enix
Length of output: 1124
🏁 Script executed:
grep -r "pytest" . --include="pyproject.toml" -A 5Repository: Fusion4Energy/F4Enix
Length of output: 412
🏁 Script executed:
grep -r "filter()" . --include="*.py" -B 2 -A 2Repository: Fusion4Energy/F4Enix
Length of output: 387
Remove the no-arg calls to .with_columns() and .filter().
.with_columns() with no arguments is a valid no-op in Polars, but .filter() without a predicate expression will raise a TypeError. If the intent is to demonstrate these methods are available, use actual expressions or remove the calls entirely.
♻️ Suggested fix
scanned_tracks = (
scanned_tracks.head(3)
- .with_columns() # We can filter by columns
- .filter() # We can apply any predicate to the LazyFrame
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_scan_tracks_file(rssa): | |
| path = Path(RESOURCES.joinpath("small_cyl.w")) # type: ignore | |
| file_parameters = parse_header(path) | |
| scanned_tracks = scan_tracks(path) | |
| scanned_tracks = ( | |
| scanned_tracks.head(3) | |
| .with_columns() # We can filter by columns | |
| .filter() # We can apply any predicate to the LazyFrame | |
| ) | |
| small_rssa = RSSA(file_parameters, scanned_tracks.collect()) | |
| assert small_rssa.tracks.shape == (3, 11) | |
| assert rssa.tracks.head(3).equals(small_rssa.tracks) | |
| def test_scan_tracks_file(rssa): | |
| path = Path(RESOURCES.joinpath("small_cyl.w")) # type: ignore | |
| file_parameters = parse_header(path) | |
| scanned_tracks = scan_tracks(path) | |
| scanned_tracks = ( | |
| scanned_tracks.head(3) | |
| ) | |
| small_rssa = RSSA(file_parameters, scanned_tracks.collect()) | |
| assert small_rssa.tracks.shape == (3, 11) | |
| assert rssa.tracks.head(3).equals(small_rssa.tracks) |
🤖 Prompt for AI Agents
In `@tests/test_rssa.py` around lines 71 - 83, The test_scan_tracks_file currently
calls .with_columns() and .filter() with no arguments which will raise a
TypeError; update the test to either remove these no-arg calls or replace them
with concrete Polars expressions (e.g., .with_columns([...]) or
.filter(some_predicate)) so scanned_tracks remains a valid LazyFrame before
.collect(); look for the test_scan_tracks_file function and modify the chain
using scan_tracks, scanned_tracks, and RSSA to eliminate the empty
.with_columns()/.filter() calls.
Adds functions to scan RSSA file instead of loading them into memory all at once. Enables working with very large files that do not fit into memory.
Summary by CodeRabbit
New Features
Refactor
Tests