Add PumpProbePulses.pumped_pulses_ratios() #317

philsmt · 2025-04-10T08:04:03Z

I recently needed to frequently distinguish regular trains with all or most pulses pumped vs trains where intentionally only few or no pulses are pumped for reference. This was way more work than I wanted to, mostly because of necessary train alignment between these modes.

This PR adds a corresponding method .pumped_pulses_ratios() returning such a series automatically.

@takluyver This could also be used to distinguish pump-probe patterns

codecov · 2025-04-10T08:06:24Z

Codecov Report

Attention: Patch coverage is 62.96296% with 10 lines in your changes missing coverage. Please review.

Project coverage is 57.40%. Comparing base (0c4c101) to head (54e8093).
Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
src/extra/components/pulses.py	62.96%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #317      +/-   ##
==========================================
+ Coverage   57.36%   57.40%   +0.03%     
==========================================
  Files          30       30              
  Lines        4539     4566      +27     
==========================================
+ Hits         2604     2621      +17     
- Misses       1935     1945      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

takluyver · 2025-04-14T14:59:44Z

src/extra/components/pulses.py

@@ -1507,6 +1507,59 @@ def pulse_mask(self, labelled=True, field=None):
        else:
            raise ValueError(f"{field=!r} parameter was not 'fel'/'ppl'/None")

+    def pumped_pulses_ratios(self, ppl_only_value=np.nan, labelled=True):


labelled is never used. I don't have a strong preference whether we implement it or remove it.

Well spotted. I realized at the end all other methods carry it, but forgot the implementation...

takluyver · 2025-04-14T15:18:17Z

tests/test_components_pulses.py

+    pulses._get_train_ids = lambda: [1000, 1001, 1002, 1003, 1004]
+    pulses._pulse_ids = pd.Series(
+        [300, 310, 300, 300, 300, 310],
+        index=pd.MultiIndex.from_tuples([
+            (1000, 0, True, True),
+            (1000, 0, True, False),
+            (1001, 0, True, False),
+            (1002, 0, False, True),
+            (1003, 0, True, True),
+            (1003, 0, True, True),
+        ], names=['trainId', 'pulseIndex', 'fel', 'ppl']))


In general I think mocking out bits of the innards of a class like this to test its public API is annoyingly brittle, and it's better to make some suitable input. I won't hold the PR up over it, though. Maybe we need better facilities for making mock data.

Generally I agree, though I found it acceptable for PulsePattern given the explicit caching mechanism.

As you guessed correctly, I wanted to avoid having to creating different mock devices with the current interface to test a very particular scenario. Ideally we would continue to be able to test without Maxwell, and we also cannot rely on runs sticking around forever unless we copy them to a defined place. Maybe some serialized data in the repository that is unpacked into EXDF?

takluyver · 2025-04-14T15:28:55Z

I think doing this with a pandas series makes it more complex than using a 2D array. I haven't tested this, but as an idea:

fel_count = self.pulse_mask(field='fel').sum(axis=1)
ppl_count = self.pulse_mask(field='ppl').sum(axis=1)
ratio = ppl_count / fel_count
ratio[fel_count = 0] = fill_value

Might want to avoid division by zero; np.divide() takes a where= parameter.

philsmt · 2025-04-15T07:10:20Z

I think doing this with a pandas series makes it more complex than using a 2D array.

Assuming a constant number of pulses per train is no longer useful for me, in particular when pump-probe is used. That's why I based the PulsePattern family mostly on pandas types.

Going forward the more ubiquitous use of frame filters will likely make this even more common. I don't think we can get away with using linear pulse axes more often.

EDIT: Looks like I mixed up MRs here... this function actually reduces to a train dimension. Just ignore the part above please for that matter.

Concerning your comment: The downside of that code is that right now I only rely on the base interface (see implementation in DldPulses), while for that case I'd need pulse_mask, too. The code looks easier, provided .pulse_mask() does actually always include pulse-less trains, which I don't quite recall whether it did 🤔

takluyver · 2025-04-15T08:39:12Z

I'd maybe add .pulse_mask() - or rather the variant with the field= parameter - to DldPulses, to match PumpProbePulses. It looks to me like the implementation of that plus pumped_pulses_ratios() using the mask arrays is still simpler than the MultiIndex version, and the enhanced pulse_mask could be useful as well. Up to you, though.

fadybishara · 2025-04-23T14:56:50Z

src/extra/components/pulses.py

+            pumped_count = pd.Series([])
+
+        # Compute the ratio for trains with at least one pumped pulse.
+        ratios = pumped_count / fel_count.loc[pumped_count.index]


I'm not sure I follow the logic here, if pumped_count and fel_count have different trains, how do you know that fel_count will have more trains? If not, wouldn't this throw an error?

It's likely I misunderstood something but if not, perhaps a simple solution would be to do an explicit inner merge like

joint_count = pd.concat([fel_count, pumped_count], keys=['fel', 'ppl'], axis=1, join='inner' ratios = joint_counts.ppl.div(joint_counts.fel)

Oh, never mind, I get it -- all the trains have FEL pulses but not necessarily PPL pulses. Nevertheless, just because it should not happen doesn't mean it cannot happen, no?

(Also, a better way to do what I suggested is with np.intersect1d on the train IDs -- but probably none of what I suggested is necessary.)

Not quite - this line is not about FEL vs PPL pulses, but FEL vs FEL+PPL pulses. The set of trains with pumped pulses is a (not necessarily proper) subset of the set of trains with FEL pulses. It becomes clear when comparing line 1528 and line 1533, the latter makes a stricter indexing.

Yes, right, 1528 vs. 1533 is what I was referring to. [:, :, True, True] is necessarily a subset (not sure what you mean by "proper") of [:, :, True, :] so there is no problem here.

philsmt · 2025-04-24T06:47:46Z

I'd maybe add .pulse_mask() - or rather the variant with the field= parameter - to DldPulses, to match PumpProbePulses. It looks to me like the implementation of that plus pumped_pulses_ratios() using the mask arrays is still simpler than the MultiIndex version, and the enhanced pulse_mask could be useful as well. Up to you, though.

I had a thought about this, but this implementation would run into the problem of selected trains vs trains with data. As with KeyData.data_counts(), the statistics methods of these components strive to return a result for all selected trains, whether there is data or not.

fadybishara · 2025-04-24T07:20:04Z

src/extra/components/pulses.py

+            # pd.SeriesGroupBy.count() is indeed faster than
+            # pd.SeriesGroupBy.groups, likely due additional objects
+            # created by the latter.
+            try:


Seems to me to be more complicated than necessary, why not do the following?

try: ppl_only_index = pids[:, :, False, True].groupby('trainId').count() except KeyError: ppl_only_index = pd.Series([])

philsmt force-pushed the feat/pump-probe-ratios branch 3 times, most recently from 628d451 to 67e28db Compare April 10, 2025 08:48

takluyver reviewed Apr 14, 2025

View reviewed changes

philsmt added 2 commits April 15, 2025 09:17

Add PumpProbePulses.pumped_pulses_ratios()

98781c4

Add pumped_pulses_ratio() to DldPulses

54e8093

philsmt force-pushed the feat/pump-probe-ratios branch from 67e28db to 54e8093 Compare April 15, 2025 07:17

fadybishara reviewed Apr 23, 2025

View reviewed changes

fadybishara reviewed Apr 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add PumpProbePulses.pumped_pulses_ratios() #317

Add PumpProbePulses.pumped_pulses_ratios() #317

Uh oh!

philsmt commented Apr 10, 2025

Uh oh!

codecov bot commented Apr 10, 2025 •

edited

Loading

Uh oh!

takluyver Apr 14, 2025

Uh oh!

philsmt Apr 15, 2025

Uh oh!

takluyver Apr 14, 2025

Uh oh!

philsmt Apr 15, 2025

Uh oh!

takluyver commented Apr 14, 2025

Uh oh!

philsmt commented Apr 15, 2025 •

edited

Loading

Uh oh!

takluyver commented Apr 15, 2025

Uh oh!

fadybishara Apr 23, 2025 •

edited

Loading

Uh oh!

fadybishara Apr 23, 2025

Uh oh!

philsmt Apr 24, 2025

Uh oh!

fadybishara Apr 24, 2025

Uh oh!

philsmt commented Apr 24, 2025

Uh oh!

fadybishara Apr 24, 2025

Uh oh!

Uh oh!

Add PumpProbePulses.pumped_pulses_ratios() #317

Are you sure you want to change the base?

Add PumpProbePulses.pumped_pulses_ratios() #317

Uh oh!

Conversation

philsmt commented Apr 10, 2025

Uh oh!

codecov bot commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

takluyver commented Apr 14, 2025

Uh oh!

philsmt commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

takluyver commented Apr 15, 2025

Uh oh!

fadybishara Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philsmt commented Apr 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Apr 10, 2025 •

edited

Loading

philsmt commented Apr 15, 2025 •

edited

Loading

fadybishara Apr 23, 2025 •

edited

Loading