Get list of bb start/end eas for loops in `extract.py` by kunalsz · Pull Request #1253 · mandiant/flare-floss

kunalsz · 2026-04-02T09:43:08Z

In reference to the TODO in extract.py:

        if len(comp) >= 2:
            # TODO get list of bb start/end eas
            yield Loop(comp)

Added loop BB range extraction for each SCC loop (len(comp) >= 2), it now builds sorted (start_ea, end_ea) pairs from function basic blocks and passes them to Loop.
Extended Loop in features.py to keep bb_ranges while preserving existing comp behavior for compatibility

PS. main.py and tests/test_load.py got added due to black formatting, they have nothing to do with this PR

Signed-off-by: kunalsz <kunalavengers@gmail.com>

gemini-code-assist

Code Review

This pull request enhances the loop feature extraction by capturing basic block ranges and includes minor formatting adjustments to textwrap.dedent calls. The review feedback suggests optimizing the implementation by lazily initializing the basic block lookup dictionary only when a loop is identified, which prevents unnecessary computation in functions without loops.

gemini-code-assist · 2026-04-02T09:44:51Z

    edges = []
+    bb_by_va = {bb.va: bb for bb in f.basic_blocks}


The dictionary bb_by_va is created for every function, even those without loops. Since most functions do not contain loops (SCCs of size >= 2), this is an unnecessary overhead. It is more efficient to initialize this dictionary lazily only when a loop is detected.

Suggested change

edges = []

bb_by_va = {bb.va: bb for bb in f.basic_blocks}

edges = []

gemini-code-assist · 2026-04-02T09:44:51Z

    for comp in comps:
        if len(comp) >= 2:
-            # TODO get list of bb start/end eas
-            yield Loop(comp)
+            loop_bb_ranges = []
+            for bb_va in sorted(comp):
+                bb = bb_by_va.get(bb_va)
+                if bb is None:
+                    continue
+
+                loop_bb_ranges.append((bb.va, bb.va + bb.size))
+
+            yield Loop(comp, bb_ranges=loop_bb_ranges)


Implementing lazy initialization for bb_by_va here ensures that the dictionary is only constructed when at least one loop is identified in the function, avoiding unnecessary computation for the majority of functions.

bb_by_va = None for comp in comps: if len(comp) >= 2: if bb_by_va is None: bb_by_va = {bb.va: bb for bb in f.basic_blocks} loop_bb_ranges = [] for bb_va in sorted(comp): bb = bb_by_va.get(bb_va) if bb is None: continue loop_bb_ranges.append((bb.va, bb.va + bb.size)) yield Loop(comp, bb_ranges=loop_bb_ranges)

williballenthin · 2026-04-02T10:06:52Z

revert this please

williballenthin · 2026-04-02T10:07:00Z

revert this

williballenthin · 2026-04-02T10:10:47Z

Extended Loop in features.py to keep bb_ranges while preserving existing comp behavior for compatibility

This is not the right way to go about this. Features describe what people are looking for, so putting the addresses of found loops there doesn't make any sense.

When features are extracted, they're associated with a list of addresses where the feature was found. This is probably the right place to yield this information. Although to be honest, I'm not sure if it's worth the overhead of tracking this information because I can't really imagine many scenarios where people will care about the loop locations. That's why we haven't yet addressed this comment in the source code

williballenthin · 2026-04-02T10:12:07Z

I consider addressing this to-do issue as fairly low value. So I am not willing to go back and forth many times refining a solution. I'm willing to review perhaps one or maybe two more revisions of this PR. Otherwise we'll close it out and address the TODO sometime in the future if it becomes important.

kunalsz · 2026-04-02T10:47:45Z

Extended Loop in features.py to keep bb_ranges while preserving existing comp behavior for compatibility

This is not the right way to go about this. Features describe what people are looking for, so putting the addresses of found loops there doesn't make any sense.

When features are extracted, they're associated with a list of addresses where the feature was found. This is probably the right place to yield this information. Although to be honest, I'm not sure if it's worth the overhead of tracking this information because I can't really imagine many scenarios where people will care about the loop locations. That's why we haven't yet addressed this comment in the source code

@williballenthin So should I close the PR ? As it wont be a meaningful contribution if we revert the Loop feature change and keep the code in just extract.py

williballenthin · 2026-04-02T10:50:39Z

So should I close the PR ? As it wont be a meaningful contribution if we

This isn't how I think about the project. I'm not looking for "meaningful contributions" but whether or not the project and its code is improved.

kunalsz · 2026-04-02T10:51:26Z

I consider addressing this to-do issue as fairly low value. So I am not willing to go back and forth many times refining a solution. I'm willing to review perhaps one or maybe two more revisions of this PR. Otherwise we'll close it out and address the TODO sometime in the future if it becomes important.

I noticed quite a few other TODOs in the codebase. Since this one was of a low value, do you have any higher value TODOs you’d prefer I work on instead? I’d be happy to pick one that is more useful for the project

kunalsz · 2026-04-02T11:01:49Z

So should I close the PR ? As it wont be a meaningful contribution if we

This isn't how I think about the project. I'm not looking for "meaningful contributions" but whether or not the project and its code is improved.

Understood ! What I meant was that I only added that change to support the loop location feature, so if that feature is not worth keeping, the changes in extract.py will also not improve codebase (atleast I can't think of any rn)😅

Add loop BB range extraction

1268be9

Signed-off-by: kunalsz <kunalavengers@gmail.com>

gemini-code-assist Bot reviewed Apr 2, 2026

View reviewed changes

williballenthin reviewed Apr 2, 2026

View reviewed changes

Comment thread floss/main.py

Copy link
Copy Markdown

Collaborator

williballenthin Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this please

williballenthin reviewed Apr 2, 2026

View reviewed changes

Comment thread tests/test_load.py

Copy link
Copy Markdown

Collaborator

williballenthin Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this

williballenthin requested changes Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get list of bb start/end eas for loops in `extract.py`#1253

Get list of bb start/end eas for loops in `extract.py`#1253
kunalsz wants to merge 1 commit into
mandiant:masterfrom
kunalsz:bb-start-end-extract

kunalsz commented Apr 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Uh oh!

williballenthin Apr 2, 2026

Uh oh!

williballenthin Apr 2, 2026

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	edges = []
	bb_by_va = {bb.va: bb for bb in f.basic_blocks}
	edges = []

Conversation

kunalsz commented Apr 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

williballenthin Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

williballenthin Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

williballenthin commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

kunalsz commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants