Added rudimentary feature extraction #74

MJWeldy · 2023-08-06T05:13:51Z

Hi Joe,
My name is Matt Weldy. I am a phd student working with bioacoustics in Oregon. I am currently working on a project using the previous 320 length birdnet embeddings. However, I ran into this lib today and thought I would add in a rudimentary implementation of the embedding extraction. When the analyzer class is initiated feature extraction in enabled with Analyzer(fetch_embeddings = 1). The processed embedding is accessed through recording.features_list or recording.features

Feel free to keep it or toss it as you see fit :)

added feature extraction

converted input variable to self

typo

another

working

joeweiss · 2023-08-06T07:09:17Z

Hi Matt. This looks good, and this functionality was on the todo list #42, so thanks!

This PR will need a test case, preferably one that tests against the commandline embeddings.py script from BirdNET-Analyzer. I'm happy to take a stab at that, or you can if you would prefer.

Also, just FYI, the license for the project is going to change from GPLv3 to Apache 2 (a more permissive license). I'll likely hold off on merging this PR until after the license change, so your code contribution will be under Apache 2.

joeweiss · 2023-08-06T07:14:13Z

src/birdnetlib/analyzer.py

        for c in recording.chunks:

            if self.use_custom_classifier:
                pred = self.predict_with_custom_classifier(c)[0]
+            elif self.fetch_embeddings == True:


@MJWeldy From looking at this it appears that custom classifier functionality and the embeddings functionality are not possible at the same time. Is that correct?

I haven't used embeddings myself, so not sure if that is even relevant.

Thanks. I'll try to get to adding a test case in the next few weeks.

The licensing is fine with me. At this point I need to get my current project (which uses the older 320 length embeddings) over the line before I switch to the lib library. The way I added the feature extraction does not allow for the custom classifier and embedding extraction in the same run. But it would be fairly trivial to allow it to return both. I can think of a few niche case where someone might want both (like someone trying to visualize the embeddings for a few feature classes) while using a custom classifier head), but I think it would be pretty rare.

I did not add the embeddings to the older tf.lite model. Also, it would be nice to adapt the multiprocessing functions to allow for embeddings. My current scripts embed 1000 audio files in ~5-6 seconds.

joeweiss · 2023-11-16T15:31:52Z

Hey Matt. I'm going to close this for now since the PR is out of date. I still have #42 as an open issue and I'll refer to your PR when I get around to implementing it.

MJWeldy · 2024-04-16T13:36:48Z

Hi Joe,
I am circling back on the embedding extraction PR request. I am currently refactoring a research project to use birdnetlib instead of custom analyzer code. But in order to complete the refactor I need to be able to access the embeddings without a custom classifier. It looks like my original PR was never integrated. Would it be possible to pull it in now?
Best,
Matt

joeweiss · 2024-04-16T14:24:11Z

Hey Matt. I'd love to get #42 implemented.

We've had a few updates since August, so I'll need to get this PR up-to-date. It might not need any changes, it just needs to be checked. Would you mind confirming it still works as expected (against the current main branch)?

Also, it needs a test case. I can probably do that in the next day or so.

Thanks!

MJWeldy · 2024-04-16T16:28:45Z

Hi Joe!
It looks like it might work as is. If you could push the new updates and the PR to a new branch I can pull it down and test it. Also, I'll send over a test case this week. Do you have a preference how the test embeddings are passed along (i.e., text, pickle, npz)?
Best,
Matt

joeweiss · 2024-04-17T15:39:39Z

Hey Matt.

I decided to add a test case myself, which led to me going ahead and fleshing out the method. Check it out here: #112

For test cases, I've been running the BirdNET-Analyzer command line script, then asserting birdnetlib's output directly against that, rather than using an expected fixture. Usually I'd use a test fixture (npz or text), but the live test helps me catch upstream changes in the main project.

Thanks for checking out the implementation and for providing the original code (and the spark for revisiting it!).

If everything looks good to you, I'll merge that PR and publish it to PyPI.

joeweiss · 2024-04-22T18:04:42Z

#112 merged in.

MJWeldy added 10 commits August 5, 2023 20:15

Update analyzer.py

84d01d2

added feature extraction

Update analyzer.py

fd1a3f4

converted input variable to self

Update analyzer.py

e061131

typo

Update analyzer.py

74ff8af

another

Update analyzer.py

2d41c14

Update analyzer.py

2065892

Update analyzer.py

30cb071

Update analyzer.py

98c1796

Update analyzer.py

f1378a2

working

Update main.py

d69aacb

joeweiss reviewed Aug 6, 2023

View reviewed changes

joeweiss closed this Nov 16, 2023

joeweiss reopened this Apr 16, 2024

joeweiss closed this Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added rudimentary feature extraction #74

Added rudimentary feature extraction #74

MJWeldy commented Aug 6, 2023

joeweiss commented Aug 6, 2023

joeweiss Aug 6, 2023

MJWeldy Aug 6, 2023

joeweiss commented Nov 16, 2023

MJWeldy commented Apr 16, 2024

joeweiss commented Apr 16, 2024

MJWeldy commented Apr 16, 2024

joeweiss commented Apr 17, 2024

joeweiss commented Apr 22, 2024

Added rudimentary feature extraction #74

Added rudimentary feature extraction #74

Conversation

MJWeldy commented Aug 6, 2023

joeweiss commented Aug 6, 2023

joeweiss Aug 6, 2023

Choose a reason for hiding this comment

MJWeldy Aug 6, 2023

Choose a reason for hiding this comment

joeweiss commented Nov 16, 2023

MJWeldy commented Apr 16, 2024

joeweiss commented Apr 16, 2024

MJWeldy commented Apr 16, 2024

joeweiss commented Apr 17, 2024

joeweiss commented Apr 22, 2024