Skip to content

Commit

Permalink
Better docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mbsantiago committed Nov 27, 2023
1 parent f79f532 commit f9e38d4
Show file tree
Hide file tree
Showing 9 changed files with 89 additions and 53 deletions.
31 changes: 16 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@
![Static Badge](https://img.shields.io/badge/formatting-black-black)
[![codecov](https://codecov.io/gh/mbsantiago/soundevent/branch/main/graph/badge.svg?token=42kVE87avA)](https://codecov.io/gh/mbsantiago/soundevent)

> **Warning**
> This package is under active development, use with caution.
> **Warning** This package is under active development, use with caution.
`soundevent` is an open-source Python package that aims to support the
computational biocoustic community in developing well-tested, coherent, and
Expand All @@ -20,23 +19,23 @@ definition. The package comprises three key components:

## Main features

### 1. Data Classes for Bioacoustic Analysis
### 1. Data Schemas for Bioacoustic Analysis

The `soundevent` package defines several [data classes](https://mbsantiago.github.io/soundevent/data/) that conceptualize and
standardize different recurrent objects in bioacoustic analysis. These data
classes establish the relationships between various concepts and specify the
attributes each object possesses. They are designed to be flexible enough to
cover a wide range of use cases in bioacoustic analysis. The package also
includes data validation mechanisms to ensure that the information stored is
valid and meaningful. Specifically, it defines objects related to sound events,
such as user annotations and model predictions.
The `soundevent` package introduces several
[data schemas](https://mbsantiago.github.io/soundevent/data_schemas/) designed
to conceptualize and standardize recurring objects in bioacoustic analysis.
These data schemas establish relationships between various concepts and define
the attributes each object possesses. They provide flexibility to cover a broad
spectrum of use cases in bioacoustic analysis while incorporating data
validation mechanisms to ensure stored information is both valid and meaningful.
Notably, the package defines schemas related to sound events, including user
annotations and model predictions.

### 2. Serialization, Storage, and Reading Functions

To promote standardized data formats for storing annotated datasets and other
information about sounds in recordings, the `soundevent` package provides
[several
functions](https://mbsantiago.github.io/soundevent/generated/gallery/1_saving_and_loading/)
[several functions](https://mbsantiago.github.io/soundevent/generated/gallery/1_saving_and_loading/)
for serialization, storage, and reading of the different data classes offered.
These functions enable easy sharing of information about common objects in
bioacoustic research. By employing a consistent data format, researchers can
Expand Down Expand Up @@ -67,13 +66,15 @@ For detailed information on how to use the package, please refer to the

## Example Usage

To see practical examples of how to use soundevent, you can explore the collection of examples provided in the
To see practical examples of how to use soundevent, you can explore the
collection of examples provided in the
[documentation's gallery](https://mbsantiago.github.io/soundevent/generated/gallery/).

## Contributing

We welcome contributions from the community to make `soundevent` even better. If
you would like to contribute, please refer to the [contribution guidelines](CONTRIBUTING.md).
you would like to contribute, please refer to the
[contribution guidelines](CONTRIBUTING.md).

## License

Expand Down
46 changes: 23 additions & 23 deletions docs/data_schemas/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,70 +9,70 @@ the following sections:
`soundevent` equips you with tools to attach crucial information to diverse
objects encountered in bioacoustic analysis. These include:

- [Users](descriptors/#users): Keeping reference of everyone's contribution.
- [Tags](descriptors/#tags): Attaching semantic context to objects.
- [Features](descriptors/#features): Numerical descriptors capturing
- [Users](descriptors.md#users): Keeping reference of everyone's contribution.
- [Tags](descriptors.md#tags): Attaching semantic context to objects.
- [Features](descriptors.md#features): Numerical descriptors capturing
continuously varying attributes.
- [Notes](descriptors/#notes): User-written free-text annotations.
- [Notes](descriptors.md#notes): User-written free-text annotations.

## Audio Content

Delving into the core of acoustic analysis, we have schemas for:

- [Recordings](audio_content/#recordins): Complete audio files.
- [Dataset](audio_content/#datasets): A collection of recordings from a common
- [Recordings](audio_content.md#recordins): Complete audio files.
- [Dataset](audio_content.md#datasets): A collection of recordings from a common
source.

## Acoustic Objects

Identifying distinctive sound elements within audio content, we have:

- [Geometric Objects](acoustic_objects/#geometries): Defining Regions of
- [Geometric Objects](acoustic_objects.md#geometries): Defining Regions of
Interest (RoI) in the temporal-frequency plane.
- [Sound Events](acoustic_objects/#sound_events): Individual sonic occurrences.
- [Sequences](acoustic_objects/#sequences): Patterns of connected sound events.
- [Clips](acoustic_objects/#clips): Fragments extracted from recordings.
- [Sound Events](acoustic_objects.md#sound_events): Individual sonic occurrences.
- [Sequences](acoustic_objects.md#sequences): Patterns of connected sound events.
- [Clips](acoustic_objects.md#clips): Fragments extracted from recordings.

## Annotation

`soundevent` places emphasis on human annotation processes, covering:

- [Sound Event Annotations](annotation/#sound_event_annotation): Expert-created
- [Sound Event Annotations](annotation.md#sound_event_annotation): Expert-created
markers for relevant sound events.
- [Sequence Annotations](annotation/#sequence_annotation): User provided
- [Sequence Annotations](annotation.md#sequence_annotation): User provided
annotations of sequences of sound events.
- [Clip Annotations](annotation/#clip_annotations): Annotations and notes at the
- [Clip Annotations](annotation.md#clip_annotations): Annotations and notes at the
clip level.
- [Annotation Task](annotation/#annotation_task): Descriptions of tasks and the
- [Annotation Task](annotation.md#annotation_task): Descriptions of tasks and the
status of annotation.
- [Annotation Project](annotation/#annotation_project): The collective
- [Annotation Project](annotation.md#annotation_project): The collective
description of tasks and annotations.

## Prediction

Automated processing methods also play a role, generating:

- [Sound Event Predictions](prediction/#sound_event_predictions): Predictions
- [Sound Event Predictions](prediction.md#sound_event_predictions): Predictions
made during automated processing.
- [Sequence Predictions](prediction/#sequence_predictions): Predictions of
- [Sequence Predictions](prediction.md#sequence_predictions): Predictions of
sequences of sound events.
- [Clip Predictions](prediction/#clip_predictions): Collections of predictions
- [Clip Predictions](prediction.md#clip_predictions): Collections of predictions
and additional information at the clip level.
- [Model Runs](prediction/#model_runs): Sets of clip predictions generated in a
- [Model Runs](prediction.md#model_runs): Sets of clip predictions generated in a
single run by a specific model.

## Evaluation

Assessing the accuracy of predictions is crucial, and `soundevent` provides
schemas for:

- [Matches](evaluation/#matches): Predicted sound events overlapping with ground
- [Matches](evaluation.md#matches): Predicted sound events overlapping with ground
truth.
- [Clip Evaluation](evaluation/#clip_evaluation): Information about matches and
- [Clip Evaluation](evaluation.md#clip_evaluation): Information about matches and
performance metrics at the clip level.
- [Evaluation](evaluation/#evaluation_1): Comprehensive details on model
- [Evaluation](evaluation.md#evaluation_1): Comprehensive details on model
performance across the entire evaluation set.
- [Evaluation Set](evaluation/#evaluation_set): Human annotations serving as
- [Evaluation Set](evaluation.md#evaluation_set): Human annotations serving as
ground truth.

Want to know more? Dive in for a closer look at each of these schemas.
Expand Down
14 changes: 9 additions & 5 deletions docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ Let's start with the basics. A schema is like the blueprint for your data. It's
a formal way of specifying how data is structured, allowing you to clearly
define what data objects hold and how they store it.

> Note For a deeper dive into schemas, check out
> [Understanding JSON Schema](https://json-schema.org/understanding-json-schema/about#what-is-a-schema).
!!! info "More on Schemas"

For a deeper dive into schemas, check out [Understanding JSON
Schema](https://json-schema.org/understanding-json-schema/about#what-is-a-schema).

## Why should you care?

Expand All @@ -33,6 +35,8 @@ Now, let's discuss why these data schemas matter to us bioacousticians:
Using these hints makes your code more robust, acting like guardrails to
ensure that your data follows the rules.

> Note For a quick introduction to what type hints are and how to use them,
> check out this great explanation in the
> [FastAPI documentation](https://fastapi.tiangolo.com/python-types/).
!!! question "What are type hints?"

For a quick introduction to what type hints are and how to use them, check
out this great explanation in the [FastAPI
documentation](https://fastapi.tiangolo.com/python-types/).
10 changes: 10 additions & 0 deletions docs/reference/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Feature
- Note
- Recording
- RecordingSet
- Dataset
- SoundEvent
- Sequence
Expand All @@ -20,11 +21,13 @@
- AnnotationState
- StatusBadge
- AnnotationTask
- AnnotationSet
- AnnotationProject
- PredictedTag
- SoundEventPrediction
- SequencePrediction
- ClipPrediction
- PredictionSet
- ModelRun
- EvaluationSet
- Match
Expand All @@ -46,3 +49,10 @@
- MultiLineString
- MultiPolygon
- Geometry


## Other

::: soundevent.data.PathLike
options:
heading_level: 3
6 changes: 6 additions & 0 deletions docs/reference/io.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# IO Module

::: soundevent.io
options:
group_by_category: false
members:
- DataCollections
- save
- load
17 changes: 9 additions & 8 deletions docs/user_guide/1_saving_and_loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
# ### Loading Datasets
# By using the loading functions provided by the `soundevent` package, you can
# directly load the data into Python and obtain a
# [`Dataset`](../../data.md#datasets) object.
# [`Dataset`](../../data_schemas/audio_content.md#datasets) object.

from soundevent import io

Expand Down Expand Up @@ -88,7 +88,8 @@
# ### Loading Annotation Projects
# The [`load`][soundevent.io.load]
# function can be used to load the annotations into Python and obtain an
# [`AnnotationProject`](../../data.md#annotation_projects) object directly.
# [`AnnotationProject`](../../data_schemas/annotation.md#annotation_projects) object
# directly.

nips4b_sample = io.load(annotation_path, type="annotation_set")
print(repr(nips4b_sample))
Expand All @@ -97,15 +98,15 @@
# This object allows you to access and analyze the annotations, along with
# their associated objects.

for task in nips4b_sample.clip_annotations:
clip = task.clip
for clip_annotation in nips4b_sample.clip_annotations:
clip = clip_annotation.clip
recording = clip.recording
print(
f"* Recording {recording.path} [from "
f"{clip.start_time:.3f}s to {clip.end_time:.3f}s]"
)
print(f"\t{len(task.annotations)} annotations found")
for annotation in task.annotations:
print(f"\t{len(clip_annotation.sound_events)} sound event annotations found")
for annotation in clip_annotation.sound_events:
sound_event = annotation.sound_event
start_time, end_time = sound_event.geometry.coordinates
print(f"\t+ Sound event from {start_time:.3f}s to {end_time:.3f}s")
Expand All @@ -127,8 +128,8 @@
# [`save`][soundevent.io.save] and
# [`load`][soundevent.io.load] functions, respectively. The
# loading function reads the **AOEF** file and returns a
# [`ModelRun`](../../data.md#model_run) object that can be used for further
# analysis.
# [`ModelRun`](../../data_schemas/prediction.md#model_run) object that can be used
# for further analysis.
#
# By utilizing the saving and loading functions provided by soundevent, you can
# easily manage and exchange acoustic data objects in AOEF format, promoting
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/example_dataset.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"version":"1.1.0","created_on":"2023-11-24T17:51:39.492335","data":{"collection_type":"dataset","uuid":"b1096756-eea2-4489-9e6a-b98b559647bb","created_on":"2023-11-21T13:43:14.742002","recordings":[{"id":0,"uuid":"89957d47-f67d-4bfe-8352-bf0fe5a8ce3e","path":"recording1.wav","duration":10.0,"channels":1,"samplerate":44100,"time_expansion":10.0,"hash":"1234567890abcdef","date":"2021-01-01","time":"21:34:56","latitude":12.345,"longitude":34.567,"tags":[0,1,2],"features":{"SNR":10.0,"ACI":0.5},"notes":[{"uuid":"2931b864-43e4-4fb1-aae1-a214dccca6e3","message":"This is a note.","created_by":0,"is_issue":false,"created_on":"2023-11-21T13:43:14.742073"}],"owners":[1]},{"id":1,"uuid":"bd30f886-3abb-475b-aacb-c7148a4d4420","path":"recording2.wav","duration":8.0,"channels":1,"samplerate":441000,"time_expansion":10.0,"hash":"234567890abcdef1","date":"2021-01-02","time":"19:34:56","latitude":13.345,"longitude":32.567,"tags":[3,4,5],"features":{"SNR":7.0,"ACI":0.3},"notes":[{"uuid":"713b6c15-0e3d-4cc5-acc6-3f1093209a40","message":"Unsure about the species.","created_by":0,"is_issue":false,"created_on":"2023-11-21T13:43:14.742147"}],"owners":[1]}],"tags":[{"id":0,"key":"species","value":"Myotis myotis"},{"id":1,"key":"sex","value":"female"},{"id":2,"key":"behaviour","value":"foraging"},{"id":3,"key":"species","value":"Eptesicus serotinus"},{"id":4,"key":"sex","value":"male"},{"id":5,"key":"behaviour","value":"social calls"}],"users":[{"id":0,"uuid":"04ef3927-3a3d-40df-9d6e-2cc5e21482a0","name":"John Doe"},{"id":1,"uuid":"d6eb0862-a619-4919-992c-eb3625692c13","email":"[email protected]","name":"Data Collector"}],"name":"test_dataset","description":"A test dataset"}}
{"version":"1.1.0","created_on":"2023-11-27T19:45:58.447521","data":{"uuid":"b1096756-eea2-4489-9e6a-b98b559647bb","collection_type":"dataset","created_on":"2023-11-21T13:43:14.742002","recordings":[{"uuid":"89957d47-f67d-4bfe-8352-bf0fe5a8ce3e","path":"recording1.wav","duration":10.0,"channels":1,"samplerate":44100,"time_expansion":10.0,"hash":"1234567890abcdef","date":"2021-01-01","time":"21:34:56","latitude":12.345,"longitude":34.567,"tags":[0,1,2],"features":{"SNR":10.0,"ACI":0.5},"notes":[{"uuid":"2931b864-43e4-4fb1-aae1-a214dccca6e3","message":"This is a note.","created_by":"04ef3927-3a3d-40df-9d6e-2cc5e21482a0","is_issue":false,"created_on":"2023-11-21T13:43:14.742073"}],"owners":["d6eb0862-a619-4919-992c-eb3625692c13"]},{"uuid":"bd30f886-3abb-475b-aacb-c7148a4d4420","path":"recording2.wav","duration":8.0,"channels":1,"samplerate":441000,"time_expansion":10.0,"hash":"234567890abcdef1","date":"2021-01-02","time":"19:34:56","latitude":13.345,"longitude":32.567,"tags":[3,4,5],"features":{"SNR":7.0,"ACI":0.3},"notes":[{"uuid":"713b6c15-0e3d-4cc5-acc6-3f1093209a40","message":"Unsure about the species.","created_by":"04ef3927-3a3d-40df-9d6e-2cc5e21482a0","is_issue":false,"created_on":"2023-11-21T13:43:14.742147"}],"owners":["d6eb0862-a619-4919-992c-eb3625692c13"]}],"tags":[{"id":0,"key":"species","value":"Myotis myotis"},{"id":1,"key":"sex","value":"female"},{"id":2,"key":"behaviour","value":"foraging"},{"id":3,"key":"species","value":"Eptesicus serotinus"},{"id":4,"key":"sex","value":"male"},{"id":5,"key":"behaviour","value":"social calls"}],"users":[{"uuid":"04ef3927-3a3d-40df-9d6e-2cc5e21482a0","name":"John Doe"},{"uuid":"d6eb0862-a619-4919-992c-eb3625692c13","email":"[email protected]","name":"Data Collector"}],"name":"test_dataset","description":"A test dataset"}}
2 changes: 1 addition & 1 deletion docs/user_guide/nips4b_plus_sample.json

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,19 @@ theme:
- navigation.indexes
- navigation.top
- toc.follow
palette:
# Palette toggle for light mode
- scheme: default
primary: blue grey
toggle:
icon: material/brightness-7
name: Switch to dark mode
# Palette toggle for dark mode
- scheme: slate
primary: blue grey
toggle:
icon: material/brightness-4
name: Switch to light mode
plugins:
- search
- gallery:
Expand Down Expand Up @@ -63,6 +76,7 @@ plugins:
show_root_heading: true
show_category_heading: true
show_symbol_type_heading: true
show_if_no_docstring: true
docstring_style: "numpy"
docstring_section_style: "table"
summary: true
Expand Down

0 comments on commit f9e38d4

Please sign in to comment.