iter_confusion_matrices a bottleneck

I am experimenting with very large datasets (~ 1e6 to 1e7 points). It seems that storing the data as (threshold, label) tuples and then computing the measures and confusion matrices in python is much, much slower than keeping the data in numpy arrays (where available), and doing vectorized operations on the arrays. I don't know if there is interest in something like this.

I might attempt to implement something like that to be abe to handle the large datasets.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iter_confusion_matrices a bottleneck #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

iter_confusion_matrices a bottleneck #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions