Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure Refactor and Duplicate Trait Support #75

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

theelderbeever
Copy link
Collaborator

This PR has two main goals...

  1. Refactor the library structure with lessons learned to help with future extensibility and clarity
  2. Refactor the underlying data structures used to allow for duplicate attribute names in metadata

Additionally, a breaking change that moves the "entrypoint" of the library from a scorer interface to a collection has been made.

hooks:
- id: flake8
additional_dependencies: [flake8-bugbear, pep8-naming]
# - repo: https://gitlab.com/pycqa/flake8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to fix it in github actions as well , otherwise the build fails.

) -> list[AttributeStatistic]:
return [
{
**attr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not use ** syntax , i think it's hard to read

from open_rarity.models.collections import AttributeCounted, AttributeStatistic


def information_content(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation

"""
return list(
chain(
*[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not use * , and ** i always find them unintuitive in the code.

_description_
"""
d = defaultdict(int)
for key, count in groupapply(tokens, extract_token_name_key, "count").items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in favor of additional dependencies - can we have a list of dependencies and discuss what's needed and what's not?

from open_rarity.models.tokens import TokenAttributeStatistic


def unique_trait_count(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming utc clashes with utc timestamp - better naming will help us perhaps?


def __init__(self) -> None:
# OpenRarity uses InformationContent as the scoring algorithm of choice.
self.handler = InformationContentScoringHandler()
self.handler = IC()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more explicit here - i like more InformationContent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants