Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target property value #170

Closed
wants to merge 5 commits into from
Closed

Target property value #170

wants to merge 5 commits into from

Conversation

kspieks
Copy link
Collaborator

@kspieks kspieks commented Jan 6, 2024

As suggested in issue #168, this PR introduces support for an extrapolative sampler that splits the data by target property. A flag controls whether the testing set contains data points with the largest or smallest target y values.

To document a few general notes,

• This sampler sorts the indices by property value. Since no clusters are created, the TargetProperty class actually behaves as an interpolation sampler behind the scenes when it's called in main.py.
• The other consequence of sorting is that the results are always deterministic. This sampler does not use the random_state argument. If users want to obtain multiple splits, they can vary the amount of data that is assigned to the training vs. testing set

Let me know if I missed anything. Looking forward to your feedback!

@kspieks
Copy link
Collaborator Author

kspieks commented Feb 15, 2024

Closing since this code has been added in PR #172

@kspieks kspieks closed this Feb 15, 2024
kspieks added a commit that referenced this pull request Feb 15, 2024
This PR supersedes #165 and #170 and resolves #168.

#165 implemented a sampler which would divide molecules based on their
molecular weight (either ascending or descending), and #170 added
basically the same thing but for arbitrary target values. This PR just
refactors #170 as if #165 where already implemented, reducing code
duplication quite a bit.

There is also some small internal cleanup, namely
00eeca6,
98b3efd, and
3b8bca1 (as well as some long-overdue
CI updates).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant