-
-
Notifications
You must be signed in to change notification settings - Fork 0
Using dbpedia-entities-openai-1M as a dataset #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 10 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
3cdfae7
Trying to use dbpedia-entities-openai-1M as a dataset
binarycleric a47ecce
Trying forking again
binarycleric e168d9a
Using huggingface to download the dataset
binarycleric cdb4351
Don't just call the CLI directly in the script
binarycleric eacfee7
Back to promises.
binarycleric 035264e
Code cleanup, dropping instance_eval on Database
binarycleric 4975d51
Using the Parallel gem
binarycleric 97f0a75
Fixing workmem_stress
binarycleric 3a846f7
Cleanup
binarycleric 51b47dc
Fixing a typo
binarycleric ffb9a83
Fixing tests
binarycleric 0627787
Using Open3 instead of backticks
binarycleric f3b0ea8
Using secure temp files
binarycleric e7c48f8
Using hugging face cache directory
binarycleric File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -1,43 +1,49 @@ | ||||||||
| # frozen_string_literal: true | ||||||||
|
|
||||||||
| option(:total_vectors, 5_000_000) | ||||||||
| option(:vector_dimension, 1_536) | ||||||||
|
|
||||||||
| helpers do | ||||||||
| def random_vector(size: options.vector_dimension) | ||||||||
| def random_vector(size: 1_536) | ||||||||
| Array.new(size) { rand(-1.0..1.0) } | ||||||||
| end | ||||||||
|
|
||||||||
| def download_from_hugging_face(repo, local_dir="/tmp/#{repo}") | ||||||||
| `hf download #{repo} --repo-type=dataset --local-dir #{local_dir}` | ||||||||
|
binarycleric marked this conversation as resolved.
Outdated
|
||||||||
| `hf download #{repo} --repo-type=dataset --local-dir #{local_dir}` | |
| stdout, status = Open3.capture2("hf", "download", repo, "--repo-type=dataset", "--local-dir", local_dir) | |
| stdout |
binarycleric marked this conversation as resolved.
Outdated
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method calls
run_action(query).sqlbut this assumes the result has a.sqlmethod. However, if the query block returns a Sequel dataset, this should work, but if it returns other types, this could fail. Consider adding error handling or type checking.