-
-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prodigy OpenAI project #180
Conversation
429e96b
to
ab5538d
Compare
So that converter auto works
5e246b9
to
35cc2df
Compare
c913869
to
2e8d2b4
Compare
I want to include the test predictions (at least) from OpenAI. So that others can reproduce some stuff.
2e8d2b4
to
76b1834
Compare
@@ -0,0 +1,19 @@ | |||
From the text below, extract the following entities in the following format: | |||
{# whitespace #} | |||
Cell: <comma delimited list of strings> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly asking out of curiosity: is there a reason you decided to write a custom prompt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's more for convenience when I was testing different templates. I find it a bit of a hassle typing each label in the CLI. Although the current project.yml
setup already does that (but for now it's not being rendered into the template because there's no reference to labels
). I'll update this later
outputs: | ||
- corpus/anem-test_texts.jsonl | ||
|
||
- name: "openai-predict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another curiosity: did you happen to keep track how expensive it was to run this query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I haven't :/ Time-wise though I ran the query for an hour and a half, with intermittent stops because of connection errors (--resume
). I haven't tracked how much (money) it costs.
This is useful when we want to hydrate as Prodigy dataset with the data we need.
d9b0e83
to
8c984fc
Compare
d26992d
to
c16aa00
Compare
de0fe2f
to
edb2591
Compare
63c813c
to
6127f89
Compare
When downsampling, the labels may not always be equal.
0befd99
to
3d7b919
Compare
This is an extra material for our Prodigy OpenAI blog post where we also benchmark zero-shot and supervised approaches for NER. I'll add more details in the description as we go along.
Description
Types of change
Checklist
update
scripts in the.github
folder, and all the configs and docs are up-to-date.