Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prodigy OpenAI project #180

Closed
wants to merge 33 commits into from

Conversation

ljvmiranda921
Copy link
Contributor

This is an extra material for our Prodigy OpenAI blog post where we also benchmark zero-shot and supervised approaches for NER. I'll add more details in the description as we go along.

Description

  • TODO

Types of change

  • New Project

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • I ran the update scripts in the .github folder, and all the configs and docs are up-to-date.

@ljvmiranda921 ljvmiranda921 self-assigned this Feb 21, 2023
@@ -0,0 +1,19 @@
From the text below, extract the following entities in the following format:
{# whitespace #}
Cell: <comma delimited list of strings>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly asking out of curiosity: is there a reason you decided to write a custom prompt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's more for convenience when I was testing different templates. I find it a bit of a hassle typing each label in the CLI. Although the current project.yml setup already does that (but for now it's not being rendered into the template because there's no reference to labels). I'll update this later

outputs:
- corpus/anem-test_texts.jsonl

- name: "openai-predict"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another curiosity: did you happen to keep track how expensive it was to run this query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I haven't :/ Time-wise though I ran the query for an hour and a half, with intermittent stops because of connection errors (--resume). I haven't tracked how much (money) it costs.

This is useful when we want to hydrate as Prodigy dataset
with the data we need.
@svlandeg svlandeg added the enhancement New feature or request label Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants