Skip to content

Commit

Permalink
Add initial description
Browse files Browse the repository at this point in the history
  • Loading branch information
ljvmiranda921 committed Feb 27, 2023
1 parent edb2591 commit 6127f89
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 2 deletions.
41 changes: 40 additions & 1 deletion integrations/prodigy_openai/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,45 @@
<!-- SPACY PROJECT: AUTO-GENERATED DOCS START (do not remove) -->

# 🪐 spaCy Project: Benchmarking OpenAI datasets
# 🪐 spaCy Project: Using Prodigy's OpenAI recipes for a bio NER task

This project showcases Prodigy's OpenAI recipe for named-entity recognition
using the [Anatomical Entity Mention (AnEM)
dataset](https://aclanthology.org/W12-4304/). The dataset contains 11
anatomical entities (e.g., *organ*, *tissue*, *cellular component*, etc.)
based from the Common Anatomy Reference Ontology. The dataset statistics (and
some examples) are shown below:

<!-- TODO: insert dataset statistics -->

In this project, we trained a transformer-based NER model and compared it with the zero-shot
predictions of GPT-3. We wanted to test how large language models fare in a specific domain and
suggest ways on how we can leverage them to improve our annotations.

<!-- TODO: insert zero-shot and supervised learning diagrams -->
<!-- TODO: insert results -->

The transformer and zero-shot pipelines are defined by the `ner` and `gpt` workflows respectively.
In order to run the `gpt` workflow, make sure to [install Prodigy](https://prodi.gy/docs/install) as well
as a few additional Python dependencies:

```
python -m pip install prodigy -f https://[email protected]
python -m pip install -r requirements.txt
```

With `XXXX-XXXX-XXXX-XXXX` being your personal Prodigy license key.

Then, [create a new API key from
openai.com](https://platform.openai.com/account/api-keys) or fetch an existing
one. Record the secret key as well as the organization key and make sure these
are available as environmental variables. For instance, set them in a `.env`
file in the root directory:

```
PRODIGY_OPENAI_ORG = "org-..."
PRODIGY_OPENAI_KEY = "sk-..."
```


## 📋 project.yml

Expand Down
40 changes: 39 additions & 1 deletion integrations/prodigy_openai/project.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,42 @@
title: "Benchmarking OpenAI datasets"
title: "Using Prodigy's OpenAI recipes for a bio NER task"
description: |
This project showcases Prodigy's OpenAI recipe for named-entity recognition
using the [Anatomical Entity Mention (AnEM)
dataset](https://aclanthology.org/W12-4304/). The dataset contains 11
anatomical entities (e.g., *organ*, *tissue*, *cellular component*, etc.)
based from the Common Anatomy Reference Ontology. The dataset statistics (and
some examples) are shown below:
<!-- TODO: insert dataset statistics -->
In this project, we trained a transformer-based NER model and compared it with the zero-shot
predictions of GPT-3. We wanted to test how large language models fare in a specific domain and
suggest ways on how we can leverage them to improve our annotations.
<!-- TODO: insert zero-shot and supervised learning diagrams -->
<!-- TODO: insert results -->
The transformer and zero-shot pipelines are defined by the `ner` and `gpt` workflows respectively.
In order to run the `gpt` workflow, make sure to [install Prodigy](https://prodi.gy/docs/install) as well
as a few additional Python dependencies:
```
python -m pip install prodigy -f https://[email protected]
python -m pip install -r requirements.txt
```
With `XXXX-XXXX-XXXX-XXXX` being your personal Prodigy license key.
Then, [create a new API key from
openai.com](https://platform.openai.com/account/api-keys) or fetch an existing
one. Record the secret key as well as the organization key and make sure these
are available as environmental variables. For instance, set them in a `.env`
file in the root directory:
```
PRODIGY_OPENAI_ORG = "org-..."
PRODIGY_OPENAI_KEY = "sk-..."
```
directories:
- "assets"
Expand Down

0 comments on commit 6127f89

Please sign in to comment.