Skip to content

Commit

Permalink
fix deps + imports
Browse files Browse the repository at this point in the history
  • Loading branch information
markus583 committed Jul 28, 2024
1 parent 16eda81 commit 371e31c
Show file tree
Hide file tree
Showing 6 changed files with 13 additions and 10 deletions.
8 changes: 3 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This repository allows you to segment text into sentences or other semantic unit
- **SaT** — [Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation](https://arxiv.org/abs/2406.16678) by Markus Frohmann, Igor Sterner, Benjamin Minixhofer, Ivan Vulić and Markus Schedl (**state-of-the-art, encouraged**).
- **WtP** — [Where’s the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation](https://aclanthology.org/2023.acl-long.398/) by Benjamin Minixhofer, Jonas Pfeiffer and Ivan Vulić (*previous version, maintained for reproducibility*).

The namesake WtP is maintained for reproducibility. Our new followup SaT provides robust, efficient and adaptable sentence segmentation across 85 languages at higher performance and less compute cost. Check out the **state-of-the-art** results in 8 distinct corpora and 85 languages demonstrated in our [Segment any Text paper](https://arxiv.org/abs/2406.16678).
The namesake WtP is maintained for consistency. Our new followup SaT provides robust, efficient and adaptable sentence segmentation across 85 languages at higher performance and less compute cost. Check out the **state-of-the-art** results in 8 distinct corpora and 85 languages demonstrated in our [Segment any Text paper](https://arxiv.org/abs/2406.16678).

![System Figure](./configs/system-fig.png)

Expand Down Expand Up @@ -154,11 +154,9 @@ Clone the repository and install requirements:

```
git clone https://github.com/segment-any-text/wtpsplit
cd segment-any-text
pip install -e .
cd wtpsplit
pip install -r requirements.txt
cd adapters
pip install -e .
pip install adapters==0.2.1 --no-dependencies
cd ..
```

Expand Down
7 changes: 6 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,9 @@ cohere
replicate
onnx
onnxruntime
torchinfo
torchinfo
mosestokenizer
cached_property
tqdm
skops
pandas
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"pandas>=1",
"cached_property", # for Py37
"mosestokenizer",
"adapters==0.2.1"
"adapters"
],
url="https://github.com/segment-any-text/wtpsplit",
package_data={"wtpsplit": ["data/*"]},
Expand Down
2 changes: 1 addition & 1 deletion wtpsplit/evaluation/intrinsic_pairwise.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from wtpsplit.extract import PyTorchWrapper
from wtpsplit.extract_batched import extract_batched
from wtpsplit.utils import Constants, token_to_char_probs
from wtpsplit.evaluation.intrinsic import compute_statistics
from wtpsplit.evaluation.adapt import compute_statistics

logger = logging.getLogger()
logger.setLevel(logging.INFO)
Expand Down
2 changes: 1 addition & 1 deletion wtpsplit/evaluation/intrinsic_ted.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
import wtpsplit.models # noqa: F401
from wtpsplit.evaluation import get_labels, train_mixture
from wtpsplit.evaluation.evaluate_sepp_nlg_subtask1 import evaluate_subtask1
from wtpsplit.evaluation.intrinsic import process_logits
from wtpsplit.evaluation.adapt import process_logits
from wtpsplit.extract import PyTorchWrapper, extract
from wtpsplit.utils import Constants, sigmoid

Expand Down
2 changes: 1 addition & 1 deletion wtpsplit/evaluation/punct_annotation_wtp.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import wtpsplit.models # noqa
from wtpsplit.extract import PyTorchWrapper
from wtpsplit.utils import Constants
from wtpsplit.evaluation.intrinsic import process_logits
from wtpsplit.evaluation.adapt import process_logits


@dataclass
Expand Down

0 comments on commit 371e31c

Please sign in to comment.