Skip to content

Commit

Permalink
Release notes for v0.37.0 (#2577)
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool authored Aug 23, 2024
1 parent 210421d commit c2216f8
Show file tree
Hide file tree
Showing 4 changed files with 111 additions and 14 deletions.
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@ Anserini is packaged in a self-contained fatjar, which also provides the simples
Assuming you've already got Java installed, fetch the fatjar:

```bash
wget https://repo1.maven.org/maven2/io/anserini/anserini/0.36.1/anserini-0.36.1-fatjar.jar
wget https://repo1.maven.org/maven2/io/anserini/anserini/0.37.0/anserini-0.37.0-fatjar.jar
```

The follow commands will generate a SPLADE++ ED run with the dev queries (encoded using ONNX) on the MS MARCO passage corpus:

```bash
java -cp anserini-0.36.1-fatjar.jar io.anserini.search.SearchCollection \
java -cp anserini-0.37.0-fatjar.jar io.anserini.search.SearchCollection \
-index msmarco-v1-passage.splade-pp-ed \
-topics msmarco-v1-passage.dev \
-encoder SpladePlusPlusEnsembleDistil \
Expand All @@ -39,18 +39,22 @@ java -cp anserini-0.36.1-fatjar.jar io.anserini.search.SearchCollection \
To evaluate:

```bash
java -cp anserini-0.36.1-fatjar.jar trec_eval -c -M 10 -m recip_rank msmarco-passage.dev-subset run.msmarco-v1-passage-dev.splade-pp-ed-onnx.txt
java -cp anserini-0.37.0-fatjar.jar trec_eval -c -M 10 -m recip_rank msmarco-passage.dev-subset run.msmarco-v1-passage-dev.splade-pp-ed-onnx.txt
```

See [detailed instructions](docs/fatjar-regressions/fatjar-regressions-v0.36.1.md) for the current fatjar release of Anserini (v0.36.1) to reproduce regression experiments on the MS MARCO V2.1 corpora for TREC 2024 RAG, on MS MARCO V1 Passage, and on BEIR, all directly from the fatjar!
See [detailed instructions](docs/fatjar-regressions/fatjar-regressions-v0.37.0.md) for the current fatjar release of Anserini (v0.37.0) to reproduce regression experiments on the MS MARCO V2.1 corpora for TREC 2024 RAG, on MS MARCO V1 Passage, and on BEIR, all directly from the fatjar!

Also, Anserini comes with a built-in webapp for interactive querying along with a REST API that can be used by other applications.
Check out our documentation [here](docs/rest-api.md).

<!--
We also have [forthcoming instructions](docs/fatjar-regressions/fatjar-regressions-v0.36.2-SNAPSHOT.md) for the next release (v0.36.2-SNAPSHOT) if you're interested.
We also have [forthcoming instructions](docs/fatjar-regressions/fatjar-regressions-v0.37.1-SNAPSHOT.md) for the next release (v0.37.1-SNAPSHOT) if you're interested.
-->

<details>
<summary>Older instructions</summary>

+ [Anserini v0.36.1](docs/fatjar-regressions/fatjar-regressions-v0.36.1.md)
+ [Anserini v0.36.0](docs/fatjar-regressions/fatjar-regressions-v0.36.0.md)
+ [Anserini v0.35.1](docs/fatjar-regressions/fatjar-regressions-v0.35.1.md)
+ [Anserini v0.35.0](docs/fatjar-regressions/fatjar-regressions-v0.35.0.md)
Expand Down Expand Up @@ -457,6 +461,7 @@ Beyond that, there are always [open issues](https://github.com/castorini/anserin

## 📜️ Release History

+ v0.37.0: August 22, 2024 [[Release Notes](docs/release-notes/release-notes-v0.37.0.md)]
+ v0.36.1: May 23, 2024 [[Release Notes](docs/release-notes/release-notes-v0.36.1.md)]
+ v0.36.0: April 28, 2024 [[Release Notes](docs/release-notes/release-notes-v0.36.0.md)]
+ v0.35.1: April 24, 2024 [[Release Notes](docs/release-notes/release-notes-v0.35.1.md)]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
# Anserini Fatjar Regresions (v0.36.2-SNAPSHOT)

**This is a stub.**
# Anserini Fatjar Regresions (v0.37.0)

Fetch the fatjar:

```bash
# Update once artifact has been published
wget https://repo1.maven.org/maven2/io/anserini/anserini/0.36.0/anserini-0.36.0-fatjar.jar
wget https://repo1.maven.org/maven2/io/anserini/anserini/0.37.0/anserini-0.37.0-fatjar.jar
```

Note that prebuilt indexes will be downloaded to `~/.cache/pyserini/indexes/`.
Expand All @@ -16,8 +13,8 @@ If you want to change the download location, the current workaround is to use sy
Let's start out by setting the `ANSERINI_JAR` and the `OUTPUT_DIR`:

```bash
export ANSERINI_JAR=`ls target/*-fatjar.jar`
export OUTPUT_DIR="runs"
export ANSERINI_JAR="anserini-0.37.0-fatjar.jar"
export OUTPUT_DIR="."
```

## Webapp and REST API
Expand All @@ -34,12 +31,14 @@ And then navigate to [`http://localhost:8081/`](http://localhost:8081/) in your
Here's a specific example of using the REST API to issue the query "How does the process of digestion and metabolism of carbohydrates start" to `msmarco-v2.1-doc`:

```bash
curl -X GET "http://localhost:8081/api/collection/msmarco-v2.1-doc/search?query=How%20does%20the%20process%20of%20digestion%20and%20metabolism%20of%20carbohydrates%20start"
curl -X GET "http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc/search?query=How%20does%20the%20process%20of%20digestion%20and%20metabolism%20of%20carbohydrates%20start"
```

The json results are the same as the output of the `-outputRerankerRequests` option in `SearchCollection`, described below for TREC 2024 RAG.
Use the `hits` parameter to specify the number of hits to return, e.g., `hits=1000` to return the top 1000 hits.

Details of the built-in webapp and REST API can be found [here](../rest-api.md).

## TREC 2024 RAG

❗ Beware, you need lots of space to run these experiments.
Expand Down
93 changes: 93 additions & 0 deletions docs/release-notes/release-notes-v0.37.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Anserini Release Notes (v0.37.0)

+ **Release date:** August 22, 2024
+ **Lucene version:** Lucene 9.9.1

## Summary of Changes

+ Added support for indexing and searching flat (dense) vectors.
+ Added prebuilt flat indexes and repro bindings for the BGE embedding model.
+ Added bindings for researchy questions and the TREC 2024 RAG Track test set.
+ Added new regressions with prebuilt indexes.
+ Improved metadata for prebuilt indexes.
+ Improved documentation for ONNX models.
+ Improved webapp and REST API.
+ Created new versioned routes.
+ Refined UI components.
+ Upgraded `ai.djl` and fixed token length issue.

## Contributors (This Release)

Sorted by number of commits:

+ Jimmy Lin ([lintool](https://github.com/lintool))
+ Eric Zhang ([16BitNarwhal](https://github.com/16BitNarwhal))
+ Andre Slavescu ([AndreSlavescu](https://github.com/AndreSlavescu))
+ Vivek Alamuri ([valamuri2020](https://github.com/valamuri2020))
+ Alireza Taban ([alireza-taban](https://github.com/alireza-taban))
+ Chun-Wei ([bilet-13](https://github.com/bilet-13))
+ Daisy Ye ([daisyyedda](https://github.com/daisyyedda))
+ Emily Yu ([emily-emily](https://github.com/emily-emily))
+ Eric Wang ([IR3KT4FUNZ](https://github.com/IR3KT4FUNZ))
+ Faizan Faisal ([FaizanFaisal25](https://github.com/FaizanFaisal25))
+ Hosna Oyarhoseini ([hosnahoseini](https://github.com/hosnahoseini))
+ MariaPonomarenko38 ([MariaPonomarenko38](https://github.com/MariaPonomarenko38))
+ Mehrnaz Sadeghieh ([MehrnazSadeghieh](https://github.com/MehrnazSadeghieh))
+ Ronak Pradeep ([ronakice](https://github.com/ronakice))
+ Xiaoyan Song ([SeanSong25](https://github.com/SeanSong25))
+ Yidi Chen ([XKTZ](https://github.com/XKTZ))
+ Yiran Sun ([Feng-12138](https://github.com/Feng-12138))
+ Alireza Nasirian ([alireza-nasirian](https://github.com/alireza-nasirian))
+ Nathan Kuissi ([natek-1](https://github.com/natek-1))
+ npjd ([npjd](https://github.com/npjd))

## All Contributors

All contributors with five or more commits, sorted by number of commits, [according to GitHub](https://github.com/castorini/Anserini/graphs/contributors):

+ Jimmy Lin ([lintool](https://github.com/lintool))
+ Peilin Yang ([Peilin-Yang](https://github.com/Peilin-Yang))
+ Ogundepo Odunayo ([ToluClassics](https://github.com/ToluClassics))
+ Arthur Chen ([ArthurChen189](https://github.com/ArthurChen189))
+ Ahmet Arslan ([iorixxx](https://github.com/iorixxx))
+ Xueguang Ma ([MXueguang](https://github.com/MXueguang))
+ Tommaso Teofili ([tteofili](https://github.com/tteofili))
+ Edwin Zhang ([edwinzhng](https://github.com/edwinzhng))
+ Rodrigo Nogueira ([rodrigonogueira4](https://github.com/rodrigonogueira4))
+ Emily Wang ([emmileaf](https://github.com/emmileaf))
+ Royal Sequiera ([rosequ](https://github.com/rosequ))
+ Jheng-Hong Yang ([justram](https://github.com/justram))
+ Yuqi Liu ([yuki617](https://github.com/yuki617))
+ Eric Zhang ([16BitNarwhal](https://github.com/16BitNarwhal))
+ Victor Yang ([Victor0118](https://github.com/Victor0118))
+ Chris Kamphuis ([Chriskamphuis](https://github.com/Chriskamphuis))
+ Boris Lin ([borislin](https://github.com/borislin))
+ Nikhil Gupta ([nikhilro](https://github.com/nikhilro))
+ Jasper Xian ([jasper-xian](https://github.com/jasper-xian))
+ Ronak Pradeep ([ronakice](https://github.com/ronakice))
+ Stephanie Hu ([stephaniewhoo](https://github.com/stephaniewhoo))
+ Shane Ding ([shaneding](https://github.com/shaneding))
+ Yuhao Xie ([Kytabyte](https://github.com/Kytabyte))
+ Kuang Lu ([lukuang](https://github.com/lukuang))
+ Mofe Adeyemi ([Mofetoluwa](https://github.com/Mofetoluwa))
+ Xinyu (Crystina) Zhang ([crystina-z](https://github.com/crystina-z))
+ Adam Yang ([adamyy](https://github.com/adamyy))
+ Joel Mackenzie ([JMMackenzie](https://github.com/JMMackenzie))
+ Luchen Tan ([LuchenTan](https://github.com/LuchenTan))
+ Salman Mohammed ([salman1993](https://github.com/salman1993))
+ Manveer Tamber ([manveertamber](https://github.com/manveertamber))
+ Xinyu Mavis Liu ([x389liu](https://github.com/x389liu))
+ Johnson Han ([x65han](https://github.com/x65han))
+ Kelvin Jiang ([kelvin-jiang](https://github.com/kelvin-jiang))
+ Zhiying Jiang ([bazingagin](https://github.com/bazingagin))
+ Hang Cui ([HangCui0510](https://github.com/HangCui0510))
+ Akintunde Oladipo ([theyorubayesian](https://github.com/theyorubayesian))
+ Matt Yang ([d1shs0ap](https://github.com/d1shs0ap))
+ Dayang Shi ([dyshi](https://github.com/dyshi))
+ Aileen Lin ([AileenLin](https://github.com/AileenLin))
+ Michael Tu ([tuzhucheng](https://github.com/tuzhucheng))
+ Nandan Thakur ([thakur-nandan](https://github.com/thakur-nandan))
+ Yuqing Xie ([amyxie361](https://github.com/amyxie361))
+ Zeynep Akkalyoncu Yilmaz ([zeynepakkalyoncu](https://github.com/zeynepakkalyoncu))
+ Ryan Clancy ([ryan-clancy](https://github.com/ryan-clancy))
+ Peng Shi ([Impavidity](https://github.com/Impavidity))
2 changes: 1 addition & 1 deletion docs/rest-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ To access the content of a document in an index, the endpoint is `api/v1.0/index
Here's an example of getting the document of the top candidate from the above example:

```bash
curl -X GET "http://localhost:8080/api/v1.0/indexes/msmarco-v2.1-doc/documents/msmarco_v2.1_doc_15_390497775"
curl -X GET "http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc/documents/msmarco_v2.1_doc_15_390497775"
```

Output is an object of the same format as a candidate from search
Expand Down

0 comments on commit c2216f8

Please sign in to comment.