Skip to content

Commit ab88e73

Browse files
authored
[docs] update dcvlr readme (#1748)
1 parent 2d0bdb7 commit ab88e73

File tree

1 file changed

+73
-1
lines changed

1 file changed

+73
-1
lines changed

configs/projects/dcvlr/README.md

Lines changed: 73 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,73 @@
1-
WIP
1+
# DCVLR: Data Curation for Vision Language Reasoning
2+
3+
[![NeurIPS 2025](https://img.shields.io/badge/NeurIPS-2025-blue.svg)](https://neurips.cc/Conferences/2025)
4+
[![Competition](https://img.shields.io/badge/Competition-Open-green.svg)](https://dcvlr.org)
5+
6+
---
7+
8+
<div align="center">
9+
10+
<h3>
11+
🌐 <a href="https://dcvlr-neurips.github.io">Official webpage</a> •
12+
🚀 <a href="https://oumi-ai.typeform.com/to/LnYoisi5">Sign up for updates</a> •
13+
🎯 <a href="https://oumi-ai.typeform.com/to/OGPuRt6U">Apply for GPU credits (sponsored by Lambda Labs)</a>
14+
</h3>
15+
</div>
16+
17+
---
18+
19+
20+
DCVLR is the first open-data, open-models, open-source competition for data curation in vision-language reasoning, hosted at NeurIPS 2025.
21+
22+
23+
## 🎯 Challenge
24+
25+
Participants can leverage any source datasets to curate high-quality instruction-tuning datasets (1K or 10K examples). Participants are encouraged to explore diverse curation strategies, from synthetic data generation to subset selection. Submissions will be evaluated by fine-tuning an undisclosed, open-source vision-language model on the curated data and measuring performance across a wide variety of benchmarks.
26+
27+
## 🚀 Quick Start
28+
29+
Get started with training in minutes:
30+
31+
```bash
32+
# Install oumi
33+
uv pip install "oumi[gpu]"
34+
35+
# Train with Molmo-7B-O
36+
oumi train -c molmo-o --dataset dataset.jsonl
37+
38+
# Train with Qwen2.5-VL-7B-Instruct
39+
oumi train -c qwen2.5-vl-7b-instruct --dataset dataset.jsonl
40+
```
41+
42+
## 📅 Key Dates
43+
44+
| Date | Milestone |
45+
|------|-----------|
46+
| **June 11, 2025** | Release of Competition Materials |
47+
| **July 1, 2025** | Submission Portal Opens |
48+
| **October 1, 2025** | Final Submission Deadline |
49+
| **November 1, 2025** | Results Announced |
50+
| **December 2025** | NeurIPS 2025 Presentation |
51+
52+
53+
## 📚 Competition Resources
54+
55+
| Resource | Description | Link |
56+
|----------|-------------|------|
57+
| 📊 **Starter Kit** | Comprehensive starter kit with example datasets, training scripts, and best practices | [Access Starter Kit](https://huggingface.co/datasets/oumi-ai/dcvlr-starter-kit) |
58+
| 💻 **Training Scripts** | Starting scripts for fine-tuning multiple vision-language models | [View Scripts](https://github.com/oumi-ai/oumi/tree/main/configs/projects/dcvlr) |
59+
| 🧪 **Evaluation Code** | Scripts to evaluate model outputs on diverse benchmark development sets | [Get Code](https://github.com/oumi-ai/oumi/tree/main/configs/projects/dcvlr) |
60+
| ☁️ **Compute Resources** | GPU credits from Lambda Labs for participants | [Apply for Credits](https://oumi-ai.typeform.com/to/OGPuRt6U") |
61+
| 📚 **Documentation** | Complete guides and tutorials | [View Documentation](https://oumi.ai/docs) |
62+
63+
## 🤝 Sponsors
64+
65+
- **Lambda Labs** - Compute Resources
66+
- **Oumi.ai** - Competition Support
67+
68+
## 📞 Contact
69+
70+
Have questions? Get in touch with the DCVLR team:
71+
72+
- **Website**: [dcvlr.org](https://dcvlr.org)
73+
- **Email**: [Contact Form](https://dcvlr.org/contact)

0 commit comments

Comments
 (0)