-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: reorder datasets.csv and update datasets table
- Loading branch information
1 parent
799ee87
commit adae431
Showing
2 changed files
with
29 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,19 @@ | ||
nombre,tareas,idioma,página_web,github,paper,hf_dataset_name,hf_contributor_handle,dominio,pais | ||
BasCrawl,modelado del lenguaje,eu,https://doi.org/10.5281/zenodo.7313092,,,,,general,"España" | ||
Biomedical Spanish CBOW Word Embeddings in Floret,"modelado del lenguaje,CBOW (Continuous Bag Of Words)","es",https://doi.org/10.5281/zenodo.7314041,https://arxiv.org/abs/2109.07765,,,,clinico,España | ||
BasCrawl,modelado del lenguaje,eu,https://doi.org/10.5281/zenodo.7313092,,,,,general,España | ||
Biomedical Spanish CBOW Word Embeddings in Floret,"modelado del lenguaje,CBOW (Continuous Bag Of Words)",es,https://doi.org/10.5281/zenodo.7314041,https://arxiv.org/abs/2109.07765,,,,clinico,España | ||
CSIC Spanish Corpus,modelado del lenguaje,es,https://doi.org/10.5281/zenodo.7313126,,,,,academico,España | ||
Catalonia Independence Corpus,clasificación de sentimientos,"ca, es",,https://github.com/ixa-ehu/catalonia-independence-corpus,https://www.aclweb.org/anthology/2020.lrec-1.171/,catalonia_independence,lewtun,rrss,"España" | ||
HEAD-QA,preguntas de opción múltiple,es,https://aghie.github.io/head-qa/,https://github.com/aghie/head-qa,https://www.aclweb.org/anthology/P19-1092/,head_qa,mariagrandury,clinico,"España" | ||
Catalonia Independence Corpus,clasificación de sentimientos,"ca, es",,https://github.com/ixa-ehu/catalonia-independence-corpus,https://www.aclweb.org/anthology/2020.lrec-1.171/,catalonia_independence,lewtun,rrss,España | ||
HEAD-QA,preguntas de opción múltiple,es,https://aghie.github.io/head-qa/,https://github.com/aghie/head-qa,https://www.aclweb.org/anthology/P19-1092/,head_qa,mariagrandury,clinico,España | ||
InfoLibros Corpus,modelado del lenguaje,es,https://doi.org/10.5281/zenodo.7313105,,,,,literatura,Varios | ||
Large Spanish Corpus,"modelado del lenguaje,pre-entrenamiento",es,,https://github.com/josecannete/spanish-corpora,,large_spanish_corpus,lewtun,general,Varios | ||
Mucho Cine,clasificación de sentimientos,"es",http://www.lsi.us.es/~fermin/index.php/Datasets,,,muchocine,mapmeld,general,? | ||
Spanish Billion Words,"modelado del lenguaje,pre-entrenamiento","es",https://crscardellino.github.io/SBWCE/,,,spanish_billion_words,mariagrandury,general,Varios | ||
Spanish Biomedical Crawled Corpus,modelado del lenguaje,"es",https://doi.org/10.5281/zenodo.5513237,,https://arxiv.org/abs/2109.07765,,,clinico,España | ||
Spanish CBOW Word Embeddings in FastText,"modelado del lenguaje,FastText","es",https://doi.org/10.5281/zenodo.5044988,,,http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405,,genera,España | ||
Mucho Cine,clasificación de sentimientos,es,http://www.lsi.us.es/~fermin/index.php/Datasets,,,muchocine,mapmeld,general,? | ||
Spanish Billion Words,"modelado del lenguaje,pre-entrenamiento",es,https://crscardellino.github.io/SBWCE/,,,spanish_billion_words,mariagrandury,general,Varios | ||
Spanish Biomedical Crawled Corpus,modelado del lenguaje,es,https://doi.org/10.5281/zenodo.5513237,,https://arxiv.org/abs/2109.07765,,,clinico,España | ||
Spanish CBOW Word Embeddings in FastText,"modelado del lenguaje,FastText",es,https://doi.org/10.5281/zenodo.5044988,,,http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405,,genera,España | ||
Spanish CBOW Word Embeddings in Floret,"modelado del lenguaje,CBOW (Continuous Bag Of Words)",es,https://doi.org/10.5281/zenodo.7314098,,,,,general,España | ||
Spanish Legal Domain Corpora,modelado del lenguaje,es,https://doi.org/10.5281/zenodo.5495529,https://github.com/PlanTL-GOB-ES/lm-legal-es,https://arxiv.org/abs/2110.12201,,,legal,España | ||
Spanish Legal Domain Word & Sub-Word Embeddings,modelado del lenguaje,es,https://doi.org/10.5281/zenodo.5036147,https://github.com/PlanTL-GOB-ES/lm-legal-es,https://arxiv.org/abs/2110.12201,,,legal,España | ||
Spanish Skip-Gram Word Embeddings in FastText,"modelado del lenguaje,FastText","es",https://doi.org/10.5281/zenodo.5046525,,,http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405,,general,España | ||
Spanish Skip-Gram Word Embeddings in FastText,"modelado del lenguaje,FastText",es,https://doi.org/10.5281/zenodo.5046525,,,http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405,,general,España | ||
TDX Thesis Spanish Corpus,modelado del lenguaje,"ca, es",https://doi.org/10.5281/zenodo.7313149,,,,,academico,España | ||
WikiCorpus,"modelado del lenguaje,POS (Part of Speech)","ca, en, es",https://www.cs.upc.edu/~nlp/wikicorpus/,,https://www.cs.upc.edu/~nlp/papers/reese10.pdf,wikicorpus,albertvillanova,general,Varios | ||
eHealth-KD,NER (Named Entity Recognition),es,https://knowledge-learning.github.io/ehealthkd-2020/,https://github.com/knowledge-learning/ehealthkd-2020,http://ceur-ws.org/Vol-2664/eHealth-KD_overview.pdf,ehealth_kd,mariagrandury,clinico,España |