parallel-corpora

Star

Here are 17 public repositories matching this topic...

Nexdata-AI / 1990000-Groups-Chinese-Czech-Parallel-Corpus-Data

Star

1990000-Groups-Chinese-Czech-Parallel-Corpus-Data

language-translation machine-translation lexical-analysis parallel-corpora

Updated Apr 9, 2024

techiaith / alinio

Star

Cod hwyluso alinio testunau gyda hunalign a dogfennaeth ar sut i ddefnyddio LFAligner // Code for simplifying aligning texts with hunalign and documentation for LFAligner

machine-translation alignment welsh cymraeg parallel-corpora

Updated Mar 6, 2016
Python

gederajeg / rob-steal-parallel-corpora

Star

Repository kode pemrograman R dan data untuk analisis dalam penelitian dengan judul MODEL KAJIAN TERJEMAHAN BERBASIS BANK DATA TERJEMAHAN DIGITAL INGGRIS-INDONESIA DAN IMPLIKASI PEDAGOGISNYA

corpus-linguistics construction-grammar parallel-corpora constructional-equivalence english-indonesian-translation subtitle-corpora opensubtitle rob-steal-synonyms english-indonesian-parallel-corpora udayana-university

Updated Jan 30, 2022
R

Sohyo / Using-Confidential-Data-for-NMT

Star

nlp datasets parallel-corpora

Updated Jun 18, 2021

czcorpus / ictools

Star

A program for calculating corpora alignments using a pivot language

translation corpus linguistics cmd corpora manatee-open parallel-corpora

Updated Mar 21, 2024
Go

npedrazzini / parallelbibles

Star

Word-alignment models for Bible translations in 100+ historical and contemporary languages

bible-translations kriging word-alignment multidimensional-scaling parallel-corpora

Updated Sep 29, 2022
R

gederajeg / constructional-equivalence

Star

Repository of supplementary materials and RStudio project for the paper on corpus-based approach to measuring constructional equivalence.

open-data open-science corpus-linguistics parallel-corpus r-programming construction-grammar quantitative-linguistics parallel-corpora translation-studies r-programming-projects open-subtitle open-code english-indonesian-translation udayana-university universitas-udayana verbal-near-synonyms constructionist-approach translation-equivalence

Updated Dec 8, 2022
TeX

shashwatup9k / bho-resources

Star

monolingual-corpora parallel-corpora bhojpuri-textcorpus annotated-corpora bhojpuri english-bhojpuri

Updated Dec 5, 2023

rggdmonk / hadal

Star

A simple and eﬀicient tool for mining and aligning sentences with pre-trained models.

nlp machine-translation alignment nlp-library parallel-corpus sentence-alignment parallel-corpora parallel-sentence-mining

Updated May 17, 2024
Python

tsuruoka-lab / AMI-Meeting-Parallel-Corpus

Star

AMI Meeting Parallel Corpus

japanese machine-translation corpus english parallel-corpus parallel-corpora annotated-corpora document-aligned

Updated Dec 11, 2020

Giuseppe-Della-Corte / IESTAC

Star

A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models

machine-translation corpus web-scraping named-entity-recognition speech-recognition audio-data text-processing sql-database statistical-machine-translation speech-processing parallel-corpus forced-alignment sentence-similarity sentence-embeddings end-to-end-machine-learning parallel-corpora bitext speech-translation mfcc-features text-preprocessinig

Updated Jan 26, 2021

korenyoni / opus-api

Star

OPUS (opus.nlpl.eu) Python3 API

python api machine-learning corpus corporate opus corpora language-model parallel-corpus parallel-corpora

Updated Jun 27, 2024
Python

timarkh / tsakorpus

Star

Yet another search platform for linguistic corpora.

flask elasticsearch corpus linguistics corpus-linguistics corpus-tools linguistic-corpora language-documentation parallel-corpora media-aligned-corpora

Updated Jan 24, 2024
Python

Kartikaggarwal98 / Indian_ParallelCorpus

Star

Curated list of publicly available parallel corpus for Indian Languages

nlp corpus neural-machine-translation parallel-corpus indian-languages parallel-corpora low-resource-languages multilingual-translation machinetranslation low-resource-machine-translation

Updated Jul 15, 2021

tsuruoka-lab / BSD

Star

The Business Scene Dialogue corpus

japanese machine-translation corpus english parallel-corpus parallel-corpora annotated-corpora document-aligned

Updated Nov 10, 2021

This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.

machine-translation neural-machine-translation parallel-corpus parallel-corpora bangla-nlp low-resource-languages bangla-machine-translation bangla-dataset-machine-translation emnlp-2020 low-resource-nlp low-resource-machine-translation

Updated Jan 30, 2023
Python

bitextor / bitextor

Star

Bitextor generates translation memories from multilingual websites

Updated Jun 18, 2024
Python

Improve this page

Add a description, image, and links to the parallel-corpora topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the parallel-corpora topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel-corpora

Here are 17 public repositories matching this topic...

Nexdata-AI / 1990000-Groups-Chinese-Czech-Parallel-Corpus-Data

techiaith / alinio

gederajeg / rob-steal-parallel-corpora

Sohyo / Using-Confidential-Data-for-NMT

czcorpus / ictools

npedrazzini / parallelbibles

gederajeg / constructional-equivalence

shashwatup9k / bho-resources

rggdmonk / hadal

tsuruoka-lab / AMI-Meeting-Parallel-Corpus

Giuseppe-Della-Corte / IESTAC

korenyoni / opus-api

timarkh / tsakorpus

Kartikaggarwal98 / Indian_ParallelCorpus

tsuruoka-lab / BSD

csebuetnlp / banglanmt

bitextor / bitextor

Improve this page

Add this topic to your repo