machine-translation-data-processing

Here are 7 public repositories matching this topic...

facebookresearch / stopes

A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.

machine-learning translation machine-translation dataset dataset-generation nmt machine-translation-data-processing

Updated Oct 14, 2025
Python

lt3 / nfr

Star

Neural Fuzzy Repair (NFR) is a data augmentation pipeline, which integrates fuzzy matches (i.e. similar translations) into neural machine translation.

nlp natural-language-processing machine-translation fuzzy-matching data-augmentation-strategies data-augmentation machine-translation-data-processing nfr fuzzy-repair neural-fuzzy-repair

Updated Aug 14, 2024
Python

alphadl / corpus_filter

Star

Scripts for machine translation corpora filtering/ 机器翻译平行语料过滤的脚本

machine-translation machine-translation-data-processing machine-translation-metrics

Updated Jun 3, 2019
Python

geovedi / nmt-playground

Star

Personal NMT Playground

machine-translation neural-machine-translation machine-translation-data-processing

Updated Jul 3, 2017
Python

moodser / splitter-transliteration

Star

Python script to split the text generated by 'wikipedia parallel title extractor' into separate text files (separate file for each language)

machine-translation transliteration machine-translation-data-processing wikipedia-corpus machine-tranliteration

Updated Aug 16, 2018
Python

mrsumitbd / SOParallelCorpusReplication

Star

Replication package for SO processing for bitext

stackoverflow alignment parallel-corpus machine-translation-data-processing

Updated Mar 4, 2019
Python

iamsiva11 / Seq2Seq-PyTorch

Star

Extend/Passing extra source tokens to seq2seq encoder (PyTorch)

deep-learning machine-translation pytorch lstm seq2seq neural-machine-translation sequence-to-sequence nmt attention-mechanism lstm-neural-networks machine-translation-data-processing

Updated Jan 4, 2018
Python

Improve this page

Add a description, image, and links to the machine-translation-data-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the machine-translation-data-processing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

machine-translation-data-processing

Here are 7 public repositories matching this topic...

facebookresearch / stopes

lt3 / nfr

alphadl / corpus_filter

geovedi / nmt-playground

moodser / splitter-transliteration

mrsumitbd / SOParallelCorpusReplication

iamsiva11 / Seq2Seq-PyTorch

Improve this page

Add this topic to your repo