Natural Language Processing (NLP)

Machine Translation

Definition

Machine translation (MT) is the automated conversion of text from a source language to a target language. Neural machine translation (NMT) using encoder-decoder transformers has largely replaced older statistical approaches, achieving near-human quality for high-resource language pairs like English-French. The encoder processes the source sentence into a contextual representation; the decoder generates the target sentence token-by-token using cross-attention to source representations. Large multilingual models like NLLB-200 support 200 languages with a single model. Quality is measured with BLEU scores and human evaluations.

Why It Matters

Machine translation enables chatbots and support platforms to serve global markets without maintaining separate localized versions. A product built in English can automatically handle queries in Spanish, Portuguese, Japanese, or Arabic through translation layers. For knowledge bases, MT allows a single source-of-truth article library to be made accessible in dozens of languages with minimal manual effort. The economics of supporting 50+ languages would be prohibitive with human translation alone.

How It Works

Modern NMT uses transformer encoder-decoder architectures trained on massive parallel corpora (aligned sentence pairs in source and target languages). The encoder generates contextualized source representations; the decoder attends to these via cross-attention while autoregressively generating target tokens. Multilingual models share parameters across language pairs, enabling zero-shot translation between pairs not seen together in training. Techniques like back-translation augment training data by machine-translating monolingual text. Domain-specific fine-tuning improves quality for specialized vocabulary.

Neural Machine Translation — EN → FR

Source (English)

The

cat

VBZ

sleeps

quietly

Encoder

Contextual source representations

Cross-Attention

Align source & target positions

Decoder

Generate target tokens step-by-step

Target (French)

chat

dort

VBZ

tranquillement

Translation quality (BLEU score)

76 / 100

Real-World Example

A global SaaS company routes non-English support messages through a machine translation layer before processing by their English-optimized NLP pipeline. The translated text feeds into intent classification and entity extraction; the response is then translated back to the user's language. This architecture allowed them to serve 40+ languages with no language-specific NLP models, reducing ML infrastructure complexity by 90% compared to building separate models per language.

Common Mistakes

✕Assuming translation quality is uniform across all language pairs—low-resource languages (Swahili, Tagalog) have significantly higher error rates
✕Translating before sentiment or entity analysis without considering translation errors that corrupt downstream results
✕Using MT for legal or medical content without human post-editing—errors in high-stakes domains carry serious risks

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Machine Translation

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Multilingual NLP

Cross-Lingual Transfer

Natural Language Generation (NLG)

Natural Language Processing (NLP)

Language Detection

Ready to build your AI chatbot?