Natural Language Processing (NLP)

Machine Translation

Definition

Machine translation (MT) is the automated conversion of text from a source language to a target language. Neural machine translation (NMT) using encoder-decoder transformers has largely replaced older statistical approaches, achieving near-human quality for high-resource language pairs like English-French. The encoder processes the source sentence into a contextual representation; the decoder generates the target sentence token-by-token using cross-attention to source representations. Large multilingual models like NLLB-200 support 200 languages with a single model. Quality is measured with BLEU scores and human evaluations.

Why It Matters

Machine translation enables chatbots and support platforms to serve global markets without maintaining separate localized versions. A product built in English can automatically handle queries in Spanish, Portuguese, Japanese, or Arabic through translation layers. For knowledge bases, MT allows a single source-of-truth article library to be made accessible in dozens of languages with minimal manual effort. The economics of supporting 50+ languages would be prohibitive with human translation alone.

How It Works

Modern NMT uses transformer encoder-decoder architectures trained on massive parallel corpora (aligned sentence pairs in source and target languages). The encoder generates contextualized source representations; the decoder attends to these via cross-attention while autoregressively generating target tokens. Multilingual models share parameters across language pairs, enabling zero-shot translation between pairs not seen together in training. Techniques like back-translation augment training data by machine-translating monolingual text. Domain-specific fine-tuning improves quality for specialized vocabulary.

Neural Machine Translation — EN → FR

Source (English)

DT
The
NN
cat
VBZ
sleeps
RB
quietly
Encoder
Contextual source representations
Cross-Attention
Align source & target positions
Decoder
Generate target tokens step-by-step

Target (French)

Le
DT
chat
NN
dort
VBZ
tranquillement
RB
Translation quality (BLEU score)
76 / 100

Real-World Example

A global SaaS company routes non-English support messages through a machine translation layer before processing by their English-optimized NLP pipeline. The translated text feeds into intent classification and entity extraction; the response is then translated back to the user's language. This architecture allowed them to serve 40+ languages with no language-specific NLP models, reducing ML infrastructure complexity by 90% compared to building separate models per language.

Common Mistakes

  • Assuming translation quality is uniform across all language pairs—low-resource languages (Swahili, Tagalog) have significantly higher error rates
  • Translating before sentiment or entity analysis without considering translation errors that corrupt downstream results
  • Using MT for legal or medical content without human post-editing—errors in high-stakes domains carry serious risks

Related Terms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →
What is Machine Translation? Machine Translation Definition & Guide | 99helpers | 99helpers.com