Natural Language Processing (NLP)

Sequence Labeling

Definition

Sequence labeling is the NLP task of assigning a categorical label to each element in a sequence of tokens. Unlike text classification which assigns one label per document, sequence labeling produces one label per token. Applications include named entity recognition (label each token as B-PER, I-ORG, O), part-of-speech tagging (label each token with its grammatical role), chunking (label sequences of tokens forming phrases), and slot filling (label tokens contributing to dialogue slot values). Modern sequence labelers use transformer encoders with token-level classification heads, often combined with CRF (Conditional Random Field) decoders to enforce label sequence consistency.

Why It Matters

Sequence labeling is the technical foundation for named entity recognition, slot filling, relation extraction, and syntactic analysis. Any NLP application that needs to locate specific information within text—not just classify the whole text—relies on sequence labeling. For chatbot slot filling, sequence labeling identifies exactly which words in 'book a flight to Paris on Friday' contribute to the destination ('Paris') and date ('Friday') slots. Understanding sequence labeling as a paradigm helps practitioners design better NLP pipelines and interpret model architectures.

How It Works

Transformer-based sequence labelers use a BERT-style encoder to produce contextual token embeddings, followed by a linear classification layer that predicts label probabilities for each token. CRF decoding adds a transition score matrix that captures valid label transitions (I-PER cannot follow B-ORG in BIO tagging), enforcing globally consistent label sequences via the Viterbi algorithm. Training minimizes cross-entropy loss over all token labels. WordPiece tokenization creates subword tokens that must be aligned back to original word boundaries for final predictions—typically using the first subword token's label for each word.

Sequence Labeling — BIO Tagging

Input Tokens

Marie
B-PER
Curie
I-PER
worked
O
at
O
CNRS
B-ORG
in
O
Paris
B-LOC

Tag Legend

B-PERBegin Person
I-PERInside Person
B-ORGBegin Organization
B-LOCBegin Location
OOutside (no entity)

Real-World Example

An accounts payable automation system uses sequence labeling to extract invoice fields from OCR-processed purchase orders. Trained on 5,000 labeled invoices, the model labels each token as B-vendor, I-vendor, B-amount, I-amount, B-date, I-date, or O. For an invoice reading '...Amazon Web Services TOTAL: $4,821.50 due 2026-04-01...', the sequence labeler correctly identifies entity spans that the accounts payable system then normalizes and enters into the ERP system—processing invoices that previously required 15 minutes of manual data entry in under 2 seconds.

Common Mistakes

  • Ignoring subword-to-word alignment when using WordPiece tokenizers—models must map subword labels back to original word boundaries
  • Skipping CRF decoding for fine-grained tasks—independent per-token softmax produces invalid BIO label sequences (I-PER without preceding B-PER)
  • Treating sequence labeling as classification—the sequential nature and label dependencies require different architectures and evaluation metrics

Related Terms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →
What is Sequence Labeling? Sequence Labeling Definition & Guide | 99helpers | 99helpers.com