Technology  /  NLP

💬 Natural Language Processing 40 guides · updated 2026

From tokenisation and embeddings to transformer-based language understanding — the NLP fundamentals that underpin every modern LLM.

Flair NLP

Flair is a Python NLP framework from Zalando Research known for its contextual string embeddings — character-level language model embeddings that capture context and handle rare words, misspellings, and subword morphology. Flair achieves state-of-the-art results on NER and sequence labeling tasks.


Installation

Terminal window
pip install flair

The Key Innovation: Contextual String Embeddings

Traditional word embeddings give “bank” the same vector in every context. Flair’s embeddings are character-level language model representations — the vector for a word depends on the characters surrounding it in the specific sentence. This makes them sensitive to capitalization, context, and morphology.


Named Entity Recognition with Flair

from flair.data import Sentence
from flair.models import SequenceTagger
# Load pre-trained NER model (downloads automatically on first run)
tagger = SequenceTagger.load("ner") # English NER (CoNLL-2003)
text = "Anthropic, founded by Dario Amodei and Daniela Amodei in San Francisco, released Claude 3 in March 2024."
sentence = Sentence(text)
tagger.predict(sentence)
print("Named Entities:")
for entity in sentence.get_spans("ner"):
print(f" {entity.text:<30} [{entity.tag}] score: {entity.score:.4f}")
# Anthropic [ORG] score: 0.9994
# Dario Amodei [PER] score: 0.9986
# Daniela Amodei [PER] score: 0.9981
# San Francisco [LOC] score: 0.9997
# Claude 3 [MISC] score: 0.9842

Available NER Models

from flair.models import SequenceTagger
# Standard English NER (CoNLL-2003: PER, ORG, LOC, MISC)
tagger_standard = SequenceTagger.load("ner")
# Large model (higher accuracy)
tagger_large = SequenceTagger.load("ner-large")
# Fast model (lower latency)
tagger_fast = SequenceTagger.load("ner-fast")
# Multilingual NER (supports 20+ languages)
tagger_multi = SequenceTagger.load("ner-multi")
# Fine-grained NER (18 entity types — dates, events, products, etc.)
tagger_ontonotes = SequenceTagger.load("ner-ontonotes-large")

POS Tagging

from flair.data import Sentence
from flair.models import SequenceTagger
tagger = SequenceTagger.load("pos")
sentence = Sentence("Large language models generate contextually appropriate text responses.")
tagger.predict(sentence)
for token in sentence.tokens:
print(f"{token.text:<25} {token.get_label('pos').value}")
# Large JJ
# language NN
# models NNS
# generate VBP
# contextually RB
# appropriate JJ
# text NN
# responses NNS

Stacked Embeddings

Flair’s superpower is the ability to stack multiple embedding types to combine their strengths:

from flair.embeddings import WordEmbeddings, FlairEmbeddings, StackedEmbeddings
from flair.data import Sentence
# Combine GloVe + Flair contextual + FastText
stacked_embeddings = StackedEmbeddings([
WordEmbeddings('glove'), # GloVe global context
FlairEmbeddings('news-forward'), # Forward language model
FlairEmbeddings('news-backward'), # Backward language model
])
sentence = Sentence("The NLP model achieved remarkable benchmark performance.")
stacked_embeddings.embed(sentence)
for token in sentence:
print(f"{token.text:<20} embedding dim: {token.embedding.shape}")
# embedding dim: (2348,) — GloVe(100) + Flair-fwd(1024) + Flair-bwd(1024) + ...

Sentence-Level Embeddings

from flair.embeddings import SentenceTransformerDocumentEmbeddings
from flair.data import Sentence
# Use sentence-transformers through Flair
sentence_embedder = SentenceTransformerDocumentEmbeddings('all-MiniLM-L6-v2')
sentences = [
Sentence("Flair achieves excellent results on NER benchmarks."),
Sentence("Named entity recognition identifies persons and organizations."),
Sentence("Pizza dough needs to ferment overnight in the refrigerator."),
]
for sent in sentences:
sentence_embedder.embed(sent)
print(f"Embedding shape: {sent.embedding.shape}") # (384,)

Text Classification with Flair

from flair.models import TextClassifier
from flair.data import Sentence
# Load sentiment classifier
classifier = TextClassifier.load("sentiment")
texts = [
"This NLP library is incredibly powerful and easy to use!",
"The documentation is incomplete and the examples are confusing.",
"The performance is adequate for most basic NLP tasks."
]
for text in texts:
sentence = Sentence(text)
classifier.predict(sentence)
label = sentence.labels[0].value
score = sentence.labels[0].score
print(f"[{label} {score:.3f}] {text[:55]}")

Training a Custom NER Model

from flair.data import Corpus
from flair.datasets import CONLL_03
from flair.embeddings import WordEmbeddings, FlairEmbeddings, StackedEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer
# Load training corpus (CoNLL-2003 format)
corpus: Corpus = CONLL_03()
# Define tag type
tag_type = 'ner'
tag_dictionary = corpus.make_label_dictionary(label_type=tag_type)
# Define embeddings
embeddings = StackedEmbeddings([
WordEmbeddings('glove'),
FlairEmbeddings('news-forward-fast'),
FlairEmbeddings('news-backward-fast'),
])
# Create sequence tagger
tagger = SequenceTagger(
hidden_size=256,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type,
use_crf=True # Conditional Random Field output layer
)
# Train
trainer = ModelTrainer(tagger, corpus)
trainer.train(
base_path='./ner-model',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=10
)

Flair vs spaCy vs Hugging Face

AspectFlairspaCyHugging Face
NER accuracyExcellentHighHighest
Contextual embeddingsYes (char-level)Via trfYes (BERT-based)
SpeedSlowerFastModerate
StackingYesNoNo
Ease of useGoodExcellentGood
Model hubYesYesVery large

Flair is the right choice when you need the highest possible NER accuracy without writing custom model code, or when you want to experiment with stacked embeddings. For speed-critical production systems, spaCy with its transformer models provides a good alternative.