Flair NLP
Flair is a Python NLP framework from Zalando Research known for its contextual string embeddings — character-level language model embeddings that capture context and handle rare words, misspellings, and subword morphology. Flair achieves state-of-the-art results on NER and sequence labeling tasks.
Installation
pip install flairThe Key Innovation: Contextual String Embeddings
Traditional word embeddings give “bank” the same vector in every context. Flair’s embeddings are character-level language model representations — the vector for a word depends on the characters surrounding it in the specific sentence. This makes them sensitive to capitalization, context, and morphology.
Named Entity Recognition with Flair
from flair.data import Sentencefrom flair.models import SequenceTagger
# Load pre-trained NER model (downloads automatically on first run)tagger = SequenceTagger.load("ner") # English NER (CoNLL-2003)
text = "Anthropic, founded by Dario Amodei and Daniela Amodei in San Francisco, released Claude 3 in March 2024."sentence = Sentence(text)
tagger.predict(sentence)
print("Named Entities:")for entity in sentence.get_spans("ner"): print(f" {entity.text:<30} [{entity.tag}] score: {entity.score:.4f}")
# Anthropic [ORG] score: 0.9994# Dario Amodei [PER] score: 0.9986# Daniela Amodei [PER] score: 0.9981# San Francisco [LOC] score: 0.9997# Claude 3 [MISC] score: 0.9842Available NER Models
from flair.models import SequenceTagger
# Standard English NER (CoNLL-2003: PER, ORG, LOC, MISC)tagger_standard = SequenceTagger.load("ner")
# Large model (higher accuracy)tagger_large = SequenceTagger.load("ner-large")
# Fast model (lower latency)tagger_fast = SequenceTagger.load("ner-fast")
# Multilingual NER (supports 20+ languages)tagger_multi = SequenceTagger.load("ner-multi")
# Fine-grained NER (18 entity types — dates, events, products, etc.)tagger_ontonotes = SequenceTagger.load("ner-ontonotes-large")POS Tagging
from flair.data import Sentencefrom flair.models import SequenceTagger
tagger = SequenceTagger.load("pos")
sentence = Sentence("Large language models generate contextually appropriate text responses.")tagger.predict(sentence)
for token in sentence.tokens: print(f"{token.text:<25} {token.get_label('pos').value}")
# Large JJ# language NN# models NNS# generate VBP# contextually RB# appropriate JJ# text NN# responses NNSStacked Embeddings
Flair’s superpower is the ability to stack multiple embedding types to combine their strengths:
from flair.embeddings import WordEmbeddings, FlairEmbeddings, StackedEmbeddingsfrom flair.data import Sentence
# Combine GloVe + Flair contextual + FastTextstacked_embeddings = StackedEmbeddings([ WordEmbeddings('glove'), # GloVe global context FlairEmbeddings('news-forward'), # Forward language model FlairEmbeddings('news-backward'), # Backward language model])
sentence = Sentence("The NLP model achieved remarkable benchmark performance.")stacked_embeddings.embed(sentence)
for token in sentence: print(f"{token.text:<20} embedding dim: {token.embedding.shape}") # embedding dim: (2348,) — GloVe(100) + Flair-fwd(1024) + Flair-bwd(1024) + ...Sentence-Level Embeddings
from flair.embeddings import SentenceTransformerDocumentEmbeddingsfrom flair.data import Sentence
# Use sentence-transformers through Flairsentence_embedder = SentenceTransformerDocumentEmbeddings('all-MiniLM-L6-v2')
sentences = [ Sentence("Flair achieves excellent results on NER benchmarks."), Sentence("Named entity recognition identifies persons and organizations."), Sentence("Pizza dough needs to ferment overnight in the refrigerator."),]
for sent in sentences: sentence_embedder.embed(sent) print(f"Embedding shape: {sent.embedding.shape}") # (384,)Text Classification with Flair
from flair.models import TextClassifierfrom flair.data import Sentence
# Load sentiment classifierclassifier = TextClassifier.load("sentiment")
texts = [ "This NLP library is incredibly powerful and easy to use!", "The documentation is incomplete and the examples are confusing.", "The performance is adequate for most basic NLP tasks."]
for text in texts: sentence = Sentence(text) classifier.predict(sentence) label = sentence.labels[0].value score = sentence.labels[0].score print(f"[{label} {score:.3f}] {text[:55]}")Training a Custom NER Model
from flair.data import Corpusfrom flair.datasets import CONLL_03from flair.embeddings import WordEmbeddings, FlairEmbeddings, StackedEmbeddingsfrom flair.models import SequenceTaggerfrom flair.trainers import ModelTrainer
# Load training corpus (CoNLL-2003 format)corpus: Corpus = CONLL_03()
# Define tag typetag_type = 'ner'tag_dictionary = corpus.make_label_dictionary(label_type=tag_type)
# Define embeddingsembeddings = StackedEmbeddings([ WordEmbeddings('glove'), FlairEmbeddings('news-forward-fast'), FlairEmbeddings('news-backward-fast'),])
# Create sequence taggertagger = SequenceTagger( hidden_size=256, embeddings=embeddings, tag_dictionary=tag_dictionary, tag_type=tag_type, use_crf=True # Conditional Random Field output layer)
# Traintrainer = ModelTrainer(tagger, corpus)trainer.train( base_path='./ner-model', learning_rate=0.1, mini_batch_size=32, max_epochs=10)Flair vs spaCy vs Hugging Face
| Aspect | Flair | spaCy | Hugging Face |
|---|---|---|---|
| NER accuracy | Excellent | High | Highest |
| Contextual embeddings | Yes (char-level) | Via trf | Yes (BERT-based) |
| Speed | Slower | Fast | Moderate |
| Stacking | Yes | No | No |
| Ease of use | Good | Excellent | Good |
| Model hub | Yes | Yes | Very large |
Flair is the right choice when you need the highest possible NER accuracy without writing custom model code, or when you want to experiment with stacked embeddings. For speed-critical production systems, spaCy with its transformer models provides a good alternative.