Dependency Parsing in NLP

Dependency parsing maps the grammatical relationships between words in a sentence. Every word (except the root) connects to a head word with a labeled relationship — subject, object, modifier, and more.

How Dependency Parsing Works

In a dependency parse, the sentence has a root verb, and every other word attaches to a head:

"The researcher published a new paper on transformers."

          published   (ROOT)
         /    |    \
researcher  paper     .
(nsubj)    (dobj)
            /   \
           a    on
          (det) (prep)
                  |
             transformers
              (pobj)
              /
            new
           (amod)

Each arrow shows a dependency relation: nsubj (nominal subject), dobj (direct object), det (determiner), prep (prepositional modifier), pobj (object of preposition), amod (adjectival modifier).

Core Dependency Labels

Label	Meaning	Example
`nsubj`	Nominal subject	”Alice runs”
`nsubjpass`	Passive subject	”The paper was written”
`dobj`	Direct object	”She read the book”
`iobj`	Indirect object	”He gave her a gift”
`prep`	Prepositional modifier	”She works at Google”
`pobj`	Object of preposition	”at Google”
`amod`	Adjectival modifier	”a new model”
`advmod`	Adverbial modifier	”runs quickly”
`det`	Determiner	”the model”
`compound`	Compound noun	”language model”
`conj`	Conjunction	”Apple and Google”
`ROOT`	Root verb of the sentence	”She published…”

Dependency Parsing with spaCy

import spacy
nlp = spacy.load("en_core_web_sm")

text = "OpenAI released GPT-5 which significantly outperformed previous language models."
doc = nlp(text)

for token in doc:
    print(f"{token.text:<20} dep: {token.dep_:<12} head: {token.head.text}")

# OpenAI               dep: nsubj        head: released
# released             dep: ROOT         head: released
# GPT-5                dep: dobj         head: released
# which                dep: nsubj        head: outperformed
# significantly        dep: advmod       head: outperformed
# outperformed         dep: relcl        head: GPT-5
# previous             dep: amod         head: models
# language             dep: compound     head: models
# models               dep: dobj         head: outperformed

Visualizing Dependency Trees

from spacy import displacy

doc = nlp("The model efficiently handles long-context reasoning tasks.")
displacy.render(doc, style="dep", jupyter=True, options={"distance": 120})
# For a standalone script:
# displacy.serve(doc, style="dep")

Extracting Subject-Verb-Object Triples

import spacy
nlp = spacy.load("en_core_web_sm")

def get_svo_triples(text):
    doc = nlp(text)
    triples = []

    for token in doc:
        if token.pos_ == "VERB":
            subjects = [w for w in token.lefts if w.dep_ in ("nsubj", "nsubjpass")]
            objects  = [w for w in token.rights if w.dep_ in ("dobj", "pobj", "attr")]

            for subj in subjects:
                for obj in objects:
                    triples.append({
                        "subject": subj.text,
                        "verb": token.lemma_,
                        "object": obj.text
                    })

    return triples

texts = [
    "Google acquired YouTube in 2006 for $1.65 billion.",
    "Anthropic trained Claude using constitutional AI methods.",
    "Researchers published findings that challenged existing benchmarks."
]

for t in texts:
    print(get_svo_triples(t))

Navigating the Dependency Tree

spaCy provides helpers to traverse the parse tree:

doc = nlp("The startup's innovative NLP platform attracted significant investor attention.")

for token in doc:
    if token.dep_ == "ROOT":
        root = token
        print(f"Root verb: {root.text}")
        print(f"Subtree: {[t.text for t in root.subtree]}")
        print(f"Left children: {[t.text for t in root.lefts]}")
        print(f"Right children: {[t.text for t in root.rights]}")

Multilingual Dependency Parsing with Stanza

Stanza supports Universal Dependencies across 70+ languages:

import stanza
stanza.download('en')

nlp_stanza = stanza.Pipeline('en')
doc = nlp_stanza("She quickly analyzed the complex dataset.")

for sent in doc.sentences:
    for word in sent.words:
        head = sent.words[word.head - 1].text if word.head > 0 else "ROOT"
        print(f"{word.text:<15} deprel: {word.deprel:<10} head: {head}")

Real-World Use Cases

Knowledge graph construction — extract entity relationships at scale from news or scientific articles using SVO triples.

Coreference resolution — track which pronoun refers to which noun by following dependency paths.

Semantic role labeling — extend dependency parses to identify “who did what to whom, when, and where.”

RAG preprocessing — annotating documents with dependency-derived facts improves structured retrieval in knowledge-intensive QA systems.

Document-level relation extraction — large language models use attention that implicitly captures dependency-like relationships, but explicit parses help with interpretability and structured pipelines.