Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ParsingParsing

Giuseppe AttardiGiuseppe Attardi

Dipartimento di InformaticaDipartimento di Informatica

Università di PisaUniversità di Pisa

Università di Pisa

Question Answering Question Answering at TRECat TREC Consists of answering a set of 500 fact-based Consists of answering a set of 500 fact-based

questions, e.g. questions, e.g. “When was Mozart born“When was Mozart born?”?” Systems were allowed to return 5 ranked Systems were allowed to return 5 ranked

answer snippets to each question.answer snippets to each question. IR think Mean Reciprocal Rank (MRR) scoring:

• 1, 0.5, 0.33, 0.25, 0.2, 0 for 1, 2, 3, 4, 5, 6+ doc Mainly Named Entity answers (person, place,

date, …) From 2002 systems are only allowed to return From 2002 systems are only allowed to return

a single a single exactexact answer answer

Page 3: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

TREC 2000 Results (long)TREC 2000 Results (long)

Watson Team

Page 4: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

FalconFalcon

The Falcon system from SMU was by far best The Falcon system from SMU was by far best performing system at TREC 2000performing system at TREC 2000

It used NLP and performed deep semantic It used NLP and performed deep semantic processingprocessing

Page 5: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Question parseQuestion parse

Who was the first Russian astronaut to walk in space

WP VBD DT JJ NNP NP TO VB IN NN

NP NP

PP

VP

S

VP

S

Page 6: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Question semantic formQuestion semantic form

astronaut

walk space

Russianfirst

PERSON

first(x) astronaut(x) Russian(x) space(z) walk(y, z, x) PERSON(x)

Question logic form:Question logic form:

Answer type

Page 7: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parsing in QAParsing in QA

Top systems in TREC 2005 perform parsing Top systems in TREC 2005 perform parsing of queries and answer paragraphsof queries and answer paragraphs

Some use specially built parserSome use specially built parser Parsers are slow: ~ 1min/sentenceParsers are slow: ~ 1min/sentence

Page 8: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Practical Uses of Parsing Practical Uses of Parsing

Google Knowledge Graph enriched from Google Knowledge Graph enriched from relations extracted from Dependency Treesrelations extracted from Dependency Trees

Google index parses all documentsGoogle index parses all documents Google Translator applies dependency Google Translator applies dependency

parsing to sentencesparsing to sentences Sentiment Analysis improves by dependency Sentiment Analysis improves by dependency

parsingparsing

Page 9: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Statistical Methods in Statistical Methods in NLPNLP Some NLP problems:

Information extraction• Named entities, Relationships between entities, etc.

Finding linguistic structure• Part-of-speech tagging, Chunking, Parsing

Can be cast as learning mapping: Strings to hidden state sequences

• NE extraction, POS tagging Strings to strings

• Machine translation Strings to trees

• Parsing Strings to relational data structures

• Information extraction

Page 10: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

TechniquesTechniques

Log-linear (Maximum Entropy) taggers Probabilistic context-free grammars (PCFGs) Discriminative methods:

• Conditional MRFs, Perceptron, Kernel methods

Page 11: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Learning mappingLearning mapping

Strings to hidden state sequences NE extraction, POS tagging

Strings to strings Machine translation

Strings to trees Parsing

Strings to relational data structures Information extraction

Page 12: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

POS as TaggingPOS as Tagging

INPUT:

Profits soared at Boeing Co., easily topping forecasts on Wall Street.

OUTPUT:

Profits/N soared/V at/P Boeing/N Co./N ,/, easily/ADV topping/V forecasts/N on/P Wall/N Street/N ./.

Page 13: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

NE as TaggingNE as Tagging

INPUT:

Profits soared at Boeing Co., easily topping forecasts on Wall Street.

OUTPUT:

Profits/O soared/O at/O Boeing/BC Co./IC ,/O easily/O topping/O forecasts/O on/NA Wall/BL Street/IL ./O

Page 14: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parsing TechnologyParsing Technology

Page 15: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Constituent ParsingConstituent Parsing

Page 16: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Constituent ParsingConstituent Parsing

Requires Phrase Structure GrammarRequires Phrase Structure Grammar CFG, PCFG, Unification Grammar

Produces phrase structure parse treeProduces phrase structure parse tree

Rolls-Royce Inc. said it expects its sales to remain steady

ADJP

VPNP

S

VP

S

NP

VP

NP

VP

Page 17: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Statistical ParsersStatistical Parsers

Probabilistic Generative Model of Language Probabilistic Generative Model of Language which include parse structure (e.g. Collins which include parse structure (e.g. Collins 1997)1997) Learning consists in estimating the parameters of

the model with simple likelihood based techniques

Conditional parsing models (Charniak 2000; Conditional parsing models (Charniak 2000; McDonald 2005)McDonald 2005)

Page 18: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ResultsResults

Method Accuracy

PCFGs (Charniak 97) 73.0%

Conditional Models – Decision Trees (Magerman 95) 84.2%

Lexical Dependencies (Collins 96) 85.5%

Conditional Models – Logistic (Ratnaparkhi 97) 86.9%

Generative Lexicalized Model (Charniak 97) 86.7%

Generative Lexicalized Model (Collins 97) 88.2%

Logistic-inspired Model (Charniak 99) 89.6%

Boosting (Collins 2000) 89.8%

Page 19: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Linear Models for Parsing and Linear Models for Parsing and TaggingTagging Three components:

GEN is a function from a string to a set of candidates

maps a candidate to a feature vector

W is a parameter vector

Page 20: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Component 1: GENComponent 1: GEN

GEN enumerates a set of candidates for a sentence

She announced a program to promote safety in trucks and vans

GEN

Page 21: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Examples of GENExamples of GEN

A context-free grammar A finite-state machine Top N most probable analyses from a

probabilistic grammar

Page 22: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Component 2: Component 2:

maps a candidate to a feature vector Rd

defines the representation of a candidate

<1, 0, 2, 0, 0, 15, 5>

Page 23: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

FeatureFeature

A “feature” is a function on a structure, e.g.,h(x) = Number of times is seen in x

Feature vector:Feature vector:

A set of functions h1…hd define a feature vector

(x) = <h1(x), h2(x) … hd(x)>

A

B C

Page 24: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Component 3: Component 3: WW

W is a parameter vector Rd

• W map a candidate to a real-valued score

Page 25: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Putting it all togetherPutting it all together

X is set of sentences, Y is set of possible outputs (e.g. trees)

Need to learn a function F : X → Y GEN, , W define

Choose the highest scoring tree as the most plausible structure

WyxFxGENy

)(argmax)()(

Page 26: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Dependency ParsingDependency Parsing

Page 27: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Dependency ParsingDependency Parsing

Produces dependency treesProduces dependency trees Word-word dependency relationsWord-word dependency relations Easier to understand and to annotate than Easier to understand and to annotate than

constituent treesconstituent trees

Rolls-Royce Inc. said it expects its sales to remain steady

SUBJ OBJ

MOD SUBJ

OBJ

SUBJ MODTO

Page 28: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Data-Driven Dependency Data-Driven Dependency ParsingParsing Graph BasedGraph Based

Consider possible dependency graphs Define score and select graph with highest score

Transition BasedTransition Based Define a transition system that leads to a parse

tree while analyzing a sentence one word at a time

Page 29: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Transition-based Shift-Reduce Transition-based Shift-Reduce ParsingParsing

Rig

ht HePP

sawVVD

aDT

girlNN

withIN

aDT

telescopeNN

.SENT

nexttop

Shif

tLe

ft

Page 30: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Shift/Reduce Dependency Shift/Reduce Dependency ParserParser Traditional statistical parsers are trained directly Traditional statistical parsers are trained directly

on the on the task of tagging a sentencetask of tagging a sentence Instead an SR Parser is trained and Instead an SR Parser is trained and learns the learns the

sequence of parse actionssequence of parse actions required to build the required to build the parse treeparse tree

Page 31: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Grammar Not RequiredGrammar Not Required

A traditional parser requires a grammar for A traditional parser requires a grammar for generating candidate treesgenerating candidate trees

An inductive parser needs no grammarAn inductive parser needs no grammar

Page 32: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parsing as ClassificationParsing as Classification

Inductive dependency parsingInductive dependency parsing Parsing based on Shift/Reduce actionsParsing based on Shift/Reduce actions Learn from annotated corpus which action to Learn from annotated corpus which action to

perform at each stepperform at each step

Page 33: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Dependency GraphDependency Graph

Let Let RR = { = {rr11, … , , … , rrmm}} be the set of permissible be the set of permissible dependency typesdependency types

A dependency graph for a string of wordsA dependency graph for a string of words

WW = = ww11 … … wwnn is a labeled directed graph is a labeled directed graph

D = D = ((W, AW, A)), where, where(a) (a) WW is the set of nodes, i.e. word tokens in the is the set of nodes, i.e. word tokens in the

input string,input string,

(b) (b) AA is a set of labeled arcs is a set of labeled arcs ((wwii, , wwjj, r, r),),wwii, , wwjj WW, , rr RR,,

(c) (c) wwjj WW, there is at most one arc, there is at most one arc((wwii, , wwjj, , rr) ) AA..

Page 34: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parser StateParser State

The parser state is a quadrupleThe parser state is a quadrupleSS, , II, , TT, , AA, where, whereS is a stack of partially processed tokensI is a list of (remaining) input tokensT is a stack of temporary tokensA is the arc relation for the dependency graph

(h, d, r) A represents an arc h → d, tagged with dependency r

Page 35: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parser ActionsParser Actions

ShiftShiftSS, , nn||II, , TT, , AA

nn||SS, , II, , TT, , AA

RightRightss||SS, , nn||II, , TT, , AA

SS, , nn||II, , TT, , AA{({(ss, , nn, , rr)})}

LeftLeftss||SS, , nn||II, , TT, , AA

SS, , ss||II, , TT, , AA{({(nn, , ss, , rr)})}

Page 36: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parser AlgorithmParser Algorithm

The parsing algorithm is fully deterministic The parsing algorithm is fully deterministic and works as follows:and works as follows:

Input Sentence: (w1, p1), (w2, p2), … , (wn, pn) S = <>

I = <(w1, p1), (w2, p2), … , (wn, pn)> T = <> while I != <> do begin

x = getContext(S, I, T);y = estimateAction(model, x);performAction(y, S, I, T);

end

Page 37: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ProjectivityProjectivity

An arc An arc wwii→→wwkk is projective iff is projective iff

jj, , ii < < jj < < kk or or i i > > jj > > kk,,wwii →*→* wwjj

A dependency tree is projective iff every arc A dependency tree is projective iff every arc is projectiveis projective

Intuitively: arcs can be drawn on a plane Intuitively: arcs can be drawn on a plane without intersectionswithout intersections

Page 38: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Non ProjectivityNon Projectivity

I saw a girl yesterday wearing a ring

Page 39: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Non ProjectivityNon Projectivity

Většinu těchto přístrojů lze take používat nejen jako fax , ale

Addressed by special actions:

Right2, Left2

Right3, Left3

Page 40: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Actions for non-projective Actions for non-projective arcsarcs

Right2Right2ss11||ss22||SS, , nn||II, , TT, , AA

ss11||SS, , nn||II, , TT, , AA{({(ss22, , rr, , nn)})}

Left2Left2ss11||ss22||SS, , nn||II, , TT, , AA

ss22||SS, , ss11||II, , TT, , AA{({(nn, , rr, , ss22)})}

Right3Right3ss11||ss22||ss33||SS, , nn||II, , TT, , AA

ss11||ss22||SS, , nn||II, , TT, , AA{({(ss33, , rr, , nn)})}

Left3Left3ss11||ss22||ss33||SS, , nn||II, , TT, , AA

ss22||ss33||SS, , ss11||II, , TT, , AA{({(nn, , rr, , ss33)})}

ExtractExtractss11||ss22||SS, , nn||II, , TT, , AA

nn||ss11||SS, , II, , ss22||TT, , AA

InsertInsertSS, , II, , ss11||TT, , AA

ss11||SS, , II, , TT, , AA

Page 41: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExampleExample

Right2Right2 ( (nejennejen ← ← aleale)) Left3Left3 ( (VětšinuVětšinu → → faxfax) )

Většinu těchto přístrojů lze take používat nejen jako fax , ale

Page 42: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExamplesExamples

zou gemaakt moeten worden in

zou moeten worden gemaakt in

Extract followed by Insert

Page 43: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExampleExample

Right2Right2 ( (nejennejen → → aleale)) Left3Left3 ( (faxfax → → VětšinuVětšinu) )

Většinu těchto přístrojů lze take používat nejen jako fax , ale

Page 44: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExampleExample

Většinu lze používat nejen fax ale

jako ,

těchto

přístrojů take

Right2Right2

Page 45: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExampleExample

Většinu lze používat fax ale

jako ,

těchto

přístrojů take nejen

Left3Left3

Page 46: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExampleExample

Většinu lze používat

jako těchto

přístrojů take

nejen

ale

,

fax

Page 47: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Effectiveness for Non-Effectiveness for Non-ProjectivityProjectivity Training data for Czech contains 28081 non-Training data for Czech contains 28081 non-

projective relationsprojective relations 26346 (93%) can be handled by Left2/Right226346 (93%) can be handled by Left2/Right2 1683 (6%) by Left3/Right31683 (6%) by Left3/Right3

Page 48: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Non-Projective AccuracyNon-Projective Accuracy

language total DeSR MaltParser

Czech 104 77 7979

Slovene 88 34 21

Portuguese 54 26 24

Danish 35 10 9

Page 49: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Learning PhaseLearning Phase

Page 50: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

FeaturesFeatures

Feature ID Value

F form of token

L lemma of token

P part of speech (POS) tag

M morphology

/F form of the leftmost child node

/L lemma of the leftmost child node

/P POS tag of the leftmost child node, if present

M\ Morphology of the rightmost child node

F\ form of the rightmost child node

L\ lemma of the rightmost child node

P\ POS tag of the rightmost child node, if present

M\ Morphology of the rightmost child node

Page 51: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Learning EventLearning Event

leggiNOM

leDET

antiADV

chePRO

,PON

SerbiaNOM

eranoVER

discusseADJ

chePRO

SostenevaVER

context

left context target nodes right context

(-3, F, che), (-3, P, PRO),(-2, F, leggi), (-2, P, NOM), (-2, M, P), (-2, /F, le), (-2, /P, DET), (-2, /M, P),(-1, F, anti), (-1, P, ADV),(0, F, Serbia), (0, P, NOM), (0, M, S),(+1, F, che), ( +1, P, PRO), (+1, F\, erano), (+1, P\, VER), (+1, M\, P),(+2, F, ,), (+2, P, PON)

Page 52: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

DeSR (Dependency Shift DeSR (Dependency Shift Reduce)Reduce) Multilanguage statistical transition based Multilanguage statistical transition based

dependency parserdependency parser Linear algorithmLinear algorithm Capable of handling non-projectivityCapable of handling non-projectivity Trained on 28 languagesTrained on 28 languages Available from:Available from:

http://desr.sourceforge.net/http://desr.sourceforge.net/

Page 53: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Parser ArchitectureParser Architecture

Modular learners architecture:Modular learners architecture: MLP, MaxEntropy, SVM, Perceptron

Features can be configuredFeatures can be configured

Page 54: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Available ClassifiersAvailable Classifiers

Maximum EntropyMaximum Entropy Fast, not very accurate

SVMSVM Slow, very accurate

Multilayer PerceptronMultilayer Perceptron Fast, very accurate

Deep LearningDeep Learning Word embeddings as features

Page 55: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

-

d

Up

dat

eD0

D1

D2

InputLayer

OutputLayer

Destinations

Perceptron:

Activationfunctions:

Learning:

The simplest ANN: The simplest ANN: PerceptronPerceptron

Slide by Geoffrey Hinton

Page 56: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Multilayer Perceptron

input vector

hidden layers

outputs

Back-propagate error signal to get derivatives for learning

Compare outputs with correct answer to get error signal

Slide by Geoffrey Hinton

Page 57: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

BP-BP-algorithmalgorithm

Act

ivat

ion

s

The error:

UpdateWeights:

0

1

0

.5

-5 5

erro

rs

Up

dat

e

Page 58: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Feature ModelFeature Model

LEMMALEMMA -2 -1 0 1 2 3 prev(0) leftChild(-1) leftChild(0)-2 -1 0 1 2 3 prev(0) leftChild(-1) leftChild(0)rightChild(-1) rightChild(0)rightChild(-1) rightChild(0)

POSTAGPOSTAG -2 -1 0 1 2 3 next(-1) leftChild(-1) leftChild(0) -2 -1 0 1 2 3 next(-1) leftChild(-1) leftChild(0) rightChild(-1) rightChild(0)rightChild(-1) rightChild(0)

CPOSTAGCPOSTAG -1 0 1-1 0 1FEATSFEATS -1 0 1-1 0 1DEPRELDEPREL leftChild(-1) leftChild(0) rightChild(-1)leftChild(-1) leftChild(0) rightChild(-1)

Page 59: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

2nd, 3rd Order Features2nd, 3rd Order Features

2nd2nd CPOSTAG(-1) CPOSTAG(0)CPOSTAG(-1) CPOSTAG(0)

2nd2nd CPOSTAG(0) CPOSTAG(1)CPOSTAG(0) CPOSTAG(1)

2nd2nd LEMMA(0) POSTAG(leftChild(0))LEMMA(0) POSTAG(leftChild(0))

3rd3rd POSTAG(leftChild(0)) LEMMA(0) POSTAG(leftChild(0)) LEMMA(0) POSTAG(rightChild(0))POSTAG(rightChild(0))

Page 60: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

CoNLL-X Shared TaskCoNLL-X Shared Task

To assign labeled dependency structures for To assign labeled dependency structures for a range of languages by means of a fully a range of languages by means of a fully automatic dependency parserautomatic dependency parser

Input: tokenized and tagged sentencesInput: tokenized and tagged sentences Tags: token, lemma, POS, morpho features, Tags: token, lemma, POS, morpho features,

ref. to head, dependency labelref. to head, dependency label For each token, the parser must output its For each token, the parser must output its

head and the corresponding dependency head and the corresponding dependency relationrelation

Page 61: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

CoNLL-X: Data FormatCoNLL-X: Data Format

NN WORDWORD LEMMALEMMA CPOSCPOS POSPOS FEATSFEATS HEADHEAD DEPREL PHEAD PDEPRELDEPREL PHEAD PDEPREL

11 AA oo artart artart <artd>|F|S<artd>|F|S 22 >N>N __ __22 direcçãodirecção direcçãodirecção nn nn F|SF|S 44 SUBJSUBJ __ __33 jájá jájá advadv advadv __ 44 ADVLADVL __ __44 mostroumostrou mostrarmostrar vv v-finv-fin PS|3S|INDPS|3S|IND 00 STASTA __ __55 boa_vontade boa_vontadeboa_vontade boa_vontade nn nn F|SF|S 44 ACCACC __ __66 ,, ,, puncpunc puncpunc __ 44 PUNCPUNC __ __77 masmas masmas conjconj conj-cconj-c <co-vfin>|<co-fmc><co-vfin>|<co-fmc> 44 COCO __ __88 aa oo artart artart <artd>|F|S<artd>|F|S 99 >N>N __ __99 grevegreve grevegreve nn nn F|SF|S 1010 SUBJSUBJ __ __1010 prossegueprossegue prosseguirprosseguir vv v-finv-fin PR|3S|INDPR|3S|IND 44 CJTCJT __ __1111 emem emem prpprp prpprp __ 1010 ADVLADVL __ __1212 todas_astodas_as todo_otodo_o pronpron pron-detpron-det <quant>|F|P<quant>|F|P 1313 >N>N __ __1313 delegaçõesdelegações delegaçõodelegaçõo nn nn F|PF|P 1111 P<P< __ __1414 dede dede prpprp prpprp <sam-><sam-> 1313 N<N< __ __1515 oo oo artart artart <-sam>|<artd>|M|S<-sam>|<artd>|M|S 1616 >N>N __ __1616 paíspaís paíspaís nn nn M|SM|S 1414 P<P< __ __1717 .. .. puncpunc puncpunc __ 44 PUNCPUNC __ __

NN WORDWORD LEMMALEMMA CPOSCPOS POSPOS FEATSFEATS HEADHEAD DEPREL PHEAD PDEPRELDEPREL PHEAD PDEPREL

11 AA oo artart artart <artd>|F|S<artd>|F|S 22 >N>N __ __22 direcçãodirecção direcçãodirecção nn nn F|SF|S 44 SUBJSUBJ __ __33 jájá jájá advadv advadv __ 44 ADVLADVL __ __44 mostroumostrou mostrarmostrar vv v-finv-fin PS|3S|INDPS|3S|IND 00 STASTA __ __55 boa_vontade boa_vontadeboa_vontade boa_vontade nn nn F|SF|S 44 ACCACC __ __66 ,, ,, puncpunc puncpunc __ 44 PUNCPUNC __ __77 masmas masmas conjconj conj-cconj-c <co-vfin>|<co-fmc><co-vfin>|<co-fmc> 44 COCO __ __88 aa oo artart artart <artd>|F|S<artd>|F|S 99 >N>N __ __99 grevegreve grevegreve nn nn F|SF|S 1010 SUBJSUBJ __ __1010 prossegueprossegue prosseguirprosseguir vv v-finv-fin PR|3S|INDPR|3S|IND 44 CJTCJT __ __1111 emem emem prpprp prpprp __ 1010 ADVLADVL __ __1212 todas_astodas_as todo_otodo_o pronpron pron-detpron-det <quant>|F|P<quant>|F|P 1313 >N>N __ __1313 delegaçõesdelegações delegaçõodelegaçõo nn nn F|PF|P 1111 P<P< __ __1414 dede dede prpprp prpprp <sam-><sam-> 1313 N<N< __ __1515 oo oo artart artart <-sam>|<artd>|M|S<-sam>|<artd>|M|S 1616 >N>N __ __1616 paíspaís paíspaís nn nn M|SM|S 1414 P<P< __ __1717 .. .. puncpunc puncpunc __ 44 PUNCPUNC __ __

Page 62: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

CoNLL: Evaluation MetricsCoNLL: Evaluation Metrics

Labeled Attachment Score (LAS)Labeled Attachment Score (LAS) proportion of “scoring” tokens that are assigned

both the correct head and the correct dependency relation label

Unlabeled Attachment Score (UAS)Unlabeled Attachment Score (UAS) proportion of “scoring” tokens that are assigned

the correct head

Page 63: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Well-formed Parse TreeWell-formed Parse Tree

A graph D = (W, A) is well-formed iff it is acyclic, projective and connected

Page 64: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ExamplesExamples

Il governo garantirà sussidi a coloro che cercheranno lavoro

He designs and develops programs

Page 65: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

SolutionSolution

Il governo garantirà sussidi a coloro che cercheranno lavoro

He designs and develops programs

PREDREL

SUBJSUBJ OBJOBJ

Page 66: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Error Correction: Tree Error Correction: Tree RevisionRevision

Learn from its own mistakesLearn from its own mistakes Second stage fixes errorsSecond stage fixes errors

Page 67: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Stacked Shift/Reduce Stacked Shift/Reduce ParserParser

Train

TreeBank train LR ParserLR Parser

ParsedTreeBank

Hints

TreeBank

TreeBankwith Hints train RL ParserRL Parser

Parse

LR ParserLR Parser

ParsedTest

TestTest

with Hints

Hints

RL ParserRL Parser

ParsedTest with Hints

Use less accurate classifier

Page 68: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Tree Revision CombinationTree Revision Combination

Linear parser (Left to Right) with hints from Linear parser (Left to Right) with hints from other linear parser (Right to Left)other linear parser (Right to Left)

Approximate linear combination algorithmApproximate linear combination algorithm Overall linear complexityOverall linear complexity

Page 69: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

CoNLL 2007 ResultsCoNLL 2007 Results

Language LR RL Rev2 CombCoNLL Best

Czech 77.12 78.20 79.95 80.57 80.19

English 86.94 87.44 88.34 89.00 89.61

Italian 81.40 82.89 83.52 84.56 84.40

Page 70: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Evalita 2009 ResultsEvalita 2009 Results

Corpus DeSR Best

Turin TreeBank 88.67 88.73

ISST 83.38 83.38

Evaluation of Italian linguistic toolsfor parsing, POS tagging, NER tagging

Page 71: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Evalita 2014 ResultsEvalita 2014 Results

Metric LAS UAS

Parser accuracy 87.89 90.16

Metric Prec. Rec. F1

Relation Accuracy 81.89 90.45 85.95

Page 72: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Dependencies Dependencies encode encode relational relational structurestructure

Relation Extraction with Relation Extraction with Stanford DependenciesStanford Dependencies

Page 73: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Dependency paths identify Dependency paths identify relationsrelationslike protein interactionlike protein interaction[Erkan et al. EMNLP 07, Fundel et al. 2007][Erkan et al. EMNLP 07, Fundel et al. 2007]

KaiC KaiC nsubj interacts prep_withnsubj interacts prep_with SasA SasAKaiC KaiC nsubj interacts prep_withnsubj interacts prep_with SasA conj_and SasA conj_and KaiA KaiAKaiC KaiC nsubj interacts prep_withnsubj interacts prep_with SasA conj_and SasA conj_and KaiB KaiB

demonstrated

results

KaiC

interacts

rythmically

nsubj

The

compldet

ccomp

thatnsubj

KaiBKaiA

SasAconj_and conj_and

advmod

prep_with

slide by C. Manning

Page 74: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Universal DependenciesUniversal Dependencies

[de Marneffe et al. LREC 2006][de Marneffe et al. LREC 2006]• The basic dependency representation is projectiveThe basic dependency representation is projective• It can be generated by postprocessing headed phrase It can be generated by postprocessing headed phrase

structure parses (Penn Treebank syntax)structure parses (Penn Treebank syntax)

• It can also be generated directly by It can also be generated directly by dependency parsersdependency parsers

jumped

boy over

the thelittle

prepnsubj

det amod pobj

fence

det

slide by C. Manning

Page 75: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Graph modification to facilitate Graph modification to facilitate semantic analysissemantic analysis

Bell, based in LA, makes and distributes

electronic and computer products.

makes

and

nsubj dobj

products

computer

conjcc

and

electronicamod

Bell

in

prep

partmod

based

pobjLA

cc

conj distributes

slide by C. Manning

Page 76: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Triple notationTriple notation

nsubj(makes-8, Bell-1)nsubj(makes-8, Bell-1)nsubj(distributes-10, Bell-1)nsubj(distributes-10, Bell-1)partmod(Bell-1, based-3)partmod(Bell-1, based-3)nn(Angeles-6, Los-5)nn(Angeles-6, Los-5)prep in(based-3, Angeles-6)prep in(based-3, Angeles-6)conj and(makes-8, distributes-10)conj and(makes-8, distributes-10)amod(products-16, electronic-11)amod(products-16, electronic-11)conj and(electronic-11, computer-13)conj and(electronic-11, computer-13)amod(products-16, computer-13)amod(products-16, computer-13)conj and(electronic-11, building-15)conj and(electronic-11, building-15)amod(products-16, building-15)amod(products-16, building-15)dobj(makes-8, products-16)dobj(makes-8, products-16)

Page 77: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Graph modification to facilitate semantic analysisGraph modification to facilitate semantic analysis

Bell, based in LA, makes and distributes

electronic and computer products.

makes

nsubj dobj

products

computerconj_and

electronicamod

Bell

prep_in

partmod

based

LA

conj_and distributes

amod

nsubj

slide by C. Manning

Page 78: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

BioNLP 2009/2011BioNLP 2009/2011

Relation extraction shared tasks Relation extraction shared tasks [Björne [Björne et al. 2009]et al. 2009]

slide by C. Manning

Page 79: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Input: Universal Input: Universal DependenciesDependencies

slide by H. Poon

involvement

up-regulation

IL-10human

monocyte

prep_innn prep_by

gp41 p70(S6)-kinase

activation

prep_in prep_of

nn

Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocyte by gp41 …

Page 80: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Joint PredictionsJoint Predictions

involvement

up-regulation

IL-10human

monocyte

prep_innn prep_by

gp41 p70(S6)-kinase

activation

prep_in prep_of

nn

Trigger word?Event type?

slide by H. Poon

Page 81: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ReferencesReferences

G. Attardi, G. Attardi, Experiments with a Multilanguage Non-Projective Dependency Parser, , Proc. of the Tenth Conference on Natural Language Proc. of the Tenth Conference on Natural Language LearningLearning, New York, (NY), 2006. , New York, (NY), 2006.

G. Attardi, F. Dell'Orletta. G. Attardi, F. Dell'Orletta. Reverse Revision and Linear Tree Combination for Dependency Parsing. . Proc. of NAACL HLT 2009Proc. of NAACL HLT 2009, 2009., 2009.

G. Attardi, F. Dell'Orletta, M. Simi, J. Turian. G. Attardi, F. Dell'Orletta, M. Simi, J. Turian. Accurate Dependency Parsing with a Stacked Multilayer Perceptron. . Proc. of Workshop Evalita 2009Proc. of Workshop Evalita 2009, ISBN 978-88-903581-1-, ISBN 978-88-903581-1-1, 2009.1, 2009.

H. Yamada, Y. Matsumoto. 2003. Statistical Dependency H. Yamada, Y. Matsumoto. 2003. Statistical Dependency Analysis with Support Vector Machines. In Analysis with Support Vector Machines. In Proc. Proc. IWPT.IWPT.

M. T. Kromann. 2001. Optimality parsing and local cost functions in discontinuous grammars. In Proc. FG-MOL.

Page 82: Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

ReferencesReferences

D. Cer, M. de Marneffe, D. Jurafsky and C. D. Cer, M. de Marneffe, D. Jurafsky and C. Manning, Parsing to Stanford Dependencies: Manning, Parsing to Stanford Dependencies: Trade-offs between speed and accuracy, In Trade-offs between speed and accuracy, In Proc. of LREC-10Proc. of LREC-10. 2010 . 2010

Download - Parsing Giuseppe Attardi Dipartimento di Informatica Università di Pisa.