Overview of the SPMRL 2013 Shared Task:Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Central topic

Provide standard datasets for morphologically rich languages in different representations and parsing scenarios.
Standardize the evaluation protocol on morphologically ambiguous input
Raise community awareness with regard to the difficulty of parsing morphologically rich languages

Methodology

Datasets

Include data in both constituency and dependency annotation.
full data setup and small setup (5,000 sentences)
Three parsing scenarios:
- gold segmentation, pos tags, and morphological features are provided
- automatically predicted segmentation, pos tags and features
- lattice of multiple possible morphological analyses and joint disambiguation of the morphological analysis and syntactic structure

Findings

Previous research

first statistical parsing models were generative and based upon treebank grammars
applying the phrase-based treebank grammar tecniques is sentsitive to language and annotation properties, and these models are not easily portable across languages and schemes.

Notable quotes

While progress on parsing English -- the main language of focus for the ACL community -- has inspired some advances on other languages, it has not, by itself, yielded high-quality parsing for other languages and domains. This holds in particular for morphologically rich languages... where important information concerning the predicate-argument structure of sentences is expressed through word formation, rather than constituent-order patterns as is the case in English and other configurational languages. p. 146

recently, advances in PCFG-LA parsing (Petrov et al. 2006) and language-agnostic data-driven dependency parsing (McDonald et al. 2005; Nivre et al. 2007b) have made it possible to reach high accuracy with classical feature engineering techniques in addition to, or instead of, language specific knowledge. p. 147

Detecting Sarcasm is Extremely Easy ;) (Parde & Nielson 2018)

Harnessing Context Incongruity for Sarcasm Detection (Joshi et al 2015)

Sarcasm as Contrast between a Positive Sentiment and Negative Sentiment

Catastrophic Interference in Neural Embedding Models (Dachapally & Jones)

Querying word embeddings for word similarity and relatdness

Multi-Task Deep Neural Networks for Natural Language Understanding

Riordan et al., 2019

Horbach et al., 2019

Riordan et al. 2020

How do you determine the worth of a language?

November 6th 2019: Hai, Peng

Alan Ridel

Hai Hu 02-19-2020

Zeeshan 02-19-2020

Overview of the SPMRL 2013 Shared Task:Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Dependency Parsing

Characterizing the Errors of Data-Driven Dependency Parsing Models

January 17th - Job search

Job talk Monica Nesbit

BLiMP: A Benchmark of Linguistic Minimal Pairs for English

Swahili Syntax (Anthony Vitale, 1981)

Developing Universal Dependencies for Wolof

Towards a dependency-annotated treebank for Bambara (Aplonova & Tyers 2018)

A Universal Part-of-Speech Tagset (Petrov, Das, McDonald)

Universal Depedencies v1: A Multilingual Treebank Collection

Reusing Grammatical Resources for New Languages

Estonian Dependency Treebank: from Constraint Grammar Tagset to Universal Dependencies

Learning Morphosyntactic analyzers from the bible via iterative annotation projection across 26 languages

Overview of the SPMRL 2013 Shared Task:Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Central topic

Methodology

Datasets

Findings

Previous research

Notable quotes

Follow up readings

No Comments