Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM Training

Detlef Prescher; Jonas Kuhn; Mark Johnson; Stefan Riezler

arxiv: cs/0008034 · v1 · submitted 2000-08-30 · 💻 cs.CL

Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM Training

Stefan Riezler , Detlef Prescher , Jonas Kuhn , Mark Johnson This is my paper

classification 💻 cs.CL

keywords trainingambiguityconstraint-basedgaingrammargrammarslog-linearmatch

0 comments

read the original abstract

We present a new approach to stochastic modeling of constraint-based grammars that is based on log-linear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparison to training from a parsebank shows a 10% gain from EM training. Also, a new class-based grammar lexicalization is presented, showing a 10% gain over unlexicalized models.

This paper has not been read by Pith yet.

Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM Training

discussion (0)