Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

Ruey-Cheng Chen

arxiv: 1607.05822 · v2 · pith:PILXBIZWnew · submitted 2016-07-20 · 💻 cs.CL

Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

Ruey-Cheng Chen This is my paper

classification 💻 cs.CL

keywords wordmodelsegmentationselectionunsupervisedapproachfullyincremental

0 comments

read the original abstract

We present a novel incremental learning approach for unsupervised word segmentation that combines features from probabilistic modeling and model selection. This includes super-additive penalties for addressing the cognitive burden imposed by long word formation, and new model selection criteria based on higher-order generative assumptions. Our approach is fully unsupervised; it relies on a small number of parameters that permits flexible modeling and a mechanism that automatically learns parameters from the data. Through experimentation, we show that this intricate design has led to top-tier performance in both phonemic and orthographic word segmentation.

This paper has not been read by Pith yet.

Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

discussion (0)