Slearp (structured learning and prediction) is the structured learning and predict toolkit for tasks such as g2p conversion, based on discriminative leaning. Currently, slearp is implemented to be convenient for a learning of g2p conversion model. However, slearp can be applied in another structured learning problems such as morphological analysis, with some devisals.
Given a training dataset which the cosegmentation (string alignment) has already been performed, slearp learns a model (e.g. g2p conversion model). Next, given a source sequence (e.g. grapheme sequence), slearp predicts a appropriate target sequence (e.g. phoneme sequence) with the segmentation of the source and target sequence, using the model learned by slearp.
So far, slearp supports L2 norm conditional random fields (CRF), Structured Adaptive Regularization of Weight Vectors (Structured AROW), Structured Narrow Adaptive Regularization of Weights (Structured NAROW), and Structured Soft Margin Confidence Weighted Learning (Structured SMCW) as discriminative leaning. Structured SMCW get higher performance than other approches. The license of slearp is GNU GPL.
Developer implementing slearp is below.
NAIST(Nara Institute of Science and Technology)
Graduate School of Information Science
Augmented Human Communication Laboratory
The Doctoral Program
Keigo Kubo