Master Thesis: Extending of Conditional Neural Feild based on Mixture of Experts for Labeling of Sequence data, 2017
Sequence labeling is one of the important problems in pattern recognition which involves assignment of labels to each member of various kinds of sequences like sequences of characters, images or speech. One approach for sequence labeling is to model input-output structure as a graphical model. Farther analysis can be performed on the designed graphical model. Graphical models are powerful tools for modeling probability distributions with large number of variables. In some problems like handwriting recognition, the relation between input and output features can be nonlinear and highly complicated. Initially, generative models such as Hidden Markov Model (HMM) were commonly used for sequence labeling. After a while, another model called Conditional random field (CRF) was represented which became popular quickly because of its capabilities in resolving some issues related to previous generative models. CRF is a discriminative probabilistic model. Experiments in recent years show that combining CRF with other models increases the performance. In this thesis, the combination of the Conditional Random Field model and the concept of mixture of experts is investigated. A mixture of experts model increases the learning accuracy through partitioning the input space and having a focused expert network for every partition. It has been shown that utilizing mixture of experts model in learning a model will increase its performance. In this research, by using a number of expert networks, which are some types of neural networks, between the input and output layers of a CRF model, a higher level of features is obtained from the observation sequences and used for training the model. A clustering algorithm is used to assign input strings to experts. To do this, due to the inequality of the length of the observation sequences, clustering is initially performed on the elements of each sequence. After this, there will be two choices for assigning experts to input data: In the first choice, by voting among the clusters of the elements of a string, its cluster is determined and the entire elements of that sequence will be used for learning the related expert and model parameters. In the second choice, according to the cluster related to each element, the training can be performed on all experts assigned to clusters of the input elements. The result of these two choices gives us two models. Experimental results in the application of handwriting recognition demonstrate that the proposed models can considerably improve recognition accuracy in comparison to previous models. In this research, the comparison is performed with models such as neural networks, conditional random field and conditional neural field. The results indicate that the first and second proposed models improve the recognition accuracy up to 7% and 7.5% respectively.
Keywords: Sequence Labeling, Log-Linear Model, Discriminative Model, Conditional
Random Field, Mixture of Experts