Lafferty, J., A. McCallum, and F. Pereira. 2001.
Conditional random fields: Probabilistic models for segmenting and
labeling sequence data.
In Proceedings of the 18th International Conference on Machine
Learning, Williamstown, Mass., pp. 28289.The original paper on conditional random fields. Contains a good discussion of the labelbias problem. Notoriously difficult to understand  see other references for friendlier introductions to the topic. 

McCallum, A., D. Freitag, and F. Pereira. 2000.
Maximum entropy Markov models for information extraction and
segmentation.
In Proceedings of the 17th International Conference on Machine
Learning, Stanford, Calif., pp. 59198.Original paper on maximum entropy Markov models (MEMMs). Motivates the use of discriminative models for language. Approachable introduction to the subject with impressive results. 

Murphy, K. 1998.
A brief introduction to graphical models and Bayesian networks.
Online tutorial.
http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html. [ .html ] Good tutorial on Bayesian belief networks, including a discussion of the human reasoning paradigms they are meant to model and the mistaken assumption that Bayesian networks imply a use of Bayesian statistics. Contains the excellent diagram of the generative model family by S. Roweis and Z. Gharamani. Details common and less common members of the HMM family and Includes many other diagrams and links to external sources. 

Murphy, K. 2002.
A tutorial on dynamic Bayesian networks.
Presentation to the MIT AI lab.
http://www.cs.ubc.ca/ murphyk/Papers/dbntalk.pdf. [ .pdf ] Another tutorial on dynamic Bayesian networks that discusses the problems with factorial and coupled HMMs in more detail. Covers more ground than the author's 1998 tutorial at a higher level, including a short discussion of the RaoBlackwellised particle filter. Erroneously attributes MAP estimation to frequentist statistics. 

Rabiner, L. R. 1989.
A tutorial on hidden Markov models and selected applications in
speech recognition.
Proceedings of the IEEE 77 (2): 25787.The classic tutorial on hidden Markov models (HMMs). Heavy emphasis on the extension to HMMs commonly used in speech recognition (and the Aruspix project). Difficult to ascertain the target audience; only the first few sections will be helpful to newcomers. 

Raphael, C. 1999.
Automatic segmentation of acoustic musical signals using hidden
Markov models.
IEEE Transactions on Pattern Analysis and Machine
Intelligence 21 (4): 36070.Details the foundation of the automatic accompaniment system that Raphael built later. Another good example of hidden Markov models as applied to musical applications. 

Raphael, C. 2001.
A probabilistic expert system for automatic musical accompaniment.
Journal of Computational and Graphical Statistics 10 (3):
487512.One of the most impressive uses of an HMM variant in the domain of music information retrieval. Details an elaborate Bayesian network for modelling the accomaniment process. The system is implemented and successfully adjusts its tempo to the performer in real time. 

Roweis, S., and Z. Ghahramani. 1999.
A unifying review of linear Gaussian models.
Neural Computation 11 (2): 20545.Excellent introduction to the hidden Markov model and its relatives in the generative family. Includes pseudocode implementations of the most significant algorithms. Does not include the authors' famous diagram of the family. 

Sutton, C., and A. McCallum. 2006.
An introduction to conditional random fields for relational learning.
In L. Getoor and B. Taskar (Eds.), Introduction to Statistical
Relational Learning, [n.p.]. MIT Press.A more accessible introduction to conditional random fields (CRFs) than the original paper. Good discussion of the link between CRFs and HMMs and the distinction between generative and discriminative models. Concludes with a new model for time dependencies in the CRF, the skipchain CRF. 

Welch, G., and G. Bishop. 1995.
An introduction to the Kalman filter.
Technical Report 95041, Univ. of N.C. at Chapel Hill, Dept. of
Computer Science.Classic introduction to the Kalman filter and the extended Kalman filter. Careful and clear description of the mathematics involved. Has more of a feel of electrical engineering than computer science. 
This file has been generated by bibtex2html 1.75