deocument deskweiDocument Analysis

Pattern Recognition Review Papers

Math formula recognition



B. Albadr and S.A. Mahmoud, "Survey and bibliography of Arabic optical text recognition," Signal Processing, vol. 41, no. 1,
1995, 49-77.

G. Butler, P. Grogono, R. Shinghal, I.A. Tjandra. 1995. Knowledge and the Recognition and Understanding of Software Documents.

S.K. Kim, D.W. Kim, and H.J. Kim. A recognition of vehicle license plate using a genetic
 algorithm based segmentation. In Proceedings of ICIP, pages 661--664, 1996.

G. Nagy, S. Seth, and M. Viswanathan. A prototype document image analysis system for
 technical journals. IEEE computer, 25(7):10--22, 1992.

Document Layout Analysis

geometric layout == page segmentation
logical layout == block classification

Connected-components analysis, projection profiles, texture analysis, background analysis, smearing, morphology,
and block classifications.

L. O'Gorman, R. Kasturi, Document Image Analysis, IEEE Computer Society Press, Los Alamos,
 California, 1995.

Cattoni, R., T. Coianiz, S. Messelodi, and C. M. Modena. 1998. Geometric Layout Analysis Techniques for Document Image Understanding: a Review. ITC-IRST Technical Report #9703-09.

Haralick, R. M. 1994. Document image understanding: Geometric and logical layout. Proceedings of the Conference on Computer Vision and Pattern Recognition. 385-90.

TrueViz is an anotation tool for visualizing and editing
groundtruth/metadata written in Java and works on Windows and Unix.

Pink Panther is an environment for creating segmentation groundtruth files and for page segmentation benchmarking.

PinkPanther [11] is a tool for creating geometric metadata for scanneddocument images

Yanikoglu, B. A., and L. Vincent. 1998. Pink Panther: A complete environment for ground-truthing and
 benchmarking document page segmentation. Pattern Recognition 31(9): 1191-204.

Cattoni, et al (1997) reviews 23 papers grouped into following categories: projection profiles, Hough transform, connected-component (nearest neighbor) clustering, and correlation between lines, gradient analysis, Fourier spectrum, morphological transforms, and subspace line detection. Principal component analysis (Steinherz, et al. 1999) has also been used for deskewing.

Cattoni, R., T. Coianiz, S. Messelodi, and C. M. Modena. 1998. Geometric Layout Analysis Techniques for Document Image Understanding: a Review. ITC-IRST Technical Report #9703-09.

projection profile 1 9 8 (Baird 1987; Postl 1986)
Fourier Transforms (Hase and Hoshino 1985)
Morphological transforms (Chen and Haralick 1994)
Hough-inspired techinques (Hinds, et al 1995; Nakano et al 1990)
nearest neighbor clustering (Hashizume, et al. 1986; O'Gorman 1993)

Mangattan layout analysis (Baird 1994).
A.  Bagdanov and J. Kanai, Projection profile based skew estimation algorithm for JBIG compressed images.  In:
Proceedings of the Fourth International Conference on Document Analysis and Recognition, Aug 18-20, 1997, Ulm, Germany.

A.  Bagdanov and J. Kanai, Evaluation of document image skew estimation techniques.  In: Document Recognition III, Proc.
SPIE Vol. 2660, p. 343-353, 1996.

Baird. 1994. Background structure in document images. International Journal of Pattern Recognition and Artificial intelligence 8(5):1013-30.

H. S. Baird, "The Skew Angle of Printed Documents," Proc., 1987 Conf. of the Society of
 Photographic Scientists and Engineers, Rochester, New York, May 20--21, 1987.

Chen S., and Haralick R. M., An automatic algorithm for text skew estimation in document
 images using recursive morphological transforms. ICIP-94, Austin, TX, pp. 139143, Nov. 1994.

Su Chen, Robert M. Haralick, and Ihsin T. Phillips. Automatic text skew estimation in document
 images. In Proceedings of the Third International Conference on Document Analysis and Recognition, Montreal,
 Canada, pages 1153--1156. IEEE Computer Society Press, August 1995.

O'Gorman, L. 1993. The document spectrum for page layout analysis. IEEE Transaction on Pattern Analysis
 Machine Intelligence 15:1162-73.

M. Hase and Y. Hoshino, "Segmentation Method of Document Images by Two-Dimensional
 Fourier Transformation, " Systems and Computers in Japan, Vol. 16, No. 3, 1985.

Hashizume, A.,  P. S. Yeh, and A. Rosenfeld. 1986. A method of detecting the orientation of aligned
 components. Pattern Recognition Letters 4: 125-32.

S.C. Hinds, J.L. Fisher, D.P. d'Amato, A document skew detection method using run-length
 encoding and the Hough transform, Proc. 10th Int. Conf. on Pattern Recognition 1990, pp. 464---468.

Y. Ishitani. Document Skew Detection Based on Local Region Complexity. In Proc. of the 2nd
 International Conference on Document Analysis and Recognition, pages 49--52, Tsukuba, Japan, October 1993.
 IEEE Computer Society.

J.  Kanai and A. Bagdanov, Projection profile based skew estimation algorithm.  International Journal on Document Analysis and Information Retrieval, Vol. 1, 1, 1998.

J.  Kanai and A. Bagdanov, Concurrent Processing of JBIG Decompression and Skew Estimation, In: Proceedings of the 1997 International Conference on Parallel and Distributed Processing Techniques and Applications, 1997.

Nakano, Y., Y. Shima, H. Fujisawa, J. Higashino, and M. Fujinawa. 1990. An algorithm for the skew normalization of document images. Proceedings of the 10th Annual Pattern Recognition Conference, 8-13.

W. Postl, \Detection of Linear Oblique Structures and Skew Scan in Digitized Documents," Proc.
 8th Int. Conf. Pattern Recognition, Paris, France, pp. 687-689, Oct. 1986.

R. Smith (1995). A simple and efficient skew detection algorithm via text row accumulation.
 Proc. of the 3rd Int. Conf. on Document Analysis and Recognition, pp. 1145-1148, IEEE Computer Society: Los
 Alamitos, CA..

Steinherz, N., Intrator, and E. Rivlin.  1999. Skew detection via principal components analysis.  Proceedings of the International Conference on Document Aanalysis and Recognition. 153-6.

C.L. Yu, Y.Y. Tang, and C.Y. Suen 1995. Document skew detection based on the fractal and
 least squares method, Proc. of the Third Int. Conf. on Document Analysis and Recognition, pp. 1149-1152, IEEE Computer Society: Los Alamitos, CA.

lUSC: 22.2.8 Correction of Document Skew


USC database: Binarization -- Threshold selection for documents


projection analysis,
connected-component analysis
Hidden Markov Model


Casey, R. G., and E. Lecolinet. 1996. A survey of methods and strategies in character segmentation.
 IEEE Transactions on Pattern Analysis and Machine Intelligence 18(7): 690-706.

Dunn, C. E., and P. S. P. Wang. 1992. Character segmentation techniques for handwritten text-a survey.11th IAPR International Conference on Pattern Recognition. 577 -580

Elliman, D. G., and I. T. Lancaster, "A Review of Segmentation and Contextual Analysis
 Techniques for Text Recognition," Pattern Recognition, Vol. 23, PP. 337-346, 1990.

Fu, S. K., and J. K. Mu.1981. A survey on image segmentation. Pattern Recognition. 13(1): 3-16.

Fujisawa, H., Nakano, Y., and Kurino, K. 1992. Segmentation methods for character recognition: from segmentation to document structure analysis. Proceedings of the IEEE  80(7): 1082-7.

Haralick, R. M., and L. G. Shapiro. 1985. Survey: Image segmentation techniques. Computer Vision, Graphics, and Image Processing, 29:100-132.

(Duda, Hart and Stork 2001)
Statistical methods
Decision trees
Neural nets
support-vector machines (Vapnik 1995)
Combination of classifiers (Xu, et al. 1992;  Kuncheva, et al. 2001)

Bagging (Breiman 1996), and Boosting (Freund and Schapire 1995)

Breiman, L. 1996a. Bagging predictors. Machine Learning 24: 123-40.

Duda, R. O., P. E. Hart, and D. G. Stork. 2001. 2d ed. Pattern classification. New York: Wiley.

John T. Favata and Geetha Srikantan,A Multiple Feature/Resolution Approach to Handprinted Digit and Character Recognition, International Journal of Imaging Systems and Technology, Vol. 7, pp. 304 - 311, Winter 1996.  [multiple classifiers?]

Freund, Y., and  R. E. Schapire. 1996. Experiments with a new boosting algorithm.

Freund, Y., and  R. E. Schapire.  1995. A decision-theoretic generalization of on-line learning and
    an application to boosting. Proceedings of the Second European Conference on Computational Learning Theory.  23-37.

Kuncheva, L. I., J. C. Bezdek, and R. P.W. Duin. 2001. Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition. 299-314.

Vapnik, V. N. 1995. The nature of statistical learning theory. New York: Springer-Verlag.

Wolpert, D. H. 1992. Stacked generalization. Neural Networks 5: 241-59.

Xu, L., A. Krzyzak, and C. Y. Suen. 1992. Methods of combining multiple classifiers and their
 applications to handwriting recognition. IEEE Transactions on Systems, Man and Cybernetics, 22(3): 418-35.

Feature extraction

Trier, et al. (1996) reviews nearly 100 papers on feature extraction methods including:  template matching, graph description, projections, unitary image transforms, contour profiles, zoning, moments, spline curve approximation, and Fourier descriptors.

Trier, D., A. K. Jain, and T. Taxt. 1996. Feature extraction methods for character recognition - a survey. Pattern Recognition 29(4): 641-62.


XITE - X-based Image Processing Tools and Environment
in Experimental environments for Computer Vision and image processing
Tor Lønnestad and Otto Milvang
Eds. H.I. Christensen and J.L.Crowley
Pages 63-88, ISBN 981-02-1510-X, 1994

Cracknell, C., and A. C. Downton. 1999. A handwriting understanding environment (HUE) for rapid prototyping in handwriting and document analysis research.  Proceedings of the International Conference on Document Analysis and Recognition, 362-5.





Illuminator: a toolset for developing OCR and Image Understanding

Also, CMU Illuminator Plug-In Extensions: This package implements dynamic
library Plug-Ins for the RAF Illuminator. It allows externally compiled
code to be loaded into the Illuminator during run-time. Developed by Jason
McMullan, Carnegie Mellon University under the direction of Robert H.

Fruchterman, T. 1995. DAFS: A standard for document and image understanding. Proceedings of
 the Symposium on Document Image Understanding Technology, 94-100.

Text Finder

 L.A. Fletcher, R. Kasturi, A Robust algorithm for text string separation from mixed
 text/graphics images, IEEE Trans. Pattern Anal. Machine Intell. 10 (1988) 910--918.

A. K. Jain and S. Bhattacharjee. Text segmentation using Gabor filters for automatic
 document processing. Machine Vision and Applications, 5:169--184, 1992.

A.K. Jain and B. Yu, "Automatic text location in images and video frames," in Proceedings of
 ICPR, 1998, pp. 1497--1499.

C. Strouthopoulos and N. Papamarkos. 1998. Text identification for document image analysis using a neural network. Image and Vision Computing 16(12): 879--96.

F.M. Wahl, K.Y. Wong, R.G. Casey, Block segmentation and text extraction in mixed
 text/image documents, Computer Graphics and Image Processing 20 (1989) 375--390.

V. Wu and E. Riseman. 1999. Textfinder: An automatic system to detect and recognize text in images. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(11):

V. Wu, R. Manmatha, and E. M. Riseman. Finding text in images. In Proc. 2nd ACM Int. Conf.
 on Digital Libraries, pages 3--12. ACM Press, July 23--26 1997.

Y. Zhong, K. Karu, and A.K. Jain, "Locating text in complex color images," Pattern
 Recognition, vol.28, no.10, pp.1523--1535, 1995.

Video OCR

H. Li and D. Doermann, "Automatic identification of text in digital video key frames," in Proc.
 of the IEEE International Conference on Pattern Recognition, pp. 129--132, 1998.

 Huiping Li, David Doermann, and Omid Kia. Automatic text detection and tracking in digital
 videos. IEEE Transactions on Image Processing, 9(1):147--156, January 2000.

 R. Lienhart and F. Stuber, "Automatic text recognition in digital videos," in Proceedings of
 ACM Multimedia, 1996, pp. 11--20.

T. Sato, T. Kanade, E. Hughes, and M. Smith. Video OCR for digital news archives. In
 Proceedings of IEEE Workshop on Content-Based Access to Image and Video Databases, 1998.

2D Shape

Marshall S. Review of shape coding techniques. Image and Vision Computing, 7(3):281-294, November 1989.

Pavlidis T. Algorithms for shape analysis of contours and waveforms. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2(4):301-312, july 1980.

Pavlidis, T., A Review of Algorithms for Shape Analysis, CGIP(7), No. 2, April 1978, pp. 243-258.