Problem: Comprehensive (almost exhaustive) search on leave-one-out and cross validation. Date: 2005/05/11 (important for continual updates) Start with Web of Science to find seminal papers (but recent) Find recent papers on the subject, to find often cited papers from http://www.gems-system.org for non-commercial use. Contact: alexander.statnikov@vanderbilt.edu 1 INTRODUCTION literature raising doubts about the generalization ability of classifiers produced by major studies in the field (Schwarzer and Vach, 2000; Reunanen, 2003; Guyon et al., 2003, http://www.clopinet.com/isabelle/Papers/ RFE-erratum.html). meta-analytic assessment of 84 published microarray cancer outcome predictive studies (Ntzani and Ioannidis, 2003), it was found that 74%of the studies did not perform independent validation or cross-validation of proposed findings, 13% applied cross-validation in an incomplete fashion and only 13% performed cross-validation correctly. In a paper published in 1998, ``What Size Test Set Gives Good Error Rate Estimates'', we address the difficult problem of finding an optimum split of the data into training set and test set. My present research addresses the difficult problem of input variable selection. "Comparison of Classifier Methods: a Case Study in Handwriting Digit Recognition'', http://www.clopinet.com/isabelle/Projects/SVM/applist.html Dietterich (1998) and Nadeau and Bengio (2001). If there are sufficiently many examples, it may not be necessary to split the training data: Comparisons of training errors with statistical tests can be used (see Rivals and Personnaz, 2003, in this issue). Cross-validation can be extended to time-series data and, while i.i.d. assumptions do not hold anymore, it is still possible to estimate generalization error confidence intervals (see Bengio and Chapados, 2003, in this issue). Choosing what fraction of the data should be used for training and for validation is an open problem. Many authors resort to using the leave-one-out cross-validation procedure, even though it is known to be a high variance estimator of generalization error (Vapnik, 1982) T. G. Dietterich. Approximate statistical test for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895Ð1924, 1998. C. Nadeau and Y. Bengio. Inference for the generalization error. Machine Learning (to appear), 2001. Y. Bengio and D. Schuurmans. Special issue on new methods for model selection and model combination, 2002. Machine Learning, 48(1). =============== META LEARNING Machine learning as an experimental science (context) - Kibler, Langley - 1988ÊÊ ================= FEATURE SELECTION A. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2):245Ð271, December 1997. R. Kohavi and G. John. Wrappers for feature selection. Artificial Intelligence, 97(1-2):273Ð324, December 1997.