Find a copy in the library
Finding libraries that hold this item...
|All Authors / Contributors:||
Sylvain Arlot; Pascal Massart; Université de Paris-Sud. Faculté des Sciences d'Orsay (Essonne).
|Notes:||Thèse rédigée en anglais, seule l'introduction est en français.|
|Description:||1 vol. (299 p.) : ill. ; 30 cm.|
|Responsibility:||Sylvain Arlot ; sous la direction de [Pascal Massart].|
This thesis takes place within the theories of non-parametric statistics and statistical learning. Its goal is to provide an accurate understanding of several resampling or model selection methods, from the non-asymptotic viewpoint. The main advance in this thesis consists in the accurate calibration of model selection procedures, in order to make them optimal in practice for prediction. We study V-fold cross-validation (very commonly used, but badly known in theory, in particular for the question of choosing V) and several penalization procedures. We propose methods for calibrating accurately some penalties, for both their general shape and the multiplicative constants. The use of resampling allows to solve hard problems, in particular regression with a variable noise-level. We prove non-asymptotic theoretical results on these methods, such as oracle inequalities and adaptivity properties. These results rely in particular on some concentration inequalities. We also consider the problem of confidence regions and multiple testing, when the data are high-dimensional, with general and unknown correlations. Using resampling methods, we can get rid of the curse of dimensionality, and "learn" these correlations. We mainly propose two procedures, and prove for both a non-asymptotic control of their level.
- Rééchantillonnage (statistique) -- Thèses et écrits académiques.
- Modèles mathématiques -- Thèses et écrits académiques.
- Statistique non-paramétrique
- Régression non-paramétrique
- Apprentissage statistique
- Validation croisée V-fold
- Régions de confiance