WorldCat Identities

Vert, Jean-Philippe

Works: 63 works in 86 publications in 2 languages and 577 library holdings
Genres: Methods (Music) 
Roles: Other, Contributor, Thesis advisor, Opponent, Author, Editor
Classifications: QH324.2, 570.285
Publication Timeline
Most widely held works by Jean-Philippe Vert
Kernel methods in computational biology by Bernhard Schölkopf( Book )

12 editions published in 2004 in English and held by 438 WorldCat member libraries worldwide

This book provides a detailed overview of current research in kernel methods and their applications to computational biology. Following three introductory chapters -- an introduction to molecular and computational biology, a short review of kernel methods that focuses on intuitive concepts rather than technical details, and a detailed survey of recent applications of kernel methods in computational biology -- the book is divided into three sections that reflect three general trends in current research. The first part presents different ideas for the design of kernel functions specifically adapted to various biological data; the second part covers different approaches to learning from heterogeneous data; and the third part offers examples of successful applications of support vector machine methods
Computational systems biology of cancer by Emmanuel Barillot( )

6 editions published between 2011 and 2013 in English and held by 44 WorldCat member libraries worldwide

Méthodes statistiques pour la modélisation du langage naturel by Jean-Philippe Vert( Book )

2 editions published in 2001 in French and English and held by 4 WorldCat member libraries worldwide

Etude de noyaux de semigroupe sur objets structurés dans le cadre de l'apprentissage statistique by Marco Cuturi( Book )

2 editions published in 2005 in French and held by 3 WorldCat member libraries worldwide

Kernel methods refer to a new family of data analysis tools which may be used in standardized learning contexts such as classification or regression. Such tools are grounded on an a priori similarity measure between the objects to be handled, which have been named kernel in the functional anlysys literature. The problem of selecting the right kernel for a task is known to be tricky notably when the objetcs have complex structures. We propose in this work various families of generic kernels for composite objects such as strings graphs or images based on a theoretical framework that blends elements of reproducing kernel Hilbert spaces theory, information geometry and harmonic analysis on semigroups. These kernels are also tested on datasets studied considered in the fields of bioinformatics and image analysis
Introduction de la connaissance à priori dans l'étude des puces à l'ADN by Franck Rapaport( Book )

2 editions published in 2008 in English and held by 3 WorldCat member libraries worldwide

Dans cette thèse, nous proposons trois nouvelles méthodes d'analyse de puces à ADN qui intègrent notre connaissance à priori de corrélations sous-jacentes au problème.La première méthodologie utilise les données de réseau métabolique pour l'analyse de profils d'expression de gènes. Elle est basée sur la décomposition spectrale du réseau à l'aide de la matrice laplacienne liée au graphe associé. La deuxième approche est une nouvelle méthode de classification supervisée pour les données de puces arrayCGH. Elle est basée sur le problème L1-SVM usuel modifié pour intégrer une seconde contrainte de régularisation qui traduit la forte chance pour deux relevés successifs d'appartenir à la même région d'altération. La dernière méthode est une autre manière d'introduire l'information contenue dans un réseau dans la classification supervisée de profils d'expression. Pour cela, nous avons rajouté au problème L1-SVM un terme de régularisation qui traduit notre volonté d'attribuer dans la fonction de décision des poids semblables à deux gènes connectés
Approches statistiques en segmentation : application à la ré-annotation de génome by Alice Cleynen( )

1 edition published in 2013 in English and held by 2 WorldCat member libraries worldwide

Nous proposons de modéliser les données issues des technologies de séquençage du transcriptome (RNA-Seq) à l'aide de la loi binomiale négative, et nous construisons des modèles de segmentation adaptés à leur étude à différentes échelles biologiques, dans le contexte où ces technologies sont devenues un outil précieux pour l'annotation de génome, l'analyse de l'expression des gènes, et la détection de nouveaux transcrits. Nous développons un algorithme de segmentation rapide pour analyser des séries à l'échelle du chromosome, et nous proposons deux méthodes pour l'estimation du nombre de segments, directement lié au nombre de gènes exprimés dans la cellule, qu'ils soient précédemment annotés ou détectés à cette même occasion. L'objectif d'annotation précise des gènes, et plus particulièrement de comparaison des sites de début et fin de transcription entre individus, nous amène naturellement à nous intéresser à la comparaison des localisations de ruptures dans des séries indépendantes. Nous construisons ainsi dans un cadre de segmentation bayésienne des outils de réponse à nos questions pour lesquels nous sommes capable de fournir des mesures d'incertitude. Nous illustrons nos modèles, tous implémentés dans des packages R, sur des données RNA-Seq provenant d'expériences sur la levure, et montrons par exemple que les frontières des introns sont conservées entre conditions tandis que les débuts et fin de transcriptions sont soumis à l'épissage différentiel
An accurate and interpretable model for siRNA efficacy prediction by Jean-Philippe Vert( )

1 edition published in 2006 in English and held by 2 WorldCat member libraries worldwide

Effective normalization for copy number variation in Hi-C data by Nicolas Servant( )

1 edition published in 2018 in English and held by 2 WorldCat member libraries worldwide

A priori structurés pour l'apprentissage supervisé en biologie computationnelle by Laurent Jacob( Book )

2 editions published in 2009 in English and held by 2 WorldCat member libraries worldwide

Supervised learning methods are used to build functions which accurately predict the behavior of new objects from observed data. They are therefore extremely useful in several computational biology problems, where they can exploit the increasing amount of empirical data generated by high-throughput technologies, or the accumulation of experimental knowledge in public databases. In several cases however, the amount of training data is not sufficient to deal with the complexity of the learning problem. Fortunately this type of ill-posed problem is not new in statistics and statistical machine learning. It is classically addressed using regularization approaches, or equivalently using a prior on what the function should be like. In this thesis, we build on this principle and propose new regularization methods based on biological prior knowledge for each problem. In the context of in silico vaccine and drug design, we show how using the knowledge that similar targets bind similar ligands, one can improve dramatically the prediction accuracy for the targets with little known ligands, and even make predictions for targets with no known ligand. We also design a convex regularization function which takes into account the fact that only some unknown beforehand groups of targets tend to have the same binding behavior. Finally, in the context of outcome prediction from molecular data, we propose a regularization function which leads to sparse vector whose support is typically a union of potentially overlapping groups of genes defined a priori like, e.g., pathways, or a set of genes which tend to be connected to each other when a graph reflecting biological information is given
Méthodes à noyau pour l'annotation automatique et la prédiction d'interaction de structures de protéine = ernel methods for automatic annotation and interaction prediction of protein 3D structures by Martial Hue( Book )

2 editions published in 2011 in English and held by 2 WorldCat member libraries worldwide

As large quantities of protein 3D structures are now routinely solved, there is a need for computational tools to automatically annotate protein structures. In this thesis, we investigate several machine learning approaches for this purpose, based on the popular support vector machine (SVM) algorithm. Indeed, the SVM offers several possibilities to overcome the complexity of protein structures, and their interactions. We propose to solve both issues by investigating new positive definite kernels. First, a kernel function for the annotation of protein structures is devised. The kernel is based on a similarity measure called MAMMOTH. Classification tasks corresponding to Enzyme Classification (EC), Structural Classification of Proteins (SCOP), and Gene Ontology (GO) annotation, show that the MAMMOTH kernel significantly outperforms other choices of kernels for protein structures and classifiers. Second, we design a kernel in the context of binary supervised prediction of objects with a specific structure, namely pairs of general objects. The problem of the inference of missing edges in a protein-protein interaction network may be cast in this context. Our results on three benchmarks of interaction between protein structures suggest that the Metric Learning Pairwise Kernel (MLPK), in combination with the MAMMOTH kernel, yield the best performance. Lastly, we introduce a new and efficient learning method for the supervised prediction of protein interaction. A pairwise kernel method is motivated by two previous methods, the Tensor Product Pairwise Kernel (TPPK) and the local model. The connection between the approaches is explicited and the two methods are formulated in a new common framework, that yields to natural generalization by an interpolation
A Bayesian active learning strategy for sequential experimental design in systems biology by Edouard Pauwels( )

1 edition published in 2014 in English and held by 2 WorldCat member libraries worldwide

Fonctions noyaux pour molécules et leur application au criblage virtuel par machines à vecteurs de support by Pierre Mahe( Book )

2 editions published in 2006 in English and held by 2 WorldCat member libraries worldwide

La recherche thérapeutique a de plus en plus recours à des techniques de modélisation, dites de criblage virtuel, visant à corréler la structure d'une molécule avec ses propriétés biologiques. En particulier, l'utilisation de modèles prédictifs quantifiant la toxicité d'une molécule ou son activité vis à vis d'une cible thérapeutique, permet de réduire de manière considérable le temps et les coûts nécessaires à la mise au point de nouveaux médicaments. Nous nous proposons d'aborder ce problème dans le cadre des méthodes à noyaux, qui permettent de construire de tels modèles de manière efficace dès lors que l'on dispose d'une fonction noyau mesurant la similarité des objets que l'on considère. Plus particulièrement, l'objet de cette thèse est de définir de telles fonctions noyaux entre structures bi- et tri-dimensionnelles de molécules. D'un point de vue méthodologique, ce problème se traduit respectivement comme celui de comparer des graphes représentant les liaisons covalentes des molécules, ou des ensembles d'atomes dans l'espace. Plusieurs approches sont envisagées sur la base de l'extraction et la comparaison de divers motifs structuraux qui permettent d'encoder les groupes fonctionnels des molécules à différents niveaux de résolution. Les validations expérimentales suggèrent que cette méthodologie est une alternative prometteuse aux approches classiques en criblage virtuel
Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications by Koen Van den Berge( )

1 edition published in 2018 in English and held by 2 WorldCat member libraries worldwide

A convex formulation for joint RNA isoform detection and quantification from multiple RNA-seq samples by Elsa Bernard( )

1 edition published in 2015 in English and held by 2 WorldCat member libraries worldwide

ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples by Fantine Mordelet( )

1 edition published in 2011 in English and held by 2 WorldCat member libraries worldwide

Méthodes statistiques pour la mise en correspondance de descripteurs by Olivier Collier( )

1 edition published in 2013 in English and held by 2 WorldCat member libraries worldwide

Many applications, as in computer vision or medicine, aim at identifying the similarities between several images or signals. There after, it is possible to detect objects, to follow them, or to overlap different pictures. In every case, the algorithmic procedures that treat the images use a selection of key points that they try to match by pairs. The most popular algorithm nowadays is SIFT, that performs key point selection, descriptor calculation, and provides a criterion for global descriptor matching. In the first part, we aim at improving this procedure by changing the original descriptor, that requires to find the argument of the maximum of a histogram: its computation is indeed statistically unstable. So we also have to change the criterion to match two descriptors. This yields a nonparametric hypothesis testing problem, in which both the null and the alternative hypotheses are composite, even nonparametric. We use the generalized likelihood ratio test to get consistent testing procedures, and carry out a minimax study. In the second part, we are interested in the optimality of the procedure of global matching. We give a statistical model in which some descriptors are present in a given order in a first image, and in another order in a second image. Descriptor matching is equivalent in this case to the estimation of a permutation. We give an optimality criterion for the estimators in the minimax sense. In particular, we use the likelihood to find several consistent estimators, which are even optimal under some conditions. Finally, we tackled some practical aspects and showed that our estimators are computable in reasonable time, so that we could then illustrate the hierarchy of our estimators by some simulations
Learning smoothing models of copy number profiles using breakpoint annotations by Toby Dylan Hocking( )

1 edition published in 2013 in English and held by 2 WorldCat member libraries worldwide

Changes in genome organization of parasite-specific gene families during the Plasmodium transmission stages by Evelien M Bunnik( )

1 edition published in 2018 in English and held by 2 WorldCat member libraries worldwide

Integrative DNA methylation and gene expression analysis to assess the universality of the CpG island methylator phenotype by Matahi Moarii( )

1 edition published in 2015 in English and held by 2 WorldCat member libraries worldwide

Méthodes à noyaux pour les réseaux convolutionnels profonds by Alberto Bietti( )

1 edition published in 2019 in English and held by 2 WorldCat member libraries worldwide

The increased availability of large amounts of data, from images in social networks, speech waveforms from mobile devices, and large text corpuses, to genomic and medical data, has led to a surge of machine learning techniques. Such methods exploit statistical patterns in these large datasets for making accurate predictions on new data. In recent years, deep learning systems have emerged as a remarkably successful class of machine learning algorithms, which rely on gradient-based methods for training multi-layer models that process data in a hierarchical manner. These methods have been particularly successful in tasks where the data consists of natural signals such as images or audio; this includes visual recognition, object detection or segmentation, and speech recognition.For such tasks, deep learning methods often yield the best known empirical performance; yet, the high dimensionality of the data and large number of parameters of these models make them challenging to understand theoretically. Their success is often attributed in part to their ability to exploit useful structure in natural signals, such as local stationarity or invariance, for instance through choices of network architectures with convolution and pooling operations. However, such properties are still poorly understood from a theoretical standpoint, leading to a growing gap between the theory and practice of machine learning. This thesis is aimed towards bridging this gap, by studying spaces of functions which arise from given network architectures, with a focus on the convolutional case. Our study relies on kernel methods, by considering reproducing kernel Hilbert spaces (RKHSs) associated to certain kernels that are constructed hierarchically based on a given architecture. This allows us to precisely study smoothness, invariance, stability to deformations, and approximation properties of functions in the RKHS. These representation properties are also linked with optimization questions when training deep networks with gradient methods in some over-parameterized regimes where such kernels arise. They also suggest new practical regularization strategies for obtaining better generalization performance on small datasets, and state-of-the-art performance for adversarial robustness on image tasks
moreShow More Titles
fewerShow Fewer Titles
Audience Level
Audience Level
  Kids General Special  
Audience level: 0.62 (from 0.59 for Méthodes ... to 0.99 for Méthodes ...)

Kernel methods in computational biology
Alternative Names
Jean-Philippe Vert researcher

Jean-Philippe Vert wetenschapper

English (39)

French (3)