Séminaires de l'académique 2008-2009

  • 16 septembre 2008. Irene Gijbels, Department of Mathematics § Leuven Statistics Research Center, Katholieke Universiteit Leuven Belgium

 This talk concerns regression analysis for data that show possibly overdispersion or underdispersion. A starting point for modelling are  generalized linear models in which we no longer admit a linear form for the mean regression function, but allow it to be any smooth function of the covariate(s). In view of analyzing overdispersed or underdispersed data, we additionally bring in an unknown dispersion function. The mean regression function and the dispersion function are then estimated using P-splines with difference type of penalty to prevent from overfitting. The choices of smoothing parameters and implementation issues are discussed. The performance of the estimation method is investigated via some simulations and its use is illustrated on several data, including continuous data, counts and proportions.
This is based on joint work with Ilaria Prosdocimi and Gerda Claeskens.

  • 7 octobre 2008.   Anouar el Ghouch, Department of Econometrics, University of Genèva, Suisse

We propose a new approach to conditional quantile function estimation that combines both parametric and nonparametric techniques. At each design point, a global, possibly incorrect, pilot parametric model is locally adjusted through a kernel smoothing fit. The resulting quantile regression estimator behaves like a parametric one when the latter is correct and converges to the nonparametric solution as the parametric start deviates from the true underlying model. We give a Bahadur-type representation of the proposed estimator from which consistency and asymptotic normality are derived under strong mixing assumption. We also discuss numerical implementation and investigate the performance of the estimator via simulations. Finally, we propose and numerically study a practical bandwidth selector based on the plug-in principle, and we illustrate the methodology on a real data example.

  • 14 octobre 2008. Victor Panaretos, Institut de Mathematiques, EPFL 

What can be said about an unknown density function on $\mathbb{R}^n$ given a finite collection of (n-1)-dimensional marginals at random and unknown orientations? This question arises in single particle electron microscopy, a powerful method that biophysicists employ to learn about the structure of biological macromolecules. The method images unconstrained particles, as opposed to particles fixed on a lattice (crystallography) and poses a variety of statistical problems. We formulate and study statistically the one such problem, namely the estimation of a structural model for a biological particle given random projections of its Coulomb potential density, observed through the electron microscope. Although unidentifiable (ill-posed), this problem can be seen to be amenable to a statistical solution, once parametric assumptions are imposed. It can also be seen to present challenges both from a data analysis point of view (e.g. uncertainty estimation and presentation) as well as computationally.  

  • 25 novembre 2008. Marloes Maathuis, ETH Zurich.

We present a new method for determining variable importance, using intervention calculus. We assume that we have observational data, generated from an unknown underlying directed acyclic graph

(DAG) model. A DAG is not identifiable from observational data, but it is possible to consistently estimate an equivalence class of DAGs.

Moreover, for any given DAG, causal effects can be estimated using intervention calculus. In this talk, we combine these two parts. For each DAG in the estimated equivalence class, we use intervention calculus to estimate the causal effects of the covariates on the response. This yields a collection of estimated causal effects for each covariate. We show that the distinct values in this set can be consistently estimated by an algorithm that uses only local information of the graph. This local approach is computationally fast and feasible in high-dimensional problems. We propose to use summary measures of the set of possible causal effects to determine variable importance. In particular, we use the minimum absolute value of this set, since that is a conservative bound on the size of the causal effect.

 (Based on joint work with Markus Kalisch and Peter Buehlmann) 

  • 10 mars 2009, Paul Doukhan, UFR Sciences Techniques, Université Cergy Pontoise

Rappelant les questions liées à l'allègement de la propiété d'intépendance stochastique, nous en rappelons les solutions usuelles ainsi qu'une notion de dépendance faible introduite avec Sana Louhichi (1999). Cette notion développée dans un ouvrage paru en 2007 (et publié avec elle, Jérome Dedecker, Gabriel Lan, José R. Leon, et Clémentine Prieur) donne lieu à une théorie probabiliste étendant celle liée au notions d'indépendance stochastique permettant de déduire de nombreuses applications de nature statistique.
Les modèles de séries temporelles de la statistique classique sont d'abord généralisés. Puis, plus qu'exposer les outils probabilistes afférents à cette technique l'objectif de cet exposé est d'indiquer quelques utilisations de ces techniques à des questions d'estimation et de rééchantillonnage

  • 31 mars 2009, Sylvain Sardy, Section Mathématiques, Université Genève

Abstract: We consider the problem of estimating the volatility of a financial asset from a time series record. We believe the underlying volatility process is smooth, with potential abrupt changes due to market news, and possibly stationary.
By drawing parallels between time series and regression models, in particular between stochastic volatility models and Markov random fields smoothers, we propose a semiparametric estimator of volatility that we apply to real financial data.
Joint work with David Neto (UG) and Paul Tseng (UW)

  • 21 avril 2009. Isabel Molina, Departamento de Estadistica, Universidad Carlos III de Madrid

When a sample is drawn from a large nite population but estimates are required for a large number of population subgroups (domains), direct estimators for some of these domains may be unreliable due to lack of domain sample data. This is the usual setup of small domain estimation, in which the aim is to obtain estimators with better properties than direct estimators using the data from other domains (indirect estimation) and extra information provided by auxiliary variables (through some kind of model). This presentation will revise the standard model-based small domain estimation methods and will review some recent developments in this eld. Some applications will be described.

  • 5 mai 2009, Anthea Monod, Institut de Statistique, Université de Neuchâtel

With the recent phenomena of globalization and the rapid advancement in technology, the collection of data over vast surface areas and over several time periods (that is, space-time data) has become increasingly accessible, revealing complex data structures that inspire a demand for new analysis and modeling techniques. The applications of such techniques are wide-reaching and becoming increasingly prominent in various scientific disciplines, including physical, environmental, and biological sciences, hydrology and fluid dynamics. 
A pivotal concern for statistical analysis aimed at optimal spatial-temporal interpolation and prediction is the modeling and estimation of the covariance structure.  In this talk, I will present an overview of existing models for covariance structures for spatial data, including a discussion on notions associated with, and desirable properties of, spatial covariance functions; I will also present an explicit construction of the celebrated Matérn class of spatial covariances.  I will then discuss difficulties of introducing a time component into existing models for spatial covariance, and give an overview of existing spatial-temporal covariance models and discuss their advantages and limitations.  I will close with a discussion on future directions for subsequent development in the construction of covariance functions for space-time stochastic processes.
This work is conducted under the supervision of Stephan Morgenthaler (EPFL).

  • 19 mai 2009, Guillaume Chauvet, Crest-Ensai, Rennes

 La méthode du Cube proposée par Deville et Tillé (2004) permet de sélectionner des échantillons équilibrés, c'est à dire des échantillons permettant une estimation exacte pour les totaux de variables auxiliaires. Comme il est généralement impossible d'obtenir un échantillonnage équilibré exact, la méthode du Cube procède généralement en deux étapes appelées la phase de vol et la phase d'atterrissage. Plusieurs approximations de la variance associée à la phase de vol sont proposées par Deville et Tillé (2005), mais la phase d'atterrissage peut ajouter une variance non négligeable. Nous proposons ici une méthode permettant d'obtenir une approximation de la matrice de variance-covariance, conduisant à une estimation de variance quasiment sans biais. Cette méthode est évaluée à l'aide de quelques simulations. Il s'agit d'un travail en cours de réalisation, en collaboration avec Jay Breidt, de l'université du Colorado.
Deville, J-C., and Tillé, Y. (2004). Efficient balanced sampling : the cube method. Biometrika, 91, pages 893-912.
Deville, J-C., and Tillé, Y. (2005). Variance approximation under balanced sampling. Journal of Statistical Planning and Inference, 128, pages 569-591.