Recherche et développements

Recherches réalisées ou en cours

1/ Variance estimation via resampling methods in survey methodology, Erika Antal, supervised by Yves Tillé

The resample methods like the bootstrap and the jackknife are at present largely used for estimating the variance in practically all the statistical problem. Nevertheless, in survey sampling the variances depend on the sampling design and can have a complex form when the sampling design is sophisticated. New resampling methods for estimating the variance are proposed. They consist of selecting a subsample directly in the initial sample in a way that can reproduce an exact estimator of variance in the linear cases.

2/ Measures of Inequalities in Sampling from Finite Population, Yves Tillé - Matti Langel, supervised by Yves Tillé

The main goal is to evaluate the adequacy of the choice of the inequalities indicators from a statistical point of view and to eventually propose several alternatives. One of the essential questions that research on inequalities should aim to answer is straightforward: Are inequalities increasing? To provide credible statistical answers to this question, the estimation of variance and the construction of confidence intervals are imperative. Therefore, the estimation of variance will be a major focus of this study. This research is supported by grant no. 200021-121604 of the Swiss National Science Foundation.

3/ Estimation de la précision des indices d'inégalités de l'enquête SILC

Anne Massiani travaille, depuis septembre 2009, sur l'estimation de la précision d'indices d'inégalité dans le cadre complexe de l'enquête SILC. Ce travail repose sur des techniques de linéarisation et s'appuie notamment sur des développements récents proposés par Langel et Tillé (cf. paragraphe précédent). Dans un premier temps, les calculs ont été effectués sous les hypothèses simplificatrices utilisées par Eurostat (qui ne tiennent pas compte des effets dus aux imputations).

4/ Non-linear, Non-parametric, Dynamic Methods in Hedge Fund Research , Donatien Tafin, supervised by Catalin Starica

The present research intends to delve into the world of alternative investment strategies. We aim to model correlations and nonlinear statistical relations among hedge fund investment strategies and between hedge funds categories and broad asset classes, to improve our understanding of the highly dynamic, complex investment profile of the vehicle and its systematic exposure to the market. The research will combine a set of dynamic, adaptive methods and non-parametric approaches from the field of statistical learning.

In this period of intense debate over the structure of the financial industry and how its architecture should be redesigned to safeguard it against future crises, the implementation of advanced statistical techniques to model and better understand hedge funds-like returns, will highlight a set of implications for asset allocation, performance analysis as well as better risk management practices.

5/ Développement du module d'échantillonnage 'sampling' en langage R

Le package 'sampling' a vu le jour pour servir d'outil pédagogique pour le cours « Advanced methods of survey sampling » organisé par le Service METH de l'OFS en avril 2005 dans le cadre du « European Statistical Training Programme», et financé par l'Association Européenne de Libre Echange (AELE). Ce cours était destiné aux statisticiens des instituts de statistique des pays européens. Ce package, maintenant disponible sur le site web du langage R, est utilisé par de nombreux praticiens d'enquêtes (voir http://cran.r-project.org/src/contrib/Descriptions/sampling.html ). Le « R sampling Package » comprend des procédures implémentant des algorithmes d'échantillonnage modernes décrits dans le livre d'Yves Tillé (2006), Sampling Algorithms, Springer, New-York et des algorithmes d'échantillonnage usuels. Il contient également des fonctions permettant de caler des données d'enquêtes sur des résultats de recensement, le traitement de la non-réponse et quelques procédures pour estimer la variance. Alina Matei est chargée du développement et de la maintenance du package « sampling ». La priorité est mise sur les méthodes de calage, l'estimation de variance et le traitement de la non-réponse. Dans ce contexte, une nouvelle procédure permettant d'appliquer le calage généralisé a été implémentée. La nouvelle version du package 'sampling' est la version 2.4 (décembre 2010) ; c'est un package officiel du software R.

6/ Do Corporate Social Responsibility scores explain and predict firm profitability? A case study on the publishers of the Dow Jones Sustainability Indexes
Research conducted by C. Starica and C. Manescu (ECB).

This study conducts an in-depth analysis of the association of a unique fourteen-dimensional set of corporate social responsibility (CSR) scores to firm profitability measured by Return on Assets (ROA). We find that non-linear (semi or non-parametric) regression methods bring important improvements in explaining profitability relative to a classical linear approach. While a number of CSR variables like corporate governance, talent attraction and codes of conduct might have some explanatory power, the corporate social responsibility (CSR) scores do not improve over the standard variables known to be associated to the ROA. The results of this project will be soon submitted to publication.

7/ The cost of sustainability on optimal portfolio choices.
Research conducted by C. Starica together with S. Herzel (Universitá di Roma - Tor Vergata), and M. Nicolosi (Universitá di Perugia).

We examine the impact of sustainability criteria, as measured by the KLD scores, on optimal portfolio selection performed on an in-vestment universe containing the equities in the S&P500 index and covering the period between 1993 and 2008. The optimizations are done according to the Markowitz mean-variance approach while sustainability constraints are introduced by eliminating from the investment pool those assets that do not comply to different social responsibility criteria (screening). We compare the two efficient frontiers, i.e. the one without and the one with screening. A spanning test is performed to determine if the differences between the two types of efficient frontier are significant. We introduce a measure of how much an investor has to pay (through loss of return or through additional risk) in order to satisfy given sustainability criteria. The analysis is carried on separately on the three main dimensions of sustainability, namely Environmental, Social and Governance. The first part of the research in the frame of this project will be soon published.

8/ The missing link between the returns and the CSR scores
Research conducted by C. Starica and E. Stanghellini - University of Perugia

In a quantitative analysis of the relationship between CSR performance criteria and measures of business success of a company, we should take into account that the first concept is a multidimensional construct which involves different aspects, stemming from social and corporate governance requirements as well as from environment responsibility. These three concepts are difficult to measure and are imperfectly captured by the CSR scores. We can therefore formalize the multidimensional nature of the sustainable investment by means of latent traits which are measured with errors by the CSR scores. The latent traits affect, in turn, the performance criteria.

In Figure 1 the postulated model is presented. We assume that the underlying latent constructs, here represented by three nodes ("Corporate", "Environment" and "Social") are imperfectly measured by the CSR scores. The three latent variables have a direct influence on the financial measures of the company. Each arrow in the graph corresponds to a parameter of the model. The arrows pointing from the latent nodes to the CSR scores refer to the measurement process, and we here assume that the CSR scores are imperfect measures of the latent traits in the model. The arrows pointing from the latent nodes to the financial measures of the company refer to the structural parameters, and they express the impact of the underlying dimensions on the performance indicators. The interest is in investigating the significance of these parameters, as well as the sign and the magnitude. A ranking of the companies according to each latent dimension of the model will also be provided.

Figure 1. The SEM for the relationships between the CSR scores and the performance indicators

The model in Figure 1 refers to a cross-sectional analysis with the observed variables are measured in a given year. The KLD dataset provides, for each company, measurements of the observable variables repeated on consecutive years. It will then provide the possibility to explore also the dynamic underlying the process. This could be done by allowing the latent variables at time t to depend on the observable measurements but also on their value at time t-1. Also their impact on the performance indicators could be investigated by allowing delayed effects on the financial measures. This research is in an early stage. 8/ Allocation optimale d'échantillons pour des plans stratifiés ou poissoniens,  intégration de contraintes de tailles minimales, et étude du risque lié à l'aléa sur les tailles nettes - L. Qualité

La bonne prise en compte du caractère aléatoire de la taille des échantillons obtenus après non-réponse, et la généralisation à l'OFS de l'utilisation du système de tirages coordonnés développé à l'Istat conduit à réviser la méthode d'allocation d'échantillons. Un rapport est en cours de rédaction sur la solution optimale à ce problème.

10/ Modèles latentes dans la théorie des sondages - A. Matei

Alina Matei étudie l'application des modèles à variables latentes dans la théorie des sondages. Ces modèles postulent l'existence de variables inobservables qui sont calculées en utilisant des variables manifestes. Les covariations entre les variables manifestes s'expliquent par la dépendance de chaque variable observée avec les variables latentes. De plus, les variables observées sont considérées comme indépendantes conditionnellement aux variables latentes. Ces modèles peuvent être utilisés pour le traitement de la non-réponse. Le but est de diminuer le biais dû à la non-réponse.