**Abstract 2010**

wp2010_01 (Econometrics, Statistics)

**The Method of Simulated Scores for Estimating Multinormal Regression Models with Missing Values
**

**Giorgio Calzolari, Laura Neri**

Given a set of continuous variables with missing data, we prove in this paper that the iterative application of a simple "least-squares estimation/multivariate normal simulation" procedure produces an efficient parameters estimator. There are two main assumptions behind our proof: (1) the missing data mechanism is ignorable; (2) the data generating process is a multivariate normal linear regression. Disentangling the iterative procedure and its convergence conditions, we show that the estimator is a "method of simulated scores" (a particular case of McFadden's "method of simulated moments"), thus equivalent to maximum likelihood if the number of replications is conveniently large. We thus provide a non-Bayesian re-interpretation of the estimation/simulation problem. The computational procedure is obtained introducing a simple modification into existing algorithms. Its software implementation is straightforward (few simple statements in any programming language) and easily applicable to datasets with large number of variables.

**Probabilistic classification of age by third molar development: the use of soft-evidence**

**Vilma Pinchi, Fabio Corradi, Iljà Barsanti, Stefano Garatti**

The aim of the paper is to classify individuals according to age through dental development of their third molars. Such teeth were classified by Demirjian’s 8-stages dental maturity scale, but we introduced a new and relevant variation. In fact the odontologist is allowed to classify a tooth representing the uncertainty about the stage attribution, using the soft-evidence, which is included in the parametric learning. We used a modified Naïve Bayes to classify 559 Italian youths (307 males and 252 females) aged between 16 and 23, according to dichotomous and trichotomous classifications. Results show the importance of the expert’s skill in reading the OPG and the ability to express their beliefs about the dental maturity stage.

**A Time-varying Mixing Multiplicative Error Model for Realized Volatility**

**Giovanni De Luca, Giampiero M. Gallo**

In this paper we model the dynamics of realized volatility as a Multiplicative Error Model with a mixture of distributions for the innovation term with time-varying mixing weights forced by past behavior of volatility. The mixture considers innovations as a source of time-varying volatility of volatility and is able to capture the right tail behavior of the distribution of volatility. The empirical results show that there is no substantial difference in the one-step ahead conditional expectations obtained according to various mixing schemes but that fixity of mixing weights may be a binding constraint in deriving accurate quantiles of the predicted distribution.

**Augmented Designs to Assess Principal Strata Direct Effects**

**Alessandra Mattei, Fabrizia Mealli**

Many research questions involving causal inference are often concerned with understanding the causal pathways by which a treatment affects an outcome. Thus, the concept of 'direct' versus 'indirect' effects comes to play. Disentangling direct and indirect effects may be a difficult task, because the intermediate outcome is generally not under experimental control. We tackle this problem by investigating new augmented experimental designs, where the treatment is randomized, and the mediating variable is not forced, but only randomly encouraged. There are two key features of our framework: we adopt a principal stratification approach, and we mainly focus on principal strata effects, avoiding to involve a priori counterfactual outcomes. Using non parametric identification strategies, we provide a set of assumptions, which allow us to partially identify the causal estimands of interest, the Principal Strata Direct Effects. Large sample bounds for various Principal Strata average Direct Effects are provided, and a simple hypothetical example is used to show how our augmented design can be implemented and how the bounds can be calculated. Finally our augmented design is compared with and contrasted to a standard randomized design.

Published as *Journal of the Royal Statistical Society: Series B (Statistical Methodology)*, forthcoming.

wp2010_05 (Social Statistics, Demography)

**Socioeconomic and territorial inequalities in health: findings for Italian elderly
**

**Elena Pirani, Silvana Salvini**

Italy displays several territorial differences: as it is well-known, the South shows less favourable conditions than the North, with respect to economic, social, and environmental aspects and also in the field of health. The aim of our analysis is twofold. Our interest is, first, on the socioeconomic differences associated with inequalities in self-rated health of elderly people in Italy, once demographic variables and actual health status are controlled for. In particular, we intend to verify also the role of social network in the self-perception of health of older people, a component usually not considered in the literature. Secondly, bearing in mind that in Italy the competences on health care are delegated to a sub-national level, we shall explore the presence of a contextual effect among Italian areas, net of individual characteristics. Furthermore, we will address the following question: is the regional breakdown sufficient in order to examine and explain differences in health performances, or we do need a more detailed territorial level of analysis? The study makes use of a representative cross-sectional survey on health conditions carried out by the Italian National Statistical Office (ISTAT) in 2004-2005. The large sample size and the sampling design allow us to analyse health characteristics at a sub-national level. Focusing on elderly people (65 years and over), the analysis will refer both to the regional and to a sub-regional level (large areas). In order to describe the relationships among the health status of individuals, their demographic and socio-economic characteristics and the area of residence, a multilevel framework is adopted. The first result of this study is that Italy still presents health differences depending both on gender and on individual socio-economic status. A second result is that the residential context emerges to be associated with the perception of individual health status. Individual characteristics, even representing the most important correlates of health, do not completely explain intra-regional heterogeneity, confirming the existence of a contextual effect. Thirdly, we found that territorial differences were present among Regions but also among large areas. However, these intra-regional differences are not so relevant and critical. The large area level of detail does not add further and improved insights to territorial heterogeneity, so the Regions seem to represent, for Italian health context, a good territorial breakdown in order to approximate the residential environment of individuals.

Published as *Population Research and Policy Review*, Volume 31, Issue 1, pp. 97-117, 2012; link, published.

**Disentangling Systematic and Idiosyncratic Risk for Large Panels of Assets**

**Matteo Barigozzi, Christian T. Brownlees, Giampiero M. Gallo, David Veredas**

When observed over a large panel, measures of risk (such as realized volatilities) usually exhibit a secular trend around which individual risks cluster. In this article we propose a vector Multiplicative Error Model achieving a decomposition of each risk measure into a common systematic and an idiosyncratic component, while allowing for contemporaneous dependence in the innovation process. As a consequence, we can assess how much of the current asset risk is due to a system wide component, and measure the persistence of the deviation of an asset specific risk from that common level. We develop an estimation technique, based on a combination of seminonparametric methods and copula theory, that is suitable for large dimensional panels. The model is applied to two panels of daily realized volatilities between 2001 and 2008: the SPDR Sectoral Indices of the S&P500 and the constituents of the S&P100. Similar results are obtained on the two sets in terms of reverting behavior of the common nonstationary component and the idiosyncratic dynamics to with a variable speed that appears to be sector dependent.

**Fertility dynamics in France and Italy. Who are the couples that do not give birth to the intended child?**

**Daniele Vignoli, Arnaud Régnier-Loilier**

France and Italy lie at the two extremes as regards fertility levels in Europe. Although previous findings showed that desired fertility is very similar in France and Italy, an examination of intentions to have a child in the following three years points to a country-specific difference. Namely, in France reproductive intentions are higher than in Italy for all parities. Moreover, since the actual fertility levels are so different, there could be some sorts of constraints that limit fertility more strongly in Italy than in France. Taking advantage of the first two rounds of the French and Italian Gender and Generation Surveys, in this paper we aim at highlighting the profiles of those couples who do not realize their intended fertility projects in the two countries considered. This line of reasoning may provide important input to policy makers wishing to lift the constraints to fertility realization.

Published as *Population*, Volume 66, Issue 2, pp. 401-431, 2011, published.

**Home Tenure among the Old Europeans: a Gender Analysis
**

**Maria Letizia Tanturri, Daniele Vignoli**

Home-ownership is the most important asset among old people in Europe, but so far very little is known about gender differences. This paper is aimed at exploring the link between gender, family type, monetary poverty and home tenure among older European population. The analysis is carried out on a SHARE 2.0.1 sub-sample of about 28.000 individuals, aged 50 or over, living in Austria, Belgium, Denmark, France, Germany, Greece, Italy, the Netherlands, Spain, and Sweden. A multinomial regression model is used to delineate the profile of old people excluded from home ownership distinguishing between tenant and rent-free, controlling for a plurality of covariates. Results reveal a more complex picture than simple predictions. Other things being equal, women are more likely to be excluded from home ownership than men and they more often belong the rent-free categories. Women are more protected when they live in couples. Overall, poor people appear more likely to be excluded from home ownership than the not poor, especially if they are women living in enlarged families or men and women living alone.

Published as *Rivista Italiana di Economia, Demografia e Statistica*, Volume LXIII, Issue 3/4, pp. 211-218, 2009, published.

wp2010_09 (Demography, Econometrics)

**Two of a kind? Short-term shocks and the demographic transition in the European demographic history**

**Giambattista Salinari, Gustavo De Santis, Massimo Livi Bacci**

Two types of interaction between mortality and fertility have thus far been identified: short-term (e.g. after a mortality crisis) and long-term (in the demographic transition). This paper suggests that the underlying connection between the two phenomena is instead unique, and that the differences between the various cases (countries and centuries) lie, rather, in the evolution of the independent variable (death rates). Modern statistical tools (analysis of time series; identification of structural breaks) applied to ancient, aggregate data (Chesnais 1992) to twelve European countries help shed a new light on the demographic mechanisms that guide the dynamics of human population.

wp2010_10 (Economic Statistics, Demography)

**Comparing like with like: cluster-specific equivalence scales**

**Mauro Maltagliati, Gustavo De Santis**

On the basis of a few well-behaved indicators of economic well being, we create clusters of households, with different structure (number of members) but similar "economic profile", in terms of both standard of living and "style" (i.e. way of spending money, for any given standard of living). Since, by assumption, households are comparable within clusters, the ratios between their total monthly outlays produces cluster-specific equivalence scales. By properly averaging over clusters, we derive the general equivalence scales for Italy for the years 2003-2008. With the same logic, and a few adjustments, we also obtain measures of inflation and, separately, of purchasing power parities (PPP) for different regions within Italy.

wp2010_11 (Statistics, Demography)

**Spatial data mining for clustering: from the literature review to an application using RedCap**

**Federico Benassi, Chiara Bocci, Alessandra Petrucci**

The aim of the paper is both to review the scientific literature about spatial data mining methods - in particular spatial clustering methods developed in recent years - and to present an original application of the recently proposed RedCap method of spatial clustering and regionalization on Florentine Metropolitan Area (FMA). Demographic indicators computed on official data provided by the Italian Institute of Statistics (Istat), are the input of a spatial clustering and regionalization model in order to get a classification of the FMA municipalities into a number of homogeneous (with respect to demographic structure) and spatially contiguous zones.

In the optics of a progressive decentralization of the governance activities we believe that the FMA represents a very interesting case of study. This due to the fact that the individuation of new spatial areas built considering both the demographic characteristics of the resident population and the spatial dimension of the territory where this population insists could become a useful tool for local governance.

Published as *Classification and Data Mining*, A. Giusti, G. Ritter, M. Vichi (Eds.), Springer, pp. 157-164, 2013, ISBN 978-3-642-288.

wp2010_12 (Statistics, Medical Statistics)

**Shared component models in joint disease mapping: a comparison via a simulation experiment**

**Emanuela Dreassi**

Two models for jointly analysing the spatial variation of incidences of three (or more) diseases, with common and uncommon risk factors, are compared via a simulation experiment. In both models, the linear predictor can be decomposed into shared and disease-specific spatial variability components (named shared clustering and specific clustering respectively). The two models are the shared model on the original formulation that use exchangeable Poisson distribution as response multivariate variable and shared components model that use a Multinomial one. The simulation study shows that models behave similarly. However, Multinomial shared components model performs better for disease-specific spatial variability clustering terms but it is lower for the shared one.

wp2010_13 (Statistics, Statistics)

**Small area estimation in presence of nonresponse**

**Caterina Giusti, Emilia Rocco**

In standard survey estimation, the problem of nonresponse is well-known and a variety of methods exist to adjust for this phenomenon. Less well-understood are the effects of nonresponse on small area estimation. In this paper we propose a probability weighted estimation procedure that adjust for the effect of an informative nonresponse mechanism on the small area mean predictor when a small area model at unit level is adopted. We follow the approach suggested by Pfefferman et al. (1998) to compensate the effect of unequal sample selection probabilities in multilevel models. Nevertheless, as unlike in the sampling selection, the survey sampler has no control over the response mechanism, our situation is further complicated by the fact that the response probabilities are unknown and need to be estimated. To analyse the performance of the suggested weighted estimation procedure various Monte Carlo experiments have been implemented with different scenarios. Results show that it is effective if the response probabilities are “properly” estimated and, above all, that the nonresponse and small area estimation problems, if both present in a survey, need to be addressed simultaneously. An approximation of the estimator of the mean squared error (MSE) of the weighted small area mean predictor is also considered.

**Geoadditive Small Area Model for the Estimation of Consumption Expenditure in Albania**

**Chiara Bocci**

In the last few years the demand of spatially detailed statistical data is considerably increased due also to the development of statistical methods for small area. In the past, the high degree of spatial detail of such information was not so useful for practical purposes as firms and local authorities were interested in information aggregated at some pre-specified level. However, the area definition and the assignment of the data to appropriate areas can pose problems in the estimation process. In particular, in small area estimation the importance of this matter is represented by the fact that some parameters of the model can be related to the between-area relationships. Geoadditive models can face this problem analyzing directly the spatial distribution of the study variable while accounting for possible covariate effects. This paper presents the implementation of a geoadditive model to small area estimation. The geoadditive SAE model is apply in order to estimate the district level mean of the household log per-capita consumption expenditure for the Republic of Albania.

Ultimo aggiornamento 20 novembre 2012.