**Abstract 2006**

**Reducing Conservatism of Exact Small-Sample Methods of Inference for Discrete Data**

**Alan Agresti, Anna Gottard**

Exact small-sample methods for discrete data use probability distributions that do not depend on unknown parameters. However, they are conservative inferentially: The actual error probabilities for tests and confidence intervals are bounded above by the nominal level. This article discusses ways of reducing the conservatism. Fuzzy inference is a recent innovation that enables one to achieve the error probability exactly. We present a simple way of conducting fuzzy inference for discrete one-parameter exponential family distributions. In practice, most scientists would find this approach unsuitable yet might be disappointed by the conservatism of ordinary exact methods. Thus, to use exact small-sample distributions, we recommend inferences based on the mid-P value. This approach can be motivated by fuzzy inference, it is less conservative than standard exact methods, yet usually it does well in terms of achieving desired error probabilities. We illustrate this and other small-sample methods for the case of inferences about the binomial parameter.

**Population, resources and sustainable development: what relationship?**

**Silvana Salvini**

In this paper the recent trends of population and fertility, that is the basic cause of the high demographic growth in developing countries, are described. In a second time, attention will be dedicated on the main theoretical models linking population, environment and sustainable development. Finally, the relationships with conflicts will be analysed, taking into account some examples. Some case-studies will be shown to verify the impact on resources of demographic trends. In particular, the link between water scarcity and effects on food availability are analysed.

**Financial Econometric Analysis at Ultra–High Frequency: Data Handling Concerns**

**Christian T. Brownlees, Giampiero M. Gallo**

The financial econometrics literature on Ultra High-Frequency Data (UHFD) has been growing steadily in recent years. However, it is not always straightforward to construct time series of interest from the raw data and the consequences of data handling procedures on the subsequent statistical analysis are not fully understood. Some results could be sample or asset specific and in this paper we address some of these issues focussing on the data produced by the New York Stock Exchange, summarizing the structure of their TAQ ultra high-frequency dataset. We review and present a number of methods for the handling of UHFD, and explain the rationale and implications of using such algorithms. We then propose procedures to construct the time series of interest from the raw data. Finally, we examine the impact of data handling on statistical modeling within the context of financial durations ACD models.

Published as *Computational Statistics and Data Analysis*, Volume 51, Issue 4, pp. 2232-2245, 2006; link, published.

**Volatility Transmission Across Markets: A Multi-Chain Markov Switching Model**

**Giampiero M. Gallo, Edoardo Otranto**

The integration of financial markets across countries has modified the way prices react to news. Innovations originating in one market diffuse to other markets following patterns which usually stress the presence of interdependence. In some cases, though, covariances across markets have an asymmetric component which reflects the dominance of one over the others. The volatility transmission mechanisms in such events may be more complex than what can be modelled as a multivariate GARCH model. In this paper we adopt a new Markov Switching approach and we suppose that periods of high volatility and periods of low volatility represent the states of an ergodic Markov Chain where the transition probability is made dependent on the state of the "dominant" series. We provide some theoretical background and illustrate the model on Asian markets data showing support for the idea of dominant market and the good prediction performance of the model on a multi-period horizon.

Published as *Applied Financial Economics*, Volume 17, Issue 8, pp. 659-670, 2007; link, published.

**Polytomous disease mapping to detect uncommon risk factors for related diseases**

**Emanuela Dreassi**

This paper introduces a statistical model for jointly analysing the spatial variation of incidences of three (or more) diseases with common and uncommon risk factors. We have considered the mortality data (from 1990 to 1994) relative to oral cavity, larynx and lung cancers in 13 age groups of males, in the 287 municipalities of Region of Tuscany (Italy). All these pathologies share smoking as a common risk factor; furthermore, two of them (oral cavity and larynx cancer) share alcohol consumption as a risk factor. All studies suggest that smoking and alcohol consumption are the major known risk factors for oral cavity and larynx cancers; nevertheless, in this paper we investigate the possibility of there being other different risk factors for these diseases or even a different interaction between smoking and alcohol risk factors. A logit model for multinomial responses (multinomial logit or polytomous logit model) was used to model deaths for the different diseases. For each municipality and age-class we estimated the probabilities of death for each cause (the response probabilities). Lung cancer acts as the baseline category. The log odds are decomposed additively into shared (common to oral cavity and larynx diseases) and specific structured spatial variability terms, unstructured unshared spatial terms and an age-group term to adjust the crude observed data for effects that can be attributed to age. We estimated disease specific spatially structured effects; these are considered as latent variables denoting disease-specific risk factors. Results show that oral cavity and larynx cancer have different spatial patterns for residual risk factors which are not the typical ones already considered such as smoking habits and alcohol consumption. But, probably, these patterns are due to different spatial interactions between smoking habits and alcohol consumption for the first and the second disease.

Published as *Biometrical Journal*, Volume 49, Issue 4, pp. 520-529, 2007, published.

wp2006_06 (Statistics for experimental and technological research)

**On the impact of contaminations in graphical Gaussian models**

**Anna Gottard, Simona Pacillo**

This paper introduces a statistical model for jointly analysing the spatial variation of incidences of three (or more) diseases with common and uncommon risk factors. We have considered the mortality data (from 1990 to 1994) relative to oral cavity, larynx and lung cancers in 13 age groups of males, in the 287 municipalities of Region of Tuscany (Italy). All these pathologies share smoking as a common risk factor; furthermore, two of them (oral cavity and larynx cancer) share alcohol consumption as a risk factor. All studies suggest that smoking and alcohol consumption are the major known risk factors for oral cavity and larynx cancers; nevertheless, in this paper we investigate the possibility of there being other different risk factors for these diseases or even a different interaction between smoking and alcohol risk factors. A logit model for multinomial responses (multinomial logit or polytomous logit model) was used to model deaths for the different diseases. For each municipality and age-class we estimated the probabilities of death for each cause (the response probabilities). Lung cancer acts as the baseline category. The log odds are decomposed additively into shared (common to oral cavity and larynx diseases) and specific structured spatial variability terms, unstructured unshared spatial terms and an age-group term to adjust the crude observed data for effects that can be attributed to age. We estimated disease specific spatially structured effects; these are considered as latent variables denoting disease-specific risk factors. Results show that oral cavity and larynx cancer have different spatial patterns for residual risk factors which are not the typical ones already considered such as smoking habits and alcohol consumption. But, probably, these patterns are due to different spatial interactions between smoking habits and alcohol consumption for the first and the second disease.

wp2006_07 (Econometrics, Economic Statistics)

**Indirect estimation of alpha-stable stochastic volatility models**

**Marco J. Lombardi, Giorgio Calzolari**

The alpha-stable family of distributions constitutes a generalization of the Gaussian distribution, allowing for asymmetry and thicker tails. Its many useful properties, including a central limit theorem, are especially appreciated in the financial field. However, estimation difficulties have up to now hindered its diffusion among practitioners. In this paper we propose an indirect estimation approach to stochastic volatility models with alpha-stable innovations that exploits, as auxiliary model, a GARCH(1,1) with t-distributed innovations. We consider both cases of heavy-tailed noise in the returns or in the volatility. The approach is illustrated by means of a detailed simulation study and an application to currency crises.

**Robust ANalysis Of VAriance: an approach based on the Forward Search**

**Bruno Bertaccini, Roberta Varriale**

We present a simple robust method for the detection of atypical observations and the analysis of their effect in the ANOVA framework. We propose to use a forward search procedure that orders the observations by their closeness to the hypothesized model. The procedure can be applied following two different strategies: one that adds units maintaining the relative group dimension and the other that adds only one new unit at each step of the search. The assessment of the goodness of the method is carried out through a simulation study. The method is then applied to a dataset collected by the Italian National University Evaluation Committee for the evaluation of the effectiveness of the degree program reform applied during the academic year 2001/02. Results are always presented through easy to interpret plots which are powerful in revealing the structure of the data.

**Marginal Distributions of Maximum-likelihood estimator in non-standard conditions**

**Marco Barnabani**

When the true parameter lies on the boundary of the parameter space the asymptotic distribution of maximum likelihood estimator is difficult to calculate. In some relatively simple cases it is a mixture of truncated normal distributions. In this paper we shall be concerned with the the marginal distributions of the estimator when one or two components of the true parameter are zero and lie on the boundary of the parameter space. We found that these distributions are (mixtures of) normal or truncated normal multiplied by "skew functions" which distort the symmetry of the normality.

**Maximum likelihood estimator and singularity of the information matrix**

**Marco Barnabani**

When the model is identified but the information matrix is singular, the classic asymptotic properties of the maximum likelihood estimator are not clear and an inferential procedure based on it is not viable. In the paper a solution of a loglikelihood equation appropriately penalized is shown to be consistent and asymptotically normal distributed with variance-covariance matrix approximated by the Moore-Penrose pseudoinverse of the information matrix. These properties allow one to get a quadratic function based on a standard Chi-square distribution for hypothesis testing. A simulation applied to a simplified Engle's model is presented to support the theoretical results.

wp2006_11 (Economic Statistics)

**Organizing Administrative Data for Statistical Purposes: a Case Study**

**Lucia Buzzigoli, Cristina Martelli**

In Italy the public administration is actually undergoing a very deep improvement and innovation process: in this new context local bodies are required by law to build statistical information systems oriented to monitoring, auditing, management control and government decision support. The paper presents the working project that the Department of Statistics “G. Parenti” of the University of Florence is making together with the Municipality of Florence, aiming at the organization of an efficient informative system inside the administrative structure.

**Time-Varying Mixing Weights in Mixture Autoregressive Conditional Duration Models**

**Giovanni De Luca, Giampiero M. Gallo**

Financial market price formation and exchange activity can be investigated by means of ultra-high frequency data. In this paper we investigate an extension of the Autoregressive Conditional Duration (ACD) model of Engle and Russell (1998) by adopting a mixture of distribution approach with time varying weights. Empirical estimation of the Mixture ACD model shows that the limitations of the standard base model and its inadequacy of modelling the behavior in the tail of the distribution are suitably solved by our model. When the weights are made dependent on some market activity data, the model lends itself to some structural interpretation related to price formation and information diffusion in the market.

Published as *Econometric Reviews*, Volume 28, Issue 1, pp. 102-120, 2009; link, published.

wp2006_13 (Statistics, Econometrics, Demography)

**Assessing the Causal Effect of Childbearing on Economic Wellbeing in Albania**

**Francesca Francavilla, Alessandra Mattei**

In this paper we analyze to what extent births may lead to changes in economic wellbeing. In contrast to most previous studies on this issue we apply appropriate econometric techniques based on longitudinal micro data in order to identify the causal effects of child bearing events on income. We perform our analysis on longitudinal data from the Albanian Living Standard Measurement Survey. We take a quasi experimental approach, that is, we consider the experience of a childbearing event as the treatment variable, and our measure of wellbeing as the outcome variable. In order to deal with the confounding due to the presence of systematic differences in background characteristics between the treatment groups, we first fit a multiple linear regression model that includes relevant background characteristics as well as an indicator variable for the treatment (i.e., childbearing). This estimation is then compared and contrasted with a matching approach, based on the bias-corrected matching estimator introduced by Abadie and Imbens (2002). Our analysis suggests that there is some evidence that childbearing events can in fact increase household wellbeing in Albania. In addition, the treatment effect is highly heterogeneous with respect to observable characteristics such as the woman's working status and the woman's parity. All the results appear to be robust with respect to the estimated equivalence scale: changing the equivalence scale leaves the childbearing effect on income positive and non-significant.

Published as *Causal Analysis in Population Studies. Concepts, Methods, Applications*, H. Engelhardt, H.-P. Kohler, A. Furnkranz-Prskawet Editors, Springer, pp. 201-231, 2009, ISBN 978-1-4020-99.

wp2006_14 (Medical Statistics)

**Inequality in Health among Social Classes: a Longitudinal Study on Tuscany**

**Ngindu Kalala, Marco Marchi**

This paper has two objectives. The first is to review the main socio-economic classifications, with the idea to apply these socio-health categories and to compare them through an index of inequality. The second is, by examining the Longitudinal Study on Tuscany, to evaluate and to find the best solution according to the analysis. In other words, our research try to analyse the relationships between health and socio-economic conditions of the households/individuals, and consequently the relative risks based on different classifications of different types of jobs. The motivation to study the inequality in health among the social classes are due to: a) the need to control the social phenomenon and the diseases for the new social classes; b) the need to have efficacy in the health system which allows also to control the public expenditures on health, and, finally, c) the need to entrust and create confidence in the citizens on the health public fund system at regional and national level by demonstrating lower public expenditure accompanied by a distribution of resources available and more equity in health services. In order to have an international comparison we examined Florence and Livorno during two periods (1991-1997 for Florence, 1981-1987 and 1991-1997 for Livorno) and some cities and countries of European Union (1970-1980) among which we selected Copenhagen, Helsinki, Rotterdam, France cities, Sweden cities and United Kingdom - Wales (Galles). We calculate the index of inequality using the "Poisson" model of regression using the value of the socio-economic state (SES). The results show that European Union cities - with the exception of the cities selected for France - have a better index of inequality than Florence and Livorno in last period. The index of inequality in health is related to complex factors, this means that further studies are needful in this direction.

wp2006_15 (Econometrics, Statistics)

**Vector Multiplicative Error Models: Representation and Inference**

**Fabrizio Cipollini, Robert Engle, Giampiero M. Gallo**

The Multiplicative Error Model introduced by Engle (2002) for positive valued processes is specified as the product of a (conditionally autoregressive) scale factor and an innovation process with positive support. In this paper we propose a multivariate extension of such a model, by taking into consideration the possibility that the vector innovation process be contemporaneously correlated. The estimation procedure is hindered by the lack of probability density functions for multivariate positive valued random variables. We suggest the use of copula functions and of estimating equations to jointly estimate the parameters of the scale factors and of the correlations of the innovation processes. Empirical applications on volatility indicators are used to illustrate the gains over the equation by equation procedure.

wp2006_16 (Demography, Statistics)

**Invecchiamento e mobilità nell’area metropolitana fiorentina**

**Alessandra Petrucci, Nicola Salvati, Silvana Salvini, Daniele Vignoli**

Florence “demographic profile”, as that of many other urban situations, is characterized by a marked ageing, by the spreading of mono-personal families, by a growth rate close to zero and, lastly, by the increasing immigration and mobility. The municipal level, although being useful to develop strategies of precise interventions at a local level, does not allow anymore adequate interpretations of the social and economic trends and of the most significant processes of territorial transformation and functional specialization. The recent forms of social and economic organization over the territory highlight the increasing trend of the degree of spatial interconnection and interdependence. Therefore in this paper we will focus our attention on the study of ageing within Florence metropolitan area (composed by the counties of Firenze, Pistoia e Prato) in the period 1981-2001 by means of a cartographic analysis, which allows the description and the interpretation of the demographic phenomena with a geographic perspective. Then we analyze mobility, by means of a model of spatial interaction aiming at interpreting the mobility process related to the demographic structure. Finally we will conclude with some considerations over the sustainability of ageing.

Published as *Rivista di Economia e Statistica del Territorio*, Volume 2008, Issue 2, pp. 81-103, 2008, published.

wp2006_17 (Econometrics, Economic Statistics)

**Exchange Market Pressure: Some Caveats in Empirical Applications**

**Giampiero M. Gallo, Simone Bertoli, Giorgio Ricchiuti**

The Exchange Market Pressure (EMP) Index, developed by Eichengreen et al. [1994], is widely used to study currency crises as a tool to signal whether pressures on a currency are softened or warded off through monetary authorities’ interventions or whether a currency crisis has originated. In this paper we show how the index is sensitive to some assumptions behind the aggregation of the information available (exchange rates, interest rates and reserves), especially when emerging countries are involved. Specifically, we address the way exchange rate variations are computed and the impact of different definitions of the reserves, and we question the constancy of the weights adopted. These issues compound with the choice of a fixed threshold when crisis episodes are identified through EMP. As a result, the dichotomous crisis variable thus derived when adopted as a dependent variable may lead to varied results in subsequent econometric analysis.

Published as *Applied Economics*, 2010; link, forthcoming.

wp2006_18 (Economic Statistics)

**Misure di competitività: aspetti empirici**

**Leonardo Ghezzi, Alessandro Viviani**

Un'analisi della competitività della Toscana con un confronto a livello europeo.

Ultimo aggiornamento 30 giugno 2011.