Overview

The scientific sessions consist of invited lectures (one hour each), contributed talks (25 min) and posters.  All sessions will be hosted on Zoom. Invited lectures and contributed sessions will be hosted in the main room, there will be no parallel sessions. For the poster session we will use breakout rooms in Zoom, where all participants will be able to freely move from one breakout room to another. The poster presenters can decide if they want to present a ‘traditional’ poster, or use their breakout room differently, e.g. by making slides to present. Further information on how to use the breakout rooms in Zoom is available here.

The programme can be downloaded here in pdf.

Programme

9:30 – 10:30  Invited Lecture

Siem Jan Koopman (VU Amsterdam)

Analysis of High-Dimensional Unobserved Components Time Series Models using Pre-Filtering Methods

10:30 - 10:45 Break

10:45 – 12:00  Sparsity

Robert Adamek (Maastricht University), with Stephan Smeekes and Ines Wilms

Lasso Inference for High-Dimensional Time Series

Abstract

In this paper we develop valid inference for high-dimensional time series. We extend the desparsified lasso to a time series setting under Near-Epoch Dependence (NED) assumptions allowing for non-Gaussian, serially correlated and heteroskedastic processes, where the number of regressors can possibly grow faster than the time dimension. We first derive an error bound for the (regular) lasso, relaxing the commonly made exact sparsity assumption to a weaker alternative, which permits many small but non-zero parameters. The weak sparsity coupled with the NED assumption means this inequality can also be applied to the (inherently misspecified) nodewise regressions performed in the desparsified lasso. This allows us to establish the uniform asymptotic normality of the desparsified lasso under general conditions. Additionally, we show consistency of a long-run variance estimator, thus providing a complete set of tools for performing inference in high-dimensional linear time series models. Finally, we perform a simulation exercise to demonstrate the small sample properties of the desparsified lasso in common time series settings.

Jonas Krampe (University of Mannheim), with Luca Margaritella

Dynamic Factor Models with Sparse VAR Idiosyncratic Components

Abstract

We reconcile high-dimensional sparse and dense techniques within the framework of a Dynamic Factor Model and assume the idiosyncratic term follows a sparse vector autoregressive model (VAR). The different diverging behavior of the eigenvalues of the covariance matrix allows to disentangle the two different sources of dependence. The estimation is articulated in two steps: first, the factors and their loadings are estimated via principal component analysis and second, the sparse VAR is estimated by regularized regression on the estimated idiosyncratic components. We prove consistency of the proposed estimation approach as the time and cross-sectional dimension diverge. We complement our procedure with a joint information criteria for the VAR lag-length and the number of factors. The finite sample performance of our procedure is illustrated by means of a thorough simulation exercise.

Etienne Wijler (Maastricht University), with Hanno Reuvers

Sparse Generalized Yule–Walker Estimation for Large Spatio-temporal Autoregressions with an Application to Satellite Data

Abstract

In this paper, we consider a high-dimensional model in which variables are observed over time and on a spatial grid. The model takes the form of a spatio-temporal regression containing time lags and a spatial lag of the dependent variable. Unlike classical spatial autoregressive models, we do not rely on a predetermined spatial interaction matrix but infer all spatial interactions from the data. That is, assuming sparsity, we estimate the spatial and temporal dependence in a fully data-driven way by penalizing a set of Yule-Walker equations. This regularization can be left unstructured but we also propose more customized shrinkage procedures that follow intuitively when observations originate from spatial grids (e.g. satellite images). Finite sample error bounds are derived and estimation consistency is established in an asymptotic framework wherein the sample size and the number of spatial units diverge jointly. A simulation exercise shows strong finite sample performance compared to competing procedures. As an empirical application, we model satellite measured NO2 concentrations in London. Our approach delivers forecast improvements over a competitive benchmark and we discover evidence for strong spatial interactions.

12:00 - 12:45 Lunch

12:45 – 14:00  Forecasting & Finance

Rogier Quaedvlieg (Erasmus University Rotterdam), with Jia Li and Zhipeng Liao

Conditional Superior Predictive Ability

Abstract

We propose a novel dynamic approach to forecast the weights of the global minimum variance portfolio (GMVP). The GMVP weights are the population coefficients of a linear regression of a benchmark return on a vector of return differences. This representation enables us to derive a consistent loss function from which we can infer the optimal GMVP weights without imposing any distributional assumptions on the returns. In order to capture time variation in the returns’ conditional covariance structure, we model the portfolio weights through a recursive least squares (RLS) scheme as well as by generalized autoregressive score (GAS) type dynamics. Sparse parameterizations combined with targeting towards nonlinear shrinkage estimates of the long-run GMVP weights ensure scalability with respect to the number of assets. An empirical analysis of daily and monthly financial returns shows that the proposed models perform well in- and out-of-sample in comparison to existing approaches.

Guanglin Huang (Vrije Universiteit Brussel), with Wanbo Lu and Kris Boudt

Weak factor analysis with higher-order multi-cumulants

Abstract

We estimate the latent factors driving the non-normal dependence in high dimensional panel data using Higher-order multi-cumulant Factor Analysis (HFA). This new approach consists of an eigenvalue ratio test to select the number of non-Gaussian factors, and uses alternating regressions to estimate both the Gaussian and non-Gaussian factors. Simulation results confirm that the HFA estimators improve the accuracy of factor selection and factor estimation as compared to approaches using principal or independent component analysis. The empirical usefulness of the HFA approach is shown in an application to forecast the S&P 500 equity premium.

Fabian Krüger (Karlsruhe Institute of Technology), with Laura Reh and Roman Liesenfeld

Predicting the global minimum variance portfolio

Abstract

This paper proposes a test for the conditional superior predictive ability (CSPA) of a family of forecasting methods with respect to a benchmark. The test is functional in nature: Under the null hypothesis, the benchmark’s conditional expected loss is no more than those of the competitors, uniformly across all conditioning states. By inverting the CSPA tests for a set of benchmarks, we obtain confidence sets for the uniformly most superior method. The econometric inference pertains to testing conditional moment inequalities for time series data with general serial dependence, and we justify its asymptotic validity using a uniform nonparametric inference method based on a new strong approximation theory for mixingales. The usefulness of the method is demonstrated in empirical applications on volatility and inflation forecasting.

14:00 - 14:15 Break

14:15 – 15:15  Invited Lecture

Matteo Barigozzi (University of Bologna), with Matteo Luciani

Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm

Abstract

This paper studies Quasi Maximum Likelihood estimation of approximate Dynamic Factor Models for large panels of time series. Specifically, we consider the case in which the autocorrelation of the factors is explicitly accounted for and therefore the model has a state-space form. Estimation of the factors and of their loadings is implemented by means of the Expectation Maximization (EM) algorithm, jointly with the Kalman smoother. We prove that, as both the dimension of the panel n and the sample size T diverge to infinity: (i) the estimated loadings are √T-consistent and asymptotically normal if √T/n→0; (ii) the estimated factors are √n-consistent and asymptotically normal if √n/T→0; (iii) the estimated common component is min(√T,√n)-consistent and asymptotically normal regardless of the relative rate of divergence of n and T. Although the model is estimated as if the  idiosyncratic terms were cross-sectionally and serially uncorrelated, and normally distributed, we show that these mis-specifications do not affect consistency. Moreover, the estimated loadings are asymptotically as efficient as those obtained with the Principal Components estimator, while the estimated factors are more efficient if the idiosyncratic covariance is sparse enough. We then propose robust estimators of the asymptotic covariances, which can be used to conduct inference on the loadings and to compute confidence intervals for the factors and common components. Finally, we study the performance of our estimators and we compare them with the traditional Principal Components approach by means of Monte Carlo simulations and an analysis of US macroeconomic data.

15:15 – 15:30  Poster Pitches 1

The poster presenters of Poster Session 1 give a short pitch to introduce their posters

15:30 - 15:45 Break

15:45 – 17:15  Poster Session 1

Filippo Arigoni (Bank of Slovenia)

World shocks and commodity price fluctuations: evidence from resource-rich economies

Abstract

We identify world shocks driving up real commodity prices in a Bayesian dynamic factor model setting using a minimum set of sign restrictions complemented with constrained short-run responses. We find that a world demand and a world supply shock explain the lion’s share of commodity price and commodity currency fluctuations, besides shaping the real business cycle in resource-rich economies. However, according to the asymmetric level of economic development of countries and to the intensity of trade activities, different reactions to global disturbances are estimated. We also show that the shortage of energy products in exports is responsible for negligible effects of world commodity shocks on the domestic economy. Finally, our findings suggest that the non-tradable sector benefits from resource price boosts, in line with the Dutch disease theory linked to this type of economies.

Carlos Castro-Iragorri (Universidad del Rosario), with Julian Ramirez

Forecasting Dynamic Term Structure Models with Autoencoders

Abstract

Principal components analysis (PCA) is a statistical approach to build factor models in finance. PCA is also a particular case of a type of neural network known as an autoencoder. Recently, autoencoders have been successfully applied in financial applications using factor models, Gu et al. (2020), Heaton and Polson (2017). We study the relationship between autoencoders and dynamic term structure models; furthermore we propose different approaches for forecasting. We compare the forecasting accuracy of dynamic factor models based on autoencoders, classical models in term structure modelling proposed in Diebold and Li (2006) and neural network-based approaches for time series forecasting. Empirically, we test the forecasting performance of autoencoders using the U.S. yield curve data in the last 35 years. Preliminary results indicate that a hybrid approach using autoencoders and vector autoregressions framed as a dynamic term structure model provides an accurate forecast that is consistent throughout the sample. This hybrid approach overcomes in-sample overfitting and structural changes in the data.

Samriddha Lahiry (Cornell University), with Kara Karpman, Sumanta Basu and Diganta Mukherjee

Exploring Financial Networks Using Quantile Regression and Granger Causality

Abstract

In the post-crisis era, financial regulators and policymakers are increasingly interested in data-driven tools to measure systemic risk and to identify systemically important firms. We propose a statistical method that measures connectivity in the financial sector using time series of firm stock returns. Our method is based on system-wide lower tail analysis, whereby we estimate linkages between firms that occur when those firms are distressed and that exist conditional on the financial information of all other firms in the sample. This is achieved using a Granger causality analysis based on Lasso-penalized quantile regression. By considering centrality measures of these financial networks, we can assess the build-up of systemic risk and identify risk propagation channels. We apply our method to the monthly stock returns of large U.S. firms and demonstrate that we are able to detect many of the most recent systemic events, in addition to identifying key players in the 2007-2009 U.S. financial crisis. We perform a similar analysis on Indian banks with promising results. We also provide some non-asymptotic theory for Lasso-penalized quantile regression and derive sufficient conditions for its consistency in a high-dimensional regime.

Luca Margaritella (Maastricht University), with Marina Friedrich and Stephan Smeekes

High-Dimensional Causality for Climatic Attribution

Abstract

We test for causality in high-dimensional vector autoregressive models (VARs) to disentangle and interpret the complex causal chains linking radiative forcings and global temperatures. We consider both direct predictive causality in the sense of Granger and direct-indirect causality in the sense of Sims, developing a framework of impulse response analysis in high-dimensions via local projections. By allowing for high dimensionality in the model we can enrich the information set with all relevant natural and anthropogenic forcing variables to obtain reliable causal relations. These variables have mostly been investigated in an aggregated form or in separate models in the previous literature. Additionally, our framework allows to ignore the order of integration of the variables and to directly estimate the VAR in levels, thus avoiding accumulating biases coming from unit-root and cointegration tests. This is of particular appeal for climate time series which are well known to contain stochastic trends as well as yielding long memory. We are thus able to display the causal networks linking radiative forcings to global and hemispheric temperatures but also to causally connect radiative forcings among themselves, therefore allowing for a careful reconstruction of a timeline of causal effects among forcings. The robustness of our proposed procedure makes it an important tool for policy evaluation in tackling global climate change.

Peter Pedroni (Williams College), with Stephan Smeekes

Robust Estimation of Long Run Functional Relationships

Abstract

We investigate a new technique for estimating specialized long run functional relationships. The technique uses polynomial approximations to estimate functions of unknown form and exploits the structure of unit root panels by decomposing the estimating equations into a set of static linear time series regressions for each unit followed by a set of cross sectional polynomial regressions for each historical period of observation. We establish asymptotic normality and fast rates of convergence that allow for considerable robustness with respect to temporal and cross sectional dependency including common unit root factors. We also investigate the attractive finite sample properties of the technique by Monte Carlo simulation and offer two illustrations of empirical application, one from the growth literature and one pertaining to the environmental Kuznets curve.

Marie Ternes (Maastricht University), with Alain Hecq and Ines Wilms

Hierarchical Regularizers for Mixed-Frequency Vector Autoregressions

Abstract

Mixed-frequency Vector AutoRegressions (MF-VAR) model the dynamics between variables recorded at different frequencies. However, as the number of series and high-frequency observations per low- frequency period grow, MF-VARs suffer from the “curse of dimensionality”. We curb this curse through a regularizer that permits various hierarchical sparsity patterns by prioritizing the inclusion of coefficients according to the recency of the information they contain. Additionally, we investigate the presence of nowcasting relations by sparsely estimating the MF-VAR error covariance matrix. We study predictive Granger causality relations in a MF-VAR for the U.S. economy and construct a coincident indicator of GDP growth.

09:30 – 10:45  Factor Models

Takashi Yamagata (University of York), with Yoshimasa Uematsu

Inference in Weak Factor Models

Abstract

In this paper, we consider statistical inference for high-dimensional approximate factor models. We posit a weak factor structure, in which the factor loading matrix can be sparse and the signal eigenvalues may diverge more slowly than the cross-sectional dimension, N. We propose a novel inferential procedure to decide whether each component of the factor loadings is zero or not, and prove that this controls the false discovery rate (FDR) below a pre-assigned level, while the power tends to unity. This “factor selection” procedure is primarily based on a de-sparsified (or debiased) version of the WF-SOFAR estimator of Uematsu and Yamagata (2020), but is also applicable to the principal component (PC) estimator. After the factor selection, the re-sparsified WF- SOFAR and sparsified PC estimators are proposed and their consistency is established. Finite sample evidence supports the theoretical results. We apply our procedure to the FRED-MD macroeconomic and financial data, consisting of 128 series from June 1999 to May 2019. The results strongly suggest the existence of sparse factor loadings and exhibit a clear association of each of the extracted factors with a group of macroeconomic variables. In particular, we find a price factor, housing factor, output and income factor, and a money, credit and stock market factor.

Daniele Massacci (King’s Business School), with Mirco Rubin and Dario Ruzzi

Systematic Comovement in Threshold Group-Factor Models

Abstract

We study regime-specific systematic comovement between two large panels of variables that exhibit an approximate factor structure. Within each panel we identify threshold-type regimes through shifts in the factor loadings. For the resulting regimes, and with regard to the relation between any two variables in different panels, we define as “systematic” the comovement that is generated by the common components of the variables. In our setup, changes in comovement are identified by regime shifts in the loadings. After constructing measures of systematic comovement between the two panels, we propose estimators for these measures and derive their asymptotic properties. We develop inferential procedures to formally test for changes in systematic comovement between regimes. The empirical analysis of two large panels of U.S. and international equity returns shows that their systematic comovement increases when U.S. macroeconomic uncertainty is sufficiently high.

Gianluca Cubadda (University of Rome Tor-Vergata), with Alain Hecq

Dimension Reduction for High Dimensional Vector Autoregressive Models

Abstract

This paper aims to decompose a large dimensional vector autoregessive (VAR) model into two components, the first one being generated by a small-scale VAR and the second one being a white noise sequence. Hence, a reduced number of common factors generates the entire dynamics of the large system through a VAR structure. This modelling extends the common feature approach to high dimensional systems, and it differs from the dynamic factor model in which the idiosyncratic component can also embed a dynamic pattern. We show the conditions under which this decomposition exists. We provide statistical tools to detect its presence in the data and to estimate the parameters of the underlying small-scale VAR model. We evaluate the practical value of the proposed methodology by simulations as well as by an empirical application to a large set of US economic variables.

10:45 - 11:00 Break

11:00 – 12:00  Invited Lecture

Anders Bredahl Kock (Oxford University), with David Preinerstorfer

Consistency of p-norm based tests in high dimensions: characterization, monotonicity, domination

Abstract

Many commonly used test statistics are based on a norm measuring the evidence against the null hypothesis. To understand how the choice of a norm affects power properties of tests in high dimensions, we study the consistency sets of p-norm based tests in the prototypical framework of sequence models with unrestricted parameter spaces, the null hypothesis being that all observations have zero mean. The consistency set of a test is here defined as the set of all arrays of alternatives the test is consistent against as the dimension of the parameter space diverges. We characterize the consistency sets of p-norm based tests and find, in particular, that the consistency against an array of alternatives cannot be determined solely in terms of the p-norm of the alternative. Our characterization also reveals an unexpected monotonicity result: namely that the consistency set is strictly increasing in p(0,), such that tests based on higher p strictly dominate those based on lower p in terms of consistency. This monotonicity allows us to construct novel tests that dominate, with respect to their consistency behavior, all p-norm based tests without sacrificing size.

12:00 – 12:15  Poster Pitches 2

The poster presenters of Poster Session 2 give a short pitch to introduce their posters

12:15 - 13:00 Lunch

13:00 – 14:30  Poster Session 2

Igor Custodio João (VU Amsterdam), with Andre Lucas and Julia Schaumburg

Clustering Dynamics and Persistence for Financial Multivariate Panel Data

Abstract

We introduce a new method for dynamic clustering of panel data with dynamics for cluster location and shape, cluster composition, and for the number of clusters. Whereas current techniques typically result in (economically) too many switches, our method results in economically more meaningful dynamic clustering patterns. It does so by extending standard cross-sectional clustering techniques using shrinkage towards previous cluster means. In this way, the different cross-sections in the panel are tied together, substantially reducing short-lived switches of units between clusters (flickering) and the birth and death of incidental, economically less meaningful clusters. In a Monte Carlo simulation, we study how to set the penalty parameter in a data-driven way. A systemic risk surveillance example for business model classification in the global insurance industry illustrates how the new method works empirically.

Deniz Erdemlioglu (IESEG School of Management), with Massimiliano Caporina and Stefano Nasini

Estimating Financial Networks by Realized Interdependencies: A Restricted Autoregressive Approach

Abstract

We develop a network-based vector autoregressive approach to uncover the interactions among financial assets by integrating multiple realized measures based on high-frequency data. Under a restricted parameter structure, our approach allows capturing the cross-sectional and time dependencies embedded in a large panel of assets through the decomposition of these two blocks of dependencies. We propose a block coordinate descent procedure for the least square estimation and investigate its theoretical properties. By integrating realized returns, realized volume and realized volatilities of 1095 individual U.S. stocks over fifteen years, we show that our approach identifies a large array of interdependencies with a limited computational effort. As a direct consequence of the estimated model, we provide a new ranking for the systemically important financial institutions (SIFIs) and carry out an impulse response analysis to quantify the effects of adverse shocks on the financial system.

Adam Jassem (Maastricht University), with Lenard Lieb, Rui Jorge Almeida, Nalan Basturk and Stephan Smeekes

Min(d)ing the President: A text analytic approach to measuring tax news​

Abstract

We propose a novel text-analytic approach for incorporating textual information into structural economic models and apply this to study the effects of tax news. We first develop a novel semi-supervised two-step topic model that automatically extracts specific information regarding future tax policy changes from text. We also propose an approach for transforming such textual information into an economically meaningful time series, to be included in a structural econometric model as variable of interest or instrument. We apply our method to study the effects of fiscal foresight, in particular the informational content in speeches of the U.S.~president about future tax reforms. We find that our semi-supervised topic model can successfully extract information about the direction of tax changes. The extracted information predicts (exogenous) future tax changes and contains signals that are not present in previously considered (narrative) measures of (exogenous) tax changes. We find that tax news triggers a significant yet delayed response in output.

Luke Mosley (Lancaster University), with Idris Eckley and Alex Gibberd

High-Dimensional Temporal Disaggregation

Abstract

Temporal disaggregation is a method commonly used in official statistics to enable high-frequency estimates of key economic indicators, such as GDP. Traditionally, such methods have relied on only a couple of high-frequency indicator series to produce estimates. However, the prevalence of large, and increasing, volumes of administrative and alternative data-sources motivates the need for such methods to be adapted for high-dimensional settings. In this work, we propose a novel sparse temporal-disaggregation procedure and contrast this with the classical Chow-Lin method. We demonstrate the performance of our proposed method through simulation study, highlighting various ad- vantages realised. We also explore its application to disaggregation of UK gross domestic product data, demonstrating the method’s ability to operate when the number of potential indicators is greater than the number of low-frequency observations.

Livia Paranhos (University of Warwick)

Predicting Inflation with Neural Networks

Abstract

This paper applies neural network models to forecast inflation. The use of a particular recurrent neural network, the long-short term memory model, or LSTM, that summarizes macroeconomic information into common components is a major contribution of the paper. Results from an exercise with US data indicate that the estimated neural nets usually present better forecasting performance than standard benchmarks, especially at long horizons. The LSTM in particular is found to outperform the traditional feed-forward network at long horizons, suggesting an advantage of the recurrent model in capturing the long-term trend of inflation. This finding can be rationalized by the so called long memory of the LSTM that incorporates relatively old information in the forecast as long as accuracy is improved, while economizing in the number of estimated parameters. Interestingly, the neural nets containing macroeconomic information capture well the features of inflation during and after the Great Recession, possibly indicating a role for nonlinearities and macro information in this episode. The estimated common components used in the forecast seem able to capture the business cycle dynamics, as well as information on prices.

Rosnel Sessinou (Aix-Marseille University), with Luca Margaritella

Precision Least Squares: Estimation and Inference in High-Dimensional Linear Regression Model

Abstract

We show that since the precision matrix can be employed to obtain the usual least squares solution, then a regularized estimate of the precision matrix can be directly used to obtain the OLS solution even in a high-dimensional regression problem where the number of covariates can be strictly larger than the sample size. As biases can occur from different choices of the precision matrix estimate, we show how to construct a (nearly) unbiased estimator for both sparse as well as non-sparse data generating processes. We call this the Precision Least Squares estimator and show that under the framework of physical dependence it is asymptotically Gaussian, automatically free of the usual post-selection-bias, hence it does not suffer from converging only point-wise in the parameter space but it is uniformly valid and delivers honest inference. We employ the Precision Least Squares to estimate the predictive connectedness among daily asset returns of 88 global banks.

14:30 - 14:45 Break

14:45 – 16:00  Nonstationarity & Long Memory

Guillaume Chevillon (Essec Business School), with Luc Bauwens and Sébastien Laurent

We modeled long memory with just one lag!

Abstract

A large dimensional network or system can generate long memory in its components, as shown by Chevillon, Hecq and Laurent (2018, CHL) and Schennach (2018). These authors derive conditions under which the variables generated by an infinite dimensional vector autoregressive model of order 1, a VAR(1), exhibit long memory. We show how these asymptotic results can be put to practice for finite sample modeling and inference regarding series with long range dependence that belong a network or a large system. We propose to use a VAR(1), or an AR(1)-X when the VAR(1) model is estimated equation by equation, whose parameters we shrink to generic conditions matching those of CHL and Schennach. Our proposal significantly outperforms ARFIMA and HAR models when forecasting a non-parametric estimate of the log of the integrated variance (i.e., log(MedRV)) of 250 assets, or the annual productivity growth recorded in 100 industrial sectors in the U.S.

Tomás del Barrio Castro (University of the Balearic Islands)

Testing for the cointegration rank between Periodically Integrated processes

Abstract

Cointegration between Periodically Integrated (PI) processes has been analyzed among other by Birchenhall, Bladen-Hovell, Chui, Osborn, and Smith (1989), Boswijk and Franses (1995), Franses and Paap (2004), Kleibergen and Franses (1999) and del Barrio Castro and Osborn (2008). However, so far there is not a method, published in an academic journal, that allows us to determine the cointegration rank between PI processes. This paper fills the gap, a method to determine the cointegration rank between a set PI Processes based on the idea of pseudo-demodulation is proposed in the context of Seasonal Cointegration by del Barrio Castro, Cubadda and Osborn (2020). Once a pseudo-demodulation time series is obtained the Johansen (1995) procedure could be applied to determine the cointegration rank. A Monte Carlo experiment shows that the proposed approach works satisfactorily for small samples.

Aramayis Dallakyan (Texas A&M University), with Mohsen Pourahmadi

Fused-Lasso Regularized Cholesky Factors of Large Nonstationary Covariance Matrices of Replicated Time Series

Abstract

The smoothness of subdiagonals of the Cholesky factor of large covariance matrices is closely related to the degree of nonstationarity of autoregressive models for time series data. Heuristically, one expects for nearly stationary covariance matrix entries in each subdiagonal of the Cholesky factor of its inverse to be approximately the same in the sense that the sum of the absolute values of successive differences is small. Statistically, such smoothness is achieved by regularizing each subdiagonal using fused-type lasso penalties. We rely on the standard Cholesky factor as the new parameter within a regularized normal likelihood setup which guarantees: (1) joint convexity of the likelihood function, (2) strict convexity of the likelihood function restricted to each subdiagonal even when n < p, and (3) positive-definiteness of the estimated covariance matrix. A block coordinate descent algorithm, where each block is a subdiagonal, is proposed, and its convergence is established under mild conditions. Lack of decoupling of the penalized likelihood function into a sum of functions involving individual subdiagonals gives rise to some computational challenges and advantages relative to two recent algorithms for sparse estimation of the Cholesky factor, which decouple row-wise. Simulation results and real data analysis show the scope and good performance of the proposed methodology.

16:00 - 16:15 Break

16:15 – 17:15  Invited Lecture

Marcelo C. Medeiros (Pontifical Catholic University of Rio de Janeiro), with Jianqing Fan and Ricardo Masini

Bridging Factor and Sparse Models

Abstract

Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimension. They are seemingly mutually exclusive. In this paper, we propose a simple lifting method that combines the merits of these two models in a supervised learning methodology that allows to efficiently explore all the information in high-dimensional datasets. The method is based on a flexible model for panel data, called factor-augmented regression model with both observable, latent common factors, as well as idiosyncratic components as high-dimensional covariate variables. This model not only includes both factor regression and sparse regression as specific models but also significantly weakens the cross-sectional dependence and hence facilitates model selection and interpretability. The methodology consists of three steps. At each step, the remaining cross-section dependence can be inferred by a novel test for covariance structure in high-dimensions. We developed asymptotic theory for the factor-augmented sparse regression model and demonstrated the validity of the multiplier bootstrap for testing high-dimensional covariance structure. This is further extended to testing high-dimensional partial covariance structures. The theory and methods are further supported by an extensive simulation study and applications to the construction of a partial covariance network of the financial returns and a prediction exercise for a large panel of macroeconomic time series from FRED-MD database.