John Einmahl & Sebastian Engelke

2022/12/08

At Erasmus University Rotterdam

John Einmahl (Tilburg University)

Title: Extreme Value Inference for General Heterogeneous Data

Abstract: We extend extreme value statistics to independent data with possibly very different distributions. In particular, we present novel asymptotic normality results for the Hill estimator, which now estimates the extreme value index of the average distribution. Due to the heterogeneity, the asymptotic variance can be substantially smaller than that in the i.i.d. case. As a special case, we consider a heterogeneous scales model where the asymptotic variance can be calculated explicitly. The primary tool for the proofs is the functional central limit theorem for a weighted tail empirical process. A simulation study shows the good finite-sample behavior of our limit theorems. We present an application to assess the tail heaviness of earthquake energies.

This is joint work with Yi He (University of Amsterdam).

Sebastian Engelke (University of Geneva)

Title: Extremal Graphical Models

Abstract: Engelke and Hitz (2020, JRSSB) introduce a new notion of conditional independence and graphical models for the most extreme observations of a multivariate sample. This enables the analysis of complex extreme events on network structures (e.g., floods) or large-scale spatial data (e.g., heat waves). Recent results show that this notion of extremal conditional independence arises as a special case of a much more general theory for limits of sums and maxima of independent random vectors. We first discuss the implications of this theory on other fields, and then focus on statistical inference for extremal graphical models. This includes the estimation of model parameters on general graph structures through matrix completion problems, and data-driven structure learning algorithms that estimate graphs through $L_1$ penalization. Theoretical guarantees based on concentration inequalities are given even for high-dimensional settings where the dimension $d$ is much larger than the sample size $n$. In extremes, this is of particular interest since the effective sample size $k$ is much smaller than $n$.