Danish Society for Theoretical Statistics (DSTS)

Dansk Selskab for Teoretisk Statistik


Welcome to the 2-day DSTS meeting organized by Epimiology, Biostatistics and Biodemography at SDU. Please find the program and practical details below.

We wish enjoyable days to all and are looking forward to see you in Odense.

Date and Time

9 Apr 2019 13:00 - 10 Apr 2019 13:00


University of Southern Denmark
Store Auditorium - ground floor
J.B. Winsløvs Vej 15
5000 Odense C
Map: J.B. Winsløws Vej 15 - location 15-0.55


Online sign-up for the meeting go to: Registration

Deadline: April 1st, 2019


April 9th

Young Statisticians Denmark - Lunch
Sign up for YSD Lunch
Jacob v.B. Hjelmborg 
13:10 Statistical Analysis on Manifold-valued Data
Line Kühnel
University of Copenhagen (chair: Ulrich Halekoh)
  ABSTRACT: The constant improvement of data collection techniques increases the complexity of observed data objects. Cameras and scanning technologies make it possible to retrieve detailed images of everything from microscopic structures of a cell to 3D images of anatomical objects. No matter if data are curve outlines of a shape, medical images, or a collection of landmark points for an object, such complex data structures lack vector space properties and will hence challenge the well-known statistical theory for data in Euclidean space. All things considered, new generalised statistical methods have to be developed for analysing non-linear data samples. In this talk, I give a brief introduction to some of the challenges and methods for analysing complex data objects assumed to form a Riemannian manifold.
13:50 Estimating survival benefit in a clinical trial
Michael Væth
Department of Public Health, Aarhus University (chair: Ulrich)
  ABSTRACT: In a general population, a proportional change of mortality results in a change in life expectancy which, to a close approximation, is proportional to the logarithm of the change in mortality. Using censored follow-up data, this relationship may be used to predict the difference in average remaining lifetime between two groups of individuals with approximately proportional mortality.
The usefulness of the methodology in a clinical trial setting is explored using follow-up data from two clinical trials of breast cancer patients. Three methods are considered. One approach applies standardized mortality ratios (SMR) computed relative to the normal population, another approach relies on a hazard ratio estimated in Cox regression analysis. Advantages and disadvantages of these approaches are discussed. A final approach uses the SMR approach to modify the difference in restricted mean survival. The results are not discouraging, and the methodology seems potentially useful in a cost-effectiveness analysis of a new treatment option.
14:30   Pause
15:00 Structured Additive Regression Models for Functional Data
Fabian Scheipl
LMU München (chair: Ulrich Halekoh)
  ABSTRACT: Researchers are increasingly interested in regression models for functional data to relate functional observations to other variables of interest. We will discuss a comprehensive framework for additive (mixed) models for functional responses and/or functional covariates. The guiding principle is to reframe functional regression in terms of corresponding models for scalar data, allowing the adaptation of a large body of existing methods for these novel tasks. The framework encompasses  many existing as well as new models. It includes regression  for "generalized" functional data,  mean regression, quantile regression as well as generalized additive models for location, shape and scale (GAMLSS) for functional data. It admits many flexible linear, smooth or interaction terms of scalar and functional covariates as well as (functional) random effects and allows flexible choices of bases -- in particular splines and functional principal components -- and corresponding penalties for each term. It covers functional data observed on common (dense) or curve-specific (sparse) grids.
Penalized likelihood based and gradient-boosting based inference for these models are implemented in the R packages refund and FDboost, respectively. We also discuss identifiability and computational complexity for the functional regression models covered.
 15:40 The Trendiness of Trends
Andreas Kryger Jensen
Biostat, University of Copenhagen (chair: Ulrich) 
  ABSTRACT: What exactly is a trend and when do we believe in the direction of a trend?
These are two fundamental questions in applied statistics and certainly also imperative in the decision-making processes influencing public health on a national scale.
A statement often seen in the public news is that a trend has changed or is starting to change. This has most recently been exemplified in Politiken (January 2th, 2019) in which a headline stated that the year of 2018 was the first time in two decades where the proportion of smokers in Denmark had significant increased.
In this talk I will propose a statistical method for elucidating such questions using techniques from Functional Data Analysis. Under the assumption that our reality evolves in continuous time and that a statistical statement must be conditional on all available information, I propose an easily interpretable probabilistic Trend Direction Index. This index can be estimated from data and updated over time as new information becomes available.
 16:20 Pause 
 16:50 Local robust estimation of the Pickands dependence function
Yuri Goegebeur
IMADA, University of Southern Denmark (chair: Jacob v. B. Hjelmborg
  ABSTRACT: We consider the robust estimation of the Pickands dependence function in the random covariate framework. Our estimator is based on local estimation with the minimum density power divergence criterion. We provide the main asymptotic properties, in particular the convergence of the stochastic process, correctly normalized, towards a tight centered Gaussian process. The finite sample performance of our estimator is evaluated with a simulation study involving both uncontaminated and contaminated samples. The method is illustrated on a dataset of air pollution measurements.
 17:30 Experimenting in Symbolic Dynamics
Wojciech Szymanski
IMADA, University of Southern Denmark (chair: Jacob) 
  ABSTRACT: Many abstract discrete-time dynamical systems arise from very simple initial data (automata), and yet result in very complicated dynamics. Such systems may be studied by both discrete (combinatorial) as well as analytic methods. Nevertheless, many fundamental questions about the long-time behavior remain intractable and thus less rigorous, experimental approaches have been proposed. In this talk, we would like to raise the question if probabilistic methods could not be useful in this line of research.
DINNER at J.B. Winsløws Vej 19 - Map

April 10th

9:00 Regression on imperfect class labels derived by unsupervised clustering
Martin Bøgsted
Department of Haematology, Aalborg University Hospital (chair: Birgit Debrabant)
  ABSTRACT: Outcome regressed on class labels identified by unsupervised clustering is custom in many applications. However, it is common to ignore the misclassification of class labels caused by the learning algorithm, which potentially leads to serious bias of the estimated effect parameters. Due to its generality we suggest to redress the situation by use of the simulation and extrapolation method. Performance is illustrated by simulated data from Gaussian mixture models. Finally, we apply our method to a study which regressed overall survival on class labels derived from unsupervised clustering of gene expression data from bone marrow samples of multiple myeloma patients. 
9:40 Assessment of a treatment effect for recurrent event data in the presence of a terminal event
Philip Hougaard
Lundbeck and University of Southern Denmark (chair: Birgit) 
  ABSTRACT:  The paper considers clinical trials where multiple occurrences of the same event are recorded over time. The frame is events that can occur at most a few times during a trial, such as hospitalizations or heart failures. I will first consider the case without terminal events (meaning not considering the possibility of death), focusing on, the Poisson and the frailty-Poisson model, with a proportional hazards assumption. Multi-state models will also be mentioned but are less convenient for this purpose. The frailty-Poisson model shows the same treatment effect unconditionally as conditionally on the frailty. From this we will learn that studying the first event only is insufficient because it suffers from a selection effect implying that the unconditional effect is smaller than the conditional. Terminal events, such as death, make the case more complicated because events cannot occur after death and therefore we need to be concerned whether any suggested analysis technique could make a treatment with high mortality appear as successful in reducing the number of events. The suggestion is to consider the integrated hazard of events.  This will be discussed in perspective of the estimand definition. Estimands is a relatively new concept that has entered guidelines for the statistics in the pharmaceutical industry. Basically, the concept is a formalization of the handling of missing data.
10:20 Closed tests for multiple comparisons of areas under the ROC curve
Paul Blanche
Biostat, University of Copenhagen (chair: Birgit)  
  ABSTRACT: Comparing areas under the ROC curve is a common approach to compare prognostic biomarkers. In this talk, we present an efficient method to control the family wise error rate when multiple comparisons are preformed. More specifically, we suggest to combine max-test and closed testing procedures. We build on previous work on asymptotic results for ROC curves and on general multiple testing methods to take into account both the correlations between the test statistics and the logical constraints between the null hypotheses. The proposed method results in an uniformly more powerful procedure than both the single-step max-test procedure and popular stepwise extensions of the Bonferroni procedure, such as Bonferroni-Holm. As demonstrated in this talk, the method can be applied in most usual contexts, including the time-dependent context with right censored data. We show how the method works in practice through a motivating example where we compare several psychometric scores to predict the t-year risk of Alzheimer's disease. The example illustrates several multiple testing settings and demonstrates the advantage of using the proposed methods over common alternatives. 
11:30 Estimation of cohort specific population sizes from register data 
- a regression approach

Birgit Debrabant
Biostat, University of Southern Denmark (chair: Ulrich Halekoh)
  ABSTRACT:  Prevalence estimates of infectious diseases are important in many contexts. However, it is often impossible to simply count infected individuals when incubation times are long or diagnosis is complicated or lengthy. Consequently, there is a need for methods that can estimate the size N of a corresponding hidden population.

Methods differ largely in the type of observations used and their underlying sampling scheme. In this talk, we consider consecutive registration frequencies of newly identified disease carriers.  

In the simplest case, the underlying population is closed and registration probabilities (unknown) are constant (p). This case is linked to binomial removal sampling, where different types of well-known estimators exist, see e.g. Zippin (1956). Its nowadays importance is seen in Ledberg and Wennberg (2014), who present a corresponding method based on registration frequencies using an ML-estimator.

Realistic underlying populations are however often subject to new-infections and death and registration probabilities are non-constant. Estimators derived from corresponding, more complex probability models tend to be way less stable. A further difficulty can arise, when registration frequencies range over k different age-cohorts. Requiring estimates for age-cohort specific population sizes N1, N2, ..., Nk (assuming an universal p), leads to an increased dimension of the parameter space and the numerical optimization of the log-likelihood becomes very tedious.

In this talk, we consider the simultaneous estimation of age-cohort specific population sizes for several age groups (with universal unknown p), allowing individuals to exit the populations. We extend a regression approach from Zippin (1956) in order to construct estimators for N1, N2, ..., Nk and p.
A. Ledberg and P. Wennberg. "Estimating the size of hidden populations from register data". BMC Medical Research Methodology 14.1 (2014)
C. Zippin. "An Evaluation of the Removal Method of Estimating Animal Populations". Biometrics 12.2 (1956)
11:50 Describing the shape of dying in Denmark
Anne Vinkel Hansen
Danmarks Statistik (chair: Jacob v. B. Hjelmborg
  ABSTRACT: Later generations die later, and have later onset of diseased frail old age. Conversely, more aggressive diagnosing practices and better treatment of (resulting in longer life with) chronic diseases pulls in the direction of people spending more years of life with disease.  Between the opposite directions of these forces, what shape do the last years of life actually take?
We use latent class trajectory modelling to describe the use of health services in the last five years of life among Danes aged 65 and above. An extended mixed model with latent classes, latent class trajectory modelling separates a population into a number of groups defined by the shape of their trajectories across the measured time interval. Having determined the main shapes that late-life use of healthcare services takes, we explore how these are distributed across age and year of death. 
12:10 Analyzing relative survival using parametric cure models
Lasse Hjort Jakobsen
Aalborg University (chair: Jacob)
  ABSTRACT: Cure models are used in time-to-event analysis when the survival of the considered individuals reaches the same level as the general population, which corresponds to a plateau in the relative survival function. The main parameter of interest is the level of the plateau. In this presentation, we will focus on the interpretation of this parameter and describe two general classes of cure models, namely explicit and latent cure models, which differ in the inclusion of a specific parameter for the plateauing level. We will discuss these two model classes in terms of model formulation, estimation, and identifiability. Instances from both model classes will be compared in a simulation study.
12:30 Multivariate Generalized Linear Models for Twin Data
Wagner Hugo Bonat
Paraná Federal University, Brazil (chair: Jacob)
  ABSTRACT: Multivariate twin and family studies are one of the most important tools to assess diseases inheritance as well as to study their genetic and environment interrelationship. In this talk, I am going to present a flexible statistical modelling framework for analyzing multivariate Gaussian and non-Gaussian twin and family data. The non-normality is taken into account by modelling the mean and variance relationship, while the covariance structure is modeled by means of a linear covariance model including the option to model the dispersion components as functions of known covariates in a regression model fashion. The proposed model class can deal with binomial, continuous bounded, count, symmetric and asymmetric continuous and semi-continuous phenotypes as well as combination of them in a unified framework.
End of the day + Sandwich & soft drinks
Danish Society for Theoretical Statistics



Young Statisticians Denmark

How to join YS Denmark?

Young Statisticians

Seminars & Events

Research seminars at EBB

Upcoming events

Vi samler statistik ved hjælp af cookies for at forbedre brugeroplevelsen. Læs mere om cookies

Acceptér cookies