Skip to main content

Specialise in Data Science and Artificial Intelligence

Data Science & Artificial Intelligence includes fields like machine learning, data mining, deep learning, artificial intelligence, optimisation, visualisation, and statistics, and it relates to terms such as data analytics and big data.

In many areas, incorporating insights from data analysis makes a major difference. Examples of data science in action in our everyday life are product recommendations in online stores or personal assistant systems on smartphones. Many companies want to use data science techniques to optimise their businesses. In the industry, machine learning, optimisation and artificial intelligence are applied in hot developing technologies such as robotics, drones, and self-driving cars.

Portrait of Arthur Zimek

In the Data Science group at SDU, statisticians and computer scientists work together for teaching, so we provide expertise in various aspects of our educational programmes as well as a coherent picture. Through close cooperation with other faculties, we are also able to offer courses that connect to the upcoming field of Personalised Medicine, a field which is relevant to everyone and which relies heavily on Data Science.

We are engaged in data science projects with various companies, from small and medium-sized local companies to big players, which is why we can offer hands-on experience in student projects as well as theoretical research at the forefront of this field.

For example, we are working with the City of Odense and the municipalities of Kolding, Nyborg and Svendborg on improving traffic systems, the planning of public transportation and the allocation of posts for building maintenance in the yearly budgets. In the industrial sector, we have current and past projects on data analysis and optimisation with companies such as Danfoss, Ørsted, Energinet, Lego, and Aviation Cloud.

Courses

In the academic year 2021/2022, we will be offering the following courses within the area of Data Science and Artificial Intelligence:

This course provides an introduction to the science of Discrete Optimisation and focuses on two of its solution paradigms: Constraint Programming and Optimisation Heuristics and Metaheuristics.

Constraint Programming tries to solve problems by modelling them by means of a declarative programming language and then using standard deduction rules, similar to logic reasoning, to reduce the space where solutions are searched. Optimisation Heuristics and Metaheuristics are general principles to find near-optimal solutions. They are the last resort in case a problem turns out to be computationally too difficult to be solved exactly. They are often inspired by nature. For example, local search techniques are based on the principle of trial and error, which is a possible way in which the humans solve problems.

To be successful, the general principles must be adapted to the specific problem exploiting its structure and efficient implementations. Hence, the course includes first-hand experience through programming assignments.

Responsible teacher: Marco Chiarandini

Read full course description

This course offers a broader perspective of logic and how it is used in Computer Science. In a time when computers are increasingly taking charge of critical tasks, such as routing airplanes or administering medicine to patients, it is essential to understand how we can formalise the safety requirements we expect of such programs, and verify that they conform to them.

Responsible teacher: Luís Cruz-Filipe

Read full course description


We will start with a concrete biological and/or medical question, transform it into a computational problem formulation, design a mathematical model, solve it, and finally derive and evaluate real-world answers from within the model.

You will be introduced to different computer science models and methods and their application within the area of Personalised Medicine, such as molecular biology, central aspects of gene regulation, epigenetic DNA modifications, and specialties with regards to bacteria and phage genetics.

Responsible teacher: Konrad Krawczyk

Read full course description

This course covers advanced unsupervised data mining methods such as ensemble methods for clustering and outlier detection or methods dedicated to high-dimensional data (e.g., subspace clustering, outlier detection in high-dimensional data), which is important in regard to handle complex, difficult, and high-dimensional data in various applications.

Ensemble methods are established for supervised learning, but come with additional challenges for unsupervised methods. High-dimensional data come with special challenges known under the name “curse of dimensionality”. We will survey such challenges in a more general way and discuss how different general algorithmic strategies and specific solutions try to tackle those challenges.

Responsible teacher: Arthur Zimek

Read full course description

Data Mining and Machine Learning techniques enable computational systems to identify meaningful patterns in the data and to adaptively improve their performance with experience accumulated from the observed data.

This course introduces the most common techniques for performing basic data mining and machine learning tasks, and covers the basic theory, algorithms, and applications. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. Computational learning methods are introduced at a general level, with their basic ideas and intuition.

Moreover, the students have the opportunity to experiment and apply data mining and machine learning techniques to selected problems.

Responsible teacher: Arthur Zimek

Read full course description


The main focus of linear and integer programming is on resource constrained optimisation problems that can be described by means of linear inequalities and a linear objective function. These problems may arise in all contexts of decision making, such as manufacturing, logistics, health care, education, finance, energy supply and many others.

In this course, you will learn the basics of linear and integer programming and duality theory and the main solution techniques, such as the simplex method, branch and bound and cutting planes. The course also aims to provide hands-on experience with mathematical modeling and the solution of these models using software systems.

Responsible teacher: Marco Chiarandini

Read full course description

The course focuses on advanced solution techniques for solving challenging mathematical optimisation problems. We start from a few concrete applications taken from scheduling and vehicle routing and model them in terms of mixed integer linear programming (MILP) problems.

Due to the size of the instances of these problems in practical applications, basic solution techniques for MILP problems are insufficient and advanced solution techniques are required.

These are:

  • Lagrangian relaxation
  • Dantzig Wolfe decomposition
  • Column generation
  • Benders decomposition

We study the theory and we practice with implementations.

Students taking the course are expected to have knowledge of linear programming, for example from either DM545 or DM871.

Responsible teacher: Marco Chiarandini

Read full course description


Machine learning has become a part in our everyday lives, from simple product recommendations to personal electronic assistants to self-driving cars. Especially Deep Learning has gained a lot of interest in the media and has demonstrated impressive results.

This intensive course will introduce you to the exciting world of deep learning. We will learn about the theoretical background and concepts driving deep learning and highlight and discuss the most noteworthy applications of deep learning but also their limitations.

Furthermore, all content will be put into practice immediately by suitable exercises and programming tasks.

Responsible teacher: Richard Röttger

Read full course description

Visualisations are important tools for experts in various domains (e.g., social sciences, bioinformatics, digital humanities, sports) to get an overview of data distributions and insights into existing data patterns in an understandable, intuitive, visual form.

The goal of the course is to enable you to develop appropriate visual interfaces for (domain-specific) user tasks. This is important because many Computer Science graduates will be working in fields that may require visual solutions for data exploration.

The course covers important theoretical concepts related to visualisation design, including data abstraction, task abstraction, and visual encoding. The course will cover all of these aspects in a practical group project that involves developing an interactive visualisation for a data set of personal interest.

Responsible teacher: Stefan Jänicke

Read full course description

This course proved a broad introduction to Artificial Intelligence.

We discuss the goals and methods of AI, starting with the concept of intelligent agents, and the classical topics of AI: search, optimisation, knowledge representation, planning, and reasoning with uncertainty.

Responsible teacher: Luís Cruz-Filipe

Read full course description

The aim of the course is to provide an introduction to Text Mining of unstructured text in natural languages.

Increasing amounts of digitised text calls for development of formal frameworks to process such data to extract information and draw statistical conclusions based on its content. The course is designed to provide a sound theoretical basis in processing unstructured text and to provide example applications of such.

We will start working with simple examples of unstructured text demonstrating the abilities of current Text Mining methods to highlight their advantages and shortcomings. We will then move to applications of such methods on more realistic datasets sourced from online news media and scientific publications.

The content of this course is designed to give an applications context of computer science and data science methods handling real-world data.

Responsible teacher: Konrad Krawczyk

Read full course description


This course focuses on multivariate statistical methods used for dimensionality reduction, analysis of mean vectors, and discrimination and classification. Such methods are of relevance for a wide range of practical applications: quality control of industry machinery, epidemiology and clinical problems in population health care, and questions in biological conservation and environmental monitoring.

Responsible teacher: Jing Qin

Read full course description

The aim of the course is to enable you to gain insight into the mathematical structure of linear and generalised linear models, including experience in recognising such models from a given statistical problem.

Responsible teacher: Fernando Colchero

Read full course description

The aim of the course is to enable the you to use modern computer-intensive statistical methods as tools to investigate stochastic phenomena and statistical procedures, and to perform statistical inference, which is important in regard to conducting statistical analysis based on computation and simulation.

Responsible teacher: Yuri Goegebeur

Read full course description

 

Course timetable
Semester
10 ECTS courses
5 ECTS courses
Autumn 2021
DM841, DM847, DM873
DM864, DM878
Spring 2022 DM846, DM870, DM879, ST813, ST816  DM871, DM872, DM882, ST811

Most of these courses require programming abilities and a basic understanding of linear algebra. As long as you meet any academic preconditions, you can freely combine the offered courses. However, we encourage you to contact potential supervisors early to learn about a recommended course of study for certain topics.

Master Thesis projects

The following are examples of previous Master Thesis topics in the area of Data Science and Artificial Intelligence:

  • Optimisation of demand-responsive personal transportation
  • Sychronisation, enrichment and visualisation of football data
  • Bus line optimisation on Funen
  • Simulation of traffic flow in a real urban network
  • Optimisation of coordinated traffic signal intersections
  • Flight planning in free route airspaces
  • Artificial intelligence in action real-time strategy game
  • Optimising heat and power production using column generation

Who teaches Data Science and Artificial Intelligence?

Marco Chiarandini is fascinated by the fantastic journey that optimisers undertake to solve timetabling, scheduling and routing problems. The journey moves forward in the abstract world through problem communication, mathematical representation, algorithm design, implementation and experimental analysis. Finally, it returns to the real world with numbers that correspond to actually practicable decisions, that yield systemic improvements and can ultimately ameliorate our lives.

Fernando Colchero is a statistician with particular interest in developing and applying inference methods to understanding population dynamics and demographic trends across the tree of life. He uses Bayesian inference as the statistical framework to explore hidden processes that affect and drive natural populations.

Luís Cruz-Filipe has always enjoyed solving logical and mathematical puzzles. He combines this passion with his research by approaching problems in theoretical computer science in a creative manner, often using ideas from logic.

Lene Favrholdt likes to get to the core of an algorithmic problem. She strives to understand the essence of what results can be obtained, looking for precise and intuitive explanations of how and why they can (or cannot) be obtained. She finds communicating this understanding to colleagues and students in a clear and intuitive way very satisfying.

Yuri Goegebeur focuses on theoretical statistics, with main interests in extreme value theory, order statistics and asymptotic results. Although his work is largely theoretical, he has also applied extreme value methods to finance and insurance (claim size modelling, reinsurance), environmental science (ozone pollution and temperature), health science (modelling longevity) and geostatistics (earthquake magnitudes, diamond valuation).

Stefan Jänicke focuses on the development of novel visualisation solutions for a variety of application domains. He has gained experience in developing information visualisation and visual analytics techniques in numerous interdisciplinary research projects addressing current research questions in (digital) humanities, linguistics, social sciences, biology, and sports. His research interests relate to information visualisation and visual analytics with a focus on text-, time- and geovisualisation.

Konrad Krawczyk studies data science applications in health. His scientific focus spans analysis of individual molecules and large-scale data mining in the context of public health. Within the field of immunoinformatics, he focuses on how bioinformatics methods can be harnessed to understand the biology of antibodies in their immunological context. He also carries out research in text, mining scientific publications and news articles to draw conclusions about public health information.

Hans Christian Petersen is a biological anthropologist working in applied statistics, especially focusing on methods and applications related to human, primate and general evolution. Research topics are morphology, multivariate statistics and ways of dealing with datasets with missing values.

Jing Qin is a mathematician whose research areas cover combinatorics, graph algorithms and their applications in RNA Computational Biology. Most recently, she has started working in extreme statistics and applications. She enjoys applying mathematics and statistics to solving real-life problems, although it sometimes means struggling in between theoretically neat results and not-so-neat practical applications. Well, all struggles – eventually – lead to better scientific discoveries.

Richard Röttger is a computer scientist specialising in various fields of Bioinformatics. In his research he focuses on the analysis of biological networks and large-scale biomedical datasets with the aim of utilising existing information as efficiently as possible and extracting knowledge from the plethora of available biological data. To that end, he and his group employ state-of-the-art machine learning techniques like deep learning in order to research genes and proteins not in isolation but as a complex choreography of interactions leading to a deeper understanding of organisms and diseases.

Peter Schneider-Kamp is a computer scientist and interested in all things AI. In his research, he applies machine learning and other AI methods to problems from a wide number of application domains including, but not limited to, software verification, hardware synthesis, image analysis, natural language processing, and autonomous systems such as drones.

Arthur Zimek is not only a computer scientist, but has also studied philosophy, where one of the fundamental questions is: “what can we know?” This question translates into data science by the quest to understand intuition, based on which some machine learning methods work, what assumptions machine learning methods require, and how we can trust and interpret their results.


Student counsellors Faculty of Science University of Southern Denmark

  • Campusvej 55
  • Odense M - DK-5230
  • Phone: +45 6550 4387

Last Updated 15.03.2021