DONG Energy: Validation of measured data

DONG Energy collects a large amount of time series (>15.000) of pressure measurements, fuel feed, electricity production, heat production, steam flow, water flow and much more. The measurements are used for billing and tax calculations. Previously these measurements were done once a month, but new calculations- and billing methods have created a need for more regular measurements with only one hour or 5 minutes interval. DONG Energy would like to be able to validate and secure the quality of these data. One can make simple rules as upper and lower standards for the observations, or assume that repeated identical observations are simply caused by mistakes. However, it would be interesting to have more complex validations across data.

Data is received from an intern database. It is highly confidential and must be treated as such.

DONG Energy: Searchable catalogue about market situations
In relation to the daily planning of the production on the DONG Energy power plants it would be interesting to be able to seek out previous dates that look like the current situation, based on parameters such as weather forecasts, cable capacity, total production capacity, hydro balance etc.

Most data will be publicly available from and can be supplied with relevant, internal data.


Svendborg Municipality: Tool for prioritising building maintenance

Need: a tool that, without a big use of resources, can provide fast suggestions to re-prioritizing, when new tasks need to be integrated in the existing budget.

The department’s purpose is to maintain the municipality’s buildings. Within a defined budget the department must plan maintenance on selected buildings with regard to parameters such as where the need is most urgent and where the volume of the project will provide the biggest value for the building and the money.

The maintenance problems are decided from a status registration on each building, that is being updated every third year. The prioritizing of which buildings that should be maintained during the current period, is based on a predetermined categorizing of the building along with the status of the different  parts of the building, where some can have a higher priority than others. (For example, a damaged roof can be of a higher priority than other parts of the building, since it can cause damage to the rest of the building if not repaired).

The priority is decided by a wide range of factors and assessments; budgets, resources, regulations, etc., that all must be incorporated in the new tool, if possible.


The National Archives (Rigsarkivet): When did the industrial revolution begin in Denmark?

The National Archives wish to investigate if it is possible to decide the precise date and location for the beginning of the industrial revolution, based on the national migration.

With starting point in the censuses from 1845 to 1885 The National Archives wants an analysis of the citizens’ mobility with an emphasis on questions like: Who moved? From where to where? Did they move more from one part of the country than from another part? Did the migration change in the period measured on intensity, population groups or professional groups?

A collection and organization of these data will also contribute with knowledge within Social Sciences (the geo-social mobility among families), Health Sciences (family associations and epigenetics), and genealogy.

Problem: In older censuses persons can be notified with different spellings of their names. For example Kristine Jensdatter can be listed as Anne Christine Jensdatter at the next census, or maybe even have changed her last name from Jensdatter to Jensen or to her husband’s last name, as it became common in the middle of the 19th century. Scaling of advertisements

Intelligent Banker makes extensive use of online advertisements to attract visitors/traffic to our 50+ web portals. Intelligent Banker applies sophisticated machine learning techniques to improve the efficiency of our advertisement strategy, but they are facing scalability issues. The issues stem from the fact that we have more than 30,000 different ads, which Intelligent Banker needs to cluster around various features such as text, country, and type. Intelligent Banker have examined various solutions to the problem, but since clustering is highly difficult, we expect that implementing a parallel solution employing ABACUS will be able to provide better results.

The goal of this project is a sophisticated solution which can beat Intelligent Bankers current state-of-art implementation (using Affinity Propagation). Intelligent Banker sees the project in three parts; define, implement, and evaluate.


IMADA: The Lotto problem

The Lotto problem is about creating a system that, everytime you play it, will guarantee you a minimum of 4 right numbers.

Theoretically it should be possible to make a system of 62 rows to ensure this result. It is not yet constructed; hence the best system yet is 250 rows. How can we construct such a system of only 62 rows?

The paradox: Theoretically we should be able to secure 4 correct numbers with 62 rows – but every time we do the calculations, we end up with 250 rows.