Principles and tools for valid quantitative data – defend your hypothesis not your data

The course was las scheduled in 2021.

Course description

Quantitative empirical research must be based on valid data. The foundation of this lies in principles of securing proper metadata, appropriate variable composition in relation to the research topic under study (Conceptual model), but also the methods applied from data definition to analysis. The course will contain a mixture of discussions and practical exercises in securing quality assured and validated data. The course exercises will be made with EpiData, Inkscape and Stata software. The principles used are of a general nature when you work with quality assurance of quantitative empirical data and may be applied regardless of which specific software and database you use.

Elements contained in the course are

Considerations in the creation of data structures from a conceptual model
How to create data structures and documentation at project and variable level
Clean raw data from scratch for analysis with appropriate documentation, including principles of determine number of observations with sufficient information and level of missing data.
Participants become aware of how to combine official and informal help attained through searching on internet. In particular for “community software” and Stata

After the course it is expected that the participants

Understand principles and standards for handling empirical data from scratch to analysis.
Are introduced to general programming principles applied in reproducible datamanagement.
Can apply principles of data validation (double entry, visual verification, completeness and conformity to data definitions) with EpiData software.
Can create a conceptual model of their own study with Inkscape.
Have gained basic experience in preparing a documented analysis ready dataset from scratch in terms of appropriate metadata, number of observations and handling of missing data.
Are able to install open-source software (EpiData & InkScape).

Participants are expected to spend time on reading of scientific papers documenting data quality, completing exercises and solutions between course days. Bring your own computer (Mac/Windows/Linux) and make sure you have administrator rights to install software (or know who to contact for installation).

For course approval the participants must

Create a conceptual model for their own project (in SVG format).
Create a publication ready graph to a given scientific journal (vector graphic) based on a “raw” analytic graph.
Document key metadata elements from their own study in the form of elements from the Dublin Core Standard (typical biographic descriptors)

Course Litterature (examples)

Danish Code of Conduct for Research Integrity (chapter 2).
Paulsen, A., Overgaard, S. & Lauritsen, J. M. Quality of data entry using single entry, double entry and automated forms processing--an example based on a study of patient-reported outcomes. PLo S One. 7, 4, s. e35087, 2012
Ohmann, C., Canham, S., Demotes, J., Chêne, G., Lauritsen, J. et al. Raising standards in clinical research: The impact of the ECRIN data centre certification programme, 2011–2016. Contemporary clinical Trials Communications. 2017 : 5, s. 153-159
Rieder HL, Lauritsen JM. Quality assurance of data: ensuring that numbers reflect operational definitions and contain real measurements. Int J Tuberc Lung Dis. 2011 Mar;15(3):296-304.

Number of participants

12 PhD students

Course fee

The course is free of charge for PhD students enrolled in Universities that have joined the "Open market agreement".

Graduate Programme

Public Health

Course director

Jens Lauritsen

ECTS credits

2 ECTS

Register for this course

Share on

Facebook X

Copy link

Search