Updated information regarding what data is used can be found in the changelog page.
30 day time-window
Events for a given individual and a given phenocode will be merged if they are less than or equal to 30 days apart. For example if an individual as K11_APPENDACUT events at the following dates: 2000-01-01, 2000-01-20, 2000-02-10, 2000-02-28, then all these events will become one at date 2000-01-01.
This is done as an attempt to remove events that are follow-ups rather than initial diagnoses.
Number of individuals having at least one event for a given phenocode, divided by the total number of individuals in the FinnGen study. No adjustment is done to account for the difference between the age distribution of the FinnGen cohort and the one of the Finnish population.
Recurrence within 6 months
Number of individuals having two events for the given phenocode less than 6 months apart, divided by the number of individuals having at least one event for the given phenocode.
Case fatality at 5-years
Number of individuals that died less than 5 years after the first event for the given phenocode, divided by the number of individuals having at least one event for the given phenocode.
Most of the study follows the NB-COMO study.
- Start of study: 1998-01-01
- End of study: 2017-12-31
- Prevalent cases removed from the study.
- Ignore time before start of study for individuals having the prior-phenocode before the study starts.
- Split time in unexposoed and exposed periods.
The model used is: y ~ prior + birth_year + sex
The regression are done using the lifelines library.
Due to the sensitive nature of the data, the age when entering and leaving the study has an accuracy of 1 year.
Availabe on GitHub for both the data processing pipeline and the website.