SOIGNONS: Societal Games to Nudge People into Attending Cervical Cancer Screening

Project Leader: Mari Nygård

PostDoc: Tomás Ruiz-López

SOIGNONS is an eScience initiative to virally communicate health information concerning cervical cancer via gamification in mobile games. Games with a purpose will present information as thought-provoking puzzles and incentives to a large number of smart phone users. These games will create awareness among individuals via puzzles encompassing popularized scientific-evidence about cervical cancer prevention (eventually other cancers) also for sub-optimally screened groups. Incentives in the game based on an individual’s performance will empower him/her to share electronic invitations to cancer screening via social platforms such as Facebook, Google+, and even basic SMS for maximum outreach. We expect gamification to create an evidence-based medium for the mobile users to nudge women in their families, friend circles, and also as a general societal empowerment tool to reach out to people not screened. Social nudging lies in the very nature of the human behavior where we are actively involved in the lives of our friends and relatives discussing lifestyle choices. SOIGNONS will boost nudging based on accurate scientific-evidence, which will lead to evolution in overall societal health, improve necessary visits to doctors, and introduce better lifestyle choices. About 50% of the cervical cancers in Norway arises among the 20% of sub-optimally screened population[4]. Games developed in SOIGNONS will present users with evidence and incentives to target this population. Multilingual support of English, Icelandic, and Norwegian in the games will help reach the largest possible number of users and the game can be down loaded for free from the App store. SOIGNONS has the following specific aims:
Specific Aim 1: Societal game design and content creation: Health-related scientific-evidence will be mapped onto as puzzles and incentives in a game suite called Solve Cervical Cancer (SCC). SCC will have variants based on familiar classic games such as trivia, hangman, detective games played widely by people of different ages on mobile phones. Scientific experts in public health will develop content (on an online web application) for the game puzzles from scientific-evidence. Users will play games to garner specific knowledge and will be empowered with invites to nudge their own social circle to attend screening (also by self-sampling).
Specific Aim 2: Incentives and dissemination of game for social nudging: The game will be disseminated on an app store for both IPhone and Android and it will be spread virally where a user can invite others. Incentives in the game will be designed to promote the player to encourage screening in his/her own family/friend circle and especially those who do not use modern mobile phones via SMS. Those who are at the target group for cervical cancer screening will have a possibility to access their own data in the screening program database (login through Bank ID).
Specific Aim 3: Evaluation: Attitude, knowledge and user friendliness will be evaluated by analyzing the large amount of user data generated and stored on a cloud server. The pattern of spreading app in the population will be evaluated by using information about how many invites were sent out and accepted, and we collect user feedback about whether he/she has been able to nudge a friend/relative to go to screening. Using screening registry we evaluate how many queried her screening history. Subsequent screening attendance will be observed via registry linkage, allowing addressing aspects of the effect of the game on sub-optimally screened groups.

Nordic Biobank Registers

Project Leader: Mads Melbye

PostDoc: Xueping Liu

We will focus on the existing and fully working Danish Biobank Register system. This system (, available online since 2012, supports the register based health related research with a very flexible and quick search functionality to view biological materials stored in various biobanks in Denmark, Greenland and the Genetic Biobank in the Faroe Islands. The Biobank Register e.g. integrates data from the Danish Patient Registry (including Danish Cancer Registry), Danish Pathology Registry, and the Danish Civil Registration Registry and automatically matches it with the biological material from individuals, including blood, tissue and other sample types. Presently the Biobank Register allows searches among 15.5 million samples, from over 5.1 million Danish individuals. The register points to a biological specimen on e.g. 568 000 persons who have been diagnosed with cancer. A large collection of genotyped samples exists here.
The Janus Serum Bank is a population-based biobank reserved for cancer research. The specimens are collected during the period from 1972-2004 and are stored at – 25° Celsius. The sam ples originate from 317 000 persons in Norway who have participated in health studies and also from blood donors in and around Oslo. Today, samples are only collected from earlier donors in the Janus Serum Bank who have developed cancer. The Bank is internationally unique regarding size and number of cancer cases. Annual linkage to the Cancer Registry shows that 61 000 donors are diagnosed with cancer as of December 31, 2011.
HUNT Biobank is the biobank for the comprehensive and longitudinal HUNT study as well as a national biobank for Cohorts of Norway (CONOR) with DNA samples from 250.000 participants from the large Norwegian Health Surveys gathered at one physical site. In total more than 107 000 unique participants have contributed with bio samples,; many with multiple samples from different time points stored in the biobank. In total approx. 8000 HUNT participants and 15 000 participants from the CONOR studies have developed cancer as of 2010, respectively. .
In HUNT, an interactive single-nucleotide polymorphism (SNP) database has recently been established where researchers can look for specific SNPs available across different genotype efforts based on sample collections (studies) at the HUNT biobank. This is a solution that dynamically connects all aspects of genotype data including study characteristics, genotype technologies, and minor allele frequencies of relevant SNPs.
The existing systems on which we would like to focus are the Danish Biobank Register ( the HUNT Study (, the Norwegian Cancer Registry ( and Janus Biobank ( However, during the study period we will seek to also start including other Nordic biobanks in the programme.

Developing an efficient imputation pipeline to construct near complete genome variant data information in GWAs datasets

Project Leader: Aarno Palotie

PostDoc: Priit Palta

The project aim to use population specific whole genome and whole exome sequence data as a backbone for imputing low frequency variants in Estonian and Finnish population cohort GWAs data and use the data for register based diagnostic outcomes such as cancers and comorbidities
Over the past eight years genome wide variant data has been accumulated from large samples collections in all NIASC sites. Currently the two performance site Tartu and Helsinki have accumulated GWAs data from 70 000 Finnish and 20 000 Estonian individuals and whole exome or whole genome sequence data from 16 000 Finnish and 1800 Estonian individuals. These large datasets provide a substantial resource for association studies. To efficiently use all genomewide variant data, we would also like to include low frequency variants in the outcome association analysis. To achieve this, we would have to impute the non‐genotyped variants in the GWAs results. Although HapMap and 1000genomes data provide a fundament and a standardized imputation backbone, there is increasing evidence that for low frequency variants these panels are not sufficient. Population specific sequence data improves substantially the imputation accuracy for variants that have a population frequency under 5%. Estonia and Finland are historically and linguistically closely related. Comparing low frequency variant association data between these two populations is thus especially interesting and potentially beneficial. As replication is challenging for low frequency variants, we hypothesize that similarly imputed datasets between two ethnically related countries would be helpful; the likelihood for shared haplotypes is likely to be higher.
The haplotype reference consortium led by Goncalo Abecasis, Jonathan Marchini and Richard Durbin are currently constructing a haplotype catalogue based on available whole genome data. This will further improve our imputation accuracy. However, as most of the haplotype project is using low coverage sequence data (2‐6X) the variant calling accuracy of rare variants will still not be superb. This is especially challenging for variants that are rare in the general European population but are enriched through bottleneck effects in either the Finnish or Estonian populations. As is well documented, the Finnish bottleneck effects are strong resulting in enrichment of some low frequency variants that are very rare elsewhere. Some of these variants are contributing to disease phenotypes but are so rare in most populations that they are not within reach of disease association studies. However, when enriched in an isolate like Finland, the frequency might be boosted to 0‐5‐5% as demonstrated in Figure 2 below and become analyzable disease association targets. Of special interest is that within the range of 0.5‐5% population frequency in Finland there is an excess of loss of function (LoF) variants. LoFs are of special interest in association studies as they represent human knockouts. In our recent study by Lim et al (PLoS Genetics 2014 Jul 31;10(7):e1004494) we analyzed 83 LoF variants enriched in Finland and linked them to National Health Record data. We identified several disease associations including a LoF in the LpA gene protective for coronary heart disease. Protective LoFs are interesting potential drug targets and thus of special value.

Computational methods for genetic cancer susceptibility analysis

Project Leader: Lauri Aaltonen

PostDoc: Kimmo Palin

The aims of the project are to develop methods (i) for computational annotation and visualisation of DNA variants found in Whole-Genome Sequencing (WGS) studies, (ii) for genetic association and linkage studies in structured populations, (iii) for subclassification of cancer patients based on their constitutional genetic and environmental attributes and the attributes of their tumor and (iv) for detection of gene-environment and gene-gene interactions for rare cancer risk variants. These aims are tightly aligned with the methodological needs of the two host groups.
Both of the host laboratories are currently undertaking large scale sequencing and genotyping projects. The Helsinki group is focusing on detailed genome sequencing of ~250 individual Finnish colorectal cancer patients and their tumors whereas the Trondheim group is sequencing several thousand Norwegian individuals from the HUNT cohort in lower detail but wider representation of the general population. In addition to these, both groups are genotyping significantly larger sets of individuals from the same cohorts. These large data production projects have already required substantial methods development (e.g. RikuRator, SLRP:Systematic Long Range Phasing). The host groups are well prepared to provide mentoring for leading edge methods development. The UH group employs three computer science PhD:s and maintains close collaboration with UH Computer Science department also part of Finnish Centre of Excellence in Cancer Genetic Research. The project has access to substantial high performance computing and data storage environment provided by the CSC — IT Center for Science Ltd enabling use of very large datasets and resource intensive computation. The current setup includes 1277 CPU cores and 405 Terabytes of storage. Combination of the two genetic and epidemiological data sources from separate but closely related populations provide great potential for detecting cancer relevant genetic variants but simultaneously require novel methods development to be leveraged fully. The differential structure between the populations enables teasing apart the causative variants from the bystanders while the close relationship makes it more likely to have the same causative variant in both populations. The detailed clinical information available for the UH samples provide opportunity to discover subclasses of patients with potentially altered disease etiology and the practicality of the discoveries can be rapidly tested in the NTNU-HUNT set.