One of our times major misconception is that cardiovascular disease (CVD) is an isolated problem of the western world. Instead, CVD will, in 2012, surpass malaria as the leading cause of death worldwide (WHO). CVDs, mainly in the forms of heart attacks and stroke, already kill some 17 million people per year—nearly one-third of annual deaths globally. Within a decade, heart attacks and strokes will become the leading causes of both death and disability worldwide, with the number of fatalities projected to increase to over 20 million by 2020 and 24 million by 2030. As CVD continues its rapid spread across the globe, the effects will be similar to those of the most deadly pandemics in our history. Large numbers of young adults will suffer from premature CVD due to type-2 diabetes (no longer "the disease of the old”). Like previous pandemics, low- and middle-income families in developing countries will suffer disproportionately. Today’s CVD care, focusing on treating clinically manifest CVDs, will not be capable of meeting a CVD threat of pandemic proportions because reactive care is too expensive, in most instances lifelong, and therefore simply not available for the majority of people living in emerging economies.
Our Solution aND GOAL
Our goal is to provide the scientific community with a network-based understanding of the molecular pathology underlying CAD and general atherosclerosis. By understanding the locations and architectures of CAD networks we can i) target central regulatory hubs, which are experimentally proven drug targets that unlike peripheral network targets largely can prevent disease network activity, ii) use network activity as predictor of atherosclerosis burden and mirror this activity by easily obtainable footprints in blood and iii) from this create a data-driven computational tool to calculate early CAD/MI risk in any given individual. This tool can also be used to monitor effects of individualized treatments. In sum, we foresee how predictive, preventive and personalized CAD care will evolve from understanding disease network biology. Data-driven segmentation and treatment of patients will enable us to take on, and eventually win, the battle against a rapid spread of CVDs across the globe.
We sample patients suffering CVD (i.e. coronary and carotid artery disease, CAD). Besides careful clinical characterization and DNA, we also gather tissue biopsies for RNA isolation. In addition to atherosclerotic plaques, we also isolate biopsies from metabolic tissues (see Figure to the right). Today we have close to 1000 CAD patients and from each of these patients 6-10 RNA samples. This number will according to our calculations be sufficient to dismantle most CAD networks. This assumption will be true if, unlike the many risk factors leading to CAD, the molecular events (and as they are represented in CAD networks) driving CAD are largely similar across all CAD patients, a notion supported by our and others historical global gene expression data as well as immunohistopathological imaging.
Genetically Modified Mouse and Cell Models
We use mouse models to understand how CAD networks defined in humans evolve over time and to evaluate the effects of network "hubs" on atherosclerosis development. We use the apoB100/100ldlr–/–Mttpflox/flox-Mx-1Cre mouse model with 4 genetic modifications that besides rendering the mouse atherogenic (apoB100/100ldlr–/–) also allows for lowering LDL cholesterol at any given time point during the lifespan (Mttpflox/flox-Mx-1Cre) to induce and study atherosclerosis regression. These mice develop premature atherosclerosis due to a plasma lipoprotein profile similar to familial hypercholesterolemia in humans with predominantly LDL cholesterol. We also use atherosclerosis cell models to reveal the exact cellular architecture of identified CAD networks. By performing combined systemic (acetylated LDL) and specific gene perturbations (siRNA), it is possible to dissect the details of cellular CAD networks.
We avoid measuring the activity of single or a limited number of genes and instead try, when possible, to use genomic screens of DNA and RNA samples gathered from the patients and model systems. In this fashion, we generate several hundred of thousands measurements per patient or model system. We like to believe that we in this way are monitoring all activities associated with a given disease situation or perturbation. This approach generates huge amounts of data (Exabyte scale) that we store in a database that we refer to as the "AtheroCode". It is named AtheroCode because it is our belief that this database eventually will have the data and information necessary to decode the molecular basis of atherosclerosis. Besides raw data, we store all our computations and clinical information in this database. We also try to link other databases with existing knowledge about CAD/atherosclerosis and publically available -omic databases to AtheroCode.
Given the amount of data generated, traditional statistics and calculations are not feasible. Instead,we use computer-supported algorithms to infer gene networks. Previously, we have been using supercomputers for our computations but are currently setting up Cloud computing solutions provided by Amazon Inc. We find this solution to be more flexible as to capacities and user access facilitating sharing data with our collaborators. To infer gene networks, we use a combination of co-expression and Bayesian network algorithms. We also integrate other types of -omic data in our analysis to improve network resolution and thereby also improving network-based prediction of disease. We try to avoid developing new algorithms but instead use state-of-the-art and well-tested algorithms from published papers as well as those provided to us by our collaborators in Europe and US. With lists of network genes, a first, but highly valuable validation, is to investigate the inherited CAD/MI-risk enrichment of expression SNPs (eSNPs) using several genome-wide association (GWA) databases on CAD/MI available to the group.