Data content

Several layers of information have been integrated in AlzBase. Information regarding gene expressionincludes Alzheimer’s disease (AD), non-dementia, related diseases and aging. Information regarding correlation with disease severity includes cortical atrophy, Braak stage, MMSE and NFT scores. Other annotation information comes from Allen brain atlas, GWAS catalogue, eQTL studies and CTD drug database.Inaddition, gene-gene correlation can also be retrieved.A summary of AlzBase statistics is shown in Table 1.


Table 1 Data content of AlzBase as of August 30, 2014.

Data content

Data Integration

As shown in Figure 1, three categories of information have been integrated into AlzBase. These include differential gene expression, other gene annotations and gene-gene correlation. A comprehensive summary on the top genes from AlzBase and AD genetics is also provided.


Data content

Figure 1 Data processing and data content of alzBase.


Core datasets on Alzheimer's disease

The core datasets include the transcriptome data from both brain and blood. The brain datasets cover multiple stages of the disease development including aging, non-dementia, early AD and late AD. Several brain regions and sub-regions have been included in the brain datasets. The blood datasets also cover several stages including aging, MCI, mild/moderate AD and severe AD. For details please refer to Tables 2a & 2b.Most of the datasetswereacquired from Gene Expression Omnibus (GEO) at NCBI. The RankProd algorithm was used for differential gene expression analysis.

Table 2a Detailed data source part_A.
Data content

* Mixed brain tissues: Control: temporal cortex(73%), frontal cortex(21%), parietal cortex(2%), cerebellar cortex(3%); Case: temporal cortex(60%), frontal cortex(18%), parietal cortex(10%), cerebellar cortex(13%).
# UC Davis & BIG: These datasets are currently private, the process of sample collection and microarray experiment was supported by BIG and UC Davis.

Table 2b Complementary source of other datasets studied on AD.
Data content

Datasets on related diseases and aging

Publically available datasets on brain transcriptome of other neurological disorders have been integrated into AlzBase. These include datasets for Parkinson’s disease, Huntington’s disease, schizophrenia, bipolar disorder, autism and some other diseases. The details are shown in Table 3.Similarly, the RankProd algorithm was used for differential gene expression analysis.


Table 3 Detailed data source part_B.
Data content

For in-depth analysis of brain aging, we collected several datasets on brain transcriptome covering a wide range of age span. The datasets are listed in Table 4. To select critical aging genes, a linear correlation algorithm was used due to the nature of time series rather than case-control design in aging studies.


Table 4 Detailed data source part_C.
Data content

In some previous studies, it has been claimed that people with type 2 diabetes(T2D) have higher risk of developing AD. Therefore, we collected several transcriptome datasets of T2D covering blood, muscle, liver and islet. The datasets are listed in Table 5. To be consistent, the RankProd algorithm was used for differential gene expression analysis.


Table 5 Detailed data source part_D.
Data content

Annotations of the genes in AlzBase

The annotations include the following sources,
1) Correlation with AD severity including Braak stage, cortical atrophy, MMSE score and NFT score. The information was retrieved from three published studies.
2) Annotations from Allen brain atlas(http://human.brain-map.org/). These include brain expression, molecular function, pathways and biological processes.
3) Phenotypic information from GWAS catalogue curated from large GWAS studies (https://www.genome.gov/26525384#download).
4) Regulatory SNPs from brain eQTL studies.
5) Drug-gene interaction from CTD drug database(http://ctdbase.org/).

Gene-gene correlation

Besides information on genedys-regulation and annotations, we also analyzed the relationship among the genes in AlzBase.

This include,
1) brain gene co-expression extracted from the brain co-expression network,
2) composite correlation pattern measured by normalized mutual information (NMI) using data in Table 2.


Footnote: For references please refer to our AlzBase paper (in submission) and previous works (listed in "About us").