R/data.R
zhang2014.Rd
Data Preprocessing: In the accompanying paper, the CEL files from three mouse liver Affymetrix microarray time-series expression sets \(-\) Hogenesch 2009 - GSE11923 (Hughes et al., 2009), Hughes 2012 - GSE30411 (Hughes et al., 2012), Zhang 2014 - GSE54650 (Zhang et al., 2014) \(-\) were downloaded from the Gene Expression Omnibus database (GEO). In each experiment, wild-type C57BL/6J mice were entrained to a 12-h light, 12-h dark environment before being released into constant darkness. Mouse age, length of entrainment, time of sampling, and sampling resolution vary by experiment. The data were subsequently normalized by robust multi-array average (rma) using the Affy R Package (Gautier et al., 2004) and checked for quality control using the Oligo R Package (Carvalho and Irizarry, 2010), following each package’s vignette, respectively. Since each GEO data set used a different microarray platform (affy_mouse430_2, affy_moex_1_0_st_v1, affy_mogene_1_0_st_v1), each had a different set of probes. A common set of features needed to be identified to compare across microarrays. Probes for each data set were mapped to genes based on prealigned databases specific to each microarray (mouse4302.db, moex10sttranscriptcluster.db, mogene10sttranscriptcluster.db). Multiple probes corresponding to a single gene were aggregated by taking the mean expression. A final 12,868 common set of genes across all three microarray platforms were used for subsequent analysis. See the supplement for code.
zhang2014
A data frame with 12868 rows (genes) and 24 variables (ZT time)
Zhang, R, Lahens, NF, Ballance, HI, Hughes, ME, Hogenesch, JB (2014) A circadian gene expression atlas in mammals: implications for biology and medicine. Proc Natl Acad Sci USA 111:16219-16224. DOI: 10.1073/pnas.1408886111