Lab Data
Epigenome data analysis for: Mouse_tissue_differentiation_erythroblasts_v3
Introduction
This website summarizes the data and results of epigenome profiling for the following project: Mouse_tissue_differentiation_erythroblasts_v3.
This project was published in: Shearstone, J.R., Pop, R., Bock, C., Boyle, P., Meissner A., Socolovsky, M. "Global DNA Demethylation During Mouse Erythropoiesis in vivo."" Science Vol. 334 no. 6057 pp. 799-802. PMID: 22076376 (2011)
We recommend the following approach for inspecting the results and generating hypotheses about relevant biology: (i) Open the clustered sample heatmaps for gene promoters and CpG islands to find out which samples are being grouped together. (ii) Inspect the scatterplots to assess the magnitude of change and evidence of global biases. Several slightly different types of scatterplots are provided, and observations are considered reliable only if the are present in all of these alternative plots. (iii) Next, the coverage pie charts should be checked to see if some samples exhibit significantly higher or lower coverage than other samples. If this is the case, extra care has to be taken to distinguish true biological effects from potential biases in sequencing coverage. (iv) The boxplot diagrams should be checked for strong differences in total DNA methylation, noting that such differences need to be interpreted very carefully because they are often the result of coverage biases rather than true biological differences. (v) If all of the above checks suggest that the data are reliable and do not exhibit strong biases, the significance barcharts and significance tables will provide statistically sound data on the number of differential regions. (vi) The next step should be to look at the epigenome profiles themselves. All tracks can be loaded into the UCSC Genome Browser by following the links in the Data section. This is also a good time to look for changes in known regions of interest. (vii) For a more systematic look at epigenetic differences between pairs of samples, download the corresponding tables in the Supplementary Tables section and open them in Excel. Once sorted by one of the most appropriate p-value or qvalClass column, just copy-paste the chromosomal position (first three columns) into UCSC Genome Browser and inspect the results. (viii) It is also possible to load lists of differential regions into the UCSC Genome Browser using the links in the Differential Regions section. (ix) If an analysis contains multiple replicates per sample group, an additional "groupwise comparison" table in the Supplementary Tables section helps identify differences that are shared among replicates. (x) We are always happy to help you interpret your data. Please contact with any questions that you may have.
Data
Each of the links below opens epigenome profile tracks in the UCSC Genome Browser. For RRBS, "CpG methylation" reports DNA methylation data per CpG, summarizing over all observed reads. "RRBS reads" shows DNA methylation data for each read separately. "RRBS fragments" data details DNA methylation and total sequencing coverage for all DNA fragments that were sequenced. Finally, "HOXA cluster (QC)" opens the "CpG methylation" track focused on the HOXA cluster, which provides a quick way of visual quality control. A more detailed explanation of these browser tracks is available in the RRBS track documentation sheet. For ChIP-seq, MeDIP, and MethylCap, all tracks plot the (absolute or normalized) read density across the genome. Reads were extended downstream to a default length of 300 bp before read frequencies were calculated.
Analysis
The following diagrams summarize the statistical analysis of all included epigenome data. A detailed description of each diagram type may be added in the future. For the time being, please contact us if any diagram is not self-explanatory.
Genomic Region Tracks
Each of the links below opens genomic region tracks in the UCSC Genome Browser. The first few tracks report the epigenetic state of each region in each sample, whereas the remaining tracks depict genomic regions that exhibit epigenetic differences between samples.
Supplementary Tables
The following tables contain the raw data behind all analyses reported above, and a wealth of candidate regions for biological interpretation and experimental follow-up. While the amount of information may be overwhelming, these tables can quite reasonably be analyzed with Excel. Once loaded into Excel, we recommend to hide or delete all columns that are not directly relevant to the comparison of interest, sort by the p-value or qvalClass column of interest and inspect the top hits using the UCSC Genome Browser (with all epigenome tracks from the Data section loaded).