Dr Chunxiao Song

Research Area: Genetics and Genomics
Technology Exchange: Bioinformatics, Protein interaction and Transcript profiling
Scientific Themes: Genetics & Genomics and Cancer Biology
Keywords: Epigenetics, Chemical Biology, DNA modifications, Cancer

The goal of our research is to develop and apply novel tools to probe epigenetic modifications, thereby understanding their functions in human health and disease. To do so, we combine various chemical biology, biophysics and genomic approaches to analyze the epigenome. We also utilize the epigenetic information in body fluids for non-invasive for disease diagnostics, including early detection of cancer. In addition, we are investigating the epigenetic heterogeneity of tumors to understand the contribution of epigenetics to cancer.

Dynamic interplay between epigenetic marks

Although much has been learnt about how dynamic changes in epigenetic modifications regulate transcription and cellular differentiation, much remains unknown and technological improvements are required to enable more sensitive and specific analyses. We develop single-molecule tools to study the dynamic spatial and temporal interplay among various DNA, histone and RNA epigenetic marks to discover novel epigenetic regulators and mechanisms.

Epigenetic-based diagnostics

The epigenetic state is tightly regulated, and misregulation of epigenetic patterns are hallmarks in many human diseases including multiple types of cancer. We develop sensitive tools to explore the wealth of epigenetic information in body fluids, such as the circulating cell-free DNA, to broaden the use of new epigenetic biomarkers for non-invasive disease diagnostics, including early cancer detection and treatment monitoring.

Epigenetic heterogeneity of tumors

Epigenetic factors are known to play pivotal roles in tumor development and maintenance. It is also becoming clear that tumor tissue is a highly heterogeneous mass in terms of the genome, transcriptome, as well as epigenome. We aim to develop novel single-cell technologies to study the epigenetic heterogeneity of tumors. Defining the cell lineage within a tumor based on epigenetic profiling may provide insight to identity critical events in tumor development and to guide cancer prevention and therapeutics.

Name Department Institution Country
Professor Skirmantas Kriaucionis Oxford Ludwig Institute Oxford University, Old Road Campus Research Building United Kingdom
Professor Xin Lu Oxford Ludwig Institute Oxford University, Old Road Campus Research Building United Kingdom
Professor Jianjun Chen Department of Cancer Biology University of Cincinnati United States
Dr Benjamin Schuster-Böckler Oxford Ludwig Institute Oxford University, Old Road Campus Research Building United Kingdom
Pan W, Ngo TTM, Camunas-Soler J, Song C-X, Kowarsky M, Blumenfeld YJ, Wong RJ, Shaw GM, Stevenson DK, Quake SR. 2017. Simultaneously Monitoring Immune Response and Microbial Infections during Pregnancy through Plasma cfRNA Sequencing. Clin Chem, 63 (11), pp. 1695-1704. | Show Abstract | Read more

BACKGROUND: Plasma cell-free RNA (cfRNA) encompasses a broad spectrum of RNA species that can be derived from both human cells and microbes. Because cfRNA is fragmented and of low concentration, it has been challenging to profile its transcriptome using standard RNA-seq methods. METHODS: We assessed several recently developed RNA-seq methods on cfRNA samples. We then analyzed the dynamic changes of both the human transcriptome and the microbiome of plasma during pregnancy from 60 women. RESULTS: cfRNA reflects a well-orchestrated immune modulation during pregnancy: an up-regulation of antiinflammatory genes and an increased abundance of antimicrobial genes. We observed that the plasma microbiome remained relatively stable during pregnancy. The bacteria Ureaplasma shows an increased prevalence and increased abundance at postpartum, which is likely to be associated with postpartum infection. We demonstrated that cfRNA-seq can be used to monitor viral infections. We detected a number of human pathogens in our patients, including an undiagnosed patient with a high load of human parvovirus B19 virus (B19V), which is known to be a potential cause of complications in pregnancy. CONCLUSIONS: Plasma cfRNA-seq demonstrates the potential to simultaneously monitor immune response and microbial infections during pregnancy.

Song C-X, Yin S, Ma L, Wheeler A, Chen Y, Zhang Y, Liu B, Xiong J, Zhang W, Hu J et al. 2017. 5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages. Cell Res, 27 (10), pp. 1231-1242. | Show Abstract | Read more

5-Hydroxymethylcytosine (5hmC) is an important mammalian DNA epigenetic modification that has been linked to gene regulation and cancer pathogenesis. Here we explored the diagnostic potential of 5hmC in circulating cell-free DNA (cfDNA) using a sensitive chemical labeling-based low-input shotgun sequencing approach. We sequenced cell-free 5hmC from 49 patients of seven different cancer types and found distinct features that could be used to predict cancer types and stages with high accuracy. Specifically, we discovered that lung cancer leads to a progressive global loss of 5hmC in cfDNA, whereas hepatocellular carcinoma and pancreatic cancer lead to disease-specific changes in the cell-free hydroxymethylome. Our proof-of-principle results suggest that cell-free 5hmC signatures may potentially be used not only to identify cancer types but also to track tumor stage in some cancers.

Qing Y, Tian Z, Bi Y, Wang Y, Long J, Song C-X, Diao J. 2017. Quantitation and mapping of the epigenetic marker 5-hydroxymethylcytosine. Bioessays, 39 (5), pp. 1700010-1700010. | Show Abstract | Read more

We here review primary methods used in quantifying and mapping 5-hydroxymethylcytosine (5hmC), including global quantification, restriction enzyme-based detection, and methods involving DNA-enrichment strategies and the genome-wide sequencing of 5hmC. As discovered in the mammalian genome in 2009, 5hmC, oxidized from 5-methylcytosine (5mC) by ten-eleven translocation (TET) dioxygenases, is increasingly being recognized as a biomarker in biological processes from development to pathogenesis, as its various detection methods have shown. We focus in particular on an ultrasensitive single-molecule imaging technique that can detect and quantify 5hmC from trace samples and thus offer information regarding the distance-based relationship between 5hmC and 5mC when used in combination with fluorescence resonance energy transfer.

Yang YA, Zhao JC, Fong K-W, Kim J, Li S, Song C, Song B, Zheng B, He C, Yu J. 2016. FOXA1 potentiates lineage-specific enhancer activation through modulating TET1 expression and function. Nucleic Acids Res, 44 (17), pp. 8153-8164. | Show Abstract | Read more

Forkhead box A1 (FOXA1) is an FKHD family protein that plays pioneering roles in lineage-specific enhancer activation and gene transcription. Through genome-wide location analyses, here we show that FOXA1 expression and occupancy are, in turn, required for the maintenance of these epigenetic signatures, namely DNA hypomethylation and histone 3 lysine 4 methylation. Mechanistically, this involves TET1, a 5-methylcytosine dioxygenase. We found that FOXA1 induces TET1 expression via direct binding to its cis-regulatory elements. Further, FOXA1 physically interacts with the TET1 protein through its CXXC domain. TET1 thus co-occupies FOXA1-dependent enhancers and mediates local DNA demethylation and concomitant histone 3 lysine 4 methylation, further potentiating FOXA1 recruitment. Consequently, FOXA1 binding events are markedly reduced following TET1 depletion. Together, our results suggest that FOXA1 is not only able to recognize but also remodel the epigenetic signatures at lineage-specific enhancers, which is mediated, at least in part, by a feed-forward regulatory loop between FOXA1 and TET1.

Song C-X, Diao J, Brunger AT, Quake SR. 2016. Simultaneous single-molecule epigenetic imaging of DNA methylation and hydroxymethylation. Proc Natl Acad Sci U S A, 113 (16), pp. 4338-4343. | Show Abstract | Read more

The modifications 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) are the two major DNA epigenetic modifications in mammalian genomes and play crucial roles in development and pathogenesis. Little is known about the colocalization or potential correlation of these two modifications. Here we present an ultrasensitive single-molecule imaging technology capable of detecting and quantifying 5hmC and 5mC from trace amounts of DNA. We used this approach to perform single-molecule fluorescence resonance energy transfer (smFRET) experiments which measure the proximity between 5mC and 5hmC in the same DNA molecule. Our results reveal high levels of adjacent and opposing methylated and hydroxymethylated CpG sites (5hmC/5mCpGs) in mouse genomic DNA across multiple tissues. This identifies the previously undetectable and unappreciated 5hmC/5mCpGs as one of the major states for 5hmC in the mammalian genome and suggest that they could function in promoting gene expression.

Huang H, Jiang X, Wang J, Li Y, Song C-X, Chen P, Li S, Gurbuxani S, Arnovitz S, Wang Y et al. 2016. Identification of MLL-fusion/MYC⊣miR-26⊣TET1 signaling circuit in MLL-rearranged leukemia. Cancer Lett, 372 (2), pp. 157-165. | Show Abstract | Read more

Expression of functionally important genes is often tightly regulated at both transcriptional and post-transcriptional levels. We reported previously that TET1, the founding member of the TET methylcytosine dioxygenase family, plays an essential oncogenic role in MLL-rearranged acute myeloid leukemia (AML), where it is overexpressed owing to MLL-fusion-mediated direct up-regulation at the transcriptional level. Here we show that the overexpression of TET1 in MLL-rearranged AML also relies on the down-regulation of miR-26a, which directly negatively regulates TET1 expression at the post-transcriptional level. Through inhibiting expression of TET1 and its downstream targets, forced expression of miR-26a significantly suppresses the growth/viability of human MLL-rearranged AML cells, and substantially inhibits MLL-fusion-mediated mouse hematopoietic cell transformation and leukemogenesis. Moreover, c-Myc, an oncogenic transcription factor up-regulated in MLL-rearranged AML, mediates the suppression of miR-26a expression at the transcriptional level. Collectively, our data reveal a previously unappreciated signaling pathway involving the MLL-fusion/MYC⊣miR-26a⊣TET1 signaling circuit, in which miR-26a functions as an essential tumor-suppressor mediator and its transcriptional repression is required for the overexpression and oncogenic function of TET1 in MLL-rearranged AML. Thus, restoration of miR-26a expression/function holds therapeutic potential to treat MLL-rearranged AML.

Yu M, Song C-X, He C. 2015. Detection of mismatched 5-hydroxymethyluracil in DNA by selective chemical labeling. Methods, 72 (C), pp. 16-20. | Show Abstract | Read more

How DNA demethylation is achieved in mammals is still under extensive investigation. One proposed mechanism is deamination of 5-hydroxymethylcytosine to form 5-hydroxymethyluracil (5hmU), followed by base excision repair to replace the mismatched 5hmU with cytosine. In this process, 5hmU:G mispair serves as a key intermediate and its localization and distribution in mammalian genome could be important information to investigate the proposed pathway. Here we describe a selective labeling method to map mismatched 5hmU. After converting other cytosine modifications to 5-carboxylcytosines, a biotin tag is installed onto mismatched 5hmU through β-glucosyltransferase-catalyzed glucosylation and click chemistry. The enriched 5hmU-containing DNA fragments can be subject to subsequent sequencing to reveal the distribution of 5hmU:G mispair with base-resolution information acquired.

Lu X, Han D, Zhao BS, Song C-X, Zhang L-S, Doré LC, He C. 2015. Base-resolution maps of 5-formylcytosine and 5-carboxylcytosine reveal genome-wide DNA demethylation dynamics. Cell Res, 25 (3), pp. 386-389. | Read more

Yang A, Zhao J, Wu L, Kim JA, Fong W, Jin H, Song C, He C, Yu J. 2014. TET1-Mediated DNA Demethylation and Epigenomic Regulation in Prostate Cancer ENDOCRINE REVIEWS, 35 (3),

Madzo J, Liu H, Rodriguez A, Vasanthakumar A, Sundaravel S, Caces DBD, Looney TJ, Zhang L, Lepore JB, Macrae T et al. 2014. Hydroxymethylation at gene regulatory regions directs stem/early progenitor cell commitment during erythropoiesis. Cell Rep, 6 (1), pp. 231-244. | Show Abstract | Read more

Hematopoietic stem cell differentiation involves the silencing of self-renewal genes and induction of a specific transcriptional program. Identification of multiple covalent cytosine modifications raises the question of how these derivatized bases influence stem cell commitment. Using a replicative primary human hematopoietic stem/progenitor cell differentiation system, we demonstrate dynamic changes of 5-hydroxymethylcytosine (5-hmC) during stem cell commitment and differentiation to the erythroid lineage. Genomic loci that maintain or gain 5-hmC density throughout erythroid differentiation contain binding sites for erythroid transcription factors and several factors not previously recognized as erythroid-specific factors. The functional importance of 5-hmC was demonstrated by impaired erythroid differentiation, with augmentation of myeloid potential, and disrupted 5-hmC patterning in leukemia patient-derived CD34+ stem/early progenitor cells with TET methylcytosine dioxygenase 2 (TET2) mutations. Thus, chemical conjugation and affinity purification of 5-hmC-enriched sequences followed by sequencing serve as resources for deciphering functional implications for gene expression during stem cell commitment and differentiation along a particular lineage.

Wen L, Li X, Yan L, Tan Y, Li R, Zhao Y, Wang Y, Xie J, Zhang Y, Song C et al. 2014. Whole-genome analysis of 5-hydroxymethylcytosine and 5-methylcytosine at base resolution in the human brain. Genome Biol, 15 (3), pp. R49. | Show Abstract | Read more

BACKGROUND: 5-methylcytosine (mC) can be oxidized by the tet methylcytosine dioxygenase (Tet) family of enzymes to 5-hydroxymethylcytosine (hmC), which is an intermediate of mC demethylation and may also be a stable epigenetic modification that influences chromatin structure. hmC is particularly abundant in mammalian brains but its function is currently unknown. A high-resolution hydroxymethylome map is required to fully understand the function of hmC in the human brain. RESULTS: We present genome-wide and single-base resolution maps of hmC and mC in the human brain by combined application of Tet-assisted bisulfite sequencing and bisulfite sequencing. We demonstrate that hmCs increase markedly from the fetal to the adult stage, and in the adult brain, 13% of all CpGs are highly hydroxymethylated with strong enrichment at genic regions and distal regulatory elements. Notably, hmC peaks are identified at the 5'splicing sites at the exon-intron boundary, suggesting a mechanistic link between hmC and splicing. We report a surprising transcription-correlated hmC bias toward the sense strand and an mC bias toward the antisense strand of gene bodies. Furthermore, hmC is negatively correlated with H3K27me3-marked and H3K9me3-marked repressive genomic regions, and is more enriched at poised enhancers than active enhancers. CONCLUSIONS: We provide single-base resolution hmC and mC maps in the human brain and our data imply novel roles of hmC in regulating splicing and gene expression. Hydroxymethylation is the main modification status for a large portion of CpGs situated at poised enhancers and actively transcribed regions, suggesting its roles in epigenetic tuning at these regions.

Hon GC, Song C-X, Du T, Jin F, Selvaraj S, Lee AY, Yen C-A, Ye Z, Mao S-Q, Wang B-A et al. 2014. 5mC oxidation by Tet2 modulates enhancer activity and timing of transcriptome reprogramming during differentiation. Mol Cell, 56 (2), pp. 286-297. | Show Abstract | Read more

In mammals, cytosine methylation (5mC) is widely distributed throughout the genome but is notably depleted from active promoters and enhancers. While the role of DNA methylation in promoter silencing has been well documented, the function of this epigenetic mark at enhancers remains unclear. Recent experiments have demonstrated that enhancers are enriched for 5-hydroxymethylcytosine (5hmC), an oxidization product of the Tet family of 5mC dioxygenases and an intermediate of DNA demethylation. These results support the involvement of Tet proteins in the regulation of dynamic DNA methylation at enhancers. By mapping DNA methylation and hydroxymethylation at base resolution, we find that deletion of Tet2 causes extensive loss of 5hmC at enhancers, accompanied by enhancer hypermethylation, reduction of enhancer activity, and delayed gene induction in the early steps of differentiation. Our results reveal that DNA demethylation modulates enhancer activity, and its disruption influences the timing of transcriptome reprogramming during cellular differentiation.

Chen C-C, Xiao S, Xie D, Cao X, Song C-X, Wang T, He C, Zhong S. 2013. Understanding variation in transcription factor binding by modeling transcription factor genome-epigenome interactions. PLoS Comput Biol, 9 (12), pp. e1003367. | Show Abstract | Read more

Despite explosive growth in genomic datasets, the methods for studying epigenomic mechanisms of gene regulation remain primitive. Here we present a model-based approach to systematically analyze the epigenomic functions in modulating transcription factor-DNA binding. Based on the first principles of statistical mechanics, this model considers the interactions between epigenomic modifications and a cis-regulatory module, which contains multiple binding sites arranged in any configurations. We compiled a comprehensive epigenomic dataset in mouse embryonic stem (mES) cells, including DNA methylation (MeDIP-seq and MRE-seq), DNA hydroxymethylation (5-hmC-seq), and histone modifications (ChIP-seq). We discovered correlations of transcription factors (TFs) for specific combinations of epigenomic modifications, which we term epigenomic motifs. Epigenomic motifs explained why some TFs appeared to have different DNA binding motifs derived from in vivo (ChIP-seq) and in vitro experiments. Theoretical analyses suggested that the epigenome can modulate transcriptional noise and boost the cooperativity of weak TF binding sites. ChIP-seq data suggested that epigenomic boost of binding affinities in weak TF binding sites can function in mES cells. We showed in theory that the epigenome should suppress the TF binding differences on SNP-containing binding sites in two people. Using personal data, we identified strong associations between H3K4me2/H3K9ac and the degree of personal differences in NFκB binding in SNP-containing binding sites, which may explain why some SNPs introduce much smaller personal variations on TF binding than other SNPs. In summary, this model presents a powerful approach to analyze the functions of epigenomic modifications. This model was implemented into an open source program APEG (Affinity Prediction by Epigenome and Genome, http://systemsbio.ucsd.edu/apeg).

Song C-X, He C. 2013. Potential functional roles of DNA demethylation intermediates. Trends Biochem Sci, 38 (10), pp. 480-484. | Show Abstract | Read more

DNA methylation in the form of 5-methylcytosine (5mC) is a key epigenetic regulator in mammals, and the dynamic balance between methylation and demethylation impacts various processes from development to disease. The recent discovery of the enzymatic generation and removal of the oxidized derivatives of 5mC, namely 5-hydroxymethylcysotine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) in mammalian cells has led to a paradigm shift in our understanding of the demethylation process. Interestingly, emerging evidence indicates that these DNA demethylation intermediates are dynamic and could themselves carry regulatory functions. Here, we discuss 5hmC, 5fC, and 5caC as new epigenetic DNA modifications that could have distinct regulatory functions in conjunction with potential protein partners.

Huang H, Jiang X, Li Z, Li Y, Song C-X, He C, Sun M, Chen P, Gurbuxani S, Wang J et al. 2013. TET1 plays an essential oncogenic role in MLL-rearranged leukemia. Proc Natl Acad Sci U S A, 110 (29), pp. 11994-11999. | Show Abstract | Read more

The ten-eleven translocation 1 (TET1) gene is the founding member of the TET family of enzymes (TET1/2/3) that convert 5-methylcytosine to 5-hydroxymethylcytosine. Although TET1 was first identified as a fusion partner of the mixed lineage leukemia (MLL) gene in acute myeloid leukemia carrying t(10,11), its definitive role in leukemia is unclear. In contrast to the frequent down-regulation (or loss-of-function mutations) and critical tumor-suppressor roles of the three TET genes observed in various types of cancers, here we show that TET1 is a direct target of MLL-fusion proteins and is significantly up-regulated in MLL-rearranged leukemia, leading to a global increase of 5-hydroxymethylcytosine level. Furthermore, our both in vitro and in vivo functional studies demonstrate that Tet1 plays an indispensable oncogenic role in the development of MLL-rearranged leukemia, through coordination with MLL-fusion proteins in regulating their critical cotargets, including homeobox A9 (Hoxa9)/myeloid ecotropic viral integration 1 (Meis1)/pre-B-cell leukemia homeobox 3 (Pbx3) genes. Collectively, our data delineate an MLL-fusion/Tet1/Hoxa9/Meis1/Pbx3 signaling axis in MLL-rearranged leukemia and highlight TET1 as a potential therapeutic target in treating this presently therapy-resistant disease.

Gan H, Wen L, Liao S, Lin X, Ma T, Liu J, Song C-X, Wang M, He C, Han C, Tang F. 2013. Dynamics of 5-hydroxymethylcytosine during mouse spermatogenesis. Nat Commun, 4 (1), pp. 1995. | Show Abstract | Read more

Little is known about how patterns of DNA methylation change during mammalian spermatogenesis. 5 hmC has been recognized as a stable intermediate of DNA demethylation with potential regulatory functions in the mammalian genome. However, its global pattern in germ cells has yet to be addressed. Here, we first conducted absolute quantification of 5 hmC in eight consecutive types of mouse spermatogenic cells using liquid chromatography-tandem mass spectrometry, and then mapped its distributions in various genomic regions using our chemical labeling and enrichment method coupled with deep sequencing. We found that 5 hmC mapped differentially to and changed dynamically in genomic regions related to expression regulation of protein-coding genes, piRNA precursor genes and repetitive elements. Moreover, 5 hmC content correlated with the levels of various transcripts quantified by RNA-seq. These results suggest that the highly ordered alterations of 5 hmC in the mouse genome are potentially crucial for the differentiation of spermatogenic cells.

Lu X, Song C-X, Szulwach K, Wang Z, Weidenbacher P, Jin P, He C. 2013. Chemical modification-assisted bisulfite sequencing (CAB-Seq) for 5-carboxylcytosine detection in DNA. J Am Chem Soc, 135 (25), pp. 9315-9317. | Show Abstract | Read more

5-Methylcytosine (5mC) in DNA can be oxidized stepwise to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) by the TET family proteins. Thymine DNA glycosylase can further remove 5fC and 5caC, connecting 5mC oxidation with active DNA demethylation. Here, we present a chemical modification-assisted bisulfite sequencing (CAB-Seq) that can detect 5caC with single-base resolution in DNA. We optimized 1-ethyl-3-[3-dimethylaminopropyl]carbodiimide hydrochloride (EDC)-catalyzed amide bond formation between the carboxyl group of 5caC and a primary amine group. We found that the modified 5caC can survive the bisulfite treatment without deamination. Therefore, this chemical labeling coupled with bisulfite treatment provides a base-resolution detection and sequencing method for 5caC.

Sun M, Song C-X, Huang H, Frankenberger CA, Sankarasharma D, Gomes S, Chen P, Chen J, Chada KK, He C, Rosner MR. 2013. HMGA2/TET1/HOXA9 signaling pathway regulates breast cancer growth and metastasis. Proc Natl Acad Sci U S A, 110 (24), pp. 9920-9925. | Show Abstract | Read more

The ten-eleven translocation (TET) family of methylcytosine dioxygenases initiates demethylation of DNA and is associated with tumorigenesis in many cancers; however, the mechanism is mostly unknown. Here we identify upstream activators and downstream effectors of TET1 in breast cancer using human breast cancer cells and a genetically engineered mouse model. We show that depleting the architectural transcription factor high mobility group AT-hook 2 (HMGA2) induces TET1. TET1 binds and demethylates its own promoter and the promoter of homeobox A (HOXA) genes, enhancing its own expression and stimulating expression of HOXA genes including HOXA7 and HOXA9. Both TET1 and HOXA9 suppress breast tumor growth and metastasis in mouse xenografts. The genes comprising the HMGA2-TET1-HOXA9 pathway are coordinately regulated in breast cancer and together encompass a prognostic signature for patient survival. These results implicate the HMGA2-TET1-HOX signaling pathway in the epigenetic regulation of human breast cancer and highlight the importance of targeting methylation in specific subpopulations as a potential therapeutic strategy.

Wang T, Wu H, Li Y, Szulwach KE, Lin L, Li X, Chen I-P, Goldlust IS, Chamberlain SJ, Dodd A et al. 2013. Subtelomeric hotspots of aberrant 5-hydroxymethylcytosine-mediated epigenetic modifications during reprogramming to pluripotency. Nat Cell Biol, 15 (6), pp. 700-711. | Show Abstract | Read more

Mammalian somatic cells can be directly reprogrammed into induced pluripotent stem cells (iPSCs) by introducing defined sets of transcription factors. Somatic cell reprogramming involves epigenomic reconfiguration, conferring iPSCs with characteristics similar to embryonic stem cells (ESCs). Human ESCs (hESCs) contain 5-hydroxymethylcytosine (5hmC), which is generated through the oxidation of 5-methylcytosine by the TET enzyme family. Here we show that 5hmC levels increase significantly during reprogramming to human iPSCs mainly owing to TET1 activation, and this hydroxymethylation change is critical for optimal epigenetic reprogramming, but does not compromise primed pluripotency. Compared with hESCs, we find that iPSCs tend to form large-scale (100 kb-1.3 Mb) aberrant reprogramming hotspots in subtelomeric regions, most of which exhibit incomplete hydroxymethylation on CG sites. Strikingly, these 5hmC aberrant hotspots largely coincide (~80%) with aberrant iPSC-ESC non-CG methylation regions. Our results suggest that TET1-mediated 5hmC modification could contribute to the epigenetic variation of iPSCs and iPSC-hESC differences.

Song C-X, Szulwach KE, Dai Q, Fu Y, Mao S-Q, Lin L, Street C, Li Y, Poidevin M, Wu H et al. 2013. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell, 153 (3), pp. 678-691. | Show Abstract | Read more

TET proteins oxidize 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC). 5fC and 5caC are excised by mammalian DNA glycosylase TDG, implicating 5mC oxidation in DNA demethylation. Here, we show that the genomic locations of 5fC can be determined by coupling chemical reduction with biotin tagging. Genome-wide mapping of 5fC in mouse embryonic stem cells (mESCs) reveals that 5fC preferentially occurs at poised enhancers among other gene regulatory elements. Application to Tdg null mESCs further suggests that 5fC production coordinates with p300 in remodeling epigenetic states of enhancers. This process, which is not influenced by 5hmC, appears to be associated with further oxidation of 5hmC and commitment to demethylation through 5fC. Finally, we resolved 5fC at base resolution by hydroxylamine-based protection from bisulfite-mediated deamination, thereby confirming sites of 5fC accumulation. Our results reveal roles of active 5mC/5hmC oxidation and TDG-mediated demethylation in epigenetic tuning at regulatory elements.

Yu P, Xiao S, Xin X, Song C-X, Huang W, McDee D, Tanaka T, Wang T, He C, Zhong S. 2013. Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation. Genome Res, 23 (2), pp. 352-364. | Show Abstract | Read more

Spatial organization of different epigenomic marks was used to infer functions of the epigenome. It remains unclear what can be learned from the temporal changes of the epigenome. Here, we developed a probabilistic model to cluster genomic sequences based on the similarity of temporal changes of multiple epigenomic marks during a cellular differentiation process. We differentiated mouse embryonic stem (ES) cells into mesendoderm cells. At three time points during this differentiation process, we used high-throughput sequencing to measure seven histone modifications and variants--H3K4me1/2/3, H3K27ac, H3K27me3, H3K36me3, and H2A.Z; two DNA modifications--5-mC and 5-hmC; and transcribed mRNAs and noncoding RNAs (ncRNAs). Genomic sequences were clustered based on the spatiotemporal epigenomic information. These clusters not only clearly distinguished gene bodies, promoters, and enhancers, but also were predictive of bidirectional promoters, miRNA promoters, and piRNAs. This suggests specific epigenomic patterns exist on piRNA genes much earlier than germ cell development. Temporal changes of H3K4me2, unmethylated CpG, and H2A.Z were predictive of 5-hmC changes, suggesting unmethylated CpG and H3K4me2 as potential upstream signals guiding TETs to specific sequences. Several rules on combinatorial epigenomic changes and their effects on mRNA expression and ncRNA expression were derived, including a simple rule governing the relationship between 5-hmC and gene expression levels. A Sox17 enhancer containing a FOXA2 binding site and a Foxa2 enhancer containing a SOX17 binding site were identified, suggesting a positive feedback loop between the two mesendoderm transcription factors. These data illustrate the power of using epigenome dynamics to investigate regulatory functions.

Zhang L, Szulwach KE, Hon GC, Song C-X, Park B, Yu M, Lu X, Dai Q, Wang X, Street CR et al. 2013. Tet-mediated covalent labelling of 5-methylcytosine for its genome-wide detection and sequencing. Nat Commun, 4 (1), pp. 1517. | Show Abstract | Read more

5-methylcytosine is an epigenetic mark that affects a broad range of biological functions in mammals. The chemically inert methyl group prevents direct labelling for subsequent affinity purification and detection. Therefore, most current approaches for the analysis of 5-methylcytosine still have limitations of being either density-biased, lacking in robustness and consistency, or incapable of analysing 5-methylcytosine specifically. Here we present an approach, TAmC-Seq, which selectively tags 5-methylcytosine with an azide functionality that can be further labelled with a biotin for affinity purification, detection and genome-wide mapping. Using this covalent labelling approach, we demonstrate high sensitivity and specificity for known methylated loci, as well as increased CpG dinucleotide coverage at lower sequencing depth as compared with antibody-based enrichment, providing an improved efficiency in the 5-methylcytosine enrichment and genome-wide profiling.

Wang T, Pan Q, Lin L, Szulwach KE, Song C-X, He C, Wu H, Warren ST, Jin P, Duan R, Li X. 2012. Genome-wide DNA hydroxymethylation changes are associated with neurodevelopmental genes in the developing human cerebellum. Hum Mol Genet, 21 (26), pp. 5500-5510. | Show Abstract | Read more

5-Hydroxymethylcytosine (5-hmC) is a newly discovered modified form of cytosine that has been suspected to be an important epigenetic modification in neurodevelopment. While DNA methylation dynamics have already been implicated during neurodevelopment, little is known about hydroxymethylation in this process. Here, we report DNA hydroxymethylation dynamics during cerebellum development in the human brain. Overall, we find a positive correlation between 5-hmC levels and cerebellum development. Genome-wide profiling reveals that 5-hmC is highly enriched on specific gene regions including exons and especially the untranslated regions (UTRs), but it is depleted on introns and intergenic regions. Furthermore, we have identified fetus-specific and adult-specific differentially hydroxymethylated regions (DhMRs), most of which overlap with genes and CpG island shores. Surprisingly, during development, DhMRs are highly enriched in genes encoding mRNAs that can be regulated by fragile X mental retardation protein (FMRP), some of which are disrupted in autism, as well as in many known autism genes. Our results suggest that 5-hmC-mediated epigenetic regulation may broadly impact the development of the human brain, and its dysregulation could contribute to the molecular pathogenesis of neurodevelopmental disorders. Accession number: Sequencing data have been deposited to GEO with accession number GSE40539.

Yu M, Hon GC, Szulwach KE, Song C-X, Jin P, Ren B, He C. 2012. Tet-assisted bisulfite sequencing of 5-hydroxymethylcytosine. Nat Protoc, 7 (12), pp. 2159-2170. | Show Abstract | Read more

A complete understanding of the potential function of 5-hydroxymethylcytosine (5-hmC), a DNA cytosine modification in mammalian cells, requires an accurate single-base resolution sequencing method. Here we describe a modified bisulfite-sequencing method, Tet-assisted bisulfite sequencing (TAB-seq), which can identify 5-hmC at single-base resolution, as well as determine its abundance at each modification site. This protocol involves β-glucosyltransferase (β-GT)-mediated protection of 5-hmC (glucosylation) and recombinant mouse Tet1(mTet1)-mediated oxidation of 5-methylcytosine (5-mC) to 5-carboxylcytosine (5-caC). After the subsequent bisulfite treatment and PCR amplification, both cytosine and 5-caC (derived from 5-mC) are converted to thymine (T), whereas 5-hmC reads as C. The treated genomic DNA is suitable for both whole-genome and locus-specific sequencing. The entire procedure (which does not include data analysis) can be completed in 14 d for whole-genome sequencing or 7 d for locus-specific sequencing.

Song C-X, Yi C, He C. 2012. Mapping recently identified nucleotide variants in the genome and transcriptome. Nat Biotechnol, 30 (11), pp. 1107-1116. | Show Abstract | Read more

Nucleotide variants, especially those related to epigenetic functions, provide critical regulatory information beyond simple genomic sequence, and they define cell status in higher organisms. 5-Methylcytosine, which is found in DNA, was until recently the only nucleotide variant studied in terms of epigenetics in eukaryotes. However, 5-methylcytosine has turned out to be just one component of a dynamic DNA epigenetic regulatory network that also includes 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxylcytosine. Recently, reversible methylation of N6-methyladenosine in RNA has also been demonstrated. The discovery of these new nucleotide variants triggered an explosion of new information in the epigenetics field. This rapid research progress has benefited significantly from timely developments of new technologies that specifically recognize, enrich and sequence nucleotide modifications, as evidenced by the wide application of the bisulfite sequencing of 5-methylcytosine and very recent modifications of bisulfite sequencing to resolve 5-hydroxymethylcytosine from 5-methylcytosine with base-resolution information.

Li Y, Song C-X, He C, Jin P. 2012. Selective capture of 5-hydroxymethylcytosine from genomic DNA. J Vis Exp, (68), | Show Abstract | Read more

5-methylcytosine (5-mC) constitutes ~2-8% of the total cytosines in human genomic DNA and impacts a broad range of biological functions, including gene expression, maintenance of genome integrity, parental imprinting, X-chromosome inactivation, regulation of development, aging, and cancer(1). Recently, the presence of an oxidized 5-mC, 5-hydroxymethylcytosine (5-hmC), was discovered in mammalian cells, in particular in embryonic stem (ES) cells and neuronal cells(2-4). 5-hmC is generated by oxidation of 5-mC catalyzed by TET family iron (II)/α-ketoglutarate-dependent dioxygenases(2, 3). 5-hmC is proposed to be involved in the maintenance of embryonic stem (mES) cell, normal hematopoiesis and malignancies, and zygote development(2, 5-10). To better understand the function of 5-hmC, a reliable and straightforward sequencing system is essential. Traditional bisulfite sequencing cannot distinguish 5-hmC from 5-mC(11). To unravel the biology of 5-hmC, we have developed a highly efficient and selective chemical approach to label and capture 5-hmC, taking advantage of a bacteriophage enzyme that adds a glucose moiety to 5-hmC specifically(12). Here we describe a straightforward two-step procedure for selective chemical labeling of 5-hmC. In the first labeling step, 5-hmC in genomic DNA is labeled with a 6-azide-glucose catalyzed by β-GT, a glucosyltransferase from T4 bacteriophage, in a way that transfers the 6-azide-glucose to 5-hmC from the modified cofactor, UDP-6-N3-Glc (6-N3UDPG). In the second step, biotinylation, a disulfide biotin linker is attached to the azide group by click chemistry. Both steps are highly specific and efficient, leading to complete labeling regardless of the abundance of 5-hmC in genomic regions and giving extremely low background. Following biotinylation of 5-hmC, the 5-hmC-containing DNA fragments are then selectively captured using streptavidin beads in a density-independent manner. The resulting 5-hmC-enriched DNA fragments could be used for downstream analyses, including next-generation sequencing. Our selective labeling and capture protocol confers high sensitivity, applicable to any source of genomic DNA with variable/diverse 5-hmC abundances. Although the main purpose of this protocol is its downstream application (i.e., next-generation sequencing to map out the 5-hmC distribution in genome), it is compatible with single-molecule, real-time SMRT (DNA) sequencing, which is capable of delivering single-base resolution sequencing of 5-hmC.

Szulwach KE, Song C-X, He C, Jin P. 2012. 5-Hydroxymethylcytosine (5-hmC) Specific Enrichment. Bio Protoc, 2 (15), | Show Abstract

5-Hydroxymethylcytosine (5-hmC) is a newly discovered DNA modification in mammalian genomes. This protocol is to be a highly efficient and selective chemical approach to label and capture 5-hmC, taking advantage of a bacteriophage enzyme that adds a glucose moiety to 5-hmC specifically, which could in turn be used for high-throughput mapping via next-generation sequencing.

Kellinger MW, Song C-X, Chong J, Lu X-Y, He C, Wang D. 2012. 5-formylcytosine and 5-carboxylcytosine reduce the rate and substrate specificity of RNA polymerase II transcription. Nat Struct Mol Biol, 19 (8), pp. 831-833. | Show Abstract | Read more

Although the roles of 5-methylcytosine and 5-hydroxymethylcytosine in epigenetic regulation of gene expression are well established, the functional effects of 5-formylcytosine and 5-carboxylcytosine on the process of transcription are not clear. Here we report a systematic study of the effects of five different forms of cytosine in DNA on mammalian and yeast RNA polymerase II transcription, providing new insights into potential functional interplay between cytosine methylation status and transcription.

Yu M, Hon GC, Szulwach KE, Song C-X, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B et al. 2012. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell, 149 (6), pp. 1368-1380. | Show Abstract | Read more

The study of 5-hydroxylmethylcytosines (5hmC) has been hampered by the lack of a method to map it at single-base resolution on a genome-wide scale. Affinity purification-based methods cannot precisely locate 5hmC nor accurately determine its relative abundance at each modified site. We here present a genome-wide approach, Tet-assisted bisulfite sequencing (TAB-Seq), that when combined with traditional bisulfite sequencing can be used for mapping 5hmC at base resolution and quantifying the relative abundance of 5hmC as well as 5mC. Application of this method to embryonic stem cells not only confirms widespread distribution of 5hmC in the mammalian genome but also reveals sequence bias and strand asymmetry at 5hmC sites. We observe high levels of 5hmC and reciprocally low levels of 5mC near but not on transcription factor-binding sites. Additionally, the relative abundance of 5hmC varies significantly among distinct functional sequence elements, suggesting different mechanisms for 5hmC deposition and maintenance.

Yao Q, Song C-X, He C, Kumaran D, Dunn JJ. 2012. Heterologous expression and purification of Arabidopsis thaliana VIM1 protein: in vitro evidence for its inability to recognize hydroxymethylcytosine, a rare base in Arabidopsis DNA. Protein Expr Purif, 83 (1), pp. 104-111. | Show Abstract | Read more

The discovery of 5-hydroxymethyl-cytosine (5hmC) in mammalian cells prompted us to look for this base in the DNA of Arabidopsis thaliana (thale cress), and to ask how well the Arabidopsis Variant in Methylation 1 (VIM1) protein, an essential factor in maintaining 5-cytosine methylation (5mC) homeostasis and epigenetic silencing in this plant, recognizes this novel base. We found that the DNA of Arabidopsis' leaves and flowers contain low levels of 5hmC. We also cloned and expressed in Escherichia coli full-length VIM1 protein, the archetypal member of the five Arabidopsis VIM gene family. Using in vitro binding assays, we observed that full-length VIM1 binds preferentially to hemi-methylated DNA with a single modified 5mCpG site; this result is consistent with its known role in preserving DNA methylation in vivo following DNA replication. However, when 5hmC replaces one or both cytosine residues at a palindromic CpG site, VIM1 binds with approximately ≥10-fold lower affinity. These results suggest that 5hmC may contribute to VIM-mediated passive loss of cytosine methylation in vivo during Arabidopsis DNA replication.

Zhao L-Y, Song C-X, Yu D, Liu X-L, Guo J-Q, Wang C, Ding Y-W, Zhou H-X, Ma S-M, Liu X-D, Liu X. 2012. [Effects of extremely low frequency electromagnetic radiation on cardiovascular system of workers]. Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi, 30 (3), pp. 194-195. | Show Abstract

OBJECTIVE: To observe the exposure levels of extremely low frequency electromagnetic fields in workplaces and to analyze the effects of extremely low frequency electromagnetic radiation on cardiovascular system of occupationally exposed people. METHOD: Intensity of electromagnetic fields in two workplaces (control and exposure groups) was detected with EFA-300 frequency electromagnetic field strength tester, and intensity of the noise was detected with AWA5610D integral sound level. The information of health physical indicators of 188 controls and 642 occupationally exposed workers was collected. Data were analyzed by SPSS17.0 statistic software. RESULTS: The intensity of electric fields and the magnetic fields in exposure groups was significantly higher than that in control group (P < 0.05), but there was no significant difference of noise between two workplaces (P > 0.05). The results of physical examination showed that the abnormal rates of HCY, ALT, AST, GGT, ECG in the exposure group were significantly higher than those in control group (P < 0.05). There were no differences of sex, age, height, weight between two groups (P > 0.05). CONCLUSION: Exposure to extremely low frequency electromagnetic radiation may have some effects on the cardiovascular system of workers.

Cited:

22

WOS

Song C-X, He C. 2012. Balance of DNA methylation and demethylation in cancer development GENOME BIOLOGY, 13 (10), | Read more

Song C-X, He C. 2012. Balance of DNA methylation and demethylation in cancer development. Genome Biol, 13 (10), pp. 173. | Show Abstract | Read more

Genome-wide 5-hydroxymethylome analysis of a rodent hepatocarcinogen model reveals that 5-hydroxymethylcytosine-dependent active DNA demethylation may be functionally important in the early stages of carcinogenesis.

Song C-X, Clark TA, Lu X-Y, Kislyuk A, Dai Q, Turner SW, He C, Korlach J. 2011. Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine. Nat Methods, 9 (1), pp. 75-77. | Show Abstract | Read more

We describe strand-specific, base-resolution detection of 5-hydroxymethylcytosine (5-hmC) in genomic DNA with single-molecule sensitivity, combining a bioorthogonal, selective chemical labeling method of 5-hmC with single-molecule, real-time (SMRT) DNA sequencing. The chemical labeling not only allows affinity enrichment of 5-hmC-containing DNA fragments but also enhances the kinetic signal of 5-hmC during SMRT sequencing. We applied the approach to sequence 5-hmC in a genomic DNA sample with high confidence.

Szulwach KE, Li X, Li Y, Song C-X, Wu H, Dai Q, Irier H, Upadhyay AK, Gearing M, Levey AI et al. 2011. 5-hmC-mediated epigenetic dynamics during postnatal neurodevelopment and aging. Nat Neurosci, 14 (12), pp. 1607-1616. | Show Abstract | Read more

DNA methylation dynamics influence brain function and are altered in neurological disorders. 5-hydroxymethylcytosine (5-hmC), a DNA base that is derived from 5-methylcytosine, accounts for ∼40% of modified cytosine in the brain and has been implicated in DNA methylation-related plasticity. We mapped 5-hmC genome-wide in mouse hippocampus and cerebellum at three different ages, which allowed us to assess its stability and dynamic regulation during postnatal neurodevelopment through adulthood. We found developmentally programmed acquisition of 5-hmC in neuronal cells. Epigenomic localization of 5-hmC-regulated regions revealed stable and dynamically modified loci during neurodevelopment and aging. By profiling 5-hmC in human cerebellum, we found conserved genomic features of 5-hmC. Finally, we found that 5-hmC levels were inversely correlated with methyl-CpG-binding protein 2 dosage, a protein encoded by a gene in which mutations cause Rett syndrome. These data suggest that 5-hmC-mediated epigenetic modification is critical in neurodevelopment and diseases.

Song C-X, He C. 2011. The hunt for 5-hydroxymethylcytosine: the sixth base. Epigenomics, 3 (5), pp. 521-523. | Read more

Song C-X, He C. 2011. Bioorthogonal labeling of 5-hydroxymethylcytosine in genomic DNA and diazirine-based DNA photo-cross-linking probes. Acc Chem Res, 44 (9), pp. 709-717. | Show Abstract | Read more

DNA is not merely a combination of four genetic codes, namely A, T, C, and G. It also contains minor modifications that play crucial roles throughout biology. For example, the fifth DNA base, 5-methylcytosine (5-mC), which accounts for ∼1% of all the nucleotides in mammalian genomic DNA, is a vital epigenetic mark. It impacts a broad range of biological functions, from development to cancer. Recently, an oxidized form of 5-methylcytosine, 5-hydroxymethylcytosine (5-hmC), was found to constitute the sixth base in the mammalian genome; it was believed to be another crucial epigenetic mark. Unfortunately, further study of this newly discovered DNA base modification has been hampered by inadequate detection and sequencing methods, because current techniques fail to differentiate 5-hmC from 5-mC. The immediate challenge, therefore, is to develop robust methods for ascertaining the positions of 5-hmC within the mammalian genome. In this Account, we describe our development of the first bioorthogonal, selective labeling of 5-hmC to specifically address this challenge. We utilize β-glucosyltransferase (βGT) to transfer an azide-modified glucose onto 5-hmC in genomic DNA. The azide moiety enables further bioorthogonal click chemistry to install a biotin group, which allows for detection, affinity enrichment, and, most importantly, deep sequencing of the 5-hmC-containing DNA. With this highly effective and selective method, we revealed the first genome-wide distribution of 5-hmC in the mouse genome and began to shed further light on the biology of 5-hmC. The strategy lays the foundation for developing high-throughput, single-base-resolution sequencing methods for 5-hmC in mammalian genomes in the future. DNA and RNA are not static inside cells. They interact with protein and other DNA and RNA in fundamental biological processes such as replication, transcription, translation, and DNA and RNA modification and repair. The ability to investigate these interactions will also be enhanced by developing and utilizing bioorthogonal probes. We have chosen the photoreactive diazirine photophore as a bioorthogonal moiety to develop nucleic acid probes. The small size and unique photo-cross-linking activity of diazirine enabled us to develop a series of novel cross-linking probes to streamline the study of protein-nucleic acid and nucleic acid-nucleic acid interactions. In the second half of this Account, we highlight a few examples of these probes.

He Y-F, Li B-Z, Li Z, Liu P, Wang Y, Tang Q, Ding J, Jia Y, Chen Z, Li L et al. 2011. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science, 333 (6047), pp. 1303-1307. | Show Abstract | Read more

The prevalent DNA modification in higher organisms is the methylation of cytosine to 5-methylcytosine (5mC), which is partially converted to 5-hydroxymethylcytosine (5hmC) by the Tet (ten eleven translocation) family of dioxygenases. Despite their importance in epigenetic regulation, it is unclear how these cytosine modifications are reversed. Here, we demonstrate that 5mC and 5hmC in DNA are oxidized to 5-carboxylcytosine (5caC) by Tet dioxygenases in vitro and in cultured cells. 5caC is specifically recognized and excised by thymine-DNA glycosylase (TDG). Depletion of TDG in mouse embyronic stem cells leads to accumulation of 5caC to a readily detectable level. These data suggest that oxidation of 5mC by Tet proteins followed by TDG-mediated base excision of 5caC constitutes a pathway for active DNA demethylation.

Song C-X, Yu M, Dai Q, He C. 2011. Detection of 5-hydroxymethylcytosine in a combined glycosylation restriction analysis (CGRA) using restriction enzyme Taq(α)I. Bioorg Med Chem Lett, 21 (17), pp. 5075-5077. | Show Abstract | Read more

5-Hydroxymethylcytosine (5-hmC) is a newly discovered DNA base in mammalian cells that is believed to be another important epigenetic modification. Here we report the use of a methylation-insensitive restriction enzyme Taq(α)I coupled with selective chemical labeling of 5-hmC in a combined glycosylation restriction analysis (CGRA) to detect 5-hmC in TCGA sequences. This method, differentiates fully versus hemi-hydroxymethylated cytosine in the CpG dinucleotide, adds a new tool to facilitate biological studies of 5-hmC.

Sun F, Zhou L, Zhao B-C, Deng X, Cho H, Yi C, Jian X, Song C-X, Luan C-H, Bae T et al. 2011. Targeting MgrA-mediated virulence regulation in Staphylococcus aureus. Chem Biol, 18 (8), pp. 1032-1041. | Show Abstract | Read more

Increasing antibiotic resistance in human pathogens necessitates the development of new approaches against infections. Targeting virulence regulation at the transcriptional level represents a promising strategy yet to be explored. A global transcriptional regulator, MgrA in Staphylococcus aureus, was identified previously as a key virulence determinant. We have performed a fluorescence anisotropy (FA)-based high-throughput screen that identified 5, 5-methylenedisalicylic acid (MDSA), which blocks the DNA binding of MgrA. MDSA represses the expression of α-toxin that is up-regulated by MgrA and activates the transcription of protein A, a gene down-regulated by MgrA. MDSA alters bacterial antibiotic susceptibilities via an MgrA-dependent pathway. A mouse model of infection indicated that MDSA could attenuate S. aureus virulence. This work is a rare demonstration of utilizing small molecules to block protein-DNA interaction, thus tuning important biological regulation at the transcriptional level.

Song C-X, Sun Y, Dai Q, Lu X-Y, Yu M, Yang C-G, He C. 2011. Detection of 5-hydroxymethylcytosine in DNA by transferring a keto-glucose by using T4 phage β-glucosyltransferase. Chembiochem, 12 (11), pp. 1682-1685. | Show Abstract | Read more

Capture with ketone: 5-Hydroxylmethylcytosine (5-hmC) in DNA can be selectively labeled with a keto-glucose and subsequently linked to biotin via a ketoxime linker for enrichment and detection/sequencing purposes (see scheme). Keto-glucose can be more efficiently transferred than azide-glucose by β-glucosyltransferase, and can provide single-base resolution detection/ sequencing of 5-hmC. © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Moran-Crusio K, Reavie L, Shih A, Abdel-Wahab O, Ndiaye-Lobry D, Lobry C, Figueroa ME, Vasanthakumar A, Patel J, Zhao X et al. 2011. Tet2 loss leads to increased hematopoietic stem cell self-renewal and myeloid transformation. Cancer Cell, 20 (1), pp. 11-24. | Show Abstract | Read more

Somatic loss-of-function mutations in the ten-eleven translocation 2 (TET2) gene occur in a significant proportion of patients with myeloid malignancies. Although there are extensive genetic data implicating TET2 mutations in myeloid transformation, the consequences of Tet2 loss in hematopoietic development have not been delineated. We report here an animal model of conditional Tet2 loss in the hematopoietic compartment that leads to increased stem cell self-renewal in vivo as assessed by competitive transplant assays. Tet2 loss leads to a progressive enlargement of the hematopoietic stem cell compartment and eventual myeloproliferation in vivo, including splenomegaly, monocytosis, and extramedullary hematopoiesis. In addition, Tet2(+/-) mice also displayed increased stem cell self-renewal and extramedullary hematopoiesis, suggesting that Tet2 haploinsufficiency contributes to hematopoietic transformation in vivo.

Szulwach KE, Li X, Li Y, Song C-X, Han JW, Kim S, Namburi S, Hermetz K, Kim JJ, Rudd MK et al. 2011. Integrating 5-hydroxymethylcytosine into the epigenomic landscape of human embryonic stem cells. PLoS Genet, 7 (6), pp. e1002154. | Show Abstract | Read more

Covalent modification of DNA distinguishes cellular identities and is crucial for regulating the pluripotency and differentiation of embryonic stem (ES) cells. The recent demonstration that 5-methylcytosine (5-mC) may be further modified to 5-hydroxymethylcytosine (5-hmC) in ES cells has revealed a novel regulatory paradigm to modulate the epigenetic landscape of pluripotency. To understand the role of 5-hmC in the epigenomic landscape of pluripotent cells, here we profile the genome-wide 5-hmC distribution and correlate it with the genomic profiles of 11 diverse histone modifications and six transcription factors in human ES cells. By integrating genomic 5-hmC signals with maps of histone enrichment, we link particular pluripotency-associated chromatin contexts with 5-hmC. Intriguingly, through additional correlations with defined chromatin signatures at promoter and enhancer subtypes, we show distinct enrichment of 5-hmC at enhancers marked with H3K4me1 and H3K27ac. These results suggest potential role(s) for 5-hmC in the regulation of specific promoters and enhancers. In addition, our results provide a detailed epigenomic map of 5-hmC from which to pursue future functional studies on the diverse regulatory roles associated with 5-hmC.

Dai Q, Song C-X, Pan T, He C. 2011. Syntheses of two 5-hydroxymethyl-2'-deoxycytidine phosphoramidites with TBDMS as the 5-hydroxymethyl protecting group and their incorporation into DNA. J Org Chem, 76 (10), pp. 4182-4188. | Show Abstract | Read more

5-Hydroxymethylcytosine (5-hmC) is a newly discovered DNA base modification in mammalian genomic DNA that is proposed to be a major epigenetic mark. We report here the syntheses of two new versions of phosphoramidites III and IV from 5-iodo-2'-deoxyuridine in 18% and 32% overall yields, respectively, with TBDMS as the 5-hydroxyl protecting group. Phosphoramidites III and IV allow efficient incorporation of 5-hmC into DNA and a "one-step" deprotection procedure to cleanly remove all the protecting groups. A "two-step" deprotection strategy is compatible with ultramild DNA synthesis, which enables the synthesis of 5hmC-containing DNA with additional modifications.

Song C-X, Szulwach KE, Fu Y, Dai Q, Yi C, Li X, Li Y, Chen C-H, Zhang W, Jian X et al. 2011. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat Biotechnol, 29 (1), pp. 68-72. | Show Abstract | Read more

In contrast to 5-methylcytosine (5-mC), which has been studied extensively, little is known about 5-hydroxymethylcytosine (5-hmC), a recently identified epigenetic modification present in substantial amounts in certain mammalian cell types. Here we present a method for determining the genome-wide distribution of 5-hmC. We use the T4 bacteriophage β-glucosyltransferase to transfer an engineered glucose moiety containing an azide group onto the hydroxyl group of 5-hmC. The azide group can be chemically modified with biotin for detection, affinity enrichment and sequencing of 5-hmC-containing DNA fragments in mammalian genomes. Using this method, we demonstrate that 5-hmC is present in human cell lines beyond those previously recognized. We also find a gene expression level-dependent enrichment of intragenic 5-hmC in mouse cerebellum and an age-dependent acquisition of this modification in specific gene bodies linked to neurodegenerative disorders.

Song C-X, Cai G-X, Farrell TR, Jiang Z-P, Li H, Gan L-B, Shi Z-J. 2009. Direct functionalization of benzylic C-Hs with vinyl acetates via Fe-catalysis. Chem Commun (Camb), (40), pp. 6002-6004. | Show Abstract | Read more

Direct cross-coupling to construct sp3 C-sp3 C bonds via Fe-catalyzed benzylic C-H activation with 1-aryl vinyl acetate was developed.

Lin S, Song C-X, Cai G-X, Wang W-H, Shi Z-J. 2008. Intra/intermolecular direct allylic alkylation via Pd(II)-catalyzed allylic C-H activation. J Am Chem Soc, 130 (39), pp. 12901-12903. | Show Abstract | Read more

The first catalytic direct alkylation of allylic C-H bonds via Pd(II)-catalysis is described in the absence of base. Polysubstituted cyclic compounds can also be constructed by the intramolecular direct allylic alkylation.

Song C-X, Yin S, Ma L, Wheeler A, Chen Y, Zhang Y, Liu B, Xiong J, Zhang W, Hu J et al. 2017. 5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages. Cell Res, 27 (10), pp. 1231-1242. | Show Abstract | Read more

5-Hydroxymethylcytosine (5hmC) is an important mammalian DNA epigenetic modification that has been linked to gene regulation and cancer pathogenesis. Here we explored the diagnostic potential of 5hmC in circulating cell-free DNA (cfDNA) using a sensitive chemical labeling-based low-input shotgun sequencing approach. We sequenced cell-free 5hmC from 49 patients of seven different cancer types and found distinct features that could be used to predict cancer types and stages with high accuracy. Specifically, we discovered that lung cancer leads to a progressive global loss of 5hmC in cfDNA, whereas hepatocellular carcinoma and pancreatic cancer lead to disease-specific changes in the cell-free hydroxymethylome. Our proof-of-principle results suggest that cell-free 5hmC signatures may potentially be used not only to identify cancer types but also to track tumor stage in some cancers.

Song C-X, Diao J, Brunger AT, Quake SR. 2016. Simultaneous single-molecule epigenetic imaging of DNA methylation and hydroxymethylation. Proc Natl Acad Sci U S A, 113 (16), pp. 4338-4343. | Show Abstract | Read more

The modifications 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) are the two major DNA epigenetic modifications in mammalian genomes and play crucial roles in development and pathogenesis. Little is known about the colocalization or potential correlation of these two modifications. Here we present an ultrasensitive single-molecule imaging technology capable of detecting and quantifying 5hmC and 5mC from trace amounts of DNA. We used this approach to perform single-molecule fluorescence resonance energy transfer (smFRET) experiments which measure the proximity between 5mC and 5hmC in the same DNA molecule. Our results reveal high levels of adjacent and opposing methylated and hydroxymethylated CpG sites (5hmC/5mCpGs) in mouse genomic DNA across multiple tissues. This identifies the previously undetectable and unappreciated 5hmC/5mCpGs as one of the major states for 5hmC in the mammalian genome and suggest that they could function in promoting gene expression.

Song C-X, He C. 2013. Potential functional roles of DNA demethylation intermediates. Trends Biochem Sci, 38 (10), pp. 480-484. | Show Abstract | Read more

DNA methylation in the form of 5-methylcytosine (5mC) is a key epigenetic regulator in mammals, and the dynamic balance between methylation and demethylation impacts various processes from development to disease. The recent discovery of the enzymatic generation and removal of the oxidized derivatives of 5mC, namely 5-hydroxymethylcysotine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) in mammalian cells has led to a paradigm shift in our understanding of the demethylation process. Interestingly, emerging evidence indicates that these DNA demethylation intermediates are dynamic and could themselves carry regulatory functions. Here, we discuss 5hmC, 5fC, and 5caC as new epigenetic DNA modifications that could have distinct regulatory functions in conjunction with potential protein partners.

Song C-X, Szulwach KE, Dai Q, Fu Y, Mao S-Q, Lin L, Street C, Li Y, Poidevin M, Wu H et al. 2013. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell, 153 (3), pp. 678-691. | Show Abstract | Read more

TET proteins oxidize 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC). 5fC and 5caC are excised by mammalian DNA glycosylase TDG, implicating 5mC oxidation in DNA demethylation. Here, we show that the genomic locations of 5fC can be determined by coupling chemical reduction with biotin tagging. Genome-wide mapping of 5fC in mouse embryonic stem cells (mESCs) reveals that 5fC preferentially occurs at poised enhancers among other gene regulatory elements. Application to Tdg null mESCs further suggests that 5fC production coordinates with p300 in remodeling epigenetic states of enhancers. This process, which is not influenced by 5hmC, appears to be associated with further oxidation of 5hmC and commitment to demethylation through 5fC. Finally, we resolved 5fC at base resolution by hydroxylamine-based protection from bisulfite-mediated deamination, thereby confirming sites of 5fC accumulation. Our results reveal roles of active 5mC/5hmC oxidation and TDG-mediated demethylation in epigenetic tuning at regulatory elements.

Song C-X, Yi C, He C. 2012. Mapping recently identified nucleotide variants in the genome and transcriptome. Nat Biotechnol, 30 (11), pp. 1107-1116. | Show Abstract | Read more

Nucleotide variants, especially those related to epigenetic functions, provide critical regulatory information beyond simple genomic sequence, and they define cell status in higher organisms. 5-Methylcytosine, which is found in DNA, was until recently the only nucleotide variant studied in terms of epigenetics in eukaryotes. However, 5-methylcytosine has turned out to be just one component of a dynamic DNA epigenetic regulatory network that also includes 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxylcytosine. Recently, reversible methylation of N6-methyladenosine in RNA has also been demonstrated. The discovery of these new nucleotide variants triggered an explosion of new information in the epigenetics field. This rapid research progress has benefited significantly from timely developments of new technologies that specifically recognize, enrich and sequence nucleotide modifications, as evidenced by the wide application of the bisulfite sequencing of 5-methylcytosine and very recent modifications of bisulfite sequencing to resolve 5-hydroxymethylcytosine from 5-methylcytosine with base-resolution information.

Yu M, Hon GC, Szulwach KE, Song C-X, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B et al. 2012. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell, 149 (6), pp. 1368-1380. | Show Abstract | Read more

The study of 5-hydroxylmethylcytosines (5hmC) has been hampered by the lack of a method to map it at single-base resolution on a genome-wide scale. Affinity purification-based methods cannot precisely locate 5hmC nor accurately determine its relative abundance at each modified site. We here present a genome-wide approach, Tet-assisted bisulfite sequencing (TAB-Seq), that when combined with traditional bisulfite sequencing can be used for mapping 5hmC at base resolution and quantifying the relative abundance of 5hmC as well as 5mC. Application of this method to embryonic stem cells not only confirms widespread distribution of 5hmC in the mammalian genome but also reveals sequence bias and strand asymmetry at 5hmC sites. We observe high levels of 5hmC and reciprocally low levels of 5mC near but not on transcription factor-binding sites. Additionally, the relative abundance of 5hmC varies significantly among distinct functional sequence elements, suggesting different mechanisms for 5hmC deposition and maintenance.

Song C-X, Clark TA, Lu X-Y, Kislyuk A, Dai Q, Turner SW, He C, Korlach J. 2011. Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine. Nat Methods, 9 (1), pp. 75-77. | Show Abstract | Read more

We describe strand-specific, base-resolution detection of 5-hydroxymethylcytosine (5-hmC) in genomic DNA with single-molecule sensitivity, combining a bioorthogonal, selective chemical labeling method of 5-hmC with single-molecule, real-time (SMRT) DNA sequencing. The chemical labeling not only allows affinity enrichment of 5-hmC-containing DNA fragments but also enhances the kinetic signal of 5-hmC during SMRT sequencing. We applied the approach to sequence 5-hmC in a genomic DNA sample with high confidence.

Song C-X, Szulwach KE, Fu Y, Dai Q, Yi C, Li X, Li Y, Chen C-H, Zhang W, Jian X et al. 2011. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat Biotechnol, 29 (1), pp. 68-72. | Show Abstract | Read more

In contrast to 5-methylcytosine (5-mC), which has been studied extensively, little is known about 5-hydroxymethylcytosine (5-hmC), a recently identified epigenetic modification present in substantial amounts in certain mammalian cell types. Here we present a method for determining the genome-wide distribution of 5-hmC. We use the T4 bacteriophage β-glucosyltransferase to transfer an engineered glucose moiety containing an azide group onto the hydroxyl group of 5-hmC. The azide group can be chemically modified with biotin for detection, affinity enrichment and sequencing of 5-hmC-containing DNA fragments in mammalian genomes. Using this method, we demonstrate that 5-hmC is present in human cell lines beyond those previously recognized. We also find a gene expression level-dependent enrichment of intragenic 5-hmC in mouse cerebellum and an age-dependent acquisition of this modification in specific gene bodies linked to neurodegenerative disorders.

A comprehensive computational platform to improve liquid biopsies for cancer detection

BackgroundAlthough recent advances in cancer research offer new ways to treat cancer, early detection still represents the best opportunity for curing cancer. Earlier stage treatment not only greatly improves patient survival but also costs considerably less. Therefore, a non-invasive, low cost and reliable cancer diagnostic assay could greatly benefit cancer patients and the public. In this regard, circulating cell-free DNA (cfDNA) holds tremendous potential to develop such a diagnostic assay. ...

View project

Comprehensive epigenetic profiling of circulating cell-free DNA for noninvasive diagnostics

Background5-Methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) are the two major epigenetic modifications found in the mammalian genome and they played important roles in a broad range of biological processes from gene regulation to initiation and progression of many human diseases. Therefore, epigenetic modifications are valuable biomarkers for diagnostics. Circulating cell-free DNA (cfDNA) is the DNA found in our bloodstream, which provides a noninvasive window for disease diagnosis. ...

View project

A comprehensive computational platform to improve liquid biopsies for cancer detection

BackgroundAlthough recent advances in cancer research offer new ways to treat cancer, early detection still represents the best opportunity for curing cancer. Earlier stage treatment not only greatly improves patient survival but also costs considerably less. Therefore, a non-invasive, low cost and reliable cancer diagnostic assay could greatly benefit cancer patients and the public. In this regard, circulating cell-free DNA (cfDNA) holds tremendous potential to develop such a diagnostic assay. ...

View project

2841