In that paper, Thurman studies the links between DNase I hypersensitive sites (DHSs) and markers of regulation. These DHSs are areas between histones where the DNA is free of histone proteins. There are remarkable images around showing histones as beads on a string, and the distances in nucleotides between histones is in fact not that large. In fact, a histone, despite a large complex, sterically hindering 50% of the DNA access does not stop translation; the transcription complexes apparently have no trouble passing the histones, as described by Felsenfeld et al. Quite amazing!
Now, those histones are chemically modified with acetyl, methyl, phosphates, and other groups. At well-describes residues, and each easily regulates modification of other steps. And everything regulates gene expression. Oh, and as we say yesterday, all that is regulated by metabolites, which in turn... Lovely. Try modeling that mathematically :) Here's what Abcam has to say about it:
Acetylation is generally linked to gene activation. Acetylation on Lys-10 (H3K9ac) impairs methylation at Arg-9 (H3R8me2s). Acetylation on Lys-19 (H3K18ac) and Lys-24 (H3K24ac) favors methylation at Arg-18 (H3R17me). Citrullination at Arg-9 (H3R8ci) and/or Arg-18 (H3R17ci) by PADI4 impairs methylation and represses transcription. Asymmetric dimethylation at Arg-18 (H3R17me2a) by CARM1 is linked to gene activation. Symmetric dimethylation at Arg-9 (H3R8me2s) by PRMT5 is linked to gene repression. Asymmetric dimethylation at Arg-3 (H3R2me2a) by PRMT6 is linked to gene repression and is mutually exclusive with H3 Lys-5 methylation (H3K4me2 and H3K4me3). H3R2me2a is present at the 3' of genes regardless of their transcription state and is enriched on inactive promoters, while it is absent on active promoters. Methylation at Lys-5 (H3K4me), Lys-37 (H3K36me) and Lys-80 (H3K79me) are linked to gene activation. Methylation at Lys-5 (H3K4me) facilitates subsequent acetylation of H3 and H4. ... ...
And that goes on for a while. Ambitiously, I started converting things I read into a WikiPathways:
I think that will keep me busy for a while. I won't even attempt to complete it further tonight. I have given up on that about an hour ago. In fact, I returned to the paper by Thurman, as I still have to figure out how their experimental methods work. In fact, how does one even detect the chemical modification of a histone, and to which DNA sequence on any of the chromosomes it belongs?? I mean, that's not AFM or STM, I say...
No, it's ChiP. ChIP on a chip, in fact. They have antibodies are stick particularly to a histones with one particular modification. That is how I actually ended up on that Abcam web page in the first place. Check out this nice western blot. With a huge antibody detecting whether there is an acetyl modification. Wicked!
Well, earlier I learned that proteins detecting methylated CpG bases not because of the methyl group (which amazed me already), but by a distorted hydration in the major groove due to MeCP2 binding. Seriously! Eat that, organic chemist friends!
So, Thurman and friends find distal DHSs and relate these to cis-regulatory elements. To some extend, puzzling, because the above tells us that a lot of regulatory work is happening outside those DHSs. But then again, I did read today about DNA methylation triggering histone modifications. It seems there is so much interactions going on, that it resembles a melting pot. Oh wait, that makes sense; it's one big one pot synthesis anyway.
The paper discusses an enormous amount of experimental work, and I cannot seem to be able to make sense of it all. There are striking aspects to it, which I will touch upon momentarily. But I cannot help but mentioning that I am not sure they could either. Their Discussion section leaves something to be desired, like an actual discussion. Instead, they just summarize the paper.
They used ChIP with Cell Signaling's 9751 antibody recognizing H3K4me3, with formaldehyde-induced crosslinking. It actually turns out, that the peaks for this modification are right on top of the DNA part from which the transcript is made, in line with Felsenfeld's observation. Upstream of that, where the promotors are expected, that is where DNase I signals are found. That is, I think this means that the DHS upstream of the histone where transcription starts is where the promotor regulation happens. With transcription factors (TFs), of course. And in those DHS regions, that is where DNA methylation happens, and Thurman finds DNA methylation in those regions, inhibiting TFs binding, because the already mentioned MeCP2 already takes that place.
Now, then they make a jump from this low level chemistry, to a genome wide landscape. Well, they actually start with that, but as a chemist, I am more of a bottom-up guy (that is an IT method). They report that most DHSs are found in introns and at distal locations. The first is striking: the ratio between intron/exon is >99. Does that imply that exons basically are always DNA wrapped around histones?? Does that actually then tell me that transcription actually sort of requires steric hindrance of the histone?? Ha, those diagrams biologists would be even more misleading that they have been to me (don't ask me how long it took me to learn that there are some 10-40 mitochondria per cell! and I still do not know if all copies in the cell have the same DNA, or if they are more like a population like your microbiome).
Now, distal DHSs are the second largest group, and capture some 40-45% of all DHSs. Distal means typically more than 2.5 kb away from the TSS (transcriptional start sites). Most of them are somewhere between 10 and 50 kb away. Now, isn't that something? That is distant indeed!
What? Still with me? Let's do some math. It's hard, and I hope to get it right. A human has about 3 billion base pairs (I'll take the WikiPedia count). The paper finds almost 3 million DHSs. That means that the average distance between DHSs is about 1 kb. Compare that to their diagram 1b, outline in the previous paragraph. That means that the DHSs must be very densely placed around the transcribed genes. Indeed, they report ratios of up to and above a 100 fold increase. It must be like that, because otherwise, you cannot get those distances for distal DHSs.
Now, another interesting aspect of the paper, is that they find different DHSs for different cell types. That, in fact, increases the average distance between DHSs: those 3 million they find is for 125 cell lines, and more DHSs are found in less then 20 cell lines. Only promotor-related DHSs seem to be more persistent between cell lines. This implies that different cell lines, have different genes unfolded in nucleosome/DHS rich areas (defining the chromatin accessibility), triggering different gene expression. That all makes sense, and rather existing too. As such, it seems to me that this map effectively gives a predictive model, indicating which genes are expressed in which cell types.
A further question they ask is if DNA (not histone) methylation is the cause of the result of DHSs. The confirm earlier found correlation between DNA methylation and gene silencing. They basically question if the things like MeCP2 binding happen because no transcription factor is in the way, or that TF cannot bind because MeCP2 is there. Chemically, these are perhaps equivalent: they have competing binding affinities. Except that the methylation must happen at some point too. The suggest that that may be due DNA getting randomly methylated, perhaps not unlike passive demethylation. Chemically, that does not make sense to. I would guess there are many chemical species in the cell that would get more easily methylated... They believe to have found evidence for passive deposition, but also find positive correlation between methylation and gene expression. I would say, the answer is still out there.
OK, that's about how far I got now. The last two pages I have to read again, and see what papers I need to read to make sense of that. And I will try to see what others have been saying about this paper. One hooray for #altmetrics!
Thurman, R., Rynes, E., Humbert, R., Vierstra, J., Maurano, M., Haugen, E., Sheffield, N., Stergachis, A., Wang, H., Vernot, B., Garg, K., John, S., Sandstrom, R., Bates, D., Boatman, L., Canfield, T., Diegel, M., Dunn, D., Ebersol, A., Frum, T., Giste, E., Johnson, A., Johnson, E., Kutyavin, T., Lajoie, B., Lee, B., Lee, K., London, D., Lotakis, D., Neph, S., Neri, F., Nguyen, E., Qu, H., Reynolds, A., Roach, V., Safi, A., Sanchez, M., Sanyal, A., Shafer, A., Simon, J., Song, L., Vong, S., Weaver, M., Yan, Y., Zhang, Z., Zhang, Z., Lenhard, B., Tewari, M., Dorschner, M., Hansen, R., Navas, P., Stamatoyannopoulos, G., Iyer, V., Lieb, J., Sunyaev, S., Akey, J., Sabo, P., Kaul, R., Furey, T., Dekker, J., Crawford, G., & Stamatoyannopoulos, J. (2012). The accessible chromatin landscape of the human genome Nature, 489 (7414), 75-82 DOI: 10.1038/nature11232
Felsenfeld G, Boyes J, Chung J, Clark D, & Studitsky V (1996). Chromatin structure and gene expression. Proceedings of the National Academy of Sciences of the United States of America, 93 (18), 9384-8 PMID: 8790338