References

Aird, Daniel, Michael G Ross, Wei-Sheng Chen, Maxwell Danielsson, Timothy Fennell, Carsten Russ, David B Jaffe, Chad Nusbaum, and Andreas Gnirke. 2011. “Analyzing and Minimizing PCR Amplification Bias in Illumina Sequencing Libraries.” Genome Biol 12 (2): R18. https://doi.org/10.1186/gb-2011-12-2-r18.

Akalin, Altuna, Vedran Franke, Kristian Vlahoviček, Christopher E Mason, and Dirk Schübeler. 2015. “Genomation: A Toolkit to Summarize, Annotate and Visualize Genomic Intervals.” Bioinformatics 31 (7): 1127–9.

Akalin, Altuna, Matthias Kormaksson, Sheng Li, Francine E Garrett-Bakelman, Maria E Figueroa, Ari Melnick, and Christopher E Mason. 2012. “MethylKit: A Comprehensive R Package for the Analysis of Genome-Wide DNA Methylation Profiles.” Genome Biol. 13 (10): R87.

Alberts, B., D. Bray, J. Lewis, M. Raff, K. Roberts, and J.D. Watson. 2002. Molecular Biology of the Cell. 4th ed. Garland.

Allhoff, Manuel, Kristin Seré, Heike Chauvistré, Qiong Lin, Martin Zenke, and Ivan G Costa. 2014. “Detecting Differential Peaks in ChIP-Seq Signals with ODIN.” Bioinformatics 30 (24): 3467–75. https://doi.org/10.1093/bioinformatics/btu722.

Allhoff, Manuel, Kristin Seré, Juliana F Pires, Martin Zenke, and Ivan G Costa. 2016. “Differential Peak Calling of ChIP-Seq Signals with Replicates with THOR.” Nucleic Acids Res 44 (20): e153. https://doi.org/10.1093/nar/gkw680.

Anders, Simon, Alejandro Reyes, and Wolfgang Huber. 2012. “Detecting Differential Usage of Exons from RNA-Seq Data.” Genome Research 22 (10): 2008–17. https://doi.org/10.1101/gr.133744.111.

Angelini, Claudia, Ruth Heller, Rita Volkinshtein, and Daniel Yekutieli. 2015. “Is This the Right Normalization? A Diagnostic Tool for ChIP-Seq Normalization.” BMC Bioinformatics 16 (May): 150. https://doi.org/10.1186/s12859-015-0579-z.

Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, et al. 2000. “Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.” Nat. Genet. 25 (1): 25–29.

“Babraham Bioinformatics - FastQC A Quality Control Tool for High Throughput Sequence Data.” n.d. Accessed July 16, 2018. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.

“Babraham Bioinformatics - Trim Galore!” n.d. Accessed July 16, 2018. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/.

Backman, Tyler W. H., and Thomas Girke. 2016. “systemPipeR: NGS Workflow and Report Generation Environment.” BMC Bioinformatics 17 (1). https://doi.org/10.1186/s12859-016-1241-0.

Barr, Cory, Thomas Wu, and Michael Lawrence. 2019. GmapR: An R Interface to the Gmap/Gsnap/Gstruct Suite.

Beck, Dominik, Miriam B Brandl, Lies Boelen, Ashwin Unnikrishnan, John E Pimanda, and Jason W H Wong. 2012. “Signal Analysis for Genome-Wide Maps of Histone Modifications Measured by ChIP-Seq.” Bioinformatics 28 (8): 1062–9. https://doi.org/10.1093/bioinformatics/bts085.

Benjamini, Yuval, and Terence P Speed. 2012. “Summarizing and Correcting the GC Content Bias in High-Throughput Sequencing.” Nucleic Acids Res 40 (10): e72. https://doi.org/10.1093/nar/gks001.

Biecek, Przemyslaw. 2018. “DALEX: Explainers for Complex Predictive Models in R.” Journal of Machine Learning Research 19 (84): 1–5. http://jmlr.org/papers/v19/18-416.html.

Bock, Christoph, Isabel Beerman, Wen-Hui Lien, Zachary D Smith, Hongcang Gu, Patrick Boyle, Andreas Gnirke, Elaine Fuchs, Derrick J Rossi, and Alexander Meissner. 2012. “DNA Methylation Dynamics During in Vivo Differentiation of Blood and Skin Stem Cells.” Mol. Cell 47 (4): 633–47.

Bolger, Anthony M., Marc Lohse, and Bjoern Usadel. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. https://doi.org/10.1093/bioinformatics/btu170.

Bolger, Anthony M, Marc Lohse, and Bjoern Usadel. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. https://doi.org/10.1093/bioinformatics/btu170.

Bonhoure, Nicolas, Gergana Bounova, David Bernasconi, Viviane Praz, Fabienne Lammers, Donatella Canella, Ian M Willis, et al. 2014. “Quantifying ChIP-Seq Data: A Spiking Method Providing an Internal Reference for Sample-to-Sample Normalization.” Genome Res 24 (7): 1157–68. https://doi.org/10.1101/gr.168260.113.

Boser, Bernhard E, Isabelle M Guyon, and Vladimir N Vapnik. 1992. “A Training Algorithm for Optimal Margin Classifiers.” In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144–52. ACM.

Bray, Nicolas L., Harold Pimentel, Páll Melsted, and Lior Pachter. 2016. “Near-Optimal Probabilistic RNA-Seq Quantification.” Nature Biotechnology 34 (5): 525–27. https://doi.org/10.1038/nbt.3519.

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32.

Chawla, Nitesh V, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. “SMOTE: Synthetic Minority over-Sampling Technique.” Journal of Artificial Intelligence Research 16: 321–57.

Chen, Tianqi, and Carlos Guestrin. 2016. “Xgboost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94. ACM.

Chen, Yiwen, Nicolas Negre, Qunhua Li, Joanna O Mieczkowska, Matthew Slattery, Tao Liu, Yong Zhang, et al. 2012. “Systematic Evaluation of Factors Influencing ChIP-Seq Fidelity.” Nat Methods 9 (6): 609–14. https://doi.org/10.1038/nmeth.1985.

Chung, Dongjun, Pei Fen Kuan, Bo Li, Rajendran Sanalkumar, Kun Liang, Emery H Bresnick, Colin Dewey, and Sündüz Keleş. 2011. “Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data.” PLoS Comput Biol 7 (7): e1002111. https://doi.org/10.1371/journal.pcbi.1002111.

Clark, Tyson A, Kristi E Spittle, Stephen W Turner, and Jonas Korlach. 2011. “Direct Detection and Sequencing of Damaged DNA Bases.” Genome Integr. 2 (December): 10.

Conesa, Ana, Pedro Madrigal, Sonia Tarazona, David Gomez-Cabrero, Alejandra Cervera, Andrew McPherson, Michał Wojciech Szcześniak, et al. 2016. “A Survey of Best Practices for RNA-Seq Data Analysis.” Genome Biology 17 (January): 13. https://doi.org/10.1186/s13059-016-0881-8.

Consortium, The Gene Ontology. 2017. “Expansion of the Gene Ontology knowledgebase and resources.” Nucleic Acids Res. 45 (D1): D331–D338.

Cox, T.F., and M.A.A. Cox. 2000. Multidimensional Scaling, Second Edition. Chapman & Hall/Crc Monographs on Statistics & Applied Probability. CRC Press.

Deaton, A M, and A Bird. 2011. “CpG Islands and the Regulation of Transcription.” Genes Dev. 25 (10): 1010–22.

de Souza, Welliton, Benilton S Carvalho, and Iscia Lopes-Cendes. 2018. “Rqc: A Bioconductor Package for Quality Control of High-Throughput Sequencing Data.” Journal of Statistical Software, Code Snippets 87 (2): 1–14. https://doi.org/10.18637/jss.v087.c02.

Dobin, Alexander, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson, and Thomas R. Gingeras. 2013. “STAR: Ultrafast Universal RNA-Seq Aligner.” Bioinformatics 29 (1): 15–21. https://doi.org/10.1093/bioinformatics/bts635.

Dodt, Matthias, Johannes T Roehr, Rina Ahmed, and Christoph Dieterich. 2012. “FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms.” Biology (Basel) 1 (3): 895–905. https://doi.org/10.3390/biology1030895.

Dong, X., M. C. Greven, A. Kundaje, S. Djebali, J. B. Brown, C. Cheng, T. R. Gingeras, et al. 2012. “Modeling gene expression using chromatin features in various cellular contexts.” Genome Biol. 13 (9): R53.

Ehrlich, Melanie. 2002. “DNA Methylation in Cancer: Too Much, but Also Too Little.” Oncogene 21 (35): 5400–5413.

Eilbeck, Karen, Suzanna E Lewis, Christopher J Mungall, Mark Yandell, Lincoln Stein, Richard Durbin, and Michael Ashburner. 2005. “The Sequence Ontology: A Tool for the Unification of Genome Annotations.” Genome Biology 6 (5): R44.

Elith, Jane, John R Leathwick, and Trevor Hastie. 2008. “A Working Guide to Boosted Regression Trees.” Journal of Animal Ecology 77 (4): 802–13.

Ernst, Jason, and Manolis Kellis. 2012. “ChromHMM: Automating Chromatin-State Discovery and Characterization.” Nat Methods 9 (3): 215–16. https://doi.org/10.1038/nmeth.1906.

Fabregat, A., S. Jupe, L. Matthews, K. Sidiropoulos, M. Gillespie, P. Garapati, R. Haw, et al. 2018. “The Reactome Pathway Knowledgebase.” Nucleic Acids Res. 46 (D1): D649–D655.

Fabregat, Antonio, Steven Jupe, Lisa Matthews, Konstantinos Sidiropoulos, Marc Gillespie, Phani Garapati, Robin Haw, et al. 2018. “The Reactome Pathway Knowledgebase.” Nucleic Acids Research 46 (D1): D649–D655. https://doi.org/10.1093/nar/gkx1132.

Felsani, Armando, Bjarki Gudmundsson, Simona Nanni, Elena Brini, Anna Moles, Hans Guttormur Thormar, Peter Estibeiro, et al. 2015. “Impact of Different ChIP-Seq Protocols on DNA Integrity and Quality of Bioinformatics Analysis Results.” Brief Funct Genomics 14 (2): 156–62. https://doi.org/10.1093/bfgp/elu001.

Feng, Hao, Karen N Conneely, and Hao Wu. 2014. “A Bayesian Hierarchical Model to Detect Differentially Methylated Loci from Single Nucleotide Resolution Sequencing Data.” Nucleic Acids Res. 42 (8): e69.

Fernandez, M., and D. Miranda-Saavedra. 2012. “Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines.” Nucleic Acids Res. 40 (10): e77.

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2018. “All Models Are Wrong but Many Are Useful: Variable Importance for Black-Box, Proprietary, or Misspecified Prediction Models, Using Model Class Reliance.” arXiv Preprint arXiv:1801.01489.

Friedman, Jerome H. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics, 1189–1232.

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. 10. Springer series in statistics New York.

Friedman, Jerome H, and Jacqueline J Meulman. 2003. “Multiple Additive Regression Trees with Application in Epidemiology.” Statistics in Medicine 22 (9): 1365–81.

Gaidatzis, Dimos, Anita Lerch, Florian Hahne, and Michael B. Stadler. 2015. “QuasR: Quantification and Annotation of Short Reads in R.” Bioinformatics 31 (7): 1130–2. https://doi.org/10.1093/bioinformatics/btu781.

Gandolfo, Luke C., and Terence P. Speed. 2018. “RLE Plots: Visualizing Unwanted Variation in High Dimensional Data.” PloS One 13 (2): e0191629. https://doi.org/10.1371/journal.pone.0191629.

Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016a. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics (Oxford, England) 32 (18): 2847–9. https://doi.org/10.1093/bioinformatics/btw313.

———. 2016b. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics.

Guinney, J., R. Dienstmann, X. Wang, A. de Reynies, A. Schlicker, C. Soneson, L. Marisa, et al. 2015. “The consensus molecular subtypes of colorectal cancer.” Nat. Med. 21 (11): 1350–6.

Haas, Brian J., Alexie Papanicolaou, Moran Yassour, Manfred Grabherr, Philip D. Blood, Joshua Bowden, Matthew Brian Couger, et al. 2013. “De Novo Transcript Sequence Reconstruction from RNA-Seq: Reference Generation and Analysis with Trinity.” Nature Protocols 8 (8). https://doi.org/10.1038/nprot.2013.084.

Hager, Gordon L, James G McNally, and Tom Misteli. 2009. “Transcription Dynamics.” Molecular Cell 35 (6): 741–53.

Han, Zhi, Lu Tian, Thierry Pécot, Tim Huang, Raghu Machiraju, and Kun Huang. 2012. “A Signal Processing Approach for Enriched Region Detection in RNA Polymerase II ChIP-Seq Data.” BMC Bioinformatics 13 Suppl 2 (March): S2. https://doi.org/10.1186/1471-2105-13-S2-S2.

Hartigan, John A, and Manchek A Wong. 1979. “Algorithm as 136: A K-Means Clustering Algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics) 28 (1): 100–108.

He, Yu-Fei, Bin-Zhong Li, Zheng Li, Peng Liu, Yang Wang, Qingyu Tang, Jianping Ding, et al. 2011. “Tet-Mediated Formation of 5-Carboxylcytosine and Its Excision by TDG in Mammalian DNA.” Science 333 (6047): 1303–7.

Helmuth, Johannes, Na Li, Laura Arrigoni, Kathrin Gianmoena, Cristina Cadenas, Gilles Gasparoni, Anupam Sinha, et al. 2016. “normR: Regime Enrichment Calling for ChIP-Seq Data.” bioRxiv.

Henikoff, Steven. 2008. “Nucleosome Destabilization in the Epigenetic Regulation of Gene Expression.” Nature Reviews Genetics 9 (1): 15–26.

Hoerl, Arthur E, and Robert W Kennard. 1970. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics 12 (1): 55–67.

Hoffman, Michael M, Orion J Buske, Jie Wang, Zhiping Weng, Jeff A Bilmes, and William Stafford Noble. 2012. “Unsupervised Pattern Discovery in Human Chromatin Structure Through Genomic Segmentation.” Nat Methods 9 (5): 473–76. https://doi.org/10.1038/nmeth.1937.

Horvath, Steve. 2013. “DNA Methylation Age of Human Tissues and Cell Types.” Genome Biology 14 (10): 3156.

Hsu, Chih-Wei, Chih-Chung Chang, Chih-Jen Lin, and others. 2003. “A Practical Guide to Support Vector Classification.”

Hyvärinen, Aapo. 2013. “Independent Component Analysis: Recent Advances.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371 (1984): 20110534.

Jiang, Lichun, Felix Schlesinger, Carrie A. Davis, Yu Zhang, Renhua Li, Marc Salit, Thomas R. Gingeras, and Brian Oliver. 2011. “Synthetic Spike-in Standards for RNA-Seq Experiments.” Genome Research 21 (9): 1543–51. https://doi.org/10.1101/gr.121095.111.

Jung, Youngsook L, Lovelace J Luquette, Joshua W K Ho, Francesco Ferrari, Michael Tolstorukov, Aki Minoda, Robbyn Issner, et al. 2014. “Impact of Sequencing Depth in ChIP-Seq Experiments.” Nucleic Acids Res 42 (9): e74. https://doi.org/10.1093/nar/gku178.

Kanehisa, M., M. Furumichi, M. Tanabe, Y. Sato, and K. Morishima. 2017. “KEGG: new perspectives on genomes, pathways, diseases and drugs.” Nucleic Acids Res. 45 (D1): D353–D361.

Kanehisa, Minoru, Yoko Sato, Masayuki Kawashima, Miho Furumichi, and Mao Tanabe. 2016. “KEGG as a Reference Resource for Gene and Protein Annotation.” Nucleic Acids Research 44 (Database issue): D457–D462. https://doi.org/10.1093/nar/gkv1070.

Khan, Aziz, Oriol Fornes, Arnaud Stigliani, Marius Gheorghe, Jaime A Castro-Mondragon, Robin van der Lee, Adrien Bessy, et al. 2018. “JASPAR 2018: Update of the Open-Access Database of Transcription Factor Binding Profiles and Its Web Framework.” Nucleic Acids Res 46 (D1): D260–D266. https://doi.org/10.1093/nar/gkx1126.

Kharchenko, Peter V, Michael Y Tolstorukov, and Peter J Park. 2008. “Design and Analysis of ChIP-Seq Experiments for DNA-Binding Proteins.” Nat Biotechnol 26 (12): 1351–9. https://doi.org/10.1038/nbt.1508.

Kidder, Benjamin L, Gangqing Hu, and Keji Zhao. 2011. “ChIP-Seq: Technical Considerations for Obtaining High-Quality Data.” Nat Immunol 12 (10): 918–22. https://doi.org/10.1038/ni.2117.

Kim, Daehwan, Ben Langmead, and Steven L Salzberg. 2015. “HISAT: A Fast Spliced Aligner with Low Memory Requirements.” Nature Methods 12 (4): 357–60. https://doi.org/10.1038/nmeth.3317.

Kim, Daehwan, Geo Pertea, Cole Trapnell, Harold Pimentel, Ryan Kelley, and Steven L. Salzberg. 2013. “TopHat2: Accurate Alignment of Transcriptomes in the Presence of Insertions, Deletions and Gene Fusions.” Genome Biology 14 (4): R36. https://doi.org/10.1186/gb-2013-14-4-r36.

Kolde, Raivo. 2019. Pheatmap: Pretty Heatmaps. https://CRAN.R-project.org/package=pheatmap.

Kourou, K., T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis. 2015. “Machine learning applications in cancer prognosis and prediction.” Comput Struct Biotechnol J 13: 8–17.

Krebs, Wolfgang, Susanne V Schmidt, Alon Goren, Dominic De Nardo, Larisa Labzin, Anton Bovier, Thomas Ulas, et al. 2014. “Optimization of Transcription Factor Binding Map Accuracy Utilizing Knockout-Mouse Models.” Nucleic Acids Res 42 (21): 13051–60. https://doi.org/10.1093/nar/gku1078.

Laajala, Teemu D, Sunil Raghav, Soile Tuomela, Riitta Lahesmaa, Tero Aittokallio, and Laura L Elo. 2009. “A Practical Comparison of Methods for Detecting Transcription Factor Binding Sites in ChIP-Seq Experiments.” BMC Genomics 10 (December): 618. https://doi.org/10.1186/1471-2164-10-618.

Landt, Stephen G, Georgi K Marinov, Anshul Kundaje, Pouya Kheradpour, Florencia Pauli, Serafim Batzoglou, Bradley E Bernstein, et al. 2012. “ChIP-Seq Guidelines and Practices of the ENCODE and modENCODE Consortia.” Genome Res 22 (9): 1813–31. https://doi.org/10.1101/gr.136184.111.

Langmead, Ben, and Steven L Salzberg. 2012a. “Fast Gapped-Read Alignment with Bowtie 2.” Nature Methods 9 (4): 357.

———. 2012b. “Fast Gapped-Read Alignment with Bowtie 2.” Nat Methods 9 (4): 357–59. https://doi.org/10.1038/nmeth.1923.

Langmead, Ben, Cole Trapnell, Mihai Pop, and Steven L Salzberg. 2009. “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome.” Genome Biol 10 (3): R25. https://doi.org/10.1186/gb-2009-10-3-r25.

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep Learning.” Nature 521 (7553): 436.

Lee, Daniel D, and H Sebastian Seung. 2001. “Algorithms for Non-Negative Matrix Factorization.” In Advances in Neural Information Processing Systems, 556–62.

Leek, Jeffrey T., W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034.

Li, Heng. 2011. “Tabix: Fast Retrieval of Sequence Features from Generic TAB-delimited Files.” Bioinformatics 27 (5): 718–19.

Li, Heng, and Richard Durbin. 2009a. “Fast and Accurate Short Read Alignment with Burrows–Wheeler Transform.” Bioinformatics 25 (14): 1754–60.

———. 2009b. “Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform.” Bioinformatics 25 (14): 1754–60. https://doi.org/10.1093/bioinformatics/btp324.

Li, Ruiqiang, Chang Yu, Yingrui Li, Tak-Wah Lam, Siu-Ming Yiu, Karsten Kristiansen, and Jun Wang. 2009. “SOAP2: An Improved Ultrafast Tool for Short Read Alignment.” Bioinformatics 25 (15): 1966–7.

Li, Wentian, and Jan Freudenberg. 2014. “Mappability and Read Length.” Front Genet 5 (November): 381. https://doi.org/10.3389/fgene.2014.00381.

Liang, Kun, and Sündüz Keleş. 2012. “Normalization of ChIP-Seq Data with Control.” BMC Bioinformatics 13 (August): 199. https://doi.org/10.1186/1471-2105-13-199.

Liao, Yang, Gordon K. Smyth, and Wei Shi. 2013. “The Subread Aligner: Fast, Accurate and Scalable Read Mapping by Seed-and-Vote.” Nucleic Acids Research 41 (10): e108–e108. https://doi.org/10.1093/nar/gkt214.

Libbrecht, M. W., and W. S. Noble. 2015. “Machine learning applications in genetics and genomics.” Nat. Rev. Genet. 16 (6): 321–32.

Lister, R, E A Mukamel, J R Nery, M Urich, C A Puddifoot, N D Johnson, J Lucero, et al. 2013. “Global Epigenomic Reconfiguration During Mammalian Brain Development.” Science 341 (6146): 1237905–5.

Lister, Ryan, Mattia Pelizzola, Robert H Dowen, R David Hawkins, Gary Hon, Julian Tonti-Filippini, Joseph R Nery, et al. 2009. “Human DNA Methylomes at Base Resolution Show Widespread Epigenomic Differences.” Nature 462 (7271): 315–22.

Love, Michael I, Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2.” Genome Biology 15 (12). https://doi.org/10.1186/s13059-014-0550-8.

Lövkvist, Cecilia, Ian B Dodd, Kim Sneppen, and Jan O Haerter. 2016. “DNA Methylation in Human Epigenomes Depends on Local Topology of CpG Sites.” Nucleic Acids Res. 44 (11): 5123–32.

Lun, Aaron T L, and Gordon K Smyth. 2014. “De Novo Detection of Differentially Bound Regions for ChIP-Seq Data Using Peaks and Windows: Controlling Error Rates Correctly.” Nucleic Acids Res 42 (11): e95. https://doi.org/10.1093/nar/gku351.

Luo, Weijun, Michael S Friedman, Kerby Shedden, Kurt D Hankenson, and Peter J Woolf. 2009. “GAGE: Generally Applicable Gene Set Enrichment for Pathway Analysis.” BMC Bioinformatics 10 (1): 161. https://doi.org/10.1186/1471-2105-10-161.

Maaten, Laurens van der, and Geoffrey Hinton. 2008. “Visualizing Data Using T-Sne.” Journal of Machine Learning Research 9 (Nov): 2579–2605.

Mathe, C., M. F. Sagot, T. Schiex, and P. Rouze. 2002. “Current methods of gene prediction, their strengths and weaknesses.” Nucleic Acids Res. 30 (19): 4103–17.

Maza, Elie, Pierre Frasse, Pavel Senin, Mondher Bouzayen, and Mohamed Zouine. 2013. “Comparison of Normalization Methods for Differential Gene Expression Analysis in RNA-Seq Experiments: A Matter of Relative Size of Studied Transcriptomes.” Communicative & Integrative Biology 6 (6): e25849. https://doi.org/10.4161/cib.25849.

McKenna, Aaron, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran Garimella, et al. 2010. “The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next-Generation DNA Sequencing Data.” Genome Research 20 (9): 1297–1303. https://doi.org/10.1101/gr.107524.110.

McPherson, Andrew, Fereydoun Hormozdiari, Abdalnasser Zayed, Ryan Giuliany, Gavin Ha, Mark G. F. Sun, Malachi Griffith, et al. 2011. “deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data.” PLOS Computational Biology 7 (5): e1001138. https://doi.org/10.1371/journal.pcbi.1001138.

Micsinai, Mariann, Fabio Parisi, Francesco Strino, Patrik Asp, Brian D Dynlacht, and Yuval Kluger. 2012. “Picking ChIP-Seq Peak Detectors for Analyzing Chromatin Modification Experiments.” Nucleic Acids Res 40 (9): e70. https://doi.org/10.1093/nar/gks048.

Morgan, Martin, Simon Anders, Michael Lawrence, Patrick Aboyoun, Hervé Pagès, and Robert Gentleman. 2009. “ShortRead: A Bioconductor Package for Input, Quality Assessment and Exploration of High-Throughput Sequence Data.” Bioinformatics 25 (19): 2607–8. https://doi.org/10.1093/bioinformatics/btp450.

Mortazavi, Ali, Shirley Pepke, Camden Jansen, Georgi K Marinov, Jason Ernst, Manolis Kellis, Ross C Hardison, Richard M Myers, and Barbara J Wold. 2013. “Integrating and Mining the Chromatin Landscape of Cell-Type Specificity Using Self-Organizing Maps.” Genome Res 23 (12): 2136–48. https://doi.org/10.1101/gr.158261.113.

Mortazavi, Ali, Brian A. Williams, Kenneth McCue, Lorian Schaeffer, and Barbara Wold. 2008. “Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq.” Nature Methods 5 (7): 621–28. https://doi.org/10.1038/nmeth.1226.

Noushmehr, H., D. J. Weisenberger, K. Diefes, H. S. Phillips, K. Pujara, B. P. Berman, F. Pan, et al. 2010. “Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma.” Cancer Cell 17 (5): 510–22.

Numata, Shusuke, Tianzhang Ye, Thomas M Hyde, Xavier Guitart-Navarro, Ran Tao, Michael Wininger, Carlo Colantuoni, Daniel R Weinberger, Joel E Kleinman, and Barbara K Lipska. 2012. “DNA Methylation Signatures in Development and Aging of the Human Prefrontal Cortex.” The American Journal of Human Genetics 90 (2): 260–72.

Patro, Rob, Geet Duggal, Michael I Love, Rafael A Irizarry, and Carl Kingsford. 2017. “Salmon: Fast and Bias-Aware Quantification of Transcript Expression Using Dual-Phase Inference.” Nature Methods 14 (4): 417–19. https://doi.org/10.1038/nmeth.4197.

Patro, Rob, Stephen M. Mount, and Carl Kingsford. 2014. “Sailfish Enables Alignment-Free Isoform Quantification from RNA-Seq Reads Using Lightweight Algorithms.” Nature Biotechnology 32 (5): 462–64. https://doi.org/10.1038/nbt.2862.

Phillips, Jennifer E, and Victor G Corces. 2009. “CTCF: Master Weaver of the Genome.” Cell 137 (7): 1194–1211.

Poplin, R., P. C. Chang, D. Alexander, S. Schwartz, T. Colthurst, A. Ku, D. Newburger, et al. 2018. “A universal SNP and small-indel variant caller using deep neural networks.” Nat. Biotechnol. 36 (10): 983–87.

Rashid, Naim U, Paul G Giresi, Joseph G Ibrahim, Wei Sun, and Jason D Lieb. 2011. “ZINBA Integrates Local Covariates with DNA-Seq Data to Identify Broad and Narrow Regions of Enrichment, Even Within Amplified Genomic Regions.” Genome Biol 12 (7): R67. https://doi.org/10.1186/gb-2011-12-7-r67.

Reynolds, Alan P, Graeme Richards, Beatriz de la Iglesia, and Victor J Rayward-Smith. 2006. “Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms.” Journal of Mathematical Modelling and Algorithms 5 (4): 475–504.

Risso, Davide, John Ngai, Terence P. Speed, and Sandrine Dudoit. 2014. “Normalization of RNA-Seq Data Using Factor Analysis of Control Genes or Samples.” Nature Biotechnology 32 (9): 896–902. https://doi.org/10.1038/nbt.2931.

Risso, Davide, Katja Schwartz, Gavin Sherlock, and Sandrine Dudoit. 2011. “GC-Content Normalization for RNA-Seq Data.” BMC Bioinformatics 12 (December): 480. https://doi.org/10.1186/1471-2105-12-480.

Robertson, Gordon, Jacqueline Schein, Readman Chiu, Richard Corbett, Matthew Field, Shaun D. Jackman, Karen Mungall, et al. 2010. “De Novo Assembly and Analysis of RNA-Seq Data.” Nature Methods 7 (11): 909–12. https://doi.org/10.1038/nmeth.1517.

Robinson, Mark D., Davis J. McCarthy, and Gordon K. Smyth. 2010. “edgeR: A Bioconductor Package for Differential Expression Analysis of Digital Gene Expression Data.” Bioinformatics (Oxford, England) 26 (1): 139–40. https://doi.org/10.1093/bioinformatics/btp616.

Rousseeuw, Peter J. 1987. “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Journal of Computational and Applied Mathematics 20: 53–65.

Ruffalo, Matthew, Thomas LaFramboise, and Mehmet Koyutürk. 2011. “Comparative Analysis of Algorithms for Next-Generation Sequencing Read Alignment.” Bioinformatics 27 (20): 2790–6. https://doi.org/10.1093/bioinformatics/btr477.

Schübeler, Dirk. 2015. “Function and Information Content of DNA Methylation.” Nature 517 (7534): 321–26.

Schwartz, Yuri B, and Vincenzo Pirrotta. 2007. “Polycomb Silencing Mechanisms and the Management of Genomic Programmes.” Nature Reviews Genetics 8 (1): 9–22.

Shao, Zhen, Yijing Zhang, Guo-Cheng Yuan, Stuart H Orkin, and David J Waxman. 2012. “MAnorm: A Robust Model for Quantitative Comparison of ChIP-Seq Data Sets.” Genome Biol 13 (3): R16. https://doi.org/10.1186/gb-2012-13-3-r16.

Smith, Zachary D, and Alexander Meissner. 2013. “DNA Methylation: Roles in Mammalian Development.” Nat. Rev. Genet. 14 (3): 204–20.

Song, Qiang, and Andrew D Smith. 2011. “Identifying Dispersed Epigenomic Domains from ChIP-Seq Data.” Bioinformatics 27 (6): 870–71. https://doi.org/10.1093/bioinformatics/btr030.

Sood, Ankur Jai, Coby Viner, and Michael M Hoffman. 2019. “DNAmod: The Dna Modification Database.” Journal of Cheminformatics 11 (1): 30.

Stadler, Michael B, Rabih Murr, Lukas Burger, Robert Ivanek, Florian Lienert, Anne Schöler, Erik van Nimwegen, et al. 2011. “DNA-binding Factors Shape the Mouse Methylome at Distal Regulatory Regions.” Nature 480 (7378): 490–95.

Stanke, Mario, and Burkhard Morgenstern. 2005. “AUGUSTUS: A Web Server for Gene Prediction in Eukaryotes That Allows User-Defined Constraints.” Nucleic Acids Research 33 (Web Server issue): W465–W467. https://doi.org/10.1093/nar/gki458.

Subramanian, Aravind, Pablo Tamayo, Vamsi K. Mootha, Sayan Mukherjee, Benjamin L. Ebert, Michael A. Gillette, Amanda Paulovich, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proceedings of the National Academy of Sciences 102 (43): 15545–50. https://doi.org/10.1073/pnas.0506580102.

Tahiliani, Mamta, Kian Peng Koh, Yinghua Shen, William A Pastor, Hozefa Bandukwala, Yevgeny Brudno, Suneet Agarwal, et al. 2009. “Conversion of 5-Methylcytosine to 5-Hydroxymethylcytosine in Mammalian DNA by MLL Partner TET1.” Science 324 (5929): 930–35.

Tan, Ge, and Boris Lenhard. 2016. “TFBSTools: An R/Bioconductor Package for Transcription Factor Binding Site Analysis.” Bioinformatics 32 (10): 1555–6. https://doi.org/10.1093/bioinformatics/btw024.

Teng, Mingxiang, and Rafael A Irizarry. 2016. “Accounting for GC-Content Bias Reduces Systematic Errors and Batch Effects in ChIP-Seq Peak Callers.” bioRxiv, January. http://biorxiv.org/content/early/2016/11/30/090704.

Teng, Mingxiang, and Rafael A. Irizarry. 2017. “Accounting for Gc-Content Bias Reduces Systematic Errors and Batch Effects in Chip-Seq Data.” Genome Research. https://doi.org/10.1101/gr.220673.117.

Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Methodological) 58 (1): 267–88.

Tibshirani, Robert, Guenther Walther, and Trevor Hastie. 2001. “Estimating the Number of Clusters in a Data Set via the Gap Statistic.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2): 411–23.

Trapnell, Cole, Brian A. Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J. van Baren, Steven L. Salzberg, Barbara J. Wold, and Lior Pachter. 2010. “Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching During Cell Differentiation.” Nature Biotechnology 28 (5): 511–15. https://doi.org/10.1038/nbt.1621.

Wang, L., H. L. McLeod, and R. M. Weinshilboum. 2011. “Genomics and drug response.” N. Engl. J. Med. 364 (12): 1144–53.

Wang, Zhong, Mark Gerstein, and Michael Snyder. 2009. “RNA-Seq: A Revolutionary Tool for Transcriptomics.” Nature Reviews Genetics 10 (1): 57–63. https://doi.org/10.1038/nrg2484.

Wardle, FC, and H Tan. 2015. “A Chip on the Shoulder? Chromatin Immunoprecipitation and Validation Strategies for Chip Antibodies [Version 1; Referees: 2 Approved].” F1000Research 4 (235). https://doi.org/10.12688/f1000research.6719.1.

Weinstein, J. N., E. A. Collisson, G. B. Mills, K. R. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, et al. 2013. “The Cancer Genome Atlas Pan-Cancer analysis project.” Nat. Genet. 45 (10): 1113–20.

Wilbanks, Elizabeth G, and Marc T Facciotti. 2010. “Evaluation of Algorithm Performance in ChIP-Seq Peak Detection.” PLoS ONE 5 (7): e11471. https://doi.org/10.1371/journal.pone.0011471.

Wu, Thomas D., Jens Reeder, Michael Lawrence, Gabe Becker, and Matthew J. Brauer. 2016. “GMAP and GSNAP for Genomic Sequence Alignment: Enhancements to Speed, Accuracy, and Functionality.” Methods in Molecular Biology (Clifton, N.J.) 1418: 283–334. https://doi.org/10.1007/978-1-4939-3578-9_15.

Xing, Haipeng, Yifan Mo, Will Liao, and Michael Q Zhang. 2012. “Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-Seq Data.” PLoS Comput Biol 8 (7): e1002613. https://doi.org/10.1371/journal.pcbi.1002613.

Xu, Han, Lusy Handoko, Xueliang Wei, Chaopeng Ye, Jianpeng Sheng, Chia-Lin Wei, Feng Lin, and Wing-Kin Sung. 2010. “A Signal-Noise Model for Significance Analysis of ChIP-Seq with Negative Control.” Bioinformatics 26 (9): 1199–1204. https://doi.org/10.1093/bioinformatics/btq128.

Yao, Zizhen. 2012. MotifRG: A Package for Discriminative Motif Discovery, Designed for High Throughput Sequencing Dataset.

Zang, Chongzhi, Dustin E Schones, Chen Zeng, Kairong Cui, Keji Zhao, and Weiqun Peng. 2009. “A Clustering Approach for Identification of Enriched Domains from Histone Modification ChIP-Seq Data.” Bioinformatics 25 (15): 1952–8. https://doi.org/10.1093/bioinformatics/btp340.

Zhang, Yanxiao, Yu-Hsuan Lin, Timothy D Johnson, Laura S Rozek, and Maureen A Sartor. 2014. “PePr: A Peak-Calling Prioritization Pipeline to Identify Consistent or Differential Peaks from Replicated ChIP-Seq Data.” Bioinformatics 30 (18): 2568–75. https://doi.org/10.1093/bioinformatics/btu372.

Zhang, Yong, Tao Liu, Clifford A Meyer, Jérôme Eeckhoute, David S Johnson, Bradley E Bernstein, Chad Nusbaum, et al. 2008. “Model-Based Analysis of ChIP-Seq (MACS).” Genome Biol 9 (9): R137. https://doi.org/10.1186/gb-2008-9-9-r137.

Zhou, J., and O. G. Troyanskaya. 2015. “Predicting effects of noncoding variants with deep learning-based sequence model.” Nat. Methods 12 (10): 931–34.

Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301–20.