References

Aird, Ross, Chen, Danielsson, Fennell, Russ, Jaffe, Nusbaum, and Gnirke. 2011. “Analyzing and Minimizing PCR Amplification Bias in Illumina Sequencing Libraries.” Genome Biol 12 (2): R18. https://doi.org/10.1186/gb-2011-12-2-r18.

Akalin, Franke, Vlahoviček, Mason, and Schübeler. 2015. “Genomation: A Toolkit to Summarize, Annotate and Visualize Genomic Intervals.” Bioinformatics 31 (7): 1127–9.

Akalin, Kormaksson, Li, Garrett-Bakelman, Figueroa, Melnick, and Mason. 2012. “MethylKit: A Comprehensive R Package for the Analysis of Genome-Wide DNA Methylation Profiles.” Genome Biol. 13 (10): R87.

Alberts, Bray, Lewis, Raff, Roberts, and Watson. 2002. Molecular Biology of the Cell. 4th ed. Garland.

Allhoff, Seré, Chauvistré, Lin, Zenke, and Costa. 2014. “Detecting Differential Peaks in ChIP-Seq Signals with ODIN.” Bioinformatics 30 (24): 3467–75. https://doi.org/10.1093/bioinformatics/btu722.

Allhoff, Seré, F Pires, Zenke, and G Costa. 2016. “Differential Peak Calling of ChIP-Seq Signals with Replicates with THOR.” Nucleic Acids Res 44 (20): e153. https://doi.org/10.1093/nar/gkw680.

Anders, Reyes, and Huber. 2012. “Detecting Differential Usage of Exons from RNA-Seq Data.” Genome Research 22 (10): 2008–17. https://doi.org/10.1101/gr.133744.111.

Andrews. 2010. “Babraham Bioinformatics - FastQC A Quality Control Tool for High Throughput Sequence Data.” https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.

Angelini, Heller, Volkinshtein, and Yekutieli. 2015. “Is This the Right Normalization? A Diagnostic Tool for ChIP-Seq Normalization.” BMC Bioinformatics 16 (May): 150. https://doi.org/10.1186/s12859-015-0579-z.

Ashburner, Ball, Blake, et al. 2000. “Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.” Nat. Genet. 25 (1): 25–29.

Backman, and Girke. 2016. “systemPipeR: NGS Workflow and Report Generation Environment.” BMC Bioinformatics 17 (1). https://doi.org/10.1186/s12859-016-1241-0.

Barr, Wu, and Lawrence. 2019. GmapR: An R Interface to the Gmap/Gsnap/Gstruct Suite.

Bartel. 2004. “MicroRNAs: Genomics, Biogenesis, Mechanism, and Function.” Cell 116 (2): 281–97.

Beck, Brandl, Boelen, Unnikrishnan, Pimanda, and Wong. 2012. “Signal Analysis for Genome-Wide Maps of Histone Modifications Measured by ChIP-Seq.” Bioinformatics 28 (8): 1062–9. https://doi.org/10.1093/bioinformatics/bts085.

Benjamini, and Speed. 2012. “Summarizing and Correcting the GC Content Bias in High-Throughput Sequencing.” Nucleic Acids Res 40 (10): e72. https://doi.org/10.1093/nar/gks001.

Biecek. 2018. “DALEX: Explainers for Complex Predictive Models in R.” Journal of Machine Learning Research 19 (84): 1–5. http://jmlr.org/papers/v19/18-416.html.

Bock, Beerman, Lien, et al. 2012. “DNA Methylation Dynamics During in Vivo Differentiation of Blood and Skin Stem Cells.” Mol. Cell 47 (4): 633–47.

Bolger, Lohse, and Usadel. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. https://doi.org/10.1093/bioinformatics/btu170.

Bonhoure, Bounova, Bernasconi, et al. 2014. “Quantifying ChIP-Seq Data: A Spiking Method Providing an Internal Reference for Sample-to-Sample Normalization.” Genome Res 24 (7): 1157–68. https://doi.org/10.1101/gr.168260.113.

Boser, Guyon, and Vapnik. 1992. “A Training Algorithm for Optimal Margin Classifiers.” In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144–52. ACM.

Bray, Pimentel, Melsted, and Pachter. 2016. “Near-Optimal Probabilistic RNA-Seq Quantification.” Nature Biotechnology 34 (5): 525–27. https://doi.org/10.1038/nbt.3519.

Breiman. 2001. “Random Forests.” Machine Learning 45 (1): 5–32.

Chawla, Bowyer, Hall, and Kegelmeyer. 2002. “SMOTE: Synthetic Minority over-Sampling Technique.” Journal of Artificial Intelligence Research 16: 321–57.

Chen, and Guestrin. 2016. “Xgboost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94. ACM.

Chen, Negre, Li, et al. 2012. “Systematic Evaluation of Factors Influencing Chip-Seq Fidelity.” Nature Methods 9 (6): 609–14.

———. 2012. “Systematic Evaluation of Factors Influencing ChIP-Seq Fidelity.” Nat Methods 9 (6): 609–14. https://doi.org/10.1038/nmeth.1985.

Chung, Kuan, Li, Sanalkumar, Liang, Bresnick, Dewey, and Keleş. 2011. “Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data.” PLoS Comput Biol 7 (7): e1002111. https://doi.org/10.1371/journal.pcbi.1002111.

Clark, Spittle, Turner, and Korlach. 2011. “Direct Detection and Sequencing of Damaged DNA Bases.” Genome Integr. 2 (December): 10.

Conesa, Madrigal, Tarazona, et al. 2016. “A Survey of Best Practices for RNA-Seq Data Analysis.” Genome Biology 17 (January): 13. https://doi.org/10.1186/s13059-016-0881-8.

Consortium. 2017. “Expansion of the Gene Ontology knowledgebase and resources.” Nucleic Acids Res. 45 (D1): D331–D338.

Cox, and Cox. 2000. Multidimensional Scaling, Second Edition. Chapman & Hall/Crc Monographs on Statistics & Applied Probability. CRC Press.

Crawley. 2012. The R Book. Wiley. https://books.google.de/books?id=XYDl0mlH-moC.

Deaton, and Bird. 2011. “CpG Islands and the Regulation of Transcription.” Genes Dev. 25 (10): 1010–22.

De Hertogh, De Meulder, Berger, Pierre, Bareke, Gaigneaux, and Depiereux. 2010. “A Benchmark for Statistical Microarray Data Analysis That Preserves Actual Biological and Technical Variance.” BMC Bioinformatics 11 (1): 17.

de Souza, Carvalho, and Lopes-Cendes. 2018. “Rqc: A Bioconductor Package for Quality Control of High-Throughput Sequencing Data.” Journal of Statistical Software, Code Snippets 87 (2): 1–14. https://doi.org/10.18637/jss.v087.c02.

Diez, Barr, Çetinkaya-Rundel, and Amazon.com. 2015. OpenIntro Statistics. OpenIntro, Incorporated. https://books.google.de/books?id=wfcPswEACAAJ.

Dobin, Davis, Schlesinger, Drenkow, Zaleski, Jha, Batut, Chaisson, and Gingeras. 2013. “STAR: Ultrafast Universal RNA-Seq Aligner.” Bioinformatics 29 (1): 15–21. https://doi.org/10.1093/bioinformatics/bts635.

Dong, Greven, Kundaje, et al. 2012. “Modeling gene expression using chromatin features in various cellular contexts.” Genome Biol. 13 (9): R53.

Ehrlich. 2002. “DNA Methylation in Cancer: Too Much, but Also Too Little.” Oncogene 21 (35): 5400–5413.

Eilbeck, Lewis, Mungall, Yandell, Stein, Durbin, and Ashburner. 2005. “The Sequence Ontology: A Tool for the Unification of Genome Annotations.” Genome Biology 6 (5): R44.

Elith, Leathwick, and Hastie. 2008. “A Working Guide to Boosted Regression Trees.” Journal of Animal Ecology 77 (4): 802–13.

ENCODE Project Consortium. 2012. “An Integrated Encyclopedia of DNA Elements in the Human Genome.” Nature 489 (7414): 57–74.

Ernst, and Kellis. 2012. “ChromHMM: Automating Chromatin-State Discovery and Characterization.” Nat Methods 9 (3): 215–16. https://doi.org/10.1038/nmeth.1906.

Fabregat, Jupe, Matthews, et al. 2018. “The Reactome Pathway Knowledgebase.” Nucleic Acids Research 46 (D1): D649–D655. https://doi.org/10.1093/nar/gkx1132.

Fabregat, Jupe, Matthews, et al. 2018. “The Reactome Pathway Knowledgebase.” Nucleic Acids Res. 46 (D1): D649–D655.

Felsani, Gudmundsson, Nanni, et al. 2015. “Impact of Different ChIP-Seq Protocols on DNA Integrity and Quality of Bioinformatics Analysis Results.” Brief Funct Genomics 14 (2): 156–62. https://doi.org/10.1093/bfgp/elu001.

Feng, Conneely, and Wu. 2014. “A Bayesian Hierarchical Model to Detect Differentially Methylated Loci from Single Nucleotide Resolution Sequencing Data.” Nucleic Acids Res. 42 (8): e69.

Fernandez, and Miranda-Saavedra. 2012. “Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines.” Nucleic Acids Res. 40 (10): e77.

Fisher, Rudin, and Dominici. 2018. “All Models Are Wrong but Many Are Useful: Variable Importance for Black-Box, Proprietary, or Misspecified Prediction Models, Using Model Class Reliance.” arXiv Preprint arXiv:1801.01489.

Friedman. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics, 1189–1232.

Friedman, Hastie, and Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. Springer series in statistics New York.

Friedman, Hastie, and Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33 (1): 1.

Friedman, and Meulman. 2003. “Multiple Additive Regression Trees with Application in Epidemiology.” Statistics in Medicine 22 (9): 1365–81.

Gaidatzis, Lerch, Hahne, and Stadler. 2015. “QuasR: Quantification and Annotation of Short Reads in R.” Bioinformatics 31 (7): 1130–2. https://doi.org/10.1093/bioinformatics/btu781.

Gandolfo, and Speed. 2018. “RLE Plots: Visualizing Unwanted Variation in High Dimensional Data.” PloS One 13 (2): e0191629. https://doi.org/10.1371/journal.pone.0191629.

Gonick, and Smith. 2005. The Cartoon Guide to Statistics. Collins Reference. https://books.google.de/books?id=-U7vygAACAAJ.

Gu, Eils, and Schlesner. 2016a. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics (Oxford, England) 32 (18): 2847–9. https://doi.org/10.1093/bioinformatics/btw313.

———. 2016b. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics.

Guinney, Dienstmann, Wang, et al. 2015. “The consensus molecular subtypes of colorectal cancer.” Nat. Med. 21 (11): 1350–6.

Haas, Papanicolaou, Yassour, et al. 2013. “De Novo Transcript Sequence Reconstruction from RNA-Seq: Reference Generation and Analysis with Trinity.” Nature Protocols 8 (8). https://doi.org/10.1038/nprot.2013.084.

Hager, McNally, and Misteli. 2009. “Transcription Dynamics.” Molecular Cell 35 (6): 741–53.

Han, Tian, Pécot, Huang, Machiraju, and Huang. 2012. “A Signal Processing Approach for Enriched Region Detection in RNA Polymerase II ChIP-Seq Data.” BMC Bioinformatics 13 Suppl 2 (March): S2. https://doi.org/10.1186/1471-2105-13-S2-S2.

Hartigan, and Wong. 1979. “Algorithm as 136: A K-Means Clustering Algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics) 28 (1): 100–108.

He, Li, Li, et al. 2011. “Tet-Mediated Formation of 5-Carboxylcytosine and Its Excision by TDG in Mammalian DNA.” Science 333 (6047): 1303–7.

Helmuth, Li, Arrigoni, et al. 2016. “normR: Regime Enrichment Calling for ChIP-Seq Data.” bioRxiv.

Henikoff. 2008. “Nucleosome Destabilization in the Epigenetic Regulation of Gene Expression.” Nature Reviews Genetics 9 (1): 15–26.

Hoerl, and Kennard. 1970. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics 12 (1): 55–67.

Hoffman, Buske, Wang, Weng, Bilmes, and Noble. 2012. “Unsupervised Pattern Discovery in Human Chromatin Structure Through Genomic Segmentation.” Nat Methods 9 (5): 473–76. https://doi.org/10.1038/nmeth.1937.

Horvath. 2013. “DNA Methylation Age of Human Tissues and Cell Types.” Genome Biology 14 (10): 3156.

Hsu, Chang, Lin, and others. 2003. “A Practical Guide to Support Vector Classification.”

Hyvärinen. 2013. “Independent Component Analysis: Recent Advances.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371 (1984): 20110534.

James, Witten, Hastie, and Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer New York. https://books.google.de/books?id=qcI\_AAAAQBAJ.

Jiang, Schlesinger, Davis, Zhang, Li, Salit, Gingeras, and Oliver. 2011. “Synthetic Spike-in Standards for RNA-Seq Experiments.” Genome Research 21 (9): 1543–51. https://doi.org/10.1101/gr.121095.111.

Jung, Luquette, Ho, et al. 2014. “Impact of Sequencing Depth in ChIP-Seq Experiments.” Nucleic Acids Res 42 (9): e74. https://doi.org/10.1093/nar/gku178.

Kanehisa, Furumichi, Tanabe, Sato, and Morishima. 2017. “KEGG: new perspectives on genomes, pathways, diseases and drugs.” Nucleic Acids Res. 45 (D1): D353–D361.

Kanehisa, Sato, Kawashima, Furumichi, and Tanabe. 2016. “KEGG as a Reference Resource for Gene and Protein Annotation.” Nucleic Acids Research 44 (Database issue): D457–D462. https://doi.org/10.1093/nar/gkv1070.

Khan, Fornes, Stigliani, et al. 2018. “JASPAR 2018: Update of the Open-Access Database of Transcription Factor Binding Profiles and Its Web Framework.” Nucleic Acids Res 46 (D1): D260–D266. https://doi.org/10.1093/nar/gkx1126.

Kharchenko, Tolstorukov, and Park. 2008. “Design and Analysis of ChIP-Seq Experiments for DNA-Binding Proteins.” Nat Biotechnol 26 (12): 1351–9. https://doi.org/10.1038/nbt.1508.

Kidder, Hu, and Zhao. 2011. “ChIP-Seq: Technical Considerations for Obtaining High-Quality Data.” Nat Immunol 12 (10): 918–22. https://doi.org/10.1038/ni.2117.

Kim, Langmead, and Salzberg. 2015. “HISAT: A Fast Spliced Aligner with Low Memory Requirements.” Nature Methods 12 (4): 357–60. https://doi.org/10.1038/nmeth.3317.

Kim, Pertea, Trapnell, Pimentel, Kelley, and Salzberg. 2013. “TopHat2: Accurate Alignment of Transcriptomes in the Presence of Insertions, Deletions and Gene Fusions.” Genome Biology 14 (4): R36. https://doi.org/10.1186/gb-2013-14-4-r36.

Kolde. 2019. Pheatmap: Pretty Heatmaps. https://CRAN.R-project.org/package=pheatmap.

Kourou, Exarchos, Exarchos, Karamouzis, and Fotiadis. 2015. “Machine learning applications in cancer prognosis and prediction.” Comput Struct Biotechnol J 13: 8–17.

Krebs, Schmidt, Goren, et al. 2014. “Optimization of Transcription Factor Binding Map Accuracy Utilizing Knockout-Mouse Models.” Nucleic Acids Res 42 (21): 13051–60. https://doi.org/10.1093/nar/gku1078.

Krueger, and Andrews. 2011. “Bismark: A Flexible Aligner and Methylation Caller for Bisulfite-Seq Applications.” Bioinformatics 27 (11): 1571–2.

Kutner, Nachtsheim, and Neter. 2003. Applied Linear Regression Models. The Mcgraw-Hill/Irwin Series Operations and Decision Sciences. McGraw-Hill Higher Education. https://books.google.de/books?id=0nAMAAAACAAJ.

Laajala, Raghav, Tuomela, Lahesmaa, Aittokallio, and Elo. 2009. “A Practical Comparison of Methods for Detecting Transcription Factor Binding Sites in ChIP-Seq Experiments.” BMC Genomics 10 (December): 618. https://doi.org/10.1186/1471-2164-10-618.

Landt, Marinov, Kundaje, et al. 2012. “ChIP-Seq Guidelines and Practices of the ENCODE and modENCODE Consortia.” Genome Res 22 (9): 1813–31. https://doi.org/10.1101/gr.136184.111.

Langmead, and Salzberg. 2012a. “Fast Gapped-Read Alignment with Bowtie 2.” Nature Methods 9 (4): 357.

———. 2012b. “Fast Gapped-Read Alignment with Bowtie 2.” Nat Methods 9 (4): 357–59. https://doi.org/10.1038/nmeth.1923.

Langmead, Trapnell, Pop, and Salzberg. 2009. “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome.” Genome Biol 10 (3): R25. https://doi.org/10.1186/gb-2009-10-3-r25.

LeCun, Bengio, and Hinton. 2015. “Deep Learning.” Nature 521 (7553): 436.

Leek, Johnson, Parker, Jaffe, and Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034.

Lee, and Seung. 2001. “Algorithms for Non-Negative Matrix Factorization.” In Advances in Neural Information Processing Systems, 556–62.

Li. 2011. “Tabix: Fast Retrieval of Sequence Features from Generic TAB-delimited Files.” Bioinformatics 27 (5): 718–19.

Liang, and Keleş. 2012. “Normalization of ChIP-Seq Data with Control.” BMC Bioinformatics 13 (August): 199. https://doi.org/10.1186/1471-2105-13-199.

Liao, Smyth, and Shi. 2013. “The Subread Aligner: Fast, Accurate and Scalable Read Mapping by Seed-and-Vote.” Nucleic Acids Research 41 (10): e108–e108. https://doi.org/10.1093/nar/gkt214.

Libbrecht, and Noble. 2015. “Machine learning applications in genetics and genomics.” Nat. Rev. Genet. 16 (6): 321–32.

Li, and Durbin. 2009a. “Fast and Accurate Short Read Alignment with Burrows–Wheeler Transform.” Bioinformatics 25 (14): 1754–60.

———. 2009b. “Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform.” Bioinformatics 25 (14): 1754–60. https://doi.org/10.1093/bioinformatics/btp324.

Li, and Freudenberg. 2014. “Mappability and Read Length.” Front Genet 5 (November): 381. https://doi.org/10.3389/fgene.2014.00381.

Lister, Mukamel, Nery, et al. 2013. “Global Epigenomic Reconfiguration During Mammalian Brain Development.” Science 341 (6146): 1237905–5.

Lister, Pelizzola, Dowen, et al. 2009. “Human DNA Methylomes at Base Resolution Show Widespread Epigenomic Differences.” Nature 462 (7271): 315–22.

Li, Yu, Li, Lam, Yiu, Kristiansen, and Wang. 2009. “SOAP2: An Improved Ultrafast Tool for Short Read Alignment.” Bioinformatics 25 (15): 1966–7.

Love, Huber, and Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2.” Genome Biology 15 (12). https://doi.org/10.1186/s13059-014-0550-8.

Lövkvist, Dodd, Sneppen, and Haerter. 2016. “DNA Methylation in Human Epigenomes Depends on Local Topology of CpG Sites.” Nucleic Acids Res. 44 (11): 5123–32.

Lun, and Smyth. 2014. “De Novo Detection of Differentially Bound Regions for ChIP-Seq Data Using Peaks and Windows: Controlling Error Rates Correctly.” Nucleic Acids Res 42 (11): e95. https://doi.org/10.1093/nar/gku351.

Luo, Friedman, Shedden, Hankenson, and Woolf. 2009. “GAGE: Generally Applicable Gene Set Enrichment for Pathway Analysis.” BMC Bioinformatics 10 (1): 161. https://doi.org/10.1186/1471-2105-10-161.

Maaten, and Hinton. 2008. “Visualizing Data Using T-Sne.” Journal of Machine Learning Research 9 (Nov): 2579–2605.

Mathe, Sagot, Schiex, and Rouze. 2002. “Current methods of gene prediction, their strengths and weaknesses.” Nucleic Acids Res. 30 (19): 4103–17.

Maza, Frasse, Senin, Bouzayen, and Zouine. 2013. “Comparison of Normalization Methods for Differential Gene Expression Analysis in RNA-Seq Experiments: A Matter of Relative Size of Studied Transcriptomes.” Communicative & Integrative Biology 6 (6): e25849. https://doi.org/10.4161/cib.25849.

McKenna, Hanna, Banks, et al. 2010. “The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next-Generation DNA Sequencing Data.” Genome Research 20 (9): 1297–1303. https://doi.org/10.1101/gr.107524.110.

McPherson, Hormozdiari, Zayed, et al. 2011. “deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data.” PLOS Computational Biology 7 (5): e1001138. https://doi.org/10.1371/journal.pcbi.1001138.

Mermel, Schumacher, Hill, Meyerson, Beroukhim, and Getz. 2011. “GISTIC2. 0 Facilitates Sensitive and Confident Localization of the Targets of Focal Somatic Copy-Number Alteration in Human Cancers.” Genome Biology 12 (4): R41.

Micsinai, Parisi, Strino, Asp, Dynlacht, and Kluger. 2012. “Picking ChIP-Seq Peak Detectors for Analyzing Chromatin Modification Experiments.” Nucleic Acids Res 40 (9): e70. https://doi.org/10.1093/nar/gks048.

Morgan, Anders, Lawrence, Aboyoun, Pagès, and Gentleman. 2009. “ShortRead: A Bioconductor Package for Input, Quality Assessment and Exploration of High-Throughput Sequence Data.” Bioinformatics 25 (19): 2607–8. https://doi.org/10.1093/bioinformatics/btp450.

Morris, and Mattick. 2014. “The Rise of Regulatory Rna.” Nature Reviews Genetics 15 (6): 423–37.

Mortazavi, Pepke, Jansen, Marinov, Ernst, Kellis, Hardison, Myers, and Wold. 2013. “Integrating and Mining the Chromatin Landscape of Cell-Type Specificity Using Self-Organizing Maps.” Genome Res 23 (12): 2136–48. https://doi.org/10.1101/gr.158261.113.

Mortazavi, Williams, McCue, Schaeffer, and Wold. 2008. “Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq.” Nature Methods 5 (7): 621–28. https://doi.org/10.1038/nmeth.1226.

Noushmehr, Weisenberger, Diefes, et al. 2010. “Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma.” Cancer Cell 17 (5): 510–22.

Numata, Ye, Hyde, et al. 2012. “DNA Methylation Signatures in Development and Aging of the Human Prefrontal Cortex.” The American Journal of Human Genetics 90 (2): 260–72.

Patro, Duggal, Love, Irizarry, and Kingsford. 2017. “Salmon: Fast and Bias-Aware Quantification of Transcript Expression Using Dual-Phase Inference.” Nature Methods 14 (4): 417–19. https://doi.org/10.1038/nmeth.4197.

Patro, Mount, and Kingsford. 2014. “Sailfish Enables Alignment-Free Isoform Quantification from RNA-Seq Reads Using Lightweight Algorithms.” Nature Biotechnology 32 (5): 462–64. https://doi.org/10.1038/nbt.2862.

Phillips, and Corces. 2009. “CTCF: Master Weaver of the Genome.” Cell 137 (7): 1194–1211.

Poplin, Chang, Alexander, et al. 2018. “A universal SNP and small-indel variant caller using deep neural networks.” Nat. Biotechnol. 36 (10): 983–87.

Rashid, Giresi, Ibrahim, Sun, and Lieb. 2011. “ZINBA Integrates Local Covariates with DNA-Seq Data to Identify Broad and Narrow Regions of Enrichment, Even Within Amplified Genomic Regions.” Genome Biol 12 (7): R67. https://doi.org/10.1186/gb-2011-12-7-r67.

Reynolds, Richards, Iglesia, and Rayward-Smith. 2006. “Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms.” Journal of Mathematical Modelling and Algorithms 5 (4): 475–504.

Risso, Ngai, Speed, and Dudoit. 2014. “Normalization of RNA-Seq Data Using Factor Analysis of Control Genes or Samples.” Nature Biotechnology 32 (9): 896–902. https://doi.org/10.1038/nbt.2931.

Risso, Schwartz, Sherlock, and Dudoit. 2011. “GC-Content Normalization for RNA-Seq Data.” BMC Bioinformatics 12 (December): 480. https://doi.org/10.1186/1471-2105-12-480.

Robertson, Schein, Chiu, et al. 2010. “De Novo Assembly and Analysis of RNA-Seq Data.” Nature Methods 7 (11): 909–12. https://doi.org/10.1038/nmeth.1517.

Robinson, McCarthy, and Smyth. 2010. “edgeR: A Bioconductor Package for Differential Expression Analysis of Digital Gene Expression Data.” Bioinformatics (Oxford, England) 26 (1): 139–40. https://doi.org/10.1093/bioinformatics/btp616.

Rousseeuw. 1987. “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Journal of Computational and Applied Mathematics 20: 53–65.

Ruffalo, LaFramboise, and Koyutürk. 2011. “Comparative Analysis of Algorithms for Next-Generation Sequencing Read Alignment.” Bioinformatics 27 (20): 2790–6. https://doi.org/10.1093/bioinformatics/btr477.

Schübeler. 2015. “Function and Information Content of DNA Methylation.” Nature 517 (7534): 321–26.

Schwartz, and Pirrotta. 2007. “Polycomb Silencing Mechanisms and the Management of Genomic Programmes.” Nature Reviews Genetics 8 (1): 9–22.

Shao, Zhang, Yuan, Orkin, and Waxman. 2012. “MAnorm: A Robust Model for Quantitative Comparison of ChIP-Seq Data Sets.” Genome Biol 13 (3): R16. https://doi.org/10.1186/gb-2012-13-3-r16.

Smith, and Meissner. 2013. “DNA Methylation: Roles in Mammalian Development.” Nat. Rev. Genet. 14 (3): 204–20.

Smyth Gordon. 2004. “Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments.” Statistical Applications in Genetics and Molecular Biology 3 (1): 1–25.

Song, and Smith. 2011. “Identifying Dispersed Epigenomic Domains from ChIP-Seq Data.” Bioinformatics 27 (6): 870–71. https://doi.org/10.1093/bioinformatics/btr030.

Sood, Viner, and Hoffman. 2019. “DNAmod: The Dna Modification Database.” Journal of Cheminformatics 11 (1): 30.

Stadler, Murr, Burger, et al. 2011a. “DNA-binding Factors Shape the Mouse Methylome at Distal Regulatory Regions.” Nature 480 (7378): 490–95.

———. 2011b. “DNA-binding Factors Shape the Mouse Methylome at Distal Regulatory Regions.” Nature 480 (7378): 490–95.

Stanke, and Morgenstern. 2005. “AUGUSTUS: A Web Server for Gene Prediction in Eukaryotes That Allows User-Defined Constraints.” Nucleic Acids Research 33 (Web Server issue): W465–W467. https://doi.org/10.1093/nar/gki458.

Storey, and Tibshirani. 2003. “Statistical Significance for Genomewide Studies.” Proc. Natl. Acad. Sci. U. S. A. 100 (16): 9440–5.

Strahl, and Allis. 2000. “The Language of Covalent Histone Modifications.” Nature 403 (6765): 41–45.

Subramanian, Tamayo, Mootha, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proceedings of the National Academy of Sciences 102 (43): 15545–50. https://doi.org/10.1073/pnas.0506580102.

Tahiliani, Koh, Shen, et al. 2009. “Conversion of 5-Methylcytosine to 5-Hydroxymethylcytosine in Mammalian DNA by MLL Partner TET1.” Science 324 (5929): 930–35.

Tan, and Lenhard. 2016. “TFBSTools: An R/Bioconductor Package for Transcription Factor Binding Site Analysis.” Bioinformatics 32 (10): 1555–6. https://doi.org/10.1093/bioinformatics/btw024.

Teng, and Irizarry. 2017. “Accounting for Gc-Content Bias Reduces Systematic Errors and Batch Effects in Chip-Seq Data.” Genome Research. https://doi.org/10.1101/gr.220673.117.

Teng, and Irizarry. 2016. “Accounting for GC-Content Bias Reduces Systematic Errors and Batch Effects in ChIP-Seq Peak Callers.” bioRxiv, January. http://biorxiv.org/content/early/2016/11/30/090704.

Tibshirani. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B (Methodological) 58 (1): 267–88.

Tibshirani, Walther, and Hastie. 2001. “Estimating the Number of Clusters in a Data Set via the Gap Statistic.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2): 411–23.

Trapnell, Williams, Pertea, Mortazavi, Kwan, Baren, Salzberg, Wold, and Pachter. 2010. “Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching During Cell Differentiation.” Nature Biotechnology 28 (5): 511–15. https://doi.org/10.1038/nbt.1621.

Wang, and Burge. 2008. “Splicing Regulation: From a Parts List of Regulatory Elements to an Integrated Splicing Code.” Rna 14 (5): 802–13.

Wang, Gerstein, and Snyder. 2009. “RNA-Seq: A Revolutionary Tool for Transcriptomics.” Nature Reviews Genetics 10 (1): 57–63. https://doi.org/10.1038/nrg2484.

Wang, McLeod, and Weinshilboum. 2011. “Genomics and drug response.” N. Engl. J. Med. 364 (12): 1144–53.

Wardle, and Tan. 2015. “A Chip on the Shoulder? Chromatin Immunoprecipitation and Validation Strategies for Chip Antibodies [Version 1; Referees: 2 Approved].” F1000Research 4 (235). https://doi.org/10.12688/f1000research.6719.1.

Weinstein, Collisson, Mills, et al. 2013. “The Cancer Genome Atlas Pan-Cancer analysis project.” Nat. Genet. 45 (10): 1113–20.

Wilbanks, and Facciotti. 2010. “Evaluation of Algorithm Performance in ChIP-Seq Peak Detection.” PLoS ONE 5 (7): e11471. https://doi.org/10.1371/journal.pone.0011471.

Wu, Reeder, Lawrence, Becker, and Brauer. 2016. “GMAP and GSNAP for Genomic Sequence Alignment: Enhancements to Speed, Accuracy, and Functionality.” Methods in Molecular Biology (Clifton, N.J.) 1418: 283–334. https://doi.org/10.1007/978-1-4939-3578-9_15.

Xing, Mo, Liao, and Zhang. 2012. “Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-Seq Data.” PLoS Comput Biol 8 (7): e1002613. https://doi.org/10.1371/journal.pcbi.1002613.

Xu, Handoko, Wei, Ye, Sheng, Wei, Lin, and Sung. 2010. “A Signal-Noise Model for Significance Analysis of ChIP-Seq with Negative Control.” Bioinformatics 26 (9): 1199–1204. https://doi.org/10.1093/bioinformatics/btq128.

Zang, Schones, Zeng, Cui, Zhao, and Peng. 2009. “A Clustering Approach for Identification of Enriched Domains from Histone Modification ChIP-Seq Data.” Bioinformatics 25 (15): 1952–8. https://doi.org/10.1093/bioinformatics/btp340.

Zhang, Lin, Johnson, Rozek, and Sartor. 2014. “PePr: A Peak-Calling Prioritization Pipeline to Identify Consistent or Differential Peaks from Replicated ChIP-Seq Data.” Bioinformatics 30 (18): 2568–75. https://doi.org/10.1093/bioinformatics/btu372.

Zhang, Liu, Meyer, et al. 2008. “Model-Based Analysis of ChIP-Seq (MACS).” Genome Biol 9 (9): R137. https://doi.org/10.1186/gb-2008-9-9-r137.

Zhou, and Troyanskaya. 2015. “Predicting effects of noncoding variants with deep learning-based sequence model.” Nat. Methods 12 (10): 931–34.

Zou, and Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301–20.