Next Generation Sequencing to Detect Environmental Stress
Current Topics in Microbiology
Oral Presentation
Prepared by K. Ternus, M. Isbell
Signature Science, LLC, 8329 North Mopac Expressway, Austin, Texas, 78759, United States
Contact Information: [email protected]; 512-583-2367
ABSTRACT
Next generation sequencing (NGS) has the potential to detect changes within microbial communities in response to specific types of stress. However, the majority of species within complex environmental samples do not have an available reference genome sequence; therefore, current state-of-the-art taxonomic classification methods only provide information on a small subset of environmental sequences; in addition, typically low sequencing throughput further reduces the amount of informative data availabe. Despite these limitations, NGS data can be mined using machine learning, innovative graphical data visualization, and statistical analyses to support biomarker discovery. This presentation will demonstrate the application of optimal data analysis with current taxonomic classification tools to discern patterns indicative of environmental stress. In one example, we will show that analysis of publicly available soil Illumina MiSeq NGS datasets was able to be analyzed in an exploratory search for biomarkers. Using pairwise correlations, cluster analysis, and evaluation of species level k-mer quantity across samples, species content was found to be strongly correlated with the geographic sites from which the samples originated and also suggested correlations among stress response to copper and/or arsenic and the relative abundance of certain species.
Current Topics in Microbiology
Oral Presentation
Prepared by K. Ternus, M. Isbell
Signature Science, LLC, 8329 North Mopac Expressway, Austin, Texas, 78759, United States
Contact Information: [email protected]; 512-583-2367
ABSTRACT
Next generation sequencing (NGS) has the potential to detect changes within microbial communities in response to specific types of stress. However, the majority of species within complex environmental samples do not have an available reference genome sequence; therefore, current state-of-the-art taxonomic classification methods only provide information on a small subset of environmental sequences; in addition, typically low sequencing throughput further reduces the amount of informative data availabe. Despite these limitations, NGS data can be mined using machine learning, innovative graphical data visualization, and statistical analyses to support biomarker discovery. This presentation will demonstrate the application of optimal data analysis with current taxonomic classification tools to discern patterns indicative of environmental stress. In one example, we will show that analysis of publicly available soil Illumina MiSeq NGS datasets was able to be analyzed in an exploratory search for biomarkers. Using pairwise correlations, cluster analysis, and evaluation of species level k-mer quantity across samples, species content was found to be strongly correlated with the geographic sites from which the samples originated and also suggested correlations among stress response to copper and/or arsenic and the relative abundance of certain species.