This narrative is to share the data used in the manuscript about Microbial Community Assembly Mediated by Environmental Stress in Groundwater (Ning et al.). The data are results from ENIGMA 100-Well Survey.
Daliang Ning1, Yajiao Wang1, Yupeng Fan1, Jianjun Wang1,2, Joy D. Van Nostrand1, Liyou Wu1, Ping Zhang1,3, Daniel J. Curtis1, Renmao Tian1,4, Lauren Lui5, Terry C. Hazen6,7,8, Eric J. Alm9, Matthew W. Fields10, Farris Poole11, Michael W. W. Adams11, Romy Chakraborty12, David A. Stahl13, Paul D. Adams5,8, Adam P. Arkin5,8, Zhili He1,14, and Jizhong Zhou1,12,15,16
1 Institute for Environmental Genomics and Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, USA
2 State Key Laboratory of Lake Science and Environment, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, China
3 Alkek Center for Metagenomics and Microbiome Research, Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
4 Institute for Food Safety and Health, Illinois Institute of Technology, Bedford Park, IL, USA
5 Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
6 Department of Earth and Planetary Sciences, Bredesen Center, Department of Civil and Environmental Sciences, Center for Environmental Biotechnology, and Institute for a Secure and Sustainable Environment, University of Tennessee, Knoxville, TN, USA
7 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
8 Department of Bioengineering, University of California, Berkeley, CA, USA
9 Department of Biological Engineering, Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, USA
10 Center for Biofilm Engineering and Department of Microbiology & Cell Biology, Montana State University, Bozeman, MT, USA
11 Department of Biochemistry & Molecular Biology, University of Georgia, Athens, GA, USA
12 Earth and Environmental Sciences, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
13 Department of Civil and Environmental Engineering, University of Washington, Seattle, WA, USA
14 Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai, China
15 School of Civil Engineering and Environmental Sciences, University of Oklahoma, Norman, OK, USA
16 School of Computer Science, University of Oklahoma, Norman, OK, USA
The values of all physical chemical variables are stored in a csv file (100WSc.Env.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/RMJvEnJGtKk6hav.
The explanation of each variable, including the data type, unit, and description, is in a csv file (100WSc.Env.Description.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/uyZ33aGsoHC8m0G.
The data files are also available in GitHub: https://github.com/DaliangNing/iCAMP1/tree/master/Publication2/Data.
The 16S rRNA gene sequencing data are currently public available in MG-RAST: http://metagenomics.anl.gov/mgmain.html?mgpage=project&project=mgp8190
The OTU table used in this study is in a csv file (100WSc.OTUtable.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/UPKcB8VVtLmeNvt
The representative sequences of the OTUs are in a fasta file (100WSc.Rep_Seq.fasta) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/WqEUgvG4TCjnJw4
The phylogenetic tree is in a nwk file (100WSc.Tree.nwk) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/J7hOzrSPMSRiJrr
The classification result (taxonomy information) is in a csv file (100WSc.Classif.QIIME2.Silva138.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/NmPMhFIHtO4YBZX
The data files are also available from https://github.com/DaliangNing/iCAMP1/tree/master/Publication2/Data.
The ASV (zOTU) table generated in this study is in a csv file (Q100WUN.resamp.zOTUtab.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/5bm0UBIi7ey2MO0
The sequences of the ASVs are in a fasta file (Q100WUN.resamp.Sequence.fasta) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/zYxGNVKnqjxmOgW
The phylogenetic tree is in a nwk file (Q100WUN.resamp.Tree.nwk) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/0xkWYuTVGSrc8G0
The classification result (taxonomy information) is in a csv file (Q100WUN.resamp.Classification.csv) here: https://iegst1.rccc.ou.edu/owncloud/index.php/s/lvmjc4XKPvvRrkO
The data files are also available from https://github.com/DaliangNing/iCAMP1/tree/master/Publication2/Data/ASVs.
All custom scripts and the latest version of R package ‘iCAMP’ are available from GitHub (https://github.com/DaliangNing/iCAMP1).
Custom scripts are available from https://github.com/DaliangNing/iCAMP1/tree/master/Publication2/Code.
This study by ENIGMA- Ecosystems and Networks Integrated with Genes and Molecular Assemblies (http://enigma.lbl.gov), a Science Focus Area Program at Lawrence Berkeley National Laboratory is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Biological & Environmental Research under contract number DE-AC02-05CH11231. The development of the theoretical framework and statistical methods was partly supported by NSF Grants EF-2025558 and DEB-2129235. The study was also supported by the Office of the Vice President for Research at the University of Oklahoma.