To assess the microbiome, a highly conserved gene found in all species of bacteria and archaea—the 16S ribosomal RNA (16S rRNA) gene, or 16S—can be targeted, amplified, and utilized for phylogenetic studies.11 Since the gene is specific to bacteria and simultaneously contains hypervariable regions, sequencing 16S permits not only its separation from other prokaryotic and eukaryotic cells within a given sample, but also an unbiased taxonomic tool to differentiate bacterial species among themselves. Next-generation sequencing platforms, such as 454, Illumina, and others, allow for bar-coded, massively parallel 16S rRNA sequencing of millions of sequence determinations in a single experiment. If, a decade ago, the isolation and identification of a given species took months to achieve, dozens of samples containing thousands of different species each can now be sequenced and classified in one day. But this approach only helps with the question of “Who is there?”—in other words, what is the community microbial composition in terms of relative abundance?
To address an even more complex matter (“What are the microbes doing?”), it is now feasible to perform whole-genome shotgun sequencing of a given microbiome (metagenome). Using the same sequencing platforms as in 16S surveys, we also can elucidate the vast array of enzymatic and metabolic pathways generated by these communities, and, with these data, begin to understand their function and behavior. Multiple studies have found that a significant number of intestinal microbiome genes are simply not encoded by the human genome. Among these are genes whose products are essential for the biological development and well-being of humans, including enzymes involved in metabolism of otherwise indigestible polysaccharides, amino acids, and xenobiotics, and the production of proteins associated with the maturation and regulation of the immune system. Concomitantly, an enormous computational biology (and biologist) capacity, along with sophisticated bioinformatic tools and true multidisciplinary efforts are required to filter, align, assemble, and annotate the colossal amount of data and metadata generated and make scientific sense of it.