Name
A new era of microbiome research enabled by long-reads: applications for childhood undernutrition
Date & Time
Wednesday, May 7, 2025, 3:30 PM - 3:55 PM
Jeremiah Minich
Description

Over the past 20 years, Next Generation Sequencing has led to a boom in metagenomics research, leading to many important discoveries about the role of the microbiome in human health, including pediatric undernutrition. Past research links the microbiome to wasting (affecting 45 million children), but its role in stunting (150 million children, 22%) remains unclear. Current short-read (SR) microbiome approaches are limited in resolution and focused on changes in taxonomic or functional abundances across participants. One of the biggest limitations of SR is the inability to assemble and thus analyze complete, circular genomes of bacteria, which impairs comparisons of genome-wide genetic repertoire of microbes within and across populations. We hypothesized that complete metagenome-assembled-genomes (cMAGs), generated from a longitudinal, long-read (LR) metagenomics cohort, were critical for pangenome and microbial GWAS (mGWAS) analyses for identifying microbial genetic associations with pediatric linear growth trajectories. Here, we perform the most comprehensive long-read DNA sequencing technology benchmarking comparison by sequencing 47 human pediatric fecal samples (8 participants, 5-6-time points, 11 months) across the three leading platforms (Oxford Nanopore Technologies Promethion Kit 14 R10.4.1, Pacific Biosciences Revio SMRTbell prep kit 3.0, and Illumina synthetic long reads). Traditional long-read, single-molecule approaches (PB and ONT) generated 51-72x more cMAGs per Gbp than legacy short-read approaches, while PB generated the most accurate, complete cMAGs at the lowest cost (PB: $16; ONT: $25; ILMN: $95 per cMAG). PB generated 2.4x more cMAGs than ONT when controlling for Gbp input. In this Malawian pediatric undernutrition cohort, we generated 985 cMAGs (831 circular) from 47 samples, performed independent functional pangenome and mGWAS analyses across multiple clades, and identified microbial genetic associations with various environmental and biological phenotypes related to undernutrition. Our study is the most comprehensive metagenomic comparison for long-reads, and this resource demonstrates the power of comparing cMAGs with health trajectories and establishes a new standard for microbiome association studies. Furthermore, we have adapted and applied a low-cost long-read metagenomics library preparation method with seqWell LongPlex and performed long-read metagenomics of 288 samples on across 4 PacBio Revio runs and 1 Vega run (96 samples per lane) across an extended sample size of 42 participants and 6-time points.