The engine behind human gut microbiome analysis and data science

3d rendered medical illustration of the microbiome of the small intestine

By Elaine Smith

As his career unfolds, biostatistician Kevin McGregor is becoming very familiar with the human gut microbiome. His work is particularly relevant given the human biome is a community of microorganisms that inhabit our bodies and appears to be linked to numerous health concerns, both physical and mental.

McGregor, an assistant professor in the Department of Mathematics and Statistics in the Faculty of Science, is a biostatistician who joined York University in 2021 after finishing his PhD at McGill University. He is part of the team involved in creating and teaching in the department’s new data science program, which makes its debut in 2023, but he is also involved in developing statistical models and associated software packages for understanding the makeup of the gut microbiome.

“My training is very quantitative, so I’m involved on the mathematical/statistical side,” says McGregor. “The microbiologists collect all the data and it’s my job to come up with the statistical methods to analyze it.”

Kevin McGregor
Kevin McGregor

He might be involved in looking into one species of microorganism if it’s abundant and considered relevant to a particular disease, such as Crohn’s disease, or he might be exploring the interaction between various types of microbes in the overall network.

“Microorganisms don’t live independently; they may be symbiotic or competing for resources,” says McGregor. “We’re looking for correlations related to metabolic interactions. I usually develop a methodology for analysis and the accompanying software. The first step is more theoretical; then, I create a software package so the microbiologists can plug in the data and get answers.”

Most of the studies compare the genetic sequencing for the microbiomes of hundreds of individuals. Researchers are looking at the counts of various species of microorganisms that are present to see if the patterns align with specific diseases or biomarkers.

One of the challenges of analyzing microbiome data is that numerous zeroes appear to indicate that certain organisms have no presence in an individual’s microbiome. Sometimes, these are false negatives; the stool sample that was used to sequence the individual’s gut microbiome simply didn’t include a specific microbe.

“They are statistically difficult to deal with,” McGregor says. “It requires that I develop a statistical method that can look at the network patterns but get around this challenge.”

The programs that McGregor devises must determine what the probability is that any zero truly indicates the absence of that microbe. One of the methods he employs to weed out the false negatives is the zero-inflated logistic normal multinomial model.

Next comes the software development that allows him to “fit” the model: input real data and get an output. Genetic sequencing of the microbiome provides “tons of data,” says McGregor, and the models are complicated. The associated software can take “hours and hours to run” on a computer, so he looks for shortcuts, such as the variational Bayes method, a statistical tool that is computationally efficient. McGregor is currently supervising a postdoctoral fellow, Ismaïla Ba, PhD, who is working on this model.

McGregor says he loves the problem-solving aspect of his work, devising new models or improving existing ones. He also likes the real-world applications that his work makes possible and enjoys the opportunity to collaborate with researchers in a broad range of fields. He recently joined forces with Joseph De Souza, an assistant professor of systems neuroscience, to apply for grants that will allow them to examine microbiome data related to Parkinson’s disease. He’s also involved with the Integrated Microbiome Platforms for Advancing Causation Testing and Translation (IMPACTT) team, which is a multi-disciplinary microbiome research core across Canadian universities.

“My dream is to be viewed as having a positive impact on microbiome research, developing models and giving sound advice to researchers in the field,” McGregor says. “I’d also like to come up with statistically innovative techniques in this area and be recognized by the statistics community.”

McGregor’s career is young; look for its impact to grow exponentially.