Beschreibung
Machine learning (ML) approaches in metagenomic research are relatively new. Initially, most microbiome-ML studies focused on human-associated microbiomes, and only recently these methods have expanded to soil, agricultural, and food systems. In this context, ML serves two main purposes: (i) building predictive models, and (ii) complementing classical statistical analyses. Metagenomic datasets are often compositional, sparse, and extremely high-dimensional, with features far outnumbering samples. Here, tree-based ensembles like Random Forest (RF) offer particular advantages: they handle heterogeneous, zero-inflated data and generally perform well with minimal tuning. RF has thus become a popular default in microbiome analytics. However, applying such algorithms to small-sample datasets can lead to overfitting. Rigorous validation (e.g., nested cross-validation) is critical to obtain fair performance estimates. As larger microbiome datasets become available, more algorithms can be explored and systematically benchmarked. Notably, deep neural networks (DNNs) remain underutilized in microbiome research—possibly due to limited sample sizes and the noisy, compositional nature of microbiome data. We will discuss whether recent data expansions alleviate these issues and how DNNs, coupled with effective feature selection and noise-reduction techniques can improve predictive modeling. The methods discussed highlight the practical relevance of ML in advancing food and agricultural microbiome applications, from improving food quality and safety to supporting sustainable agriculture.| Zeitraum | 2 Dez. 2025 |
|---|---|
| Ereignistitel | DigAgro 2025 |
| Veranstaltungstyp | Konferenz |
| Ort | Tulln , ÖsterreichAuf Karte anzeigen |
Research Field
- Exploration of Biological Resources