Microbial Ecology and Diversity Bioinformatics, IT & Databases Microorganisms
Predicting prokaryotic phenotypes—observable traits that govern functionality, adaptability, and interactions—holds significant potential for fields such as biotechnology, environmental sciences, and evolutionary biology. In this study, we leverage machine learning to explore the relationship between prokaryotic genotypes and phenotypes. Utilizing the highly standardized datasets in the Bac Dive database, we model eight physiological properties based on protein family inventories, evaluate model performance using multiple metrics, and examine the biological implications of our predictions. The high confidence values achieved underscore the importance of data quality and quantity for reliably inferring bacterial phenotypes. Our approach generates 50,396 completely new datapoints for 15,938 strains, now openly available in the Bac Dive database, thereby enriching existing phenotypic resources and enabling further research. The open-source software we provide can be readily applied to other datasets, such as those from metagenomic studies, and to various applications, including assessing the potential of soil bacteria for bioremediation.
|
This is referenced by
Refined variant calling pipeline on RNA-seq data of breast cancer cell lines without matched-normal samples
Eberth S., Koblitz J., Steenpaß L. and Pommerenke C. BMC Res Notes 18(1): 67 (2025) |
01.04.2020-31.03.2023
| Research topics |
| Date 07.06.2025 |
| Journal Communications biology |
| Issue 1 |
| Volume 8 |
| Pages 897 |
| Publication Language English |
| Open Access Status Open Access (gold) |
| Online Ahead Of Print No |
The content on this page is maintained by the authors.