E-Ink News Daily

Back to list

Large genome model: Open source AI trained on trillions of bases

Researchers have developed Evo 2, an open-source AI model trained on trillions of DNA base pairs across all three domains of life (bacteria, archaea, and eukaryotes). The model can identify complex genomic features like regulatory DNA and splice sites in eukaryotic genomes, overcoming limitations of the original Evo system that only worked well with simpler bacterial genomes. This represents a significant advance in genomic AI by enabling analysis of complex genome structures.

Background

Previous AI systems like Evo were limited to analyzing bacterial genomes due to their simpler organization, where related genes cluster together. Eukaryotic genomes present greater challenges with features like introns and distributed regulatory sequences that are harder for AI to interpret.

Source
Ars Technica
Published
Mar 5, 2026 at 06:14 AM
Score
8.0 / 10