Already have an account?Login
PhageAI is an AI-driven software platform using advanced Machine Learning and Natural Language Processing techniques for deeper understanding of the bacteriophages genomics.
For AI model training we used 594 manually selected bacteriophages from different species and families. Each of the them was represented by a complete nucleotide sequence in FASTA format.
Application of continuous embeddings of DNA sequences and feature ranking with recursive feature elimination allowed us to prepare optimal datasets for training a new Support Vector Machine model which resulted in the creation of a new accurate lifecycle (virulent or temperate) classifier with ~98% of accuracy on both sets: training and validation.
To confirm that score, lifecycle classifier was also tested on unseen data delivered by Proteon Pharmaceuticals S.A. company. All of 61 samples (49 virulent, 12 temperate) were predicted correctly by the model with 97% of confidence level, in accordance with experts lifecycle assumptions.
A current methodology opens up opportunities for further research in the field of phage classification.