Enter your email address below, and we'll email instructions for setting a new one.
PhageAI is an AI-driven software platform using advanced Machine Learning and Natural Language Processing techniques for deeper understanding of the bacteriophages genomics.
We invent Phage2Vec technology - phage language model for general usage - trained on 17 559 complete bacteriophage sequneces. Machine Learning models for lifecycle prediction were trained on 4 694 manually selected bacteriophages from different species and families. Each of the sample was represented by a complete nucleotide sequence in FASTA format.
Application of continuous embeddings of DNA sequences allowed us to prepare optimal datasets for training a new Support Vector Machine model which resulted in the creation of a new accurate lifecycle (virulent, temperate or chronic) classifier with ~98% of accuracy on both sets: train and test (unseen data).
To confirm that score, lifecycle classifier was also tested on another unseen data delivered by Proteon Pharmaceuticals S.A. company. All of 61 samples (49 virulent, 12 temperate) were predicted correctly by the model with 97% of confidence level, in accordance with experts lifecycle assumptions.
A current methodology opens up opportunities for further research in the field of phage classification.