Bio-Prodict is focused on delivering solutions for guiding scientific research in the field of protein engineering, molecular design and DNA diagnostics. We apply novel approaches to data mining, storage and analysis of protein data and combine these with state-of-the art analysis methods and visualization tools to create custom-built information systems for protein superfamilies.
I am currently employed at Bio-Prodict as a medior Data Scientist, where I use state of the art machine learning techniques to develop novel solutions for bioinformatics problems.
I am primarily involved in the production of the Helix product. I work in a team that builds on the results of my internships to predict pathogenicity for different protein variants.
Python | |
Backend programming | |
Visualization | |
Machine learning | |
Scikit-learn | |
Keras | |
PyTorch | |
Data engineering | |
NoSQL databases (MongoDB, Google Datastore) | |
Google Cloud Platform (Kubernetes, Google Compute) | |
SQL databases (Postgres, SQLite, MySQL) | |
Scientific reading/writing |
Matthew's Correlation Coefficient performance on an independent dataset of genes, not present in any dataset. Our ensemble outperforms all competitors.