Entry Date:
August 20, 2019

Machine Learning for Pharmaceutical Discovery and Synthesis Consortium


Machine Learning for Pharmaceutical Discovery and Synthesis Consortium is a collaboration between the pharmaceutical and biotechnology industries and the departments of Chemical Engineering, Chemistry, and Computer Science at the Massachusetts Institute of Technology. This collaboration will facilitate the design of useful software for the automation of small molecule discovery and synthesis.

Chemists use expert knowledge and conduct manual tweaking of the structure of molecules, adding and subtracting functional groups — groups of atoms and bonds with specific properties. Even when they use systems that predict optimal desired properties, chemists still need to do each modification step themselves. This can take a significant amount of time at each step and still not produce molecules with desired properties.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Department of Electrical Engineering and Computer Science (EECS) have developed a model that better selects lead molecule candidates based on desired properties. It also modifies the molecular structure needed to achieve a higher potency, while ensuring the molecule is still chemically valid.

The model basically takes as input molecular structure data and directly creates molecular graphs -- detailed representations of a molecular structure, with nodes representing atoms and edges representing bonds. It breaks those graphs down into smaller clusters of valid functional groups that it uses as “building blocks” that help it more accurately reconstruct and better modify molecules.