Skip to main content

3D-Scaffold, a Deep Learning Approach to Identify Novel Molecules for Therapeutics

Data science tools for generating a target-focused chemical space can streamline early stages of drug development.

illustration of a blue and white pill encased in glass with molecules arounding it and blue images of a brain and COVID
Scientists produced synthesizable molecules for therapeutic candidates using a desired core 3D structure, or scaffold, critical for targeting two key proteins in SARS-CoV-2. (Illustration by Nathan Johnson|Pacific Northwest National Laboratory)

The Science  

With new pathogens on the horizon, creating new pharmaceuticals to combat them is a pressing need. The discovery of a new therapeutic drug is a long and expensive process that can take many years before clinical approval. To assist in this endeavor, a multi-institutional team has developed a deep learning framework to identify novel molecules as drug candidates. Called 3D-Scaffold, the framework creates 3D coordinates of new molecules around a core structure, or scaffold. These coordinates can be directly tested against a protein target using computational screening. Using only a small amount of training data, this framework produced 3D coordinates of molecules calculated to have high binding affinity against two key proteins in the novel coronavirus, SARS-CoV-2. 

The Impact 

3D-Scaffold explores a vast amount of chemical space using computer simulation to screen vast libraries of molecular structures for binding to a protein target. With 3D-Scaffold, researchers can generate novel, synthesizable structures that are likely to be effective against their targets by learning and reasoning from existing FDA-approved data sets. 3D-Scaffold is the first generative artificial intelligence model with medicinal chemistry application that can generate 3D coordinates of target-specific therapeutics and a library of activity-based probes with desired scaffolds. 

Summary 

3D-Scaffold is a deep generative model that produces 3D coordinates of novel molecules with desirable biophysical and biochemical properties. This framework preserves critical structural scaffolds during the generation process. First, 3D-Scaffold analyzes the chemical environment around the scaffold. Then it predicts distributions for the next type of atom to add. Finally, it sequentially attaches new atoms around the central scaffold. Molecules generated with 3D-Scaffold were predominantly valid, unique, novel, and synthesizable. They had drug-like properties similar to the molecules in the training set.

Using domain-specific data sets as training sets, scientists generated covalent and non-covalent antiviral inhibitors targeting viral proteins. Then they performed virtual screening via docking simulations. The generated structures interacted favorably against SARS-CoV-2 protein targets. Most importantly, the model performs well with relatively small volumes of training data and generates drug candidates that mimic non-structured peptides with similar motifs, which can covalently bind protein targets. Further training and optimization of this framework can accelerate the identification and optimization of leads in drug discovery and development across a range of therapeutic targets. This research used high-performance computing resources at the Environmental Molecular Sciences Laboratory (EMSL), a DOE Office of Science User Facility located at Pacific Northwest National Laboratory. 

Contact 

Neeraj Kumar 
Pacific Northwest National Laboratory 
neeraj.kumar@pnnl.gov

Funding 

The DOE Office of Science supported this research through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on the response to COVID-19, with funding provided by the Coronavirus CARES Act. The research was performed using capabilities at EMSL, a DOE Office of Science user facility sponsored by the Biological and Environmental Research program.

Publication

R. P. Joshi, et al. “3D-Scaffold: A Deep Learning Framework to Generate 3D Coordinates of Drug-like Molecules with Desired Scaffolds.” The Journal of Physical Chemistry (2021). [DOI: 10.1021/acs.jpcb.1c06437] 

Related Link

From Molecule to Medicine via Machine Learning, Pacific Northwest National Laboratory web feature