AI Tool Could Speed Discovery of New Cancer Drug Targets
Many cancer therapies work by docking into specific “binding pockets” on the surface of proteins that are driving the disease. Scientists are continually searching for new binding sites for future cancer drugs, but this process can be slow and cumbersome.
Now, researchers at Dana-Farber Cancer Institute, in collaboration with scientists at MIT and École Polytechnique Fédérale de Lausanne, have developed an artificial intelligence (AI) tool called AF2BIND that could help speed up this search, potentially overcoming a major bottleneck in cancer drug discovery.

In a study published in Nature Methods, the team used the tool to uncover thousands of previously unknown binding sites in proteins associated with diseases across the human proteome, that is, the full set of proteins produced by the body.
“We wanted to make a comprehensive list of all the proteins that have binding sites that could be potentially targeted by a small molecule,” says Nicholas Polizzi, PhD, an independent investigator in the Department of Cancer Biology at Dana-Farber, and lead author of the study. “Being able to track which proteins in the human proteome might be druggable—and where—would help focus drug discovery efforts.”
For patients, this could mean identifying new treatment targets faster, especially for cancers with limited options.
The power of prediction
In drug discovery, proteins are often described as “locks” and drugs as “keys.” For a drug to work, scientists must find a specific pocket on a protein’s surface where a small molecule can fit precisely to alter the protein’s activity.
Traditionally, scientists have searched for these binding sites by comparing proteins to similar ones that have already been studied, or by calculating water-accessible cavities using a protein’s 3D structure. But when a protein is unlike anything scientists have seen before, it becomes harder to predict where a drug might bind.
Recent advances in AI have made it possible to predict the 3D structures of proteins from their amino acid sequences, giving scientists an unprecedented view of their shapes, without having to wait for an experimentally determined structure, which may take years or never materialize. Even with these detailed maps, however, it is still not always clear where a “druggable” binding site is located, Polizzi explains.
To address this, the research team turned to AlphaFold2, an AI system that can predict protein structures. While AlphaFold2 does not directly identify where small molecules bind, the team found that it contains hidden information about how proteins interact with other molecules.
The researchers developed AF2BIND to tap into that information, doing so by predicting a protein’s 3D structure together with what they call “bait” amino acids. These bait amino acids can approximate the kinds of interactions a small molecule might make with the protein. What AlphaFold2 does with the baits provided the researchers with “features” for training a new model (AF2BIND) to predict binding-site residues.
When applied across the human proteome, AF2BIND predicted 20,302 binding sites within 13,686 proteins. Of these sites, over 8,000 sites, many in disease-related proteins, had never been identified using existing methods.
“What sets our model apart is that we are predicting less obvious binding sites that can be a bit shallower than traditional methods like,” Polizzi says. “We are not using traditional biophysical features like pocket depth. Instead, we rely on abstract features learned by a neural network, which helps reduce human bias in the prediction. This allows us to identify novel small-molecule binding sites that might otherwise have been missed.”
Speeding up drug discovery
Looking ahead, the researchers are focused on pushing the boundaries of what the technology can uncover, especially when it comes to hidden, or cryptic, binding sites.
“These are pockets that don’t appear to be there in a three-dimensional structure, but we know the protein binds to a molecule because the pocket opens up,” Polizzi explains. “Our model already does a decent job at predicting some of these, and that’s something we want to improve in the next version.”
Targeting these hidden sites could open the door to a new class of highly specific drugs, particularly for proteins that have long been considered difficult, or even impossible, to target.
While tools like AF2BIND won’t eliminate the time required for clinical trials, Polizzi says, they could point scientists in the right direction for starting trials faster, and using potentially more specific molecules that target unique sites that have little overlap with other proteins.
“Drug discovery still remains slow because of clinical trials, which this won’t speed up,” he notes. “But it can speed up getting to the right binding site.”
