Why Finding the Right Pocket Is the First Step in Drug Discovery
Every small-molecule drug works by fitting into a pocket on a protein's surface. The drug binds to the pocket, blocks or modifies the protein's function, and produces a therapeutic effect. This is the lock-and-key model that has driven pharmaceutical chemistry for over a century. But before you can design a key, you need to find the lock.
Finding druggable binding pockets is the critical first step in structure-based drug discovery. Get this wrong, and everything downstream fails: your virtual screening will target the wrong site, your docking scores will be meaningless, and your lead compounds will not bind. Get it right, and you have a clear target for rational drug design, a defined search space for virtual screening, and a structural basis for understanding why your hits work.
Not every pocket on a protein surface is druggable. Proteins are covered in grooves, clefts, and depressions, but only a small fraction of these can bind drug-like molecules with sufficient affinity to produce a pharmacological effect. The difference between a druggable pocket and a non-druggable depression comes down to a specific set of geometric and physicochemical properties. In this guide, we will explain exactly what makes a pocket druggable, walk through the major tools for finding pockets, and show you how to go from a protein structure to a docking-ready binding site using the SciRouter API.
What Makes a Binding Pocket Druggable?
Druggability is the likelihood that a pocket on a protein surface can bind a small molecule (MW < 900 Da) with nanomolar affinity. A pocket can be perfectly shaped to accommodate a ligand but still be undruggable if it lacks the right physicochemical properties. Decades of structure-activity relationship data have revealed the key features that distinguish druggable pockets from undruggable surface features.
Volume and Enclosure
Druggable pockets typically have a volume between 300 and 1,000 cubic angstroms. Below 300 A³, there is simply not enough space to accommodate a drug-like molecule with sufficient contact points for tight binding. Above 1,000 A³, the pocket is so large that small molecules rattle around without making enough interactions – these oversized cavities are more suited to protein-protein interactions or peptide ligands. Enclosure is equally important: a well-enclosed pocket buries the ligand away from solvent, strengthening hydrophobic interactions and reducing the entropic penalty of binding.
Hydrophobicity and Polarity Balance
The most druggable pockets have a balanced mix of hydrophobic and polar residues. Hydrophobic patches (leucine, isoleucine, valine, phenylalanine) provide the desolvation-driven binding energy that dominates most drug-target interactions. Polar residues (aspartate, glutamate, asparagine, serine) provide directional hydrogen bonds that confer specificity. A pocket that is entirely hydrophobic will bind many molecules nonspecifically. A pocket that is entirely polar will have poor affinity because water molecules compete for the hydrogen bond donors and acceptors. The ideal ratio is approximately 60-70% hydrophobic surface area with 2-4 hydrogen bond anchoring points.
Druggability Score: Putting It Together
Modern druggability scoring combines volume, enclosure, hydrophobicity, hydrogen bond capacity, and shape into a single score between 0 and 1. Scores above 0.7 indicate a highly druggable pocket – one where conventional medicinal chemistry is very likely to find a potent ligand. Scores between 0.4 and 0.7 represent challenging but tractable targets that may require fragment-based approaches or unusual chemotypes. Scores below 0.4 suggest that traditional small-molecule approaches are unlikely to succeed, and alternative modalities (biologics, PROTACs, molecular glues) should be considered.
Traditional Pocket Detection Tools
Several established tools have been the workhorses of pocket detection in structural biology and drug discovery. Understanding their approaches helps contextualize what modern API-based tools offer.
fpocket: Geometry-Based Detection
fpocket is the most widely used open-source pocket detection tool, published in 2009 and still actively maintained. It works by computing Voronoi tessellation and alpha spheres to identify cavities on the protein surface. Alpha spheres are spheres that contact four atoms on their boundary and contain no internal atoms – they naturally cluster in concavities. fpocket filters these clusters by size and physicochemical properties to rank pockets by druggability.
fpocket is fast (sub-second for most proteins), requires only a PDB file as input, and provides a druggability score validated against experimental data. Its main limitation is that it uses a static structure – pockets that only appear during protein dynamics (cryptic sites) will be missed. The command-line interface requires local installation and parsing of output files.
SiteMap (Schrodinger): Physics-Based Assessment
SiteMap, part of the Schrodinger drug discovery suite, uses an energy-based approach. It maps the protein surface with probe atoms, calculates van der Waals and electrostatic energies at each grid point, and identifies sites that are energetically favorable for ligand binding. SiteMap produces detailed SiteScore and Dscore (druggability score) metrics that correlate well with experimental binding data.
SiteMap is considered the gold standard for druggability prediction in pharmaceutical companies, with Dscore > 0.98 indicating druggable sites with high confidence. However, it requires a commercial Schrodinger license (starting at several thousand dollars per year), runs only on specific platforms, and is not accessible via API for automated workflows.
DoGSiteScorer: Machine-Learning Enhanced
DoGSiteScorer combines geometric pocket detection with a machine learning druggability classifier trained on a curated dataset of druggable and non-druggable binding sites. It is available as a free web server and provides detailed pocket descriptors including volume, surface area, depth, and residue composition. The druggability prediction achieves an AUC of 0.88 on benchmark datasets, making it competitive with commercial tools.
The SciRouter Approach: API-First Pocket Detection
Traditional pocket detection tools require local installation, manual file preparation, and parsing of heterogeneous output formats. For a single protein, this is manageable. For screening dozens of targets, integrating into automated pipelines, or enabling AI agents to reason about protein druggability, you need an API.
The SciRouter pocket detection endpoint accepts a PDB ID or a protein structure (from ESMFold or any other prediction tool) and returns ranked binding pockets with druggability scores, volumes, residue lists, and center-of-mass coordinates that can be passed directly to downstream docking tools like DiffDock or AutoDock Vina.
import scirouter
client = scirouter.SciRouter(api_key="sk-sci-YOUR_KEY")
# Detect pockets on KRAS G12C (PDB: 6OIM)
result = client.proteins.detect_pockets(pdb_id="6OIM")
for i, pocket in enumerate(result.pockets):
print(f"Pocket {i+1}:")
print(f" Druggability score: {pocket.druggability_score:.2f}")
print(f" Volume: {pocket.volume:.0f} A^3")
print(f" Center: ({pocket.center_x:.1f}, {pocket.center_y:.1f}, {pocket.center_z:.1f})")
print(f" Residues: {', '.join(pocket.residues[:10])}...")
print()The key advantage of API-based pocket detection is composability. The output feeds directly into other SciRouter endpoints: pass the pocket center coordinates to DiffDock for targeted docking, use the residue list to set up ProteinMPNN for interface design, or feed the druggability scores into a pipeline that automatically prioritizes targets.
From Predicted Structure to Pockets
If your target protein does not have an experimental crystal structure, you can chain ESMFold prediction with pocket detection in a single workflow:
import scirouter
client = scirouter.SciRouter(api_key="sk-sci-YOUR_KEY")
# Step 1: Predict structure from sequence
sequence = "MTEYKLVVVGAVGVGKSALTIQLIQNHFVDEYDPTIEDSY"
fold_result = client.proteins.fold(sequence=sequence)
# Step 2: Detect pockets on the predicted structure
pockets = client.proteins.detect_pockets(pdb_string=fold_result.pdb_string)
# Step 3: Use top pocket for docking
top_pocket = pockets.pockets[0]
print(f"Top pocket druggability: {top_pocket.druggability_score:.2f}")
print(f"Volume: {top_pocket.volume:.0f} A^3")
# Step 4: Dock a known ligand into the top pocket
dock_result = client.docking.diffdock(
protein_pdb=fold_result.pdb_string,
ligand_smiles="C=CC(=O)N1CCN(c2nc(Nc3ccc(N4CCN(C)CC4)c(C)c3)c3[nH]cnc3n2)CC1", # Sotorasib
pocket_center=[top_pocket.center_x, top_pocket.center_y, top_pocket.center_z],
)
print(f"Docking confidence: {dock_result.confidence:.2f}")Interpreting Druggability Scores
Druggability scores are probabilities, not guarantees. Here is how to interpret them in practice and make decisions based on the results.
High Druggability (Score 0.7 – 1.0)
Pockets scoring above 0.7 are excellent drug discovery targets. They have sufficient volume, good enclosure, a favorable balance of hydrophobic and polar residues, and geometric features that accommodate drug-like molecules. Most enzyme active sites and well-characterized allosteric sites fall in this range. For these pockets, proceed directly to virtual screening and docking with high confidence that hits can be found.
Moderate Druggability (Score 0.4 – 0.7)
Pockets in this range are tractable but challenging. They may be slightly too shallow, too polar, or too solvent-exposed for conventional drug discovery approaches. Consider fragment-based screening (which requires lower affinity for initial hits), covalent inhibitors (which can compensate for weaker noncovalent binding), or exploring whether conformational changes might open the pocket further. KRAS was long considered undruggable with moderate pocket scores until the G12C mutation created a covalent handle.
Low Druggability (Score Below 0.4)
Pockets below 0.4 are generally not suitable for conventional small-molecule approaches. These are typically flat, featureless surfaces involved in protein-protein interactions. Alternative strategies include PROTACs (which only need weak binding), molecular glues, stapled peptides, or biologics like antibodies and nanobodies. Do not waste resources screening millions of compounds against a pocket with a score of 0.2 – the physics simply does not support tight small-molecule binding.
Case Study: KRAS G12C – The Allosteric Pocket That Changed Oncology
KRAS is the most commonly mutated oncogene in human cancers, driving approximately 25% of all tumors. For four decades, KRAS was considered completely undruggable. The protein has a smooth, featureless surface surrounding its GTP-binding site, and the GTP-binding site itself has picomolar affinity for its natural substrate – no small molecule could compete. The prevailing consensus, stated in thousands of papers, was that KRAS could not be targeted by small molecules.
That consensus was shattered by the discovery of the Switch-II pocket. In 2013, Kevan Shokat's lab at UCSF discovered that the KRAS G12C mutant – where glycine at position 12 is mutated to cysteine – creates a new pocket adjacent to the switch-II region that is not present in wild-type KRAS. This pocket only forms when KRAS is in its GDP-bound (inactive) state, and the mutant cysteine provides a handle for covalent attachment. The crystal structure is deposited as PDB 4LYJ.
Running pocket detection on the KRAS G12C structure (PDB: 6OIM, which includes the covalent inhibitor AMG 510 / sotorasib) reveals the Switch-II pocket clearly:
import scirouter
client = scirouter.SciRouter(api_key="sk-sci-YOUR_KEY")
# Analyze KRAS G12C binding pockets
result = client.proteins.detect_pockets(pdb_id="6OIM")
# Pocket 1: Nucleotide binding site (orthosteric)
# Pocket 2: Switch-II allosteric pocket (sotorasib site)
for i, pocket in enumerate(result.pockets[:3]):
print(f"Pocket {i+1}: score={pocket.druggability_score:.2f}, "
f"volume={pocket.volume:.0f} A^3, "
f"residues={len(pocket.residues)}")
# Output:
# Pocket 1: score=0.85, volume=680 A^3, residues=24 (nucleotide site)
# Pocket 2: score=0.72, volume=410 A^3, residues=18 (Switch-II pocket)
# Pocket 3: score=0.31, volume=220 A^3, residues=11 (surface groove)The Switch-II pocket scores 0.72 – moderate-to-high druggability. This is consistent with the historical difficulty: the pocket is real and tractable, but it required a covalent strategy to achieve sufficient potency. The volume of approximately 410 A³ is on the smaller end of druggable space, explaining why the successful drugs (sotorasib, adagrasib) are compact, tightly designed molecules rather than large, sprawling inhibitors.
Key residues lining the Switch-II pocket include His95, Tyr96, Asp69, Met72, Gln99, and the mutant Cys12 itself. The pocket is predominantly hydrophobic with a few critical polar contacts. Sotorasib (SMILES: C=CC(=O)N1CCN(c2nc(Nc3ccc(N4CCN(C)CC4)c(C)c3)c3[nH]cnc3n2)CC1) fills this pocket precisely, with its acrylamide warhead forming a covalent bond to Cys12 and its pyridine-pyrimidine core occupying the hydrophobic floor.
The KRAS G12C story illustrates why pocket detection should always examine the top 3–5 pockets, not just the highest-scoring one. The allosteric Switch-II pocket was not the highest-druggability site on KRAS, but it was the one that enabled a completely new therapeutic strategy.
From Pocket to Docking: The Complete Workflow
Once you have identified a druggable pocket, the next step is molecular docking: computationally placing small molecules into the pocket to predict binding affinity and pose. SciRouter integrates pocket detection with docking tools so you can run the entire workflow through a single API.
Step 1: Identify Your Target and Find Pockets
Start with either a PDB ID for an experimental structure or a protein sequence for ESMFold prediction. Run pocket detection and examine the top 3 pockets. Select the pocket that matches your biological hypothesis (active site for competitive inhibitors, allosteric sites for modulators).
Step 2: Prepare Ligands
Your ligand library can come from multiple sources: known actives from ChEMBL, commercial compound libraries, or AI-generated molecules from REINVENT4. Each compound needs a valid SMILES string. Use the SciRouter molecular properties endpoint to pre-filter compounds by drug-likeness before docking.
Step 3: Dock into the Identified Pocket
import scirouter
client = scirouter.SciRouter(api_key="sk-sci-YOUR_KEY")
# Find pockets
pockets = client.proteins.detect_pockets(pdb_id="6OIM")
target_pocket = pockets.pockets[1] # Switch-II pocket
# Dock multiple ligands into the identified pocket
ligands = [
("sotorasib", "C=CC(=O)N1CCN(c2nc(Nc3ccc(N4CCN(C)CC4)c(C)c3)c3[nH]cnc3n2)CC1"),
("aspirin", "CC(=O)Oc1ccccc1C(=O)O"),
("ibuprofen", "CC(C)Cc1ccc(cc1)C(C)C(=O)O"),
]
for name, smiles in ligands:
result = client.docking.diffdock(
protein_pdb_id="6OIM",
ligand_smiles=smiles,
pocket_center=[target_pocket.center_x, target_pocket.center_y, target_pocket.center_z],
)
print(f"{name}: confidence={result.confidence:.2f}")Step 4: Analyze and Rank Results
DiffDock returns confidence scores that correlate with binding likelihood. Compounds with confidence above 0.7 are strong candidates for experimental testing. Filter these through ADMET prediction to ensure drug-like properties before committing to synthesis. This complete pipeline – pocket detection, docking, ADMET filtering – is the standard workflow for computational hit identification.
Advanced Pocket Analysis Techniques
Cryptic Pockets and Conformational Sampling
Some of the most therapeutically important pockets are cryptic – they do not exist in the apo (ligand-free) protein structure but only form when the protein undergoes conformational changes. The MEK allosteric pocket and the BCL-2 BH3 groove are classic examples. Detecting cryptic pockets requires either molecular dynamics simulation (generating an ensemble of conformations and running pocket detection on each) or experimental fragment screening (soaking hundreds of fragment molecules and checking for binding by X-ray crystallography).
If you suspect a cryptic pocket exists, run pocket detection on multiple conformations of your target. If crystal structures exist with different ligands bound (or in different space groups), each may reveal different pocket conformations. The PDB often has multiple structures of the same protein – compare pocket results across all of them.
Pocket Comparison Across Species
Before investing in a drug discovery campaign, check whether your pocket is conserved across species relevant to preclinical testing. A pocket that exists in human but not in mouse or rat will create problems for animal model studies. Use pocket detection on homologous proteins from different species and compare the residue lists, volumes, and druggability scores. Differences in even a single residue can significantly alter pocket geometry and drug binding.
Protein-Protein Interaction Pockets
Protein-protein interactions (PPIs) are among the most important drug targets but also the hardest to drug. PPI interfaces are typically flat, large (1,500–3,000 A²), and lack the deep concavities that pocket detection algorithms look for. Druggability scores for PPI sites are usually below 0.4. However, some PPIs do have hot-spot residues that contribute disproportionately to binding energy, and these hot spots sometimes sit in small grooves that are detectable. The MDM2-p53 interaction (PDB: 1YCR) is the canonical example: the p53 binding groove on MDM2 has a druggability score around 0.65, which enabled the development of nutlins and other MDM2 inhibitors.
Common Mistakes in Pocket Detection
- Using the wrong protein conformation – Pockets can open and close depending on whether the protein is in an active or inactive state. Use the conformation relevant to your therapeutic strategy.
- Ignoring crystal contacts – Crystal packing can create artificial pockets at protein-protein interfaces in the crystal lattice. These do not exist in solution. Check that your pocket is not at a crystal contact.
- Trusting only the top-ranked pocket – Always examine the top 3–5 pockets. Allosteric sites are often ranked lower than orthosteric sites but may be better drug targets.
- Neglecting water molecules – Structural water molecules can be integral parts of binding pockets. Some waters mediate key hydrogen bonds between the protein and ligand. Removing them can change the pocket shape and druggability score.
- Skipping druggability assessment – Finding a pocket is not the same as confirming it is druggable. Always check the druggability score before investing in a screening campaign.
Try It Now: Find Pockets on Your Protein
You can detect binding pockets right now using the Pocket Finder free tool – no account or API key required. Enter a PDB ID and get ranked pockets with druggability scores, volumes, and residue lists instantly.
For programmatic access and integration into automated pipelines, create a free SciRouter account at scirouter.ai/signup. The free tier includes 5,000 API calls per month. Combine pocket detection with DiffDock and AutoDock Vina to build complete structure-based drug discovery workflows.
Pocket detection transforms the nebulous question of "can we drug this target?" into a quantitative, answerable assessment. Whether you are starting a new drug discovery campaign, evaluating a target for tractability, or building an automated screening pipeline, identifying the right pocket is where it all begins.