ChemistryDrug Discovery Tools

Can This Molecule Be Made? How to Check Synthetic Accessibility Before You Order Synthesis

Learn how synthetic accessibility scores predict whether a molecule can be synthesized. SA score calculator, retrosynthesis basics, and real drug examples with API walkthrough.

Ryan Bethencourt
April 8, 2026
9 min read

The Synthesizability Problem in Drug Discovery

You have designed the perfect molecule on your computer. It binds the target with nanomolar affinity, passes every ADMET filter, and sits squarely in drug-like chemical space. There is just one problem – nobody can actually make it in a lab.

This is one of the most common and costly failures in computational drug discovery. Teams spend weeks optimizing virtual molecules only to discover that their top candidates require twenty-step synthesis routes, exotic reagents, or reactions that do not reliably work at scale. The disconnect between computational design and synthetic reality has killed more drug programs than poor binding affinity.

Synthetic accessibility scoring exists to catch this problem early. Before you invest in docking studies, ADMET profiling, or lead optimization, you can check whether your molecule is realistically synthesizable. A two-second computation can save months of wasted effort.

In this guide, we will explain how synthetic accessibility scores work, walk through real examples from approved drugs, and show you how to check any molecule using the SciRouter API. Whether you are screening a generative chemistry output of 500 molecules or evaluating a single lead candidate, SA scoring should be one of your first filters.

What Is a Synthetic Accessibility Score?

The synthetic accessibility (SA) score was introduced by Peter Ertl and Ansgar Schuffenhauer at Novartis in 2009. It estimates how difficult a molecule would be to synthesize using conventional organic chemistry, producing a single number on a scale from 1 (trivial to make) to 10 (essentially impossible).

The algorithm works by combining two components. The first is a fragment score that measures how common the molecule's substructures are in known compounds. Molecules built from frequently occurring fragments – the kind of pieces that appear in commercial building block catalogs – score well. Unusual or unprecedented fragment combinations score poorly. The fragment frequencies are typically derived from large databases like PubChem or ChEMBL, giving the model a statistical picture of what chemists actually make.

The second component is a complexity penalty that accounts for structural features known to make synthesis harder. This includes the number of stereocenters (each one can double the difficulty of a synthesis), macrocyclic rings (notoriously hard to close), spiro and bridged ring systems, and the overall size of the molecule. Large, complex molecules with many chiral centers receive higher (worse) scores.

The SA Score Scale in Practice

  • 1.0 – 2.0 (Very Easy): Simple molecules with common functional groups. Think aspirin, ibuprofen, and most commodity chemicals. Typically one to three synthetic steps from commercial starting materials.
  • 2.0 – 3.0 (Easy): Straightforward drug-like molecules. Most fragment-based and HTS-derived leads fall here. A competent medicinal chemistry CRO can make these without difficulty.
  • 3.0 – 4.0 (Moderate): Requires some planning but uses well-established reactions. Many approved drugs sit in this range. Five to ten synthetic steps are typical.
  • 4.0 – 6.0 (Difficult): Challenging synthesis requiring specialist expertise. May involve air-sensitive reactions, difficult ring closures, or tricky stereochemistry. Budget for multiple attempts and route scouting.
  • 6.0 – 8.0 (Very Difficult): Often requires novel reaction development or lengthy linear routes. Natural product-inspired molecules frequently land here. Synthesis campaigns can take months.
  • 8.0 – 10.0 (Impractical): Theoretical molecules that would be extremely difficult to synthesize with current technology. If your generative model produces molecules in this range, filter them out.

Real Drug Examples: SA Scores Across the Spectrum

To make the SA scale concrete, let us look at actual marketed drugs and their approximate SA scores. These numbers illustrate the range of synthetic complexity that the pharmaceutical industry has successfully tackled.

Aspirin (SA Score ~1.2)

Aspirin (acetylsalicylic acid, SMILES: CC(=O)Oc1ccccc1C(=O)O) is about as easy as synthesis gets. It is a single acetylation of salicylic acid – one reaction, cheap reagents, near-quantitative yield. Its SA score of roughly 1.2 reflects this simplicity. The molecule has no stereocenters, no unusual rings, and every fragment is abundant in chemical databases.

Ibuprofen (SA Score ~1.6)

Ibuprofen (CC(C)Cc1ccc(cc1)C(C)C(=O)O) is another straightforward synthesis. The Boots process uses three steps from isobutylbenzene. The BHC process, developed later, reduced it to just two catalytic steps with nearly perfect atom economy. Despite being a blockbuster drug, ibuprofen's molecular structure is simple enough that it is often used as a teaching example in undergraduate organic chemistry courses.

Celecoxib (SA Score ~2.8)

Celecoxib (Celebrex) is a COX-2 inhibitor with a diaryl pyrazole core. Its synthesis is more involved than aspirin but still manageable – typically four to five steps. The pyrazole ring construction and sulfonamide installation are well-precedented reactions. An SA score around 2.8 correctly identifies this as a molecule that any pharmaceutical chemistry group could produce without significant difficulty.

Atorvastatin (SA Score ~4.2)

Atorvastatin (Lipitor) starts to show real synthetic complexity. The molecule has two stereocenters in the dihydroxy acid side chain that must be set with high enantioselectivity. The original Pfizer synthesis required about twelve steps. An SA score around 4.2 puts it in the "difficult but achievable" category, which matches reality – Pfizer invested heavily in process chemistry to make manufacturing economical.

Paclitaxel (SA Score ~7.8)

Paclitaxel (Taxol, CC1=C2C(C(=O)C3(C(CC4C(C3C(C(C2(C)C)(CC1OC(=O)C(C(C5=CC=CC=C5)NC(=O)C6=CC=CC=C6)O)O)OC(=O)C7=CC=CC=C7)(CO4)OC(=O)C)O)C)OC(=O)C) is the poster child for synthetic difficulty. The first total synthesis by Robert Holton required over 40 linear steps. Its SA score near 7.8 reflects the molecule's terrifying complexity: four fused rings, eleven stereocenters, and multiple sensitive functional groups. In practice, paclitaxel is produced by semi-synthesis from 10-deacetylbaccatin III extracted from yew tree needles, not by total synthesis.

Note
The gap between aspirin (SA ~1.2) and paclitaxel (SA ~7.8) spans the full range of what the pharmaceutical industry routinely produces. Most successful oral drugs cluster between 2.0 and 4.5 on the SA scale. If your computational designs consistently score above 5.0, your chemistry team will thank you for revisiting your design constraints.

Retrosynthesis: Beyond the Single Score

While SA scores tell you how hard a molecule is to make, retrosynthetic analysis tells you how to make it. Retrosynthesis works backward from the target molecule, identifying strategic bond disconnections that break it into simpler precursors. Each disconnection corresponds to a known chemical reaction run in reverse. The process continues recursively until all precursors are commercially available starting materials.

E.J. Corey formalized retrosynthetic analysis in the 1960s, earning the 1990 Nobel Prize in Chemistry for his work. Today, AI-powered retrosynthesis tools like ASKCOS, IBM RXN, and Spaya automate this process. They use neural networks trained on millions of published reactions to propose synthetic routes, estimate yields, and flag problematic steps.

For drug discovery workflows, the recommended approach is to use SA scores as a fast first-pass filter and then run retrosynthetic analysis on your top candidates. Screen 1,000 molecules by SA score in seconds, identify the 50 that score below 4.0, and then invest the computational time to generate full synthetic routes for those 50.

When SA Scores and Retrosynthesis Disagree

Occasionally, a molecule will have a moderate SA score (say 3.5) but retrosynthetic analysis reveals that the most obvious route requires a reaction with notoriously low yield or selectivity. Conversely, a molecule with a higher SA score (say 5.0) might have a clever three-step route that a retrosynthesis engine discovers. This is why both tools are complementary. The SA score is your rapid screening heuristic; retrosynthesis is your detailed route planning tool.

Checking Synthetic Accessibility with SciRouter

SciRouter provides a dedicated synthesis-check endpoint that returns the SA score, a categorical feasibility label, and additional details about molecular complexity. You can call it from the Python SDK, the REST API, or through the MCP server for agent-based workflows. No local software installation is required.

Check SA score for a single molecule
import os, requests

API_KEY = os.environ["SCIROUTER_API_KEY"]
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Check synthetic accessibility for aspirin
result = requests.post(f"{BASE}/generate/synthesis-check", headers=HEADERS, json={
    "smiles": "CC(=O)Oc1ccccc1C(=O)O"
}).json()

print(f"Molecule: Aspirin")
print(f"SA Score: {result['sa_score']:.2f}")
print(f"Feasibility: {result['feasibility']}")
print(f"Stereocenters: {result['stereocenters']}")
print(f"Ring systems: {result['ring_systems']}")

The response includes the raw SA score (1–10 float), a human-readable feasibility category, and structural descriptors that contribute to the score. This gives you enough information to make a quick go/no-go decision on each molecule.

Batch Screening: Filtering Generative Chemistry Output

The real power of SA scoring comes when you apply it to large sets of molecules. After running a generative chemistry model like REINVENT4, you might have hundreds of candidates. SA scoring lets you immediately discard the molecules that would be impractical to synthesize.

Batch SA screening of generated molecules
import os, requests

API_KEY = os.environ["SCIROUTER_API_KEY"]
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Suppose we have candidate molecules from a generative run
candidates = [
    {"name": "Aspirin", "smiles": "CC(=O)Oc1ccccc1C(=O)O"},
    {"name": "Ibuprofen", "smiles": "CC(C)Cc1ccc(cc1)C(C)C(=O)O"},
    {"name": "Caffeine", "smiles": "Cn1c(=O)c2c(ncn2C)n(C)c1=O"},
    {"name": "Celecoxib", "smiles": "Cc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(S(N)(=O)=O)cc2)cc1"},
    {"name": "Candidate_A", "smiles": "O=C(NC1CCCCC1)c1ccc(F)cc1"},
    {"name": "Candidate_B", "smiles": "CC(=O)Nc1ccc(O)cc1"},
]

synthesizable = []

for mol in candidates:
    result = requests.post(
        f"{BASE}/generate/synthesis-check",
        headers=HEADERS,
        json={"smiles": mol["smiles"]}
    ).json()

    sa = result["sa_score"]
    label = result["feasibility"]
    status = "PASS" if sa < 4.0 else "REVIEW" if sa < 6.0 else "REJECT"

    print(f"{mol['name']:20s}  SA={sa:.2f}  ({label:12s})  [{status}]")

    if sa < 4.0:
        synthesizable.append({**mol, "sa_score": sa})

print(f"\n{len(synthesizable)} of {len(candidates)} molecules passed SA filter (< 4.0)")
Tip
Set your SA cutoff based on your project context. For early-stage hit finding, a threshold of 4.0 keeps your options open. For late-stage lead optimization where you need to synthesize dozens of analogs quickly, tighten the cutoff to 3.0 or below.

Combining SA Scores with Molecular Properties

SA scoring is most powerful when combined with other property filters. A molecule that is easy to synthesize but fails Lipinski's rules is just as useless as one that is drug-like but impossible to make. The SciRouter API lets you chain multiple endpoints to build a comprehensive filter.

Combined SA + drug-likeness filter
import os, requests

API_KEY = os.environ["SCIROUTER_API_KEY"]
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

smiles_list = [
    "CC(=O)Oc1ccccc1C(=O)O",           # Aspirin
    "CC(C)Cc1ccc(cc1)C(C)C(=O)O",       # Ibuprofen
    "Cn1c(=O)c2c(ncn2C)n(C)c1=O",       # Caffeine
    "O=C(NC1CCCCC1)c1ccc(F)cc1",         # Candidate_A
]

for smi in smiles_list:
    # Get molecular properties
    props = requests.post(f"{BASE}/chemistry/properties",
        headers=HEADERS, json={"smiles": smi}).json()

    # Get SA score
    sa = requests.post(f"{BASE}/generate/synthesis-check",
        headers=HEADERS, json={"smiles": smi}).json()

    mw = props["molecular_weight"]
    logp = props["logp"]
    hbd = props["hbd"]
    hba = props["hba"]
    sa_score = sa["sa_score"]

    # Apply combined filter
    lipinski_ok = mw < 500 and logp < 5 and hbd <= 5 and hba <= 10
    sa_ok = sa_score < 4.0

    verdict = "PASS" if (lipinski_ok and sa_ok) else "FAIL"

    print(f"SMILES: {smi}")
    print(f"  MW={mw:.1f}  LogP={logp:.2f}  HBD={hbd}  HBA={hba}  SA={sa_score:.2f}")
    print(f"  Lipinski: {'PASS' if lipinski_ok else 'FAIL'}  |  SA: {'PASS' if sa_ok else 'FAIL'}  |  Overall: {verdict}")
    print()

This pattern – screen by SA score first, then calculate drug-likeness properties – is the standard workflow in computational medicinal chemistry. By eliminating unsynthesizable compounds early, you save API calls on the more detailed property analysis and focus your resources on molecules that have a realistic path to the lab.

When to Trust (and When to Question) SA Scores

SA scores are powerful screening tools, but understanding their limitations will make you a better computational chemist. Here are the situations where SA scores work well and where you should exercise caution.

SA Scores Work Well For

  • Drug-like small molecules: The SA algorithm was trained primarily on pharmaceutical-relevant compounds. For typical drug discovery targets (MW 200–600, 0–3 rings, common heteroatoms), SA scores are reliable guides.
  • Comparative ranking: Even when absolute SA values are debatable, the relative ranking of molecules by SA score is usually correct. If molecule A scores 2.5 and molecule B scores 5.5, molecule A is almost certainly easier to make.
  • Filtering generative output: When a generative model produces 500 molecules, SA scoring reliably separates the synthesizable candidates from the fantastical ones. This is the highest-value use case.
  • Early-stage triage: During hit identification, SA scores help prioritize which virtual hits to attempt in the lab first.

SA Scores Can Be Misleading For

  • Natural products: Many natural products have high SA scores because their de novo synthesis is genuinely hard. But they may be available by extraction or semi-synthesis, which the SA algorithm does not consider.
  • Peptides and macrocycles: SA scoring was not designed for peptide-like molecules or large macrocycles. These compound classes have specialized synthesis strategies (solid-phase peptide synthesis, ring-closing metathesis) that the fragment-based algorithm does not capture.
  • Reagent availability: A molecule might score well on SA but require a reagent that is out of stock, restricted, or prohibitively expensive. SA scores do not account for supply chain realities.
  • Scale-up considerations: A reaction that works at milligram scale in a research lab may fail at kilogram scale in manufacturing. SA scores reflect bench-scale feasibility only.
Warning
Never use SA scores as the sole criterion for advancing a compound. They are one signal among many. The best workflow combines SA scoring with retrosynthetic route planning, literature precedent checks, and ultimately the judgment of an experienced medicinal chemist.

Integrating SA Scoring into Your Drug Discovery Pipeline

The most effective drug discovery pipelines check synthetic accessibility at multiple stages. Here is how SA scoring fits into a complete computational workflow.

Stage 1: Virtual Library Filtering

Before running expensive docking or binding affinity predictions, filter your virtual library by SA score. Remove anything above 5.0 (or 4.0 for conservative programs). This can eliminate 30–50% of candidates, dramatically reducing downstream computation costs.

Stage 2: Post-Generation Triage

After running a generative model like REINVENT4, immediately score all output molecules for SA. Generative models sometimes produce exotic structures that score well on binding objectives but are synthetic nightmares. Catch these before investing in further profiling.

Stage 3: Lead Optimization Guardrails

During lead optimization, medicinal chemists propose analogs to improve potency, selectivity, or ADMET properties. Each proposed analog should be checked against SA thresholds to ensure the optimization is not drifting toward unsynthesizable chemical space. Set an SA ceiling and flag any designs that exceed it.

Stage 4: Candidate Selection

When choosing which compounds to advance to synthesis, use SA scores alongside predicted activity, ADMET profiles, and intellectual property landscape. A compound with slightly lower predicted affinity but an SA score of 2.0 may be a better investment than one with marginally better affinity and an SA score of 5.5 – because you will have it in hand weeks sooner.

The Molecular Design Lab on SciRouter

SciRouter's Molecular Design Lab provides a visual interface that integrates SA scoring directly into the molecular design workflow. Generate molecules, view their SA scores alongside other properties, and filter interactively without writing any code.

The lab displays SA scores with color-coded indicators: green for easy (below 3.0), yellow for moderate (3.0–5.0), and red for difficult (above 5.0). You can sort and filter by SA score, combine it with Lipinski filters, and export your shortlisted molecules for further analysis or synthesis ordering.

For programmatic access, the same data is available through the REST API and Python SDK. Whether you prefer a graphical interface or a scripted pipeline, SciRouter gives you fast, reliable SA scoring without any local software installation.

Next Steps

Synthetic accessibility scoring is one piece of a larger molecular evaluation toolkit. Combine it with Molecular Properties for drug-likeness assessment, ADMET Prediction for safety and pharmacokinetic profiling, and Molecule Generator to create novel candidates that are optimized for both activity and synthesizability from the start.

To learn more about the molecular property calculations that complement SA scoring, see our Lipinski Rule of Five Calculator guide or the ADMET Prediction Explained deep dive.

Sign up for a free SciRouter API key and start checking synthetic accessibility today. With 500 free API calls per month, you can screen entire virtual libraries before committing a single dollar to synthesis.

Frequently Asked Questions

What is a synthetic accessibility score?

A synthetic accessibility (SA) score is a numerical estimate of how difficult it would be to synthesize a given molecule in a real chemistry lab. The most common SA scoring system uses a scale from 1 (very easy to synthesize) to 10 (extremely difficult or practically impossible). It considers factors like fragment complexity, ring systems, stereocenters, and the availability of known synthetic building blocks.

What SA score is considered synthesizable?

Generally, molecules with SA scores below 4.0 are considered easy to moderately easy to synthesize. Scores between 4.0 and 6.0 represent moderate difficulty that a skilled medicinal chemistry team can usually handle. Scores above 6.0 indicate significant synthetic challenges, and molecules scoring above 8.0 are often practically unsynthesizable with current methods. Most approved oral drugs have SA scores between 1.0 and 4.5.

How accurate are SA score predictions?

SA scores are useful heuristics, not guarantees. They correlate well with the number of synthetic steps and overall difficulty for drug-like molecules, but they can miss context that a medicinal chemist would catch. For example, a molecule might have a low SA score but require an expensive or hazardous reagent. Use SA scores for initial filtering and prioritization, then consult a synthetic chemist for your top candidates.

What is the difference between SA score and retrosynthesis?

An SA score gives you a single number estimating overall difficulty. Retrosynthesis goes further by proposing actual synthetic routes, breaking the target molecule into simpler precursors step by step until commercially available starting materials are reached. Retrosynthesis is more informative but also more computationally expensive. SA scores are ideal for rapid screening of large compound libraries, while retrosynthesis is best for detailed planning of your top candidates.

Can SciRouter check synthesis feasibility for a batch of molecules?

Yes. The SciRouter synthesis-check endpoint accepts individual SMILES strings, and you can loop through a list of molecules programmatically using the Python SDK or REST API. For high-throughput screening, you can evaluate hundreds of molecules per minute. The API returns both the SA score and a categorical feasibility label (easy, moderate, difficult, very difficult) for each molecule.

Why do some approved drugs have high SA scores?

Some approved drugs like paclitaxel (Taxol) have high SA scores because their total synthesis from scratch is genuinely difficult. However, these drugs are often produced through semi-synthesis from natural precursor compounds or by fermentation, routes that the SA scoring algorithm does not consider. This is why SA scores should be interpreted as estimates of de novo chemical synthesis difficulty, not as absolute measures of whether a drug can be manufactured.

Try this yourself

500 free credits. No credit card required.