Peptide design sits in a strange gap in computational biology. The big structure-prediction breakthroughs of the last few years — AlphaFold, RoseTTAFold, and their descendants — were optimized for full-length proteins and domains. Short peptides are not that. They are usually unique, not conserved across species, and frequently disordered in isolation. The tools that dominate the long-protein world are not always the right match.
This guide explains why short peptides are a different design problem, how ESMFold addresses the short-sequence gap, and how to put together a practical fold-property-mutate-iterate workflow that runs in a browser.
The MSA problem
Most pre-ESMFold structure predictors depend heavily on multiple sequence alignments. Given a target sequence, the first step is usually to search public protein databases for homologs, stack them into an alignment, and feed the evolutionary co-variation signal into the model. Residues that change together across evolutionary time often contact each other in 3D — that is the signal the model exploits to predict structure.
This is brilliant when you have deep alignments. It falls apart when you do not. And peptides — particularly engineered peptides, bioactive fragments, or short signaling motifs — frequently do not. Some of the reasons:
- Peptides are short. A 15-residue sequence is too short to return meaningful BLAST hits in many cases.
- Peptides are not conserved like domains. Functional peptides often tolerate substitution at many positions, diluting the evolutionary co-variation signal.
- Designed peptides have no homologs by definition. If you built the peptide yesterday, there is no evolutionary record of it anywhere.
All of this is why classical MSA-dependent predictors often return low-confidence structures on short, standalone peptides.
How ESMFold is different
ESMFold is built on top of a large protein language model (the ESM-2 family). Instead of relying on multiple sequence alignments at inference time, it uses learned representations from the language model to predict structure directly from a single sequence. No alignment step, no homolog search, just sequence in, structure out.
That design has two practical consequences for peptide work:
- It works on sequences without MSAs. Novel, engineered, and short peptides are fair game.
- It is fast. Without the MSA pipeline, end-to-end prediction is measured in seconds to low tens of seconds on a single GPU for short inputs. That speed unlocks iterative design.
ESMFold is not perfect. For long, heavily multimeric systems, MSA-based methods still have the edge. For short, designed, or peptide-scale targets, ESMFold is frequently the better first tool.
Reading pLDDT for short peptides
Every ESMFold prediction comes with a pLDDT score — a per-residue confidence metric from 0 to 100, averaged to give a mean over the full sequence. It is the first thing you should look at on any new structure.
- Mean pLDDT above 85. High confidence. The predicted backbone is probably close to the true structure. For peptides, this typically means the sequence has a well-defined secondary structure element.
- Mean 70-85. Reasonable confidence. Treat the backbone shape as indicative rather than definitive. Side-chain details may be approximate.
- Mean 50-70. Partial or conditional folding. The peptide may adopt a structure only in specific contexts (membrane, binding partner) or may populate multiple conformations.
- Mean below 50. Likely disordered or unstable. ESMFold is essentially telling you it cannot commit to a single conformation. That is information — many bioactive peptides are intrinsically disordered until they engage a partner.
A practical design workflow
Here is a simple, repeatable workflow that Peptide Lab is built around:
1. Start with a seed sequence
Pick a known peptide (BPC-157, LL-37, GHK-Cu, Semax) or paste in a candidate of your own. The seed gives you a baseline structure and a baseline property profile to compare against.
2. Predict structure with ESMFold
Run the fold call. Inspect the mean pLDDT and the per-residue confidence. Is the backbone credible? Where are the low-confidence regions? That is your first real piece of structural information.
3. Evaluate physicochemical properties
Independent of 3D structure, peptides live or die by their bulk properties: net charge, hydrophobicity, helix propensity, aggregation risk. Compute these on the sequence directly — they are cheap and informative. Peptide Lab shows a 5-axis property radar covering solubility, stability, permeability, hemolysis risk, and structural confidence.
4. Propose mutations
Pick a target: increase solubility, reduce hydrophobic surface, shift net charge, preserve helix register. A simple mutation scanner ranks single-residue substitutions by predicted improvement in the target metric.
5. Re-predict and iterate
Apply the best candidate mutation, re-run ESMFold, recompute properties, and compare. Did the structure survive the substitution? Did the property shift? If yes, accept and keep going. If no, back out and try the next candidate. Rinse and repeat.
That loop is exactly what Peptide Lab implements in-browser, with every step cached so repeat visits to the same sequence are instant.
Calling ESMFold through the SciRouter API
If you want to scale this workflow beyond the browser, you can call ESMFold directly through SciRouter's protein endpoint. Here is a minimal Python example that folds a short peptide:
import httpx
API_KEY = "sk-sci-your-key-here"
BASE_URL = "https://scirouter-gateway-production.up.railway.app"
# BPC-157 as an example
sequence = "GEPPPGKPADDAGLV"
response = httpx.post(
f"{BASE_URL}/v1/proteins/fold",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"sequence": sequence},
timeout=120.0,
)
result = response.json()
job_id = result["job_id"]
print(f"Submitted fold job: {job_id}")
Because ESMFold runs asynchronously for long sequences, you typically submit the job and then poll for the result:
import time
while True:
poll = httpx.get(
f"{BASE_URL}/v1/proteins/fold/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"},
)
data = poll.json()
if data["status"] == "completed":
structure_pdb = data["result"]["pdb"]
mean_plddt = data["result"]["mean_plddt"]
print(f"Mean pLDDT: {mean_plddt:.1f}")
break
time.sleep(2)
For short peptides like BPC-157 (15 residues) or GHK-Cu (3 residues), the inference time is typically under 10 seconds once the worker is warm. For longer peptides (LL-37 at 37 residues, tirzepatide at 39 residues) it is still usually well under a minute. That is fast enough to put a fold call inside a design loop.
Where structure prediction is not enough
A pragmatic caveat: ESMFold predicts static backbone structures. It does not tell you:
- How a peptide binds a receptor. That is a docking or co-folding problem — tools like Chai-1 or DiffDock are better suited.
- How the peptide behaves in a membrane environment. Amphipathic AMPs like LL-37 adopt different conformations in solution versus at a lipid interface.
- Stability to proteases. Structure does not directly predict half-life; you need sequence-level models for protease susceptibility or empirical measurement.
- Activity. A well-folded peptide can be inert. A disordered peptide can be potent. Structure confidence is not the same as bioactivity.
A good design workflow layers complementary methods rather than relying on any single model.
Start in Peptide Lab
The easiest way to try this workflow is to use SciRouter's peptide playground. Nothing to install:
- scirouter.ai/peptide-lab — public library browser
- scirouter.ai/dashboard/peptide-lab — full workspace with saved designs
Pick a seed, edit a sequence, fold it, inspect the properties, mutate, re-fold, save. That loop — sequence in, structure out, property out, next sequence in — is the core of modern peptide design, and you can run it without leaving your browser tab.