Drug DiscoveryDrug Discovery Tools

REINVENT4 vs MolMIM vs DrugEx: Comparing AI Molecule Generators

Compare REINVENT4, MolMIM, and DrugEx for AI molecule generation. Architecture, strengths, benchmarks, and running REINVENT4 via SciRouter API.

Ryan Bethencourt
April 8, 2026
9 min read

Three Architectures for De Novo Molecule Generation

Generating novel molecules with desired properties is the central challenge of computational drug discovery. REINVENT4, MolMIM, and DrugEx represent three fundamentally different approaches: reinforcement learning with customizable scoring functions, masked language modeling for fast generation, and Pareto-based multi-objective reinforcement learning. Each makes different trade-offs between generation speed, chemical quality, and optimization flexibility.

This comparison examines each tool's architecture, practical strengths, limitations, and the scenarios where each excels. We also demonstrate how to run REINVENT4 through SciRouter's API, eliminating the complex local setup that typically gates access to these tools.

REINVENT4: The Reinforcement Learning Workhorse

Architecture

REINVENT4, developed at AstraZeneca and now open-sourced, uses a recurrent neural network (RNN) as its generative prior. The RNN is pre-trained on large SMILES datasets (typically ChEMBL or ZINC) to learn the grammar of valid chemical structures. It then undergoes reinforcement learning fine-tuning against user-defined scoring functions that encode the desired molecular properties.

The key innovation in REINVENT4 is the scoring function framework. Users compose scoring functions from modular components – docking scores, QED, synthetic accessibility, Tanimoto similarity to a reference molecule, custom QSAR models, and more. The RL agent learns to generate SMILES strings that maximize the composite score.

Strengths

  • Mature scoring infrastructure: Over 30 built-in scoring components covering drug-likeness, ADMET, similarity, and molecular fingerprints. Custom scoring functions can plug in via Python callables.
  • Multiple running modes: De novo generation, scaffold decoration, linker design, R-group exploration, and scaffold hopping. This versatility covers the full drug design lifecycle from hit finding to lead optimization.
  • Proven track record: Used in multiple published drug discovery campaigns at AstraZeneca, Merck, and academic groups. The most battle-tested generative chemistry tool available.
  • Active development: Version 4 (released 2023, actively maintained through 2026) adds transformer-based priors, improved sampling, and better integration with external scoring services.

Limitations

  • Complex configuration: TOML configuration files with dozens of parameters. Getting the scoring function weights right requires medicinal chemistry expertise.
  • Single-objective collapse: Multiple objectives are combined into a weighted sum, which can lead to Goodhart's Law-style optimization failures where the agent exploits one scoring component at the expense of others.
  • Training time: RL fine-tuning takes 30 minutes to several hours depending on the complexity of the scoring function and the number of epochs.
  • Installation complexity: Requires specific PyTorch versions, RDKit, and custom pip installation (PEP 660 not supported – must use pip install . not pip install -e .).

MolMIM: NVIDIA's Masked Language Model

Architecture

MolMIM (Molecular Masked Inverse Model), part of NVIDIA's BioNeMo framework, applies masked language modeling to molecular generation. The model takes a SMILES string, randomly masks portions of it, and learns to reconstruct the masked tokens. This gives MolMIM an understanding of chemical grammar and molecular substructure relationships.

For generation, MolMIM uses an interpolation strategy in its learned latent space. Given two input molecules, MolMIM can generate intermediate molecules that smoothly transition between their properties. This is conceptually different from RL-based generation – instead of optimizing a scoring function, MolMIM navigates a continuous chemical space.

Strengths

  • Fast inference: Once trained, MolMIM generates molecules in milliseconds per sample. No RL training loop needed for generation.
  • Smooth interpolation: The latent space allows smooth transitions between molecular properties, which is useful for exploring chemical space around a hit compound.
  • NVIDIA ecosystem: Integrates with BioNeMo, DGX systems, and NVIDIA hardware acceleration. If you are already in the NVIDIA ecosystem, MolMIM fits naturally.
  • Diversity: The masked modeling approach tends to produce structurally diverse molecules, which is valuable for library design and hit finding.

Limitations

  • Limited scoring control: No built-in scoring function framework comparable to REINVENT4. Property optimization requires external filtering or iterative guided generation.
  • NVIDIA dependency: Part of the proprietary BioNeMo suite. Requires NVIDIA GPUs and BioNeMo licenses for full functionality.
  • Less suited to constrained optimization: The interpolation approach is better for exploration than for optimizing specific property targets.
  • Fewer publications: Less academic validation compared to REINVENT4. Fewer published drug discovery campaigns using MolMIM as the primary generator.

DrugEx: Pareto Multi-Objective Optimization

Architecture

DrugEx, developed at the Computational Drug Discovery group at Leiden University, addresses the fundamental limitation of single-objective RL approaches: how do you balance competing drug design objectives without choosing weights in advance? DrugEx uses Pareto-based multi-objective reinforcement learning, maintaining a population of molecules on the Pareto front – the set of solutions where no objective can be improved without worsening another.

The generator architecture is similar to REINVENT (RNN-based SMILES generation), but the reward signal comes from Pareto dominance rather than a weighted sum. DrugEx uses a non-dominated sorting genetic algorithm (NSGA-II inspired) to select molecules that expand the Pareto front across all objectives simultaneously.

Strengths

  • True multi-objective optimization: No need to choose weights between competing objectives. The Pareto front reveals the full trade-off landscape between properties like potency, selectivity, solubility, and metabolic stability.
  • Decision-making transparency: Medicinal chemists can inspect the Pareto front and choose compounds based on which trade-offs are acceptable for their program, rather than having the algorithm make that choice.
  • Handles conflicting objectives: When high potency conflicts with good ADMET (as it often does), DrugEx finds the best compromises rather than collapsing to a single solution.
  • Open source: Fully open-source with good documentation and published benchmarks.

Limitations

  • Computationally expensive: Multi-objective RL with Pareto sorting is slower than single-objective RL. Training takes significantly longer than REINVENT4 for comparable numbers of objectives.
  • Smaller user community: Fewer practitioners and fewer integrations with common cheminformatics tools.
  • Fewer generation modes: Primarily focused on de novo generation. Less support for scaffold decoration and linker design compared to REINVENT4.
  • Pareto front interpretation: Large Pareto fronts (hundreds of non-dominated solutions) can be overwhelming to navigate without visualization tools.

Head-to-Head Comparison

The following table summarizes the key differences across practical dimensions that matter for drug discovery projects:

  • Generation speed: MolMIM is fastest (milliseconds per molecule). REINVENT4 is moderate (seconds per molecule during RL). DrugEx is slowest (minutes per epoch due to Pareto sorting).
  • Optimization control: REINVENT4 offers the most granular control via its scoring component library. DrugEx provides the most principled multi-objective handling. MolMIM offers the least optimization control.
  • Chemical validity: All three tools produce over 95% valid SMILES, but REINVENT4 and DrugEx tend to produce slightly higher validity rates due to their RL-based training.
  • Diversity: MolMIM generates the most structurally diverse molecules. REINVENT4 can suffer from mode collapse under aggressive optimization. DrugEx maintains diversity through Pareto selection pressure.
  • Drug-likeness: REINVENT4 can explicitly optimize QED, Lipinski, and custom drug-likeness scores. MolMIM and DrugEx require post-hoc filtering or objective-level specification.
  • Ease of use: MolMIM (if in NVIDIA ecosystem) is simplest. REINVENT4 has a steep learning curve for configuration. DrugEx falls in between.
Note
No single tool dominates across all dimensions. The best choice depends on your project stage (hit finding vs. lead optimization), infrastructure (NVIDIA vs. generic GPU), and whether your design goals involve multiple conflicting objectives.

When to Use Each Tool

Choose REINVENT4 When

  • You have a well-defined scoring function (docking score, QSAR model, property targets)
  • You need scaffold decoration, linker design, or R-group exploration
  • You want the most battle-tested tool with the largest community
  • Your project is in lead optimization with specific property targets

Choose MolMIM When

  • You want fast, diverse hit generation without a specific optimization target
  • You need to explore chemical space around a known active compound
  • You are already using NVIDIA BioNeMo infrastructure
  • Speed of generation matters more than fine-grained property control

Choose DrugEx When

  • You have multiple competing objectives and do not know the right trade-off weights
  • You want to present the full trade-off landscape to a medicinal chemistry team
  • Your design goals involve fundamentally conflicting properties (for example, potency versus selectivity)
  • You value principled multi-objective optimization over speed

Running REINVENT4 via SciRouter

SciRouter provides REINVENT4 as a managed cloud service, eliminating the installation complexity and GPU infrastructure requirements. Here is how to generate molecules using the Python SDK:

Install the SDK
pip install scirouter
Generate molecules with REINVENT4
from scirouter import SciRouter

client = SciRouter()

# Generate novel molecules targeting a kinase inhibitor profile
result = client.generate.molecules(
    num_molecules=50,
    scoring={
        "qed": {"weight": 0.3, "target": 0.7},
        "sa_score": {"weight": 0.2, "target": 3.0},
        "molecular_weight": {"weight": 0.2, "min": 300, "max": 500},
        "logp": {"weight": 0.3, "min": 1.0, "max": 4.0},
    },
    similarity={
        "reference_smiles": "c1ccc(-c2cnc3ccccc3n2)cc1",  # quinazoline scaffold
        "min_tanimoto": 0.3,
    },
)

print(f"Generated {len(result.molecules)} molecules\n")
for i, mol in enumerate(result.molecules[:10]):
    print(f"{i+1}. {mol.smiles}")
    print(f"   QED={mol.qed:.2f}  SA={mol.sa_score:.1f}  MW={mol.molecular_weight:.0f}  LogP={mol.logp:.1f}")
    print()

Filtering and Ranking Generated Molecules

After generation, apply additional filters using SciRouter's molecular properties and synthesis check endpoints:

Filter and rank molecules
# Check synthesis feasibility for top candidates
candidates = result.molecules[:20]

for mol in candidates:
    # Get detailed molecular properties
    props = client.chemistry.properties(smiles=mol.smiles)

    # Check synthetic accessibility
    synth = client.generate.synthesis_check(smiles=mol.smiles)

    print(f"SMILES: {mol.smiles}")
    print(f"  MW={props.molecular_weight:.0f}  LogP={props.logp:.1f}  HBA={props.hba}  HBD={props.hbd}")
    print(f"  SA score={synth.sa_score:.1f}  Feasibility: {synth.feasibility}")
    print()

Combining Generators in a Pipeline

A powerful strategy is to use multiple generators at different stages. Generate diverse initial hits with broad exploration, then optimize the best ones with targeted scoring:

Multi-stage generation pipeline
# Stage 1: Broad exploration with REINVENT4 (diverse mode)
exploration = client.generate.molecules(
    num_molecules=100,
    scoring={
        "qed": {"weight": 0.5, "target": 0.6},
        "sa_score": {"weight": 0.5, "target": 4.0},
    },
    temperature=1.0,  # high diversity
)
print(f"Stage 1: Generated {len(exploration.molecules)} diverse molecules")

# Stage 2: Filter by drug-likeness
drug_like = [
    m for m in exploration.molecules
    if m.qed > 0.5 and m.sa_score < 5.0 and 250 < m.molecular_weight < 550
]
print(f"Stage 2: {len(drug_like)} pass drug-likeness filters")

# Stage 3: Rank by synthesis feasibility
ranked = []
for mol in drug_like:
    synth = client.generate.synthesis_check(smiles=mol.smiles)
    ranked.append({"smiles": mol.smiles, "qed": mol.qed, "sa": synth.sa_score})

ranked.sort(key=lambda x: x["sa"])
print(f"\nTop 5 candidates (easiest to synthesize):")
for i, r in enumerate(ranked[:5]):
    print(f"  {i+1}. {r['smiles']} (QED={r['qed']:.2f}, SA={r['sa']:.1f})")
Tip
SciRouter's Molecular Design Lab automates this multi-stage pipeline in a single API call. Define your target profile and constraints, and the lab chains generation, property filtering, ADMET screening, and synthesis checking automatically.

Benchmarks and Performance

Published benchmarks on the GuacaMol and MOSES benchmark suites provide standardized comparisons. Here are the key findings relevant to practical drug discovery:

  • Validity (% valid SMILES): REINVENT4 achieves 98 to 99%. MolMIM achieves 96 to 98%. DrugEx achieves 97 to 99%. All three are production-ready on validity.
  • Uniqueness: MolMIM tends to generate the most unique structures (over 99%). REINVENT4 can show some repetition under aggressive optimization (90 to 98%). DrugEx maintains good uniqueness through Pareto diversity pressure (95 to 99%).
  • Novelty (vs. training set): All three generate over 95% novel molecules not found in their training data, confirming genuine de novo generation rather than memorization.
  • Rediscovery (target molecule recovery): REINVENT4 excels at rediscovering known actives when given appropriate scoring functions. MolMIM is less effective at targeted rediscovery. DrugEx falls in between.
  • Optimization efficiency: REINVENT4 reaches scoring optima faster (fewer RL epochs). DrugEx takes longer but produces better Pareto frontiers. MolMIM does not directly optimize scores.

The Future of Generative Chemistry

The field is moving rapidly. Diffusion models (like those used in image generation) are being adapted for 3D molecular generation, potentially surpassing SMILES-based approaches. Graph neural network generators that operate directly on molecular graphs avoid SMILES validity issues entirely. And foundation models trained on vast chemical datasets may eventually combine the strengths of all three approaches.

For now, REINVENT4 remains the workhorse for production drug discovery, MolMIM excels at fast exploration, and DrugEx offers the most principled multi-objective optimization. SciRouter provides REINVENT4 as a managed service today, with additional generators planned for future releases.

Next Steps

Try generating molecules with REINVENT4 through the SciRouter API. Evaluate candidates with molecular properties and synthesis check. For the full pipeline from target to synthesizable drug candidates, see our guide on SMILES to synthesis workflows.

Sign up at scirouter.ai/register for 500 free credits and start generating molecules today.

Frequently Asked Questions

Which tool is best for generating drug-like molecules from scratch?

REINVENT4 is the strongest choice for de novo generation of drug-like molecules. It has the most mature scoring infrastructure with built-in components for Lipinski compliance, QED, synthetic accessibility, and custom docking scores. Its reinforcement learning approach allows fine-grained control over which properties the generated molecules optimize for.

Can MolMIM be used for lead optimization?

MolMIM is better suited to hit generation and molecular exploration than lead optimization. Its masked language model approach generates diverse molecules quickly, but it lacks the fine-grained scoring function customization that lead optimization requires. For lead optimization with specific multi-parameter objectives, REINVENT4 or DrugEx provide more control.

What makes DrugEx unique compared to REINVENT4 and MolMIM?

DrugEx is the only tool of the three that uses true Pareto-based multi-objective optimization. Instead of collapsing multiple objectives into a single weighted score, DrugEx maintains a Pareto front of non-dominated solutions that balance competing goals like potency, selectivity, and ADMET properties simultaneously. This is particularly valuable when you do not know the right trade-off weights in advance.

How does SciRouter integrate REINVENT4 with ADMET filtering?

SciRouter's Molecular Design Lab endpoint chains REINVENT4 generation with property calculation, ADMET prediction, and synthetic accessibility scoring in a single API call. You define your target constraints and ADMET thresholds, and the pipeline returns ranked candidates that pass all filters. This eliminates the need to run separate tools and write custom integration code.

Do these tools require a GPU to run?

REINVENT4 benefits from GPU acceleration during training but can run inference on CPU. MolMIM requires a GPU for both training and efficient inference (it is part of NVIDIA BioNeMo). DrugEx needs a GPU for the reinforcement learning training loop. Through SciRouter, all computation runs on cloud GPUs so you do not need local GPU hardware.

Can I combine molecules from different generators?

Yes. A common strategy is to generate diverse initial hits with MolMIM (fast, broad exploration), filter them with property and ADMET screens, then use REINVENT4 to optimize the best hits against specific scoring functions. DrugEx can then balance the final candidates across multiple objectives. SciRouter makes this pipeline straightforward through its unified API.

Try this yourself

500 free credits. No credit card required.