Chemical Reaction Optimization with Catalyst Constraints¶

In this example, we demonstrate the use of interpoint constraints in a chemical optimization scenario. We optimize reaction conditions for a batch of chemical experiments where exactly 30 mol% of catalyst must be used across the entire batch, while not using more than 60 mL of solvent across the batch.

This scenario illustrates a common challenge in laboratory settings. First, it demonstrates how to enforce a catalyst requirement: Exactly 30 mol% of the catalyst must be used across the entire batch since the catalyst is supplied in a sealed, sensitive package that cannot be reused once opened. Second, it shows how to also include a solvent budget constraint for controlling the total solvent consumption across experiments for cost efficiency.

This example demonstrates how to use interpoint constraints and intrapoint constraints. An intrapoint constraint, often simply referred to as a constraint, applies to each individual experiment, ensuring that certain conditions are met within that single point. In contrast, an interpoint constraint applies across a batch of experiments, enforcing conditions that relate to the collective set of points rather than individual ones. These constraints are particularly useful when resources or conditions must be managed at a batch level and they allow us to:

Ensure total resource consumption meets exact requirements
Maintain chemical balances across multiple experiments
Optimize the collective use of expensive materials

For more details on interpoint constraints, see the user guide on constraints.

Imports and Settings¶

import os

import pandas as pd
from matplotlib import pyplot as plt

from baybe import Campaign
from baybe.constraints import ContinuousLinearConstraint
from baybe.parameters import NumericalContinuousParameter
from baybe.recommenders import BotorchRecommender
from baybe.searchspace import SearchSpace
from baybe.targets import NumericalTarget
from baybe.utils.dataframe import add_fake_measurements
from baybe.utils.random import set_random_seed

SMOKE_TEST = "SMOKE_TEST" in os.environ
BATCH_SIZE = 3
N_ITERATIONS = 4 if SMOKE_TEST else 15
TOLERANCE = 0.01

set_random_seed(1337)

Defining the Chemical Optimization Problem¶

We’ll optimize a synthetic chemical reaction with the following experimental parameters:

Solvent Volume (10-30 mL per experiment): The amount of solvent used
Reactant A Concentration (0.1-2.0 g/L): Primary reactant concentration
Catalyst Loading (1-10 mol%): Catalyst amount as percentage of limiting reagent
Temperature (60-120 °C): Reaction temperature Note that these ranges are chosen arbitrary and do not represent a specific real-world reaction.

parameters = [
    NumericalContinuousParameter(
        name="Solvent_Volume", bounds=(10.0, 30.0), metadata={"unit": "mL"}
    ),
    NumericalContinuousParameter(
        name="Reactant_A_Conc", bounds=(0.1, 2.0), metadata={"unit": "g/L"}
    ),
    NumericalContinuousParameter(
        name="Catalyst_Loading", bounds=(1.0, 20.0), metadata={"unit": "mol%"}
    ),
    NumericalContinuousParameter(
        name="Temperature", bounds=(60.0, 120.0), metadata={"unit": "°C"}
    ),
]

Constraint Definition¶

We define both intrapoint and interpoint constraints to demonstrate the difference:

Intrapoint constraint (applied to each individual experiment):

Reagent efficiency: For each experiment, solvent volume must be at least 5 times the reactant concentration (to ensure proper dilution)

Interpoint constraints (applied across the entire batch):

Catalyst constraint: Total catalyst loading across all experiments must equal exactly 30 mol%
Solvent budget: Total solvent across batch should be ≤ 60 mL

intrapoint_constraints = [
    ContinuousLinearConstraint(
        parameters=["Solvent_Volume", "Reactant_A_Conc"],
        operator=">=",
        coefficients=[1, -5],
        rhs=0.0,
        interpoint=False,
    ),
]

interpoint_constraints = [
    ContinuousLinearConstraint(
        parameters=["Catalyst_Loading"],
        operator="=",
        coefficients=[1],
        rhs=30.0,
        interpoint=True,
    ),
    ContinuousLinearConstraint(
        parameters=["Solvent_Volume"],
        operator="<=",
        coefficients=[1],
        rhs=60.0,
        interpoint=True,
    ),
]

Campaign Setup¶

We construct the search space by combining parameters with constraints, then create a campaign targeting maximum reaction yield. The BotorchRecommender with sequential_continuous=False is required for interpoint constraints as they operate on batches rather than individual experiments.

searchspace = SearchSpace.from_product(
    parameters=parameters,
    constraints=intrapoint_constraints + interpoint_constraints,
)

target = NumericalTarget(name="Reaction_Yield")
objective = target.to_objective()

Measurement Simulation¶

For this example, we use the add_fake_measurements utility to generate synthetic target values. This utility creates random measurements within the target’s expected range, which is useful for testing and demonstration purposes without requiring a complex reaction model.

recommender = BotorchRecommender(sequential_continuous=False)

campaign = Campaign(
    searchspace=searchspace,
    objective=objective,
    recommender=recommender,
)

Initial Training Data¶

We generate 5 random experiments from the search space to simulate existing data.

initial_data = searchspace.continuous.sample_uniform(5)
add_fake_measurements(initial_data, campaign.targets)
campaign.add_measurements(initial_data)

Optimization Loop with Constraint Validation¶

We run several optimization iterations, where each iteration recommends a batch of experiments that satisfy both intrapoint and interpoint constraints. After evaluating each batch, we validate that the interpoint constraints are satisfied and use assertions to ensure the optimization respects our resource limitations.

results_log = []

for it in range(N_ITERATIONS):
    recommendations = campaign.recommend(batch_size=BATCH_SIZE)

    add_fake_measurements(recommendations, campaign.targets)
    campaign.add_measurements(recommendations)
    total_sol = recommendations["Solvent_Volume"].sum()
    total_cat = recommendations["Catalyst_Loading"].sum()
    solvent_ok = total_sol <= (60.0 + TOLERANCE)
    catalyst_ok = abs(total_cat - 30.0) < TOLERANCE

    assert solvent_ok, f"Solvent constraint violated: {total_sol:.1f} mL (max 60.0 mL)"
    assert catalyst_ok, (
        f"Catalyst constraint violated: {total_cat:.1f} mol% (expected 30.0 mol%)"
    )

    results_log.append(
        {
            "iteration": it + 1,
            "total_solvent_mL": total_sol,
            "total_catalyst_mol%": total_cat,
            "individual_solvent_mL": recommendations["Solvent_Volume"].tolist(),
            "individual_catalyst_mol%": recommendations["Catalyst_Loading"].tolist(),
        }
    )

Visualization¶

We create plots showing both individual experiment values and their totals to illustrate how interpoint constraints work. The individual lines show how the optimizer distributes resources across experiments within each batch, while the bold total lines demonstrate that the batch-level constraints are satisfied.

results_df = pd.DataFrame(results_log)

fig, axs = plt.subplots(1, 2, figsize=(10, 4));

plt.sca(axs[0])
for exp_idx in range(BATCH_SIZE):
    individual_values = [
        batch[exp_idx] for batch in results_df["individual_solvent_mL"]
    ]
    plt.plot(
        results_df["iteration"],
        individual_values,
        "o-",
        alpha=0.6,
        label=f"Exp {exp_idx + 1}",
    )

plt.plot(
    results_df["iteration"],
    results_df["total_solvent_mL"],
    "s-",
    color="blue",
    linewidth=2,
    label="Total",
)
plt.axhline(y=60, color="red", linestyle="--", label="Budget")
plt.title("Solvent (Constrained)")
plt.xlabel("Batch")
plt.ylabel("Solvent Volume (mL)")
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5));

plt.sca(axs[1])
for exp_idx in range(BATCH_SIZE):
    individual_values = [
        batch[exp_idx] for batch in results_df["individual_catalyst_mol%"]
    ]
    plt.plot(
        results_df["iteration"],
        individual_values,
        "o-",
        alpha=0.6,
        label=f"Exp {exp_idx + 1}",
    )

plt.plot(
    results_df["iteration"],
    results_df["total_catalyst_mol%"],
    "s-",
    color="orange",
    linewidth=2,
    label="Total",
)
plt.axhline(y=30, color="red", linestyle="--", label="Required")
plt.title("Catalyst (Constrained)")
plt.xlabel("Batch")
plt.ylabel("Catalyst Loading (mol%)")
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5));

plt.tight_layout()
if not SMOKE_TEST:
    plt.savefig("interpoint.svg")