# Backtesting This example demonstrates the use of the [`simulate_transfer_learning`](baybe.simulation.transfer_learning.simulate_transfer_learning) function to learn across tasks: * We construct a campaign, * define two related test functions, * use the data from the first function to train the second, * and vice versa ## Imports ```python import os ``` ```python import numpy as np import pandas as pd import seaborn as sns from botorch.test_functions.synthetic import Hartmann ``` ```python from baybe import Campaign from baybe.objectives import SingleTargetObjective from baybe.parameters import NumericalDiscreteParameter, TaskParameter from baybe.searchspace import SearchSpace from baybe.simulation import simulate_scenarios, simulate_transfer_learning from baybe.targets import NumericalTarget from baybe.utils.botorch_wrapper import botorch_function_wrapper from baybe.utils.plotting import create_example_plots ``` ## Settings The following settings are used to set up the problem: ```python SMOKE_TEST = "SMOKE_TEST" in os.environ # reduce the problem complexity in CI pipelines DIMENSION = 3 # input dimensionality of the test function BATCH_SIZE = 1 # batch size of recommendations per DOE iteration N_MC_ITERATIONS = 2 if SMOKE_TEST else 50 # number of Monte Carlo runs N_DOE_ITERATIONS = 2 if SMOKE_TEST else 10 # number of DOE iterations POINTS_PER_DIM = 3 if SMOKE_TEST else 7 # number of grid points per input dimension ``` ## Creating the Optimization Objective The test functions each have a single output that is to be minimized. The corresponding [Objective](baybe.objective.Objective) is created as follows: ```python objective = SingleTargetObjective(target=NumericalTarget(name="Target", mode="MIN")) ``` ## Creating the Search Space This example uses the [Hartmann Function](https://botorch.org/api/test_functions.html#botorch.test_functions.synthetic.Hartmann) as implemented by `botorch`. The bounds of the search space are dictated by the test function and can be extracted from the function itself. ```python BOUNDS = Hartmann(dim=DIMENSION).bounds ``` First, we define one [NumericalDiscreteParameter](baybe.parameters.numerical.NumericalDiscreteParameter) per input dimension of the test function: ```python discrete_params = [ NumericalDiscreteParameter( name=f"x{d}", values=np.linspace(lower, upper, POINTS_PER_DIM), ) for d, (lower, upper) in enumerate(BOUNDS.T) ] ``` Next, we define a [TaskParameter](baybe.parameters.categorical.TaskParameter) to encode the task context, which allows the model to establish a relationship between the training data and the data collected during the optimization process. Since we perform a cross training here, we do not specify any `active_values`. ```python task_param = TaskParameter( name="Function", values=["Hartmann", "Shifted"], ) ``` With the parameters at hand, we can now create our search space. ```python parameters = [*discrete_params, task_param] searchspace = SearchSpace.from_product(parameters=parameters) ``` ## Defining the Tasks To demonstrate the transfer learning mechanism, we consider the problem of optimizing the Hartmann function using training data from a shifted, scaled and noisy version and vice versa. The used model is of course not aware of this relationship but needs to infer it from the data gathered during the optimization process. ```python def shifted_hartmann(*x: float) -> float: """Calculate a shifted, scaled and noisy variant of the Hartman function.""" noised_hartmann = Hartmann(dim=DIMENSION, noise_std=0.15) return 2.5 * botorch_function_wrapper(noised_hartmann)(x) + 3.25 ``` ```python test_functions = { "Hartmann": botorch_function_wrapper(Hartmann(dim=DIMENSION)), "Shifted": shifted_hartmann, } ``` ## Generating Lookup Tables We generate a single lookup table containing the target values of both functions at the given parameter grid. Parts of one lookup serve as the training data for the model. The other lookup is used as the loop-closing element, providing the target values of the other function. ```python grid = np.meshgrid(*[p.values for p in discrete_params]) ``` ```python lookups: dict[str, pd.DataFrame] = {} for function_name, function in test_functions.items(): lookup = pd.DataFrame({f"x{d}": grid_d.ravel() for d, grid_d in enumerate(grid)}) lookup["Target"] = tuple(lookup.apply(function, axis=1)) lookup["Function"] = function_name lookups[function_name] = lookup lookup = pd.concat([lookups["Hartmann"], lookups["Shifted"]]).reset_index() ``` ## Simulation Loop ```python campaign = Campaign(searchspace=searchspace, objective=objective) ``` ```python results = simulate_transfer_learning( campaign, lookup, batch_size=BATCH_SIZE, n_doe_iterations=N_DOE_ITERATIONS, n_mc_iterations=N_MC_ITERATIONS, ) ``` For comparison, we also compare with the baseline tasks ```{note} It is intended to implement a more elegant way of comparing results with and without transfer learning in the future. ``` ```python for func_name, function in test_functions.items(): task_param = TaskParameter( name="Function", values=["Hartmann", "Shifted"], active_values=[func_name] ) parameters = [*discrete_params, task_param] searchspace = SearchSpace.from_product(parameters=parameters) result_baseline = simulate_scenarios( {f"{func_name}_No_TL": Campaign(searchspace=searchspace, objective=objective)}, lookups[func_name], batch_size=BATCH_SIZE, n_doe_iterations=N_DOE_ITERATIONS, n_mc_iterations=N_MC_ITERATIONS, ) results = pd.concat([results, result_baseline]) ``` All that remains is to visualize the results. As the example shows, the optimization speed can be significantly increased by using even small amounts of training data from related optimization tasks. ```python results.rename(columns={"Scenario": "Function"}, inplace=True) # Add column to enable different styles for non-TL examples results["Uses TL"] = results["Function"].apply(lambda val: "No_TL" not in val) ax = sns.lineplot( data=results, markers=["o", "s"], markersize=13, x="Num_Experiments", y="Target_CumBest", hue="Function", style="Uses TL", ) create_example_plots(ax=ax, base_name="backtesting") ``` ```{image} backtesting_light.svg :align: center :class: only-light ``` ```{image} backtesting_dark.svg :align: center :class: only-dark ```