Skip to main content

Initial Exploration Phase

When you start an optimization experiment, the optimizer first collects a set of exploratory data points before it can build a reliable predictive model. This article explains what happens during that phase, how many data points are needed, and how to plan your experiment accordingly.


What Is the Initial Exploration Phase?

Before the optimizer can learn from data, it needs a minimum number of measurements to train on. During this initial exploration phase, suggestions are space-filling — they are designed to cover your parameter space broadly rather than exploit any known pattern.

A few things to expect during this phase:

  • Suggestions are exploratory, not targeted. They sample the parameter space as evenly as possible.

  • Predictions are not yet available, because the model has not been trained yet.

  • Expert context influences where in the parameter space the initial suggestions are drawn from (except when constraints are active).

Once enough data has been collected, the optimizer automatically transitions to guided optimization — the model-driven phase where suggestions actively target promising regions.


How Many Data Points Are Needed?

The number of initial data points depends on your parameter space. The optimizer applies two rules:

Rule 1 — Minimum samples

The optimizer requires at least one data point per parameter, with a minimum of 2. For an experiment with N parameters:

Initial points needed ≥ max(2, N)

For example, an experiment with 5 parameters needs at least 5 initial data points before guided optimization can begin.

Rule 2 — Categorical coverage (when applicable)

If your experiment includes categorical parameters without properties, the optimizer also needs to have seen every option at least once. This is necessary because the model has no basis for reasoning about a categorical option it has never observed — "Solvent A" and "Solvent B" are just opaque labels with no inherent relationship.

In this case, the initial data points needed is:

Initial points needed ≥ max(Rule 1 result, largest number of options across all such categoricals)

For example, a solvent parameter with 6 options means the optimizer needs at least 6 initial data points — one to try each solvent.


Examples

Experiment setup

Initial data points needed

With parallelization 4

3 numerical parameters

3

1 initial iteration

6 numerical parameters

6

2 initial iterations

3 numerical + 1 categorical (6 options, no properties)

6

2 initial iterations

The number of initial iterations is the data points needed divided by your parallelization setting, rounded up.


Shortening the Initial Phase

There are two ways to reduce the time spent in initial exploration:

  • Upload historical data — if you already have prior measurements that cover the parameter space, upload them during experiment setup. The optimizer will count them toward the initial design requirement and may skip the exploratory phase entirely.

  • Increase parallelization — a higher batch size reduces the number of initial iterations (though not the total number of data points required). More simultaneous experiments means you complete the initial phase in fewer rounds.


Good to Know

  • Once guided optimization starts, suggestions become increasingly targeted. The initial phase is a one-time cost — every iteration after it benefits from everything the model has learned.

Did this answer your question?