Skip to main content

Why Use Historical Data?

How to leverage previous experimental results to accelerate your optimization.

Why Use Historical Data?

If you have results from previous experiments — whether from manual testing, classical DoE, or a prior optimization campaign — you can feed them into SDLabs. The optimizer uses this data to skip or shorten the initial exploration phase and start making informed recommendations sooner.

Without historical data, the optimizer begins with a space-filling initial design to learn the landscape from scratch. With historical data, it already has a starting model and can focus on the most promising regions right away.


How It Affects the Optimization

  • Shorter initial exploration — The optimizer may need fewer (or zero) space-filling iterations before switching to model-guided optimization, depending on how much historical data you provide.

  • Better early recommendations — The surrogate model starts with real data instead of a blank slate, so even the first round of suggestions is more targeted.

  • Data quality matters — The optimizer trusts historical data the same way it trusts new measurements. If historical data is noisy, inconsistent, or from a different setup, it can mislead the model. Only include data that is relevant to the current experiment and conditions.

  • Coverage helps — Historical data that covers a wide range of the parameter space is more useful than data clustered around a narrow region. Broad coverage gives the model a better starting map of the landscape.


Two Ways to Add Historical Data

Option 1: Start an Experiment From a Dataset

Upload a CSV file during experiment creation. The platform automatically defines variables and results based on the columns in your file.

  • The experiment setup will be limited to the variables and results in your file.

  • You can expand the range of numerical variables (e.g. widen bounds beyond what the data covers).

  • You cannot add new options to categorical variables at this time — only the categories present in the data are available.

  • Your historical data becomes the starting dataset — the optimizer trains on it immediately.

This is the fastest way to get started when your historical data already defines the scope of the experiment.

Option 2: Attach Datasets to a Running Experiment

If your experiment is already set up, open the Historical Data section in the experiment configuration. From there, you can attach one or more datasets to the experiment — either by uploading new CSV files or by browsing datasets already in the platform, including datasets produced by other experiments — and then map each dataset's columns to your variables and results in a single side-by-side editor.

This is the recommended way to enrich an experiment that is already configured, especially when you want to combine data from several prior runs or sources. The optimizer trains on the attached datasets from the next iteration onward.

For step-by-step UI instructions, see Attaching Historical Data to an Experiment.

Quick alternative — For a small, ad-hoc set of prior results, you can also upload a CSV in the recommendations view: download the current suggestions as a template, format your data to match, and upload it to replace the current batch. The optimizer then trains on that data in the next iteration. For anything larger or with more than one source, prefer the Historical Data section above.


Tips for Best Results

  • Match the conditions — Historical data should come from the same or very similar experimental setup. Data from a different reactor, scale, or protocol may not transfer well.

  • Include all relevant columns — Provide values for all variables and results. Missing columns will be treated as missing data.

  • More data is generally better — But 10 high-quality, well-spread data points are worth more than 100 noisy points clustered in one region.

  • Supported formats — CSV (.csv) only.

For step-by-step upload instructions, see Upload a Dataset.


Did this answer your question?