Writing Effective Expert Context

Expert context is the place where you tell the AI what you already know about your problem. Treat it as a useful supplement to the optimization — additional information that can help, but that does not replace real experimental data. The optimizer always weighs the measurements it has collected more heavily than what you wrote in the context, and if the two disagree the data wins. Well-written context shifts the odds, especially early on when data is scarce; it does not override what the experiments are telling the model.

It also shapes the insights attached to every iteration. Those insights are written by an LLM that is given your expert context as part of its prompt — so the more grounded and specific your context, the more your insights read like a knowledgeable colleague explaining the results rather than a generic description of the search space. Writing good context pays off twice: better suggestions, and better explanations of those suggestions.

This article focuses on the structure of a good expert context: what to put in, what to leave out, and short illustrative snippets you can pattern-match against. For full end-to-end case studies showing how an expert context looks at setup, mid-run, and late in a real optimization, see:

Case Study: Optimizing a Suzuki Coupling Reaction — a single-objective reaction-yield optimization.
Case Study: Optimizing a Vegan Cheese Formulation — a multi-objective, multi-component formulation with constraints.

What Belongs in Expert Context

Think of expert context as everything a knowledgeable colleague would tell a new collaborator about the problem before they start running experiments. In practice, the most useful pieces are:

The problem and objective. A one- or two-sentence framing of what is being optimized and why. Mention the chemistry, process, or physical system in plain language.
What each variable means and how it acts. Describe the role of each parameter qualitatively — what it controls, what changing it tends to do. Avoid stating the optimum value; describe the mechanism instead.
Known ranges, sweet spots, or dead zones. If literature or past experience suggests a band where good results tend to appear, say so. If certain regions are known to fail (e.g. decomposition above a temperature, immiscibility in a solvent), state that too — negative knowledge is just as valuable as positive knowledge.
Interactions and dependencies. Mention variables that influence each other (e.g. "catalyst performance depends strongly on solvent polarity"). Also call out variables that you believe are essentially independent — that helps the AI avoid hunting for relationships that aren't there.
Sensitivities and constraints not captured by the bounds. Things like "small changes in additive loading have outsized effects" or "side reactions become significant above 50 °C" belong here.
Your intuitions and prior experience. If you have hunches from previous experiments — even ones you can't fully justify scientifically — they often carry information. Flag them as intuitions so the AI knows their epistemic weight.

What to Leave Out

The answer. Do not write down the exact optimal point or the precise optimum region. The whole purpose of running the optimization is to find that. Stating a specific optimum biases the AI toward it and undercuts exploration.
Restating the bounds. The variable ranges, types, and step sizes are already part of the experiment setup. There is no need to repeat them here.
Vague hedges. "Moderate temperature works well" is much less useful than "Yields tend to be highest in the 80–120 °C window because of competing decomposition above that range." Specificity buys you targeted exploration.
Information unrelated to the optimization. Lab logistics, vendor names, or budget notes don't help the AI reason about the science. Keep the context focused on what could shift a hypothesis.

How to Format It

There is no required structure, but two patterns tend to read well:

A short narrative paragraph framing the problem and objective, followed by a per-variable section describing what each one does.
A paragraph on the problem, then a "Key Interactions" or "Sensitivities" section pulling out the cross-variable effects.

You have up to 4,000 characters — enough room for a thorough description without writing a paper. If you run out, prioritize the per-variable descriptions and key interactions; those are what the AI uses most when generating candidates.

Use Cases — Short Patterns

Expert context can do several distinct jobs, and most real-world contexts combine a few of them. Here are short patterns showing how each job looks on its own — pick the ones that match what you actually know.

1. Pure problem framing (minimal context)

When to use: You're exploring a new system and have no firm intuitions yet. You want to give the AI a domain anchor without claiming knowledge you don't have.

We are optimizing a Suzuki–Miyaura cross-coupling between an aryl bromide and a boronic ester. The objective is to maximize isolated yield of the biaryl product. The decision variables are catalyst loading, base equivalents, solvent (DMF, dioxane, toluene), and temperature. We have no specific prior expectations about which conditions will perform best in this substrate combination.

Even this much is enough to ground the warm-start sampling and make insights mention "biaryl product" and "cross-coupling" rather than abstract variable names.

2. Encoding literature or prior-work precedents

When to use: Published work on the same or similar systems suggests where good results tend to appear. You want the AI to start near those regions while still exploring around them.

Published results on related Pd-catalysed couplings report best yields with Pd loadings in the 1–5 mol% range and mildly basic conditions (1.5–3 equiv of base). Toluene and dioxane tend to outperform DMF for electron-poor aryl halides; for the present substrate, this is plausible but not confirmed.

Notice the explicit "plausible but not confirmed" qualifier — it tells the AI to use this as a starting hint, not a hard prior.

3. Encoding negative knowledge and failure modes

When to use: You know regions of the space that are unsafe, unstable, or already known to fail. Telling the AI explicitly is more reliable than hoping it figures it out from the data.

Temperatures above 80 °C trigger noticeable substrate decomposition and the reaction stops being clean. Yields at very low catalyst loading (< 0.5 mol%) tend to stall around 10–20% and are unlikely to improve further. Toluene at temperatures near its boiling point should also be avoided due to evaporation issues in our setup.

Negative knowledge is one of the most under-used kinds of expert context — users tend to write what they want and forget what they already know doesn't work.

4. Variable interactions and couplings

When to use: You believe certain variables can't be optimized independently — their effects depend on each other. This is one of the highest-value things to communicate, because it shapes how the AI explores cross-parameter combinations.

Solvent and catalyst should be considered jointly: certain catalysts are only effective in polar aprotic solvents (DMF, NMP), while others require lower-polarity environments (toluene, dioxane). Temperature interacts with base loading — higher temperatures need less base to achieve the same conversion. Substrate equivalents are essentially independent of the other variables in our explored range.

Calling out independent variables alongside dependent ones is just as useful: it tells the AI not to spend search effort hunting for relationships that aren't there.

5. A hypothesis you want tested early

When to use: You have a specific guess you'd like the AI to probe in the first iterations, before exploring more broadly. Framing it as a hypothesis (not as a fact) keeps it revisable once data comes in.

Hypothesis to test early. We suspect that pyrrolidine in DMF at moderate temperature will be one of the strongest catalyst/solvent combinations for this transformation, based on closely related precedents. It is worth testing this combination explicitly in the first few iterations before the optimizer commits to broader exploration. If early data does not support this, this section should be revised.

The last sentence is important: it primes you to come back and edit if the data points elsewhere, which avoids the "intuition that won't die" failure mode. The Suzuki coupling case study shows exactly this pattern playing out across three iterations.

7. Mid-run steering (used in combination with the patterns above)

When to use: The experiment is already running and the data is starting to suggest something. You want to nudge the next iterations toward a region or behaviour without rewriting the whole context.

Current focus. Recent iterations suggest that the low-temperature, high-catalyst-loading corner is the most promising region. Prioritize further sampling there over broader exploration. The earlier hypothesis that toluene would outperform DMF has not been supported by the data and should be deprioritised.

Keep this kind of steering content at the bottom of the field, below the stable background context — that way the foundation stays intact while you revise direction iteration to iteration. See Updating Expert Context During a Run for the full picture, and both case studies for examples of how the steering layer evolves alongside the data.

Quick Checklist Before You Save

Have I described what the system is, not just what the inputs are?
Have I explained each variable's effect rather than its optimal value?
Have I mentioned known interactions between variables?
Have I included negative knowledge — regions or conditions that are known to fail?
Have I flagged intuitions as intuitions, not as facts?
Are my units consistent with the variable definitions in the experiment setup?

If yes to all of those, your expert context is doing real work. When you're ready to see this applied end-to-end, the Suzuki coupling case study and the vegan cheese formulation case study show what each pattern above looks like inside a real optimization run.