Case Study: Optimizing a Vegan Cheese Formulation

This case study walks through a multi-objective, multi-component formulation end to end, showing what the expert context looks like at setup, mid-run, and late in the optimization. It uses the vegan cheese formulation benchmark that ships with the platform — a real, multi-component food-science problem with three competing objectives and physical constraints on the recipe.

Formulations like this are different in character from reaction optimizations: the variables are mostly ratios, the objectives often pull in opposite directions, and the structure of the product comes from how the ingredients interact, not just from each one individually. Expert context shines here because so much of the relevant knowledge is about relationships between ingredients, which the AI cannot infer from bounds alone.

The Problem

Formulate a vegan cheese to maximise three quality properties simultaneously:

Elasticity — perceived bounce / stretch
Load-bearing capacity — resistance under compression (sliceability, structural integrity)
Melting score — how well the product melts on heating

Decision variables (all discrete, 0.5-unit increments, expressed as weight units in the recipe):

Component 1 — 0 to 4.0
Component 2 — 0 to 3.0
Component 3 — 0 to 3.5
Component 4 — 0 to 3.5
Fat 1 — 0 to 17.0
Fat 2 — 0 to 17.0
Powder — 0 to 3.5
Starch — 0 to 7.0

Constraints:

Fat 1 + Fat 2 ≤ 17 (total fat capped)
All eight components must sum exactly to 24 (the rest of the recipe is fixed)

Because the total mass is fixed, this is essentially a composition optimization: adding more of one ingredient means proportionally less of another. That coupling deserves an explicit mention in the expert context.

Version 1 — Expert Context at Setup

The food scientist begins with a context that frames the product, describes each ingredient's role, and names the key interactions and trade-offs — without claiming optimal ratios, since the system is too complex for confident priors.

The goal is to formulate a vegan cheese by optimizing its composition to maximize three target properties simultaneously: Elasticity, Load-bearing capacity, and Melting score. These properties together determine consumer acceptability and functional performance (slicing, cooking, melt behaviour on pizza).

The total recipe mass is fixed (sum of all ingredients = 24, with total fat capped at 17). So this is fundamentally a composition trade-off: increasing any ingredient must come at the expense of others.

Ingredient roles:

Components 1–4: Structure-forming agents — likely proteins, gums, or hydrocolloids. They drive texture and water-binding. Different components likely contribute differently: some may favour elasticity (gel-formers), others load-bearing (rigid networks), others water retention.
Fat 1, Fat 2: Provide creaminess, melt behaviour, and flavour release. Different fats have distinct melting points and crystal structures — Fat 1 and Fat 2 are likely complementary, and a blend tends to give a richer melt profile than either alone.
Powder: Functional dry matter (protein isolate or similar). Adds body but in excess can make the product chalky or dry.
Starch: Drives meltability, viscosity, and stretch. Helps water retention upon heating, but too much produces a pasty, gummy texture.

Expected trade-offs between objectives:

Elasticity vs Load-bearing: often in tension — highly elastic formulations tend to deform under compression rather than resist it. Pure rigidity is brittle; pure elasticity collapses.
Melting score vs Load-bearing: products that melt well at heating temperatures usually have softer cold structure. A formulation that holds shape perfectly when cold may melt poorly.
Elasticity and Melting score are sometimes positively correlated through starch and fat content, but not always — over-formulating for melt can sacrifice elasticity through excess fat phase.

Key interactions:

Fat 1 and Fat 2 are likely complementary; blends typically outperform either alone for melting behavior.
Components 1–4 can be synergistic in forming elastic networks but may compete for water — too many in combination at high doses risks phase separation.
Starch and Powder both contribute solids; together they can over-load the matrix and cause brittleness or phase separation.
Because of the fixed total mass constraint, fat fraction and non-fat solids are inversely coupled — a high-fat recipe is necessarily lower in structure-forming components, and vice versa.

Hypotheses to test early.

Some fat is essential (low-fat formulations are likely to fail on melt and mouthfeel) but very high fat (close to the 17-unit cap) risks structural collapse and oiling-off.
A modest amount of starch (probably non-zero, somewhere mid-range) is likely needed for melt, but excess will degrade texture.
A blended Fat 1/Fat 2 composition is likely to outperform Fat 1 alone or Fat 2 alone.

These are domain intuitions, not measurements. They should be revised once iterations have generated enough data to support or refute them.

What this version does well:

It frames the multi-objective nature explicitly. By calling out the Elasticity-vs-Load and Melting-vs-Load tensions, the food scientist tells the AI that this is a Pareto problem, not a single-peak optimization. The AI should propose candidates that sample different trade-offs, not just chase one objective.
It explicitly handles the fixed-mass constraint. Calling out the composition coupling tells the AI that "more of X means less of something else" — a key piece of reasoning when proposing recipe variations.
It encodes ingredient interactions, not just ingredient roles. The Fat 1 + Fat 2 complementarity, the Components-compete-for-water effect, and the Starch+Powder solids-loading effect are exactly the kind of relationships the AI cannot extract from bounds alone.
It frames priors as hypotheses, not facts. "Probably non-zero, somewhere mid-range" is honest about the level of confidence — strong enough to bias warm-start sampling, weak enough to be revisable.

After 6–8 Iterations: What the Data Showed

The first set of formulations and measurements yields a few clear patterns and some surprises:

Fat blending is confirmed. Formulations with a mix of Fat 1 and Fat 2 (roughly 5–8 units each) consistently produce higher Melting scores than formulations with one fat alone, even at the same total fat. The complementarity hypothesis is supported.
The Starch hypothesis is partially refuted. Mid-range Starch (3–4 units) does not perform as expected; the best Melting scores so far are coming from lower Starch (1.5–2.5 units). Higher Starch is hurting Load and Elasticity more than helping Melt.
Component 2 is unexpectedly important for Load. Across the Pareto front so far, runs with Component 2 near its upper bound (2.5–3.0) outperform runs with Component 2 near zero, holding other variables roughly constant. This wasn't anticipated.
The Elasticity / Load tension is real. The Pareto front shows a clear trade-off, with most "balanced" runs sitting in the middle. There is no recipe yet that scores well on both simultaneously.
Total fat closer to the 17-unit cap performs worse on Load. Suggests fat is competing with structure-forming components for the limited mass budget.

The food scientist updates the expert context.

Version 2 — Expert Context Mid-Run

The background, ingredient-role descriptions, and constraint information are unchanged — the food science hasn't changed. The hypothesis layer is rewritten to reflect what the data has shown, and a "Current focus" block steers the next iterations.

[Background, ingredient roles, trade-offs, and key interactions sections unchanged from Version 1.]

What the data has shown so far.

Fat blending is confirmed: mixed Fat 1 / Fat 2 formulations outperform single-fat formulations on Melting score at the same total fat.
The earlier expectation that mid-range Starch (3–4 units) would be optimal for melt has not been borne out. The best Melting scores so far come from lower Starch (1.5–2.5 units); higher Starch hurts Load and Elasticity more than it helps Melt.
Component 2 is more important for Load-bearing than initially expected: high Component 2 (2.5–3.0) is a strong predictor of high Load. The role of Components 1, 3, and 4 is less clear and needs more exploration.
Total fat near the upper cap (close to 17) hurts Load — the fat-vs-structure mass competition is more aggressive than expected.
A clear Pareto trade-off between Elasticity and Load is emerging. No formulation yet excels on both.

Current focus. The next iterations should:

Explore the role of Components 1, 3, and 4 with Component 2 held at the high end of its range.
Probe the lower-Starch region (1–2.5 units) more densely, since this is where Melting score appears to be best.
Test whether the Elasticity / Load trade-off can be broken by adjusting the Components 1–4 ratio rather than total solids.
Avoid total-fat values near the 17-unit cap unless investigating Melt-dominant formulations.

What this version does well:

It updates priors with evidence. The "mid-range Starch is optimal" hypothesis was wrong, and it's been replaced rather than left in to keep pulling the AI in the wrong direction.
It flags an unexpected finding. Component 2's importance was a surprise — naming it explicitly tells the AI to treat it as a priority variable in future hypotheses.
It poses a directed question. "Can the Elasticity/Load trade-off be broken by adjusting the Components 1–4 ratio?" is a hypothesis the AI can attack on the next iteration. Multi-objective optimizations especially benefit from this kind of directed exploration, since the Pareto front is large.

After 12+ Iterations: Refining the Pareto Front

More iterations sharpen the picture:

The data reveals two distinct "good" regimes: a Melt-dominant formulation (higher fat blend, lower solids) and a Structure-dominant formulation (lower fat, higher Components 2 and 3, modest Starch).
Component 3 has emerged as the second most influential structural component after Component 2. Component 4 contributes much less than expected.
The Powder ingredient sits comfortably at 1.5–2.5 units across all the best formulations — its role is supportive but not differentiating.
A handful of recipes on the Pareto front simultaneously achieve good Elasticity and Load by trading Melting score; the food scientist wants to understand whether that trade can be relaxed.

Version 3 — Expert Context Late in the Run

[Background, ingredient roles, trade-offs, and key interactions sections unchanged.]

What the data has shown.

Two viable regimes have emerged on the Pareto front: a Melt-dominant formulation (Fat 1 + Fat 2 around 10–12 total, lower solids) and a Structure-dominant formulation (lower fat, Component 2 at 2.5–3.0, Component 3 at 2.5–3.0, moderate Starch).
Component 2 and Component 3 are the dominant structural levers; Component 4 contributes weakly.
Powder is comfortable at 1.5–2.5 across all good formulations; further exploration of this variable is low priority.
A small number of formulations achieve good Elasticity and Load simultaneously, at the cost of a moderate Melting score.

Current focus.

Refine the Structure-dominant regime by exploring small variations around the current best Component 2 / Component 3 / Starch values.
Test whether the Melt-dominant regime can be pushed to higher Elasticity without losing Melting score.
Investigate the "balanced" trio of formulations that score well on Elasticity and Load — can their Melting score be improved without sacrificing the other two?
Further exploration of Component 4, Powder, and very high Fat regimes is no longer useful.

What this version does well:

It names regimes, not just optima. Multi-objective problems rarely have a single answer — calling out the two distinct viable regimes lets the AI keep pushing both rather than collapsing to a single point.
It tells the AI what to stop exploring. Powder, Component 4, and extreme-fat regimes are explicitly deprioritised, which frees the remaining iteration budget for the regions that still hold uncertainty.
It asks targeted questions. "Can the balanced trio's Melting score be improved without sacrificing the other two?" is a concrete hypothesis the AI can attack with the next batch of recipes.

Takeaways from This Case Study

Formulation problems benefit enormously from interaction descriptions. Much of what makes a vegan cheese work is in how the ingredients combine, not in any one alone. Bounds and ranges convey almost none of that — expert context is the only place to encode it.
Multi-objective problems need explicit trade-off framing. Telling the AI that Elasticity and Load are in tension changes how it generates candidates — it will deliberately propose recipes that explore the Pareto front, rather than chasing a single composite metric.
Mass-balance constraints deserve a mention. When the variables must sum to a fixed total, the AI's hypotheses should be framed in terms of composition shifts, not absolute amounts. Calling that out in the context produces much more sensible candidates.
Surprises should be promoted to first-class context. The Component 2 finding was unexpected — adding it to the mid-run context shifted how the AI generated candidates for the rest of the run.
Late in a multi-objective run, naming regimes is more useful than naming optima. The food scientist may end up shipping two products from the same Pareto front. Telling the AI to maintain both regimes keeps it from collapsing too early.

This pattern — explicit trade-off framing, ingredient-interaction descriptions, regime-aware steering — applies to any multi-objective formulation problem, not just dairy alternatives. The variables and the chemistry change; the structure of the context does not.