The Unjournal · Plant-Based Meat Substitution Pivotal Question · Methodological Survey + Evaluation Synthesis

How this was made. Drafted by GPT Pro from existing Unjournal research and discussion (the elasticity-validation survey, the Bray et al. evaluation materials, and the PBM substitution literature), then built and polished into this interactive report in Claude Code. It is currently being reviewed and adjusted by hand. Treat figures and attributions as provisional until that review is complete; the governing evaluation lives on PubPub.

Can a food-demand elasticity carry the weight we put on it?

A combined survey of validation evidence for own-price and cross-price elasticities, the Bray et al. validation case, the plant-based-meat substitution literature, and how to frame the pivotal question and workshop around what the evidence can actually bear.

◢ Own-price elasticities

Often directionally reliable; sometimes useful for broad forecasting.

The evidence base is large and shows recognizable structure across foods and countries. But sharp validation is thinner than the volume of estimates suggests. Bray et al. find standard observational scanner estimates fail badly against a randomized benchmark in their setting.

Contents

How to read this report

The report runs from the methodological question (can these parameters be trusted?) through the sharpest available test (Bray et al.), into the specific PBM evidence, and out to what the workshop and the pivotal question should actually ask. Jump to any section.

01 / Standards

What would count as validation?

A food-demand model can be internally coherent, statistically significant, and theoretically elegant while still failing the decision problem. These six tiers run from weak to decision-grade. For the PBM question, only the top tiers really count.

For the pivotal question the most useful test is not "does the coefficient replicate?" but: train the model before the intervention, predict the animal-product basket response to an exogenous PBM price change, then compare to actual post-intervention meat, fish, dairy, egg, and non-meat purchases across grocery and food-away-from-home.
02 / The grading

A report card for the literature

Each row is an estimand; each column a job we might ask it to do. Grades run A (trustworthy) to D (unreliable without direct validation). Reliability collapses moving left-to-right toward harder transport, and top-to-bottom toward the cross-price object that drives the welfare calculation.

Highlight:
EstimandDirectional
sign
Approx.
magnitude
Forecast,
same setting
Forecast after
policy / large shift
PBM-relevant
substitution
A / B+ — trust B / B- — usable with care C — weak; prior only D — unreliable unvalidated
03 / Own-price

Own-price elasticities: what is reliable?

The best-supported claim is modest: for broad food categories, own-price demand curves slope downward, and estimated own-price elasticities are often informative enough for rough forecasting and policy simulation.

This is backed by several strands. Meta-analyses of food demand find plausible own-price patterns across thousands of estimates and many countries. Demand-system forecasts at broad-category levels can perform reasonably out of sample. Policy shocks such as soda taxes produce consumption changes in the expected direction. And randomized food-price experiments, including virtual-supermarket settings, typically show price responsiveness as expected.

The stronger claim is much weaker: that a standard observational scanner-data own-price elasticity for a specific item can reliably forecast the causal effect of a new price policy. The problems are price endogeneity, promotional confounding, stockpiling and timing, functional-form misspecification, aggregation, and incomplete data coverage.

My assessment: broad-category own-price elasticities can strongly shape priors. Store- and category-level scanner estimates are usable with caution, ideally with holdout validation. SKU-level observational estimates need validation before being treated as decision-grade. And large price changes or new regimes call for extra care: a local elasticity around current prices may not forecast a 20% cut, parity, or a permanent supply-side shift.

04 / Cross-price

Cross-price elasticities: why they are harder

Cross-price elasticities are the central object for PBM substitution, and they are much harder than own-price elasticities for at least five structural reasons.

1. The signal is smaller

Own-price effects are usually the largest in a demand system. Cross-effects are often much smaller and easily swamped by noise, seasonality, promotions, and household heterogeneity.

2. The matrix is high-dimensional

With N products there are N own-price and N(N−1) cross-price elasticities. A 20-product system has 380 cross-price terms — multiple comparisons, weak identification, and heavy dependence on structure.

3. Sign interpretation is tricky

Income effects, budgeting, two-stage decisions, trip incidence, and aggregation can produce unintuitive signs. PBM and beef might be substitutes for a dinner occasion but complements in a flexitarian household's basket.

4. Cross-effects are local

Substitution depends on starting price, change size, permanence, awareness, quality, availability, and meal context. The effect of a 1% change near current prices may differ wildly from reaching parity with ground beef.

5. Coverage of the full substitution set is rare, and structure dominates

Most datasets cover only grocery, missing restaurants and total food. And because the cross-price signal is weak, model assumptions matter a lot: AIDS, nested logit, multivariate logit, and random-coefficient models can produce different substitution matrices from the same data.

05 / The sharpest test

The Bray et al. validation case

Treat this as part of the validation literature, not a study of one Midwestern retailer. It is among the cleanest direct experimental benchmarks for observational scanner-data elasticity estimation: prices randomized across product-store-weeks, then observational estimates asked to reproduce the experimental answer.

Experimental
−0.34
Observational
−1.97

Average own-price elasticity during the experiment · bar length ∝ |elasticity| · public Kellogg summary

82 / 34
Test stores / control stores
389,890
Prices set over 35 weeks

The gap survived controls for promotions, event-study windows, base-price changes, longer horizons, disaggregation, and several instruments. Standard observational fixes did not reconcile the estimates. Pre-period estimates were about −2.05 (test) and −1.63 (control), implying a large difference-in-differences gap.

The two Unjournal evaluations sharpen the reading. One raises a "margin puzzle": if product demand is really that inelastic, why doesn't the retailer raise prices more? This suggests the experiment may miss basket effects, store choice, and multiproduct pricing — so "experiment = gold standard" is too quick.

The second argues the difference-in-differences object is an elasticity, not a quantity, so changing the price support may simply produce a different local object — functional-form failure rather than pure observational bias. The lesson may be that constant-elasticity log-log models do not transport across price regimes.

The authors responded with clarifications and checks: Figure 2 estimates are OLS log-log (IV shown separately); compliance exceeded 95%; results are similar using transacted prices or instrumenting them; salience, stockout, and substitution-aware specifications were added; and a disaggregated analysis partitions by store group, period, starting price, and change magnitude. The gap remains in the narrower cells. They also softened the title from "cannot reproduce" to "does not reproduce."

Standard observational scanner-data elasticities can fail a sharp validation test, and we need to be much more explicit about the object, price support, time horizon, functional form, and validation target.

Synthesis · if own-price can fail, cross-price should be presumed more fragile until directly validated

Evaluator-attributed material here is described by role only ("one evaluator," "a second evaluator"). No names. The live, governing version of this evaluation lives on PubPub; this page is a working synthesis.
06 / PBM evidence

Five studies that don't converge

The PBM cross-price literature is a case study in non-convergence: signs flip across studies, methods, and timeframes. Treat it as partially informative, conflicting signals, not one estimate plus sampling error. Pills show each study's estimated direction by category, coloured by the sign of the cross-price relationship:

substitute green · cross-price positive — a PBM price cut reduces this category (the welfare-relevant direction) complement red · cross-price negative — a PBM price cut raises this category no clear effect grey · estimate near zero or not statistically distinguishable

Synthesis for PBM substitution

07 / The reframe

Stop asking for one elasticity

The pivotal question should not be "what is the cross-price elasticity?" as if a single stable object were waiting to be estimated. It should be a counterfactual response function.

Under a supply-side price reduction for Impossible/Beyond-like products of X%, what is the distribution over changes in the full U.S. animal-product basket, over a defined horizon and channel?

That object decomposes into components that can each be elicited and updated separately:

08 / Interactive sketch

Parameter dashboard

A toy decomposition to make the structure tangible: set the PBM price cut and the diversion shares, and see the implied displacement. These are placeholder ranges for elicitation, not estimates. The point is the wiring, not the numbers.

On the defaults — weak, evidence-flavoured anchors, not estimates. The studies conflict even on the sign of PBM↔meat substitution (see the PBM evidence), so treat every number below as a placeholder to argue with. The reasoning: own-price ≈ −1.50 (Zhao et al. 2023); total displacement is set well below 1:1 (~40% of new PBM units) because PBM buyers are mostly omnivores yet household event studies find little measured drop in meat spending; beef takes the largest single share since today's products are mainly ground-beef analogues and Freitas-Groff, Meyer & Woolley estimate them as gross substitutes for beef; chicken — the welfare-critical category — is set low precisely because it is the worst-identified, with some studies finding complementarity. Hover any control for its basis.

PBM price cutsupply-side, %
20%
PBM own-price elasticity|response|
−1.50
Diversion → beefshare of new PBM units
20%
Diversion → chickenwelfare-critical
10%
Diversion → other animalpork, fish, eggs, dairy
10%
+30%
PBM purchase increase
55%
of new PBM units displace animal products
2.5×
welfare-weighted index (chicken weighted heavily)
Illustrative only. Welfare index weights chicken displacement at 5× beef as a stand-in for the small-animal / high-footprint concern; remainder of new PBM units assumed to come from non-animal foods or added consumption. Replace with elicited distributions in the live tool.
The equations behind the sketch

Three deliberately simple relationships drive the readouts above. They are accounting identities for the sketch, not estimated behavioural equations.

PBM purchase increase (%) = price cut (%) × |own-price elasticity| e.g. 20% × 1.50 = +30%

A constant-elasticity step: a price cut of c% raises PBM quantity by roughly c × |ε| percent. This is exactly the local approximation the report warns may not transport to large cuts or new price regimes — see why cross-price is harder and the Bray et al. case.

Animal-product displacement (%) = beef share + chicken share + other-animal share (capped at 100% of the new PBM units)

The diversion sliders are the share of newly sold PBM units assumed to replace each animal category. The remainder is taken to come from non-animal foods or net-new consumption, not from meat.

Welfare-weighted index = (beef × 1 + chicken × 5 + other × 1) ÷ 100

Chicken displacement is weighted 5× beef as a stand-in for the small-animal / high-footprint concern (many more birds per pound; see the PBM evidence on why chicken effects matter most and are least identified). The weight is a placeholder, not an elicited welfare conversion.

09 / Workshop design

Designing the elicitation

The evidence ladder

Weight rises with the directness of the causal variation and the completeness of the basket observed. Bar length is illustrative weight.

Pre-register the estimands

Forecaster / evaluator uncertainty questions

10 / The agenda

Sharp tests to prioritize

What evaluators and authors should be asked to do next. Expand each.

11 / Blueprint

The ideal PBM price experiment

If we could commission one study, it would randomize the entire PBM subcategory price and measure the whole basket. The target output is not just elasticities but diversion ratios.

Target output: of the additional PBM units sold because of the price change, what fraction displaced each animal-product category?

Diversion ratios, not a single cross-price coefficient

12 / For the project

The practical conclusion

The meta-question is central: are the methods strong enough to identify the cross-product substitution that drives the welfare calculation? For own-price, often yes, with caveats. For cross-price, only with strong design and validation. For PBM-to-animal substitution, not yet with enough precision to support a narrow point estimate.

Existing food-demand methods are informative, but the evidence that they deliver sharp, transportable cross-price elasticities is weak.

Continue the empirical synthesis, but represent uncertainty broadly.
Do not collapse the evidence into a single cross-price elasticity without a validation discount.
Commission and evaluate work that directly compares methods against held-out causal variation.
Frame the workshop around forecasts of animal-product basket changes, not just elasticities.
Consider whether funding a validation experiment carries high value of information.