The Unjournal · Plant-Based Meat Substitution Pivotal Question · Methodological Survey + Evaluation Synthesis
How this was made. Drafted by GPT Pro from existing Unjournal research and discussion (the elasticity-validation survey, the Bray et al. evaluation materials, and the PBM substitution literature), then built and polished into this interactive report in Claude Code. It is currently being reviewed and adjusted by hand. Treat figures and attributions as provisional until that review is complete; the governing evaluation lives on PubPub.
A combined survey of validation evidence for own-price and cross-price elasticities, the Bray et al. validation case, the plant-based-meat substitution literature, and how to frame the pivotal question and workshop around what the evidence can actually bear.
Often directionally reliable; sometimes useful for broad forecasting.
The evidence base is large and shows recognizable structure across foods and countries. But sharp validation is thinner than the volume of estimates suggests. Bray et al. find standard observational scanner estimates fail badly against a randomized benchmark in their setting.
Much less reliable. More like a wide prior than a settled parameter.
Smaller, noisier, more numerous, more sensitive to aggregation and functional form. Meta-analyses show variation rather than convergence. For PBM-versus-meat, estimates change sign and magnitude across data sources and specifications.
The report runs from the methodological question (can these parameters be trusted?) through the sharpest available test (Bray et al.), into the specific PBM evidence, and out to what the workshop and the pivotal question should actually ask. Jump to any section.
A food-demand model can be internally coherent, statistically significant, and theoretically elegant while still failing the decision problem. These six tiers run from weak to decision-grade. For the PBM question, only the top tiers really count.
Each row is an estimand; each column a job we might ask it to do. Grades run A (trustworthy) to D (unreliable without direct validation). Reliability collapses moving left-to-right toward harder transport, and top-to-bottom toward the cross-price object that drives the welfare calculation.
| Estimand | Directional sign | Approx. magnitude |
Forecast, same setting | Forecast after policy / large shift | PBM-relevant substitution |
|---|
The best-supported claim is modest: for broad food categories, own-price demand curves slope downward, and estimated own-price elasticities are often informative enough for rough forecasting and policy simulation.
This is backed by several strands. Meta-analyses of food demand find plausible own-price patterns across thousands of estimates and many countries. Demand-system forecasts at broad-category levels can perform reasonably out of sample. Policy shocks such as soda taxes produce consumption changes in the expected direction. And randomized food-price experiments, including virtual-supermarket settings, typically show price responsiveness as expected.
The stronger claim is much weaker: that a standard observational scanner-data own-price elasticity for a specific item can reliably forecast the causal effect of a new price policy. The problems are price endogeneity, promotional confounding, stockpiling and timing, functional-form misspecification, aggregation, and incomplete data coverage.
My assessment: broad-category own-price elasticities can strongly shape priors. Store- and category-level scanner estimates are usable with caution, ideally with holdout validation. SKU-level observational estimates need validation before being treated as decision-grade. And large price changes or new regimes call for extra care: a local elasticity around current prices may not forecast a 20% cut, parity, or a permanent supply-side shift.
Cross-price elasticities are the central object for PBM substitution, and they are much harder than own-price elasticities for at least five structural reasons.
Own-price effects are usually the largest in a demand system. Cross-effects are often much smaller and easily swamped by noise, seasonality, promotions, and household heterogeneity.
With N products there are N own-price and N(N−1) cross-price elasticities. A 20-product system has 380 cross-price terms — multiple comparisons, weak identification, and heavy dependence on structure.
Income effects, budgeting, two-stage decisions, trip incidence, and aggregation can produce unintuitive signs. PBM and beef might be substitutes for a dinner occasion but complements in a flexitarian household's basket.
Substitution depends on starting price, change size, permanence, awareness, quality, availability, and meal context. The effect of a 1% change near current prices may differ wildly from reaching parity with ground beef.
Most datasets cover only grocery, missing restaurants and total food. And because the cross-price signal is weak, model assumptions matter a lot: AIDS, nested logit, multivariate logit, and random-coefficient models can produce different substitution matrices from the same data.
Treat this as part of the validation literature, not a study of one Midwestern retailer. It is among the cleanest direct experimental benchmarks for observational scanner-data elasticity estimation: prices randomized across product-store-weeks, then observational estimates asked to reproduce the experimental answer.
Average own-price elasticity during the experiment · bar length ∝ |elasticity| · public Kellogg summary
The gap survived controls for promotions, event-study windows, base-price changes, longer horizons, disaggregation, and several instruments. Standard observational fixes did not reconcile the estimates. Pre-period estimates were about −2.05 (test) and −1.63 (control), implying a large difference-in-differences gap.
The two Unjournal evaluations sharpen the reading. One raises a "margin puzzle": if product demand is really that inelastic, why doesn't the retailer raise prices more? This suggests the experiment may miss basket effects, store choice, and multiproduct pricing — so "experiment = gold standard" is too quick.
The second argues the difference-in-differences object is an elasticity, not a quantity, so changing the price support may simply produce a different local object — functional-form failure rather than pure observational bias. The lesson may be that constant-elasticity log-log models do not transport across price regimes.
The authors responded with clarifications and checks: Figure 2 estimates are OLS log-log (IV shown separately); compliance exceeded 95%; results are similar using transacted prices or instrumenting them; salience, stockout, and substitution-aware specifications were added; and a disaggregated analysis partitions by store group, period, starting price, and change magnitude. The gap remains in the narrower cells. They also softened the title from "cannot reproduce" to "does not reproduce."
Standard observational scanner-data elasticities can fail a sharp validation test, and we need to be much more explicit about the object, price support, time horizon, functional form, and validation target.
Synthesis · if own-price can fail, cross-price should be presumed more fragile until directly validated
The PBM cross-price literature is a case study in non-convergence: signs flip across studies, methods, and timeframes. Treat it as partially informative, conflicting signals, not one estimate plus sampling error. Pills show each study's estimated direction by category, coloured by the sign of the cross-price relationship:
The pivotal question should not be "what is the cross-price elasticity?" as if a single stable object were waiting to be estimated. It should be a counterfactual response function.
Under a supply-side price reduction for Impossible/Beyond-like products of X%, what is the distribution over changes in the full U.S. animal-product basket, over a defined horizon and channel?
That object decomposes into components that can each be elicited and updated separately:
A toy decomposition to make the structure tangible: set the PBM price cut and the diversion shares, and see the implied displacement. These are placeholder ranges for elicitation, not estimates. The point is the wiring, not the numbers.
On the defaults — weak, evidence-flavoured anchors, not estimates. The studies conflict even on the sign of PBM↔meat substitution (see the PBM evidence), so treat every number below as a placeholder to argue with. The reasoning: own-price ≈ −1.50 (Zhao et al. 2023); total displacement is set well below 1:1 (~40% of new PBM units) because PBM buyers are mostly omnivores yet household event studies find little measured drop in meat spending; beef takes the largest single share since today's products are mainly ground-beef analogues and Freitas-Groff, Meyer & Woolley estimate them as gross substitutes for beef; chicken — the welfare-critical category — is set low precisely because it is the worst-identified, with some studies finding complementarity. Hover any control for its basis.
Three deliberately simple relationships drive the readouts above. They are accounting identities for the sketch, not estimated behavioural equations.
A constant-elasticity step: a price cut of c% raises PBM quantity by roughly c × |ε| percent. This is exactly the local approximation the report warns may not transport to large cuts or new price regimes — see why cross-price is harder and the Bray et al. case.
The diversion sliders are the share of newly sold PBM units assumed to replace each animal category. The remainder is taken to come from non-animal foods or net-new consumption, not from meat.
Chicken displacement is weighted 5× beef as a stand-in for the small-animal / high-footprint concern (many more birds per pound; see the PBM evidence on why chicken effects matter most and are least identified). The weight is a placeholder, not an elicited welfare conversion.
Weight rises with the directness of the causal variation and the completeness of the basket observed. Bar length is illustrative weight.
What evaluators and authors should be asked to do next. Expand each.
If we could commission one study, it would randomize the entire PBM subcategory price and measure the whole basket. The target output is not just elasticities but diversion ratios.
Target output: of the additional PBM units sold because of the price change, what fraction displaced each animal-product category?
Diversion ratios, not a single cross-price coefficient
The meta-question is central: are the methods strong enough to identify the cross-product substitution that drives the welfare calculation? For own-price, often yes, with caveats. For cross-price, only with strong design and validation. For PBM-to-animal substitution, not yet with enough precision to support a narrow point estimate.
Existing food-demand methods are informative, but the evidence that they deliver sharp, transportable cross-price elasticities is weak.