The Program

One Instrument,
Four Questions

These proposals share a single backbone. The garbling gym's default payoff hands the sender +10 for any sale regardless of the true state — a state-independent payoff, which is exactly Lipnowski–Ravid's transparent motives. So the gym is, by construction, a cheap-talk world whose value is the quasiconcave envelope. Add a commitment dial and it becomes the weak-institutions world, sweeping all the way up to the concave envelope of full Bayesian persuasion. Every proposal is a position on that qcav → cav spine.

Map · the credibility spine & where each proposal sitsSchematic

From cheap talk (qcav) to full commitment (cav). 08 builds the ruler; 09 asks whether reputation slides the system rightward; 10 makes both sides learn; 11 turns commitment into a controlled treatment crossed with verifiability.

A What the gym is — and the four gaps

HAS TODAY

Discrete 3×3 garbling game · 6 fixed sender matrices · a strong zoo of ~10 receiver learners (Bayesian, regret, bandit, level-k, LLM) · LLM hooks · per-round state revelation · tested storage.

THE FOUR GAPS

1. the sender never learns · 2. no equilibrium solver (no ground truth) · 3. single sender + single receiver · 4. no commitment / credibility primitive.

B How the proposals compose

Dependency · 08 is the foundation the rest stand onSchematic

08 supplies the benchmarks every experiment measures against. 09 defines the commitment dial (reused by 11); 10 defines the sender-learner interface (reused by 09 and 11).

#	Proposal	Core question	Anchors
08	Theory baseline	Build the solver + metrics so outcomes are measured against computed benchmarks.	KG · LR · LRS · Blackwell
09	Reputation as credibility	Does reputation create an effective credibility χ_eff, tracing the LRS curve?	LRS · KG · LR
10	Two-sided learning	When the sender also learns, does the garbling rate converge, cycle, or collapse?	CS · KG/LR
11	Verifiability × commitment	Do LLM agents show "commitment blindness" — and in a novel direction?	FLP · Milgrom

The recurring empirical hook

Across the LLM literature, models are systematically too honest — safety-tuned agents cooperate and disclose where theory prescribes strategic concealment. The gym, once it has a benchmark, is the cleanest place to measure that honesty gap and ask what closes it.

Proposal 08 · Foundation

The Theory Baseline

Every experiment the gym runs is descriptive until it can be compared to what theory says should happen. This builds the missing ruler — a solver for the persuasion benchmarks, and metrics that mean something.

Kamenica–Gentzkow 2011Lipnowski–Ravid 2020Lipnowski–Ravid–Shishkin 2022Blackwell 1951

01 Background

The geometry of persuasion is a story about envelopes of the sender's value over beliefs. With commitment, the achievable value is the concave envelope $cav$ (Kamenica–Gentzkow). With transparent-motive cheap talk, it is the quasiconcave envelope (Lipnowski–Ravid). Between them, weak institutions trace a capped-concavification curve in credibility χ (LRS). These are exact, computable objects — but the gym computes none of them.

02 The gap

"Informativeness" is currently a Frobenius distance from the identity matrix — a geometric proxy with no decision-theoretic meaning, and no comparability across the planned continuous and natural-language channels.
There is no commitment value, no cheap-talk value, no LRS curve to measure a run against. We cannot say whether a learning receiver reaches qcav, cav, or neither.
Without a benchmark, the celebrated "LLMs are too honest" claim has no optimum to subtract from — it stays anecdotal.

03 The build

A theory package, side-effect-free and additive, computing on the belief simplex: the babbling value, the commitment value $V_{cav}=\operatorname{cav}\hat v(\mu_0)$, the cheap-talk value $V_{qcav}=\operatorname{qcav}v(\mu_0)$, and the weak-institution curve

$$v^*_\chi(\mu_0)=\max_{\beta,\gamma,k}\big[k\,\operatorname{cav}(v^{\wedge\gamma})(\beta)+(1-k)\,v^{CT}(\gamma)\big],\quad v^{\wedge\gamma}=\min(v,\,v^{CT}(\gamma)).$$

Plus an information-metric module — mutual information $I(\theta;s)$, a Blackwell garbling test, and a posterior mean-preserving-spread check — replacing the Frobenius proxy with quantities the literature speaks.

Figure 08 · the benchmark ladder the gym lacksExact

Prior belief μ₀

For a non-concave sender value v, the solver returns the whole ladder at the prior: babbling < cheap talk (qcav) < commitment (cav). The gap between qcav and cav is the value of commitment — the quantity proposals 09–11 are about. Envelopes computed exactly via convex hull / running maxima.

04 Why it is trustworthy

The solver ships with an oracle: it must reproduce closed-form results we already hold — the Kamenica–Gentzkow prosecutor–judge value of 0.60, the LRS central-bank curve $(\tfrac32,\,2\chi,\,1)$ with its discontinuity at $\chi=\tfrac23$, and the Crawford–Sobel collapse of $N(b)$. These were computed while building the literature-review figures, so there is a reference from day one.

Contribution

Turns the gym from a simulator into an instrument — the prerequisite for every benchmark-anchored claim.
A reusable "theoretical baseline" methods section for any paper that follows.

Artifact / methods contribution; foundation for 09–11.

Proposal 09 · Flagship

Reputation as
Endogenous Credibility

LRS prove a sender's value moves along a sharp, non-smooth curve as institutional credibility χ runs from cheap talk to full commitment. We turn χ from a knob into something agents earn — and ask where reputation lands them on that curve.

Lipnowski–Ravid–Shishkin 2022Kamenica–Gentzkow 2011Lipnowski–Ravid 2020Repeated games · LLMs

01 Background

Classical Bayesian persuasion assumes the sender can commit to a signal. The gym is the opposite extreme — transparent-motive cheap talk, whose one-shot value is the quasiconcave envelope. Between them sits the weak-institution model: the announced rule is honored only with probability χ. At $\chi=1$ we recover concavification; at $\chi=0$, quasiconcavification; in between, a curve with a productive-mistrust ramp and a genuine discontinuity.

02 The gap

The weak-institutions curve has only ever been drawn analytically. No one has asked whether emergent reputation among adaptive or LLM agents reproduces it — whether repeated interaction manufactures an effective credibility χ_eff without any enforced commitment, and whether that lift shows the same cliff the theory predicts.

03 The mechanism — a commitment dial

One primitive, reused by Proposal 11: each round the sender announces a garbling rule; with probability χ it is honored, with probability 1−χ the sender may deviate after seeing the state; the receiver sees the message, not its origin. In the flagship's second phase we remove the enforced bit and let the receiver maintain a trust state — credibility must now be earned. We then recover χ_eff three independent ways and triangulate: value-fit, behavioral honesty rate, and revealed trust.

Figure 09 · does reputation climb the LRS curve?Illustrative

Horizon N (rounds)

Left: the LRS value curve $v^*_\chi$ (shape exact); the dot marks the χ_eff reputation is hypothesized to reach. Right: the anticipated round-by-round climb of realized value from the cheap-talk floor toward that target. Short horizons fall off a reputational cliff back to qcav — the repeated-game echo of the LRS discontinuity. Trajectory shapes are illustrative predictions, not data.

04 Hypotheses

H1 · validation

Mechanical χ reproduces $v^*_\chi$, discontinuity included.

H2 · partial substitution

Reputation yields $0<\chi_{eff}<1$: above the cheap-talk floor, below full commitment.

H3 · a repeated-game cliff

Below a horizon threshold, reputation collapses and value drops discontinuously to qcav.

H4 · the honesty bias

LLM senders over-shoot χ_eff — "too credible," leaving sender value on the table.

Contribution

The first agent-based trace of the weak-institutions value function.
A definition of endogenous credibility as a measurable quantity, not a modeling primitive — bridging LRS theory and the empirical "LLMs in repeated games" literature with an exact benchmark.

Target: economics-of-AI / computational social science. Phase-A validation supports a shorter tools note.

Proposal 10

Two-Sided Learning

The gym has a rich zoo of receiver learners but a sender that never learns. Make the sender a first-class learner, and the central dynamic of agentic persuasion finally becomes observable.

Crawford–Sobel 1982Kamenica–Gentzkow / Lipnowski–RavidFolk theorem · repeated gamesLLM collusion

01 Background

Persuasion is two-sided: the sender chooses how much to garble, the receiver how much to trust, and each adapts to the other. The gym freezes one side — the sender picks a fixed matrix or a hand-written heuristic. That hides the most interesting question in the system.

02 The gap

When a self-interested sender adapts against a learning receiver, does the realized garbling rate converge to the cheap-talk value, settle on a repeated-game outcome, or fall into a limit cycle of exploit-and-collapse? Nobody has mapped this for the gym, because the sender cannot yet learn. It is the single biggest capability gap in the framework.

03 The build

A $\texttt{SenderStrategy}$ interface mirroring the receiver zoo — regret-matching, multiplicative weights, bandits over noise level, best-response-to-empirical-receiver, and an LLM sender. Plus an optional misalignment knob $b$ that tilts the sender's ideal action with the state (toward Crawford–Sobel bias), so "garbling rate" acquires a partition meaning and learners can be asked to discover the CS structure $N(b)$.

Figure 10 · three fates of two-sided adaptationIllustrative

The garbling rate over rounds for three sender×receiver pairings. Converge: a regret/bandit sender settles to the cheap-talk-optimal garbling. Cycle: build trust, exploit it, get punished by a change-point receiver, repeat. Collapse: mutual drift to full garbling (babbling). Dynamics are anticipated archetypes the experiments will classify, not measured runs.

04 Hypotheses

H1 · convergence

Against a fixed Bayesian receiver, regret/bandit/best-response senders approach the feasible optimum (qcav without reputation).

H2 · co-adaptive cycles

Against an adaptive receiver, some pairings never settle — they cycle through exploit-and-punish.

H3 · CS recovery

As bias $b$ rises, the learned garbling coarsens, tracking $N(b)$ without being told it.

H4 · persistent honesty gap

LLM senders under-garble vs. the learned optimum even with reward feedback; persona conditioning shrinks but doesn't close it.

Contribution

Upgrades the gym from a receiver-learning testbed to a two-sided information-design testbed — reused by every future direction (competition, multi-receiver, mediation).
A benchmark-anchored map of when agentic persuasion converges vs. cycles vs. collapses, and a measurement of the honesty gap that survives reward feedback.

Target: learning-dynamics / multi-agent venue; the capability itself is an artifact.

Proposal 11

Verifiability ×
Commitment

Fréchette–Lizzeri–Perego showed, with people, that commitment helps or hurts communication depending on verifiability — and that subjects misperceive commitment. We replicate the design with LLM agents, and predict their blindness has a different signature.

Fréchette–Lizzeri–Perego 2022Milgrom 1981KG / LR endpointsLLM over-honesty

01 Background

FLP nest cheap talk, disclosure, and Bayesian persuasion in one framework with two axes — verifiability (can the sender make false state-specific claims?) and commitment ρ. Their headline: informativeness rises in ρ under unverifiability but falls under verifiability, converging at full commitment. Their surprise: people are commitment-blind — they act as if commitment were weaker or different than it is.

02 The gap

No one has run this with LLM agents. And the gym's documented over-honesty makes a sharp, partly novel prediction: under verifiable rules LLMs should over-communicate like humans — but under unverifiable rules they may over-communicate too, breaking the human pattern of under-communication. If so, LLM commitment blindness is not a copy of the human kind; it is over-honesty wearing its mask.

03 The design — the FLP 2×2 with LLMs

Reusing the commitment dial (Proposal 09) plus a verifiability mask — a channel-level constraint forbidding false state-specific claims (built on Milgrom's hard-evidence logic), not a prompt instruction. Two arms: game-theoretic agents validate the opposite comparative statics against the solver; LLM agents are the new science, and we recover an effective perceived commitment ρ_perceived.

Figure 11a · opposite slopes & the LLM signatureTheory exact · LLM illustrative

Commitment ρ

Solid: theory — informativeness rises in ρ when unverifiable, falls when verifiable, meeting at Bayesian persuasion. Dashed: the hypothesized LLM arm, over-communicating in both regimes (H3). The vertical gap between an LLM curve and its theory curve is commitment blindness; the horizontal offset is ρ_perceived − ρ.

The four corners

Figure 11b · what the framework nestsSchematic

The same primitives generate four classic models at the corners; commitment is horizontal, verifiability vertical.

04 Hypotheses

H1 · replication

Game-theoretic agents show the opposite-sign comparative statics, converging at ρ=1.

H2 · blindness

LLM behavior is best fit by $\rho_{perceived}\neq\rho_{true}$.

H3 · novel direction

LLMs over-communicate under both regimes — unlike humans, who under-communicate when unverifiable.

H4 · persona modulates

Strategic personas pull ρ_perceived toward ρ under unverifiability.

Contribution

The first LLM replication-and-extension of a named Econometrica experiment, with a theoretical baseline the original lab study could only approximate.
A novel claim — LLM commitment blindness has a different signature than human blindness — that is informative whether or not it is confirmed.

Target: economics-of-AI / experimental venue; reuses the FLP figures from the literature review.

One Instrument, Four Questions

A What the gym is — and the four gaps

B How the proposals compose

The Theory Baseline

01 Background

02 The gap

03 The build

04 Why it is trustworthy

Reputation as Endogenous Credibility

01 Background

02 The gap

03 The mechanism — a commitment dial

04 Hypotheses

Two-Sided Learning

01 Background

02 The gap

03 The build

04 Hypotheses

Verifiability × Commitment

01 Background

02 The gap

03 The design — the FLP 2×2 with LLMs

The four corners

04 Hypotheses

One Instrument,
Four Questions

Reputation as
Endogenous Credibility

Verifiability ×
Commitment