# The Deep Feed > A continuous stream of what matters in AI — models, agents, products, business, research, and the people shipping it. Updated multiple times a day. Author: The Deep Feed Site: https://www.thedeepfeed.ai --- # Anthropic weighs $50B raise at $900B valuation — more than double its February round URL: https://www.thedeepfeed.ai/posts/2026-04-30-anthropic-50b-raise-at-900b-valuation/ Category: Business Date: 2026-04-30 Source: TechCrunch — https://techcrunch.com/2026/04/29/sources-anthropic-could-raise-a-new-50b-round-at-a-valuation-of-900b/ Tags: anthropic, funding, valuation, claude, openai > The Claude maker has received multiple preemptive offers in the $850B–$900B range and is expected to decide at a May board meeting, per sources. **Anthropic** has received multiple preemptive offers to raise $40 billion to $50 billion at a valuation between $850 billion and $900 billion, according to six sources familiar with the matter. TechCrunch reports the company is expected to decide whether to proceed at a board meeting in May. The figures represent more than a doubling of Anthropic's $380 billion valuation from its February Series G, and would put it at or above OpenAI's $852 billion post-money valuation from the same month. ## The revenue story Anthropic's annualized revenue run rate surpassed $30 billion earlier this month, up from roughly $9 billion at the end of 2025. Sources told TechCrunch the current run rate is closer to $40 billion, driven largely by Claude Code and Cowork, the company's AI coding platforms. Investor demand is reportedly far exceeding the round size. One institutional investor prepared to commit $5 billion has not yet secured a meeting with CFO Krishna Rao, per TechCrunch's sources. ## Timeline and pressure - **February 2026:** Anthropic closed a $30 billion Series G at $380 billion valuation. - **April 14:** Bloomberg and Business Insider first reported preemptive bids at $800 billion; at that time, Anthropic had not committed to a raise. - **Late April:** Valuation offers have risen into the $850 billion–$900 billion range. - **May (expected):** Board meeting to finalize decision on round size and valuation. The company is described as "finding it difficult to resist the pressure" to raise, with the round potentially serving as a final private financing before an IPO. ## Why it matters This is the clearest signal yet that **the frontier-lab valuation race is now decoupled from product differentiation**. Anthropic and OpenAI are raising at near-parity valuations despite different go-to-market strategies, different policy stances (Anthropic declined Pentagon classified-network access last week), and different revenue bases. Investors are pricing in total addressable market expansion—finance, healthcare, life sciences—not current performance. If Anthropic closes at $900 billion two months after raising at $380 billion, it suggests the private markets believe frontier AI labs are in a winner-take-most endgame, and seat allocation matters more than price. --- # Anthropic's $900B valuation: $40B revenue, $175B in commitments, and negative unit economics URL: https://www.thedeepfeed.ai/posts/2026-04-30-anthropic-900b-valuation-deep/ Category: Business Date: 2026-04-30 Tags: anthropic, valuation, revenue, aws, openai, enterprise-ai > The Claude builder turned down $800B in April, now fields $900B offers on $40B revenue run rate—but burns $6B-$12B annually and has locked $100B into AWS spend. ## The $900B round Anthropic didn't need—and hasn't taken Anthropic has received multiple preemptive offers to raise roughly $50 billion at valuations between $850 billion and $900 billion, [per TechCrunch](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), but as of late April 2026 has yet to accept any of them. If closed at those terms, the round would surpass OpenAI's $852 billion valuation from its [March 2026 funding](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), making Anthropic the most valuable AI startup in the world. A [board meeting in May 2026](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) is expected to make a definitive decision. The demand signal is unambiguous. One institutional investor prepared to commit $5 billion has yet to secure a meeting with Anthropic CFO Krishna Rao, [TechCrunch reports](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai)—a sign of how oversubscribed the round is before it officially exists. The valuation leap is equally striking: Anthropic was valued at $380 billion [as recently as February 2026](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), meaning a $900 billion close would more than double its worth in roughly three months. What's driving the valuation is revenue growth that would be extraordinary in any sector. Anthropic announced in early April that its business has reached $30 billion in annualized revenue, and [TechCrunch reports](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) the current run rate may be closer to $40 billion, though Anthropic's official figure stands at $30 billion annualized. That's up from roughly $10 billion in [calendar year 2025 revenue](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai)—a roughly 4x increase in four months. A large portion is driven by AI coding capabilities, specifically Claude Code and Cowork platforms, [according to OpenTools](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). But the capital structure underneath tells a different story. Amazon is investing up to $25 billion ($5 billion immediately with $20 billion tied to milestones), while Google's Alphabet is committing up to $40 billion ($10 billion now at a $350 billion valuation, with $30 billion more tied to performance targets), [per TechCrunch](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). Anthropic has also [committed to $100 billion in AWS spending over 10 years](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). The company needs capital to purchase compute infrastructure for its new Mythos model, which demands significantly more processing power than previous Claude versions, [OpenTools reports](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). This could be Anthropic's final private round. The company is reportedly considering an IPO as soon as October 2026, with [one report suggesting](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) the IPO could raise over $60 billion. The question is not whether Anthropic can command a $900 billion valuation—the term sheets prove it can—but whether accepting that capital makes sense when the company is already sitting on $175 billion in commitments, burning $6 billion to $12 billion annually, and locked into $100 billion of cloud spend with a strategic investor that competes directly with its other strategic investor. ## From $9B to $40B in four months: the fastest revenue ramp in software history [Anthropic](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) ended December 2025 at $9 billion in annualized revenue, [per documents seen by sources close to the company](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/). By the end of March 2026, that figure had reached [$30 billion in annualized revenue](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), which [Anthropic announced in early April](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). The company [generated roughly $10 billion in revenue in calendar year 2025](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), meaning the annualized figure represents a roughly 4x increase in the first four months of 2026. [TechCrunch reports](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) the current run rate may be closer to $40 billion as of late April, though Anthropic's official figure stands at $30 billion annualized. The March acceleration was particularly extreme. [According to the timeline documented by Idlen](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), Anthropic went from $9 billion annualized at the end of December 2025 to $19 billion by the end of February 2026, then $30 billion by the end of March—a 3.3x increase in the first quarter. That represents approximately $3 billion of ARR added per week in March 2026, [what sources described to Idlen as](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) "the steepest revenue acceleration ever observed at a tech startup, private or public." The historical context makes the velocity clearer. Anthropic ended 2024 at [roughly $1 billion in annualized revenue](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), meaning the $30 billion figure represents approximately 1,400% year-over-year growth from end-of-2024 to early April 2026. Axios, [quoted by TNW](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), described it bluntly: no company in American history has ever grown like this. The revenue composition tilts heavily enterprise. [Claude Code alone hit $2.5 billion in annualized revenue in February](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), more than doubling since the start of the year, according to TNW. [The Claude API powers Amazon Bedrock, Databricks Mosaic, and Snowflake Cortex offerings](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), and [an Amazon × Claude contract signed in January, worth $8 billion over three years](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), unlocked a wave of Fortune 500 deployments. [Idlen's sources noted](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) this puts Anthropic at 60% of OpenAI's $50 billion ARR as of March 31, 2026, with only 20% of the consumer user base. The valuation implications are direct. At $30 billion in annualized revenue and an $800 billion valuation in mid-April offers, Anthropic commanded a roughly 27x revenue multiple, [which TNW characterized as](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo) "high by any conventional measure, but not obviously irrational for a company whose revenue is doubling every few months." By late April, with [preemptive offers in the $850 billion to $900 billion range](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) and a possible $40 billion run rate, the multiple compresses to 21–23x—still elevated relative to traditional SaaS multiples of 5–12x, but materially lower than the 27x implied two weeks earlier. The gap between revenue growth and valuation growth is narrowing, which means investors are pricing in deceleration or margin compression that has not yet appeared in the public numbers. ## Capital in: Google's $40B, Amazon's $25B, and the $100B AWS lock-in The checks backing Anthropic's ascent aren't venture capital in the traditional sense—they're strategic bets with strings attached. [Google's Alphabet is committing up to $40 billion](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai): $10 billion immediately at a $350 billion valuation, with $30 billion more tied to performance targets. [Amazon is investing up to $25 billion](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), including $5 billion immediately with $20 billion tied to milestones. These aren't passive allocations. Google Cloud gets preferred inference distribution rights; Amazon Web Services gets a decade-long compute monopoly. That compute lock-in carries a price tag that dwarfs the equity investment. [Anthropic has committed to $100 billion in AWS spending over 10 years](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), an obligation that transforms Amazon's $25 billion equity stake into a customer acquisition cost with guaranteed margin recovery. At current AWS pricing for high-performance GPU instances, $100 billion buys roughly 1.2 billion H100-equivalent hours—but also ensures Anthropic can't negotiate meaningfully with Microsoft Azure, Google Cloud, or Oracle without breaching contractual minimums. The equity is the headline; the compute contract is the handcuffs. The February 2026 round that set the $350 billion baseline was already historic. [Anthropic raised $30 billion at a $380 billion valuation](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), making it the second-largest venture funding deal ever—eclipsed only by OpenAI's March close. That [OpenAI round pulled in $122 billion at an $852 billion valuation](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai), with [$50 billion from Amazon, $30 billion from Nvidia, and $30 billion from SoftBank](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai). The capital structure emerging across frontier AI labs isn't traditional dilution—it's a hybrid of equity, cloud credits, and multi-year revenue commitments that blur the line between investment and procurement. The February Anthropic round attracted late-stage and sovereign capital at scale. [Coatue, GIC (Singapore), Mubadala (Abu Dhabi), Lightspeed, and a consortium led by an unnamed Saudi sovereign fund](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) reportedly offered tickets between $5 billion and $15 billion, with structures mixing primary capital and secondary liquidity for early employees. At the $350 billion entry point, those investors secured a 2.4x markup in eight weeks when the $800 billion preemptive offers arrived in April—offers [Dario Amodei turned down](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), citing adequate runway and IPO positioning. The capital inflows create a paradox: Anthropic has more cash than it can deploy efficiently, but the commitments it made to secure that cash lock in costs that scale faster than revenue. The $100 billion AWS obligation averages $10 billion per year—roughly 25% of the current $40 billion revenue run rate. If Anthropic's revenue multiple compresses post-IPO or competition forces price cuts, that fixed compute spend becomes a margin anchor. Google and Amazon aren't just investors; they're creditors with contractual first claim on Anthropic's infrastructure budget for the next decade. ## Enterprise contracts and unit economics: the 1,000-customer base vs. $6B-$12B annual burn The revenue story rests on an enterprise customer base that [doubled from 500 to 1,000+ companies](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/) each spending over $1 million annually between February and April 2026. [Eight of the Fortune 10 companies now run on Claude](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/), and the revenue mix tilts heavily toward business contracts: approximately [80% of Anthropic's total revenue comes from enterprise customers](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/), a structural contrast to OpenAI's consumer-heavy base anchored by ChatGPT subscriptions. The enterprise focus delivers higher retention, larger contract sizes, and multi-year commitments that smooth revenue recognition—advantages that compound as the customer base scales. [Claude Code alone generated $2.5 billion in annualized revenue as of February 2026](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), capturing [54% of the AI coding tool market](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/) ahead of GitHub Copilot and Cursor. That figure represents a single product line—a command-line agentic coding tool—outpacing the entire 2024 revenue of established SaaS companies like Box. The developer tooling wedge has proven unusually sticky: business subscriptions to Claude Code quadrupled in the first quarter of 2026, and weekly active users doubled since January 1, per [Vucense's analysis](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/). The product's growth rate suggests it could cross $5 billion in ARR by mid-2026 on its own, making it one of the fastest-scaling enterprise software offerings in history. The burn rate tells a different story. Anthropic spends [between $500 million and $1 billion per month on compute](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), translating to $6 billion to $12 billion annually. At $30 billion in ARR and assuming roughly 50-60% gross margins after cloud infrastructure costs, the company is likely generating $15 billion to $18 billion in gross profit—enough to cover the compute spend with room for R&D and operations, but thin enough that any revenue deceleration or margin compression would quickly turn cash-flow positive into cash-flow negative. The [February Series G raised $30 billion in cash](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), securing runway for [24 to 36 months at current burn rates](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), but only if revenue growth continues to outpace compute cost inflation. The unit economics hinge on a calculation that remains opaque: cost per token served, multiplied by inference volume, minus revenue per API call. Anthropic has not disclosed these figures publicly, and the $6 billion to $12 billion annual compute spend suggests inference costs remain stubbornly high even as the company scales. If the burn is closer to $1 billion per month, gross margins are likely in the 40-50% range—razor-thin for a software company, and a sign that the path to operating leverage is longer than the revenue trajectory suggests. The enterprise contracts provide visibility, but the economics are still those of a capital-intensive infrastructure business, not a high-margin software platform. ## Anthropic vs. OpenAI: $30B revenue vs. $24B, accounting asterisks included [Anthropic announced](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/) $30 billion ARR as of April 7, 2026, versus OpenAI's [$24 billion as of end of February 2026](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/). That marks the first time a rival has led OpenAI in revenue [since ChatGPT launched in November 2022](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/), and the reversal arrived more than two months ahead of [Epoch AI's mid-2026 forecast](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/). The headline, however, hides an accounting wedge that narrows the gap. [Anthropic books gross revenue](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/) with partner cuts—Amazon's Bedrock share, reseller margins—counted as costs, while [OpenAI reports net receipts after cloud-share payouts](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/). Exact magnitude is undisclosed, but typical hyperscaler rev-share deals run 20–30 percent, which would compress Anthropic's reported $30 billion closer to $21–24 billion on a net basis comparable to OpenAI's accounting treatment. Even adjusted for methodology, the trajectory matters more than the snapshot. [Anthropic tripled ARR in roughly one quarter](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/), jumping from [$9 billion at year-end 2025](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) to $30 billion by April 7. OpenAI climbed from [$20 billion to $24 billion](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/) over the same window—a 1.2x expansion. Meritech partner Alex Clayton, who has studied more than 200 software IPOs, said he ["never saw a growth rate like this"](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/) in reference to Anthropic's acceleration. The user-base asymmetry makes Anthropic's revenue performance more striking. OpenAI commands roughly [900 million weekly active ChatGPT users](https://tech-insider.org/anthropic-vs-openai-2026/) versus Anthropic's approximately [19 million monthly active users as of January 2025](https://tech-insider.org/anthropic-vs-openai-2026/), yet Anthropic reaches 60 percent of OpenAI's revenue with only 20 percent of the consumer footprint. The delta is enterprise mix: [Anthropic generates approximately 80 percent of revenue from enterprise customers](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/), with [enterprise clients paying $1 million or more per year doubling from 500 to 1,000-plus in two months](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/) and [eight of the Fortune 10 now running on Claude](https://juggerinsight.com/en/anthropic-revenue-tops-openai-30-billion-arr/). OpenAI's mix tilts consumer-heavy, driven by ChatGPT subscriptions. Valuation diverged in the opposite direction. [OpenAI raised $122 billion in late March 2026](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/) at an [$852 billion post-money valuation](https://tech-insider.org/anthropic-vs-openai-2026/), a 35.5x multiple on its $24 billion ARR. Anthropic's February Series G priced the company at [$350 billion pre-money](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), or roughly $380 billion post-money after the $30 billion cash infusion—a 12.7x multiple on the $30 billion ARR it would report two months later. The discount reflects investor caution on gross-versus-net accounting, OpenAI's consumer moat, and Anthropic's [$6–12 billion annual burn](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) against infrastructure spend. The April preemptive offers that valued Anthropic at [$800 billion](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/)—from [Coatue, GIC, Mubadala, Lightspeed, and an unnamed Saudi sovereign fund](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/)—would have pushed the revenue multiple to 26.7x. Dario and Daniela Amodei [said no for now](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), citing dilution risk and IPO positioning. The refusal keeps the nominal valuation gap wide—OpenAI at $852 billion, Anthropic at $380 billion—even as the revenue gap compressed to within accounting-method variance. ## Why it matters: when growth is real but margins are mortgaged Anthropic [turned down $800 billion offers in mid-April](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo) because the company expects to be worth significantly more in 6-12 months. That calculation—saying no to a valuation that would have ranked among the highest in private company history—is the clearest signal of where institutional capital believes the frontier AI market is headed. The secondary market demand for Anthropic shares is described as [nearly insatiable](https://charlesandsystems.substack.com/p/anthropic-just-said-no-to-800-billion), with Goldman Sachs reportedly [charging 15-20% carry on secondary stakes](https://tech-insider.org/anthropic-vs-openai-2026/), a premium that reflects supply scarcity rather than uncertainty about trajectory. An [IPO is reportedly targeted for October 2026](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), raising $60B+ at the then-current valuation, which would make it one of the largest technology public offerings in history. The revenue acceleration that justifies this confidence is real. Anthropic grew from [$1 billion in annualized revenue at the end of 2024](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo) to [$9 billion by December 2025, then $30 billion by early April 2026](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/)—a 30x expansion in 15 months that [no company in American history has matched](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo). At $30 billion ARR and a $380 billion February valuation, the implied multiple sits at roughly 12.7x; at $40 billion ARR and $900 billion, it rises to 22.5x. High by any conventional SaaS benchmark, but defensible if the quarterly doubling continues. The enterprise mix—[approximately 80% of revenue](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/) from API and direct contracts, [more than 1,000 companies each spending over $1 million annually](https://vucense.com/ai-intelligence/industry-business/anthropic-overtakes-openai-30-billion-arr-2026/)—delivers retention and margin characteristics that consumer subscription models cannot. But the margin structure is mortgaged in ways that constrain optionality. The [$100 billion AWS commitment over 10 years](https://opentools.ai/news/anthropic-weighs-900b-funding-round-overtake-openai) locks Anthropic into a single cloud provider at a scale that eliminates negotiating leverage and precludes any meaningful shift to owned infrastructure or alternative providers. At [$500 million to $1 billion monthly compute burn](https://www.idlen.io/news/anthropic-refuses-800-billion-valuation-vc-preemptive-offers-april-2026/), even with $30 billion in trailing revenue, the company remains structurally unprofitable. The compute-to-revenue ratio—roughly 20-40% of ARR spent on training and inference—mirrors OpenAI's unit economics, which [HSBC projects will not reach profitability before 2030](https://tech-insider.org/anthropic-vs-openai-2026/). Google, which [owns 14% of Anthropic through investments totaling roughly $3 billion](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), has [reported $10.7 billion in net gains on those equity securities](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo)—a 3.6x return that comes almost entirely from valuation markup rather than realized cash flow. The divergence between growth and profitability is not unique to Anthropic, but the scale of the capital commitments is. Amazon, which has [invested an estimated $8 billion and secured a position as Anthropic's primary cloud and training partner](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo), reported a [$9.5 billion pretax gain tied to Anthropic's rising valuation in its Q3 results](https://thenextweb.com/news/anthropic-800-billion-valuation-revenue-30-billion-ipo). Both backers—Google and Amazon—are also cloud and inference infrastructure providers, which means every dollar Anthropic spends on compute flows back to entities that hold board seats and significant equity. The alignment is strategic, but it also embeds structural dependency. A renegotiation of cloud pricing, a shift in inference costs, or a decision to vertically integrate into owned data centers would require unwinding partnerships that are now load-bearing to the company's valuation. The path to profitability remains unclear not because the revenue is in doubt, but because the cost base is still scaling in parallel. At 27x revenue and $500M-$1B monthly compute burn, Anthropic is betting that enterprise AI adoption will continue to accelerate faster than infrastructure costs decline. That bet has been correct for 15 months. Whether it holds for the next 24—through an IPO, through margin pressure, through the AWS commitment's lock-in—is the question that separates a $900 billion valuation from a fundamentally profitable AI business. For now, the market is pricing in the former. The unit economics still reflect the latter's absence. --- # OpenAI traces goblin quirk in GPT-5 models to personality training feedback loop URL: https://www.thedeepfeed.ai/posts/2026-04-30-openai-goblin-quirk-postmortem/ Category: Research Date: 2026-04-30 Source: OpenAI — https://openai.com/index/where-the-goblins-came-from/ Tags: openai, gpt-5, reinforcement-learning, post-mortem, training > A post-mortem reveals how reward signals for the "Nerdy" personality caused GPT-5.1 through 5.5 to overuse creature metaphors, and how the behavior spread through training data. **OpenAI** published a technical post-mortem Wednesday explaining why its GPT-5 series models developed an unusual tendency to reference goblins, gremlins, and other creatures in responses—a quirk that spread across model versions despite no intentional training for it. The root cause: a reward signal designed to reinforce the "Nerdy" personality feature inadvertently scored outputs containing creature metaphors higher than equivalent outputs without them. That behavior then leaked into broader training data through a feedback loop involving supervised fine-tuning. ## The timeline - **November 2025 (GPT-5.1):** Internal reports surfaced about overfamiliar language. Mentions of "goblin" rose 175% post-launch; "gremlin" rose 52%. - **GPT-5.4:** Users and employees noticed a larger uptick. Analysis revealed 66.7% of all "goblin" mentions came from the 2.5% of traffic using the "Nerdy" personality. - **March 2026:** OpenAI retired the Nerdy personality mid-GPT-5.4 deployment after identifying the connection. - **GPT-5.5:** Training began before the fix; OpenAI added developer-prompt mitigations in Codex to suppress the behavior. ## How the feedback loop worked The Nerdy personality system prompt encouraged "playful use of language" and acknowledgment of the world's "strangeness." The reward model scored outputs with creature words 76.2% more favorably across audited datasets. Critically, the behavior transferred beyond the Nerdy personality condition. OpenAI's analysis showed goblin/gremlin prevalence rising in outputs *without* the Nerdy prompt at nearly the same relative rate as outputs with it—evidence that reinforcement learning does not guarantee behavioral scoping. The loop: - **Playful style rewarded → tic appears in rollouts → rollouts enter SFT data → model learns the tic as general behavior.** OpenAI confirmed that GPT-5.5's SFT data contained numerous examples of goblin, gremlin, and related creatures (raccoons, trolls, ogres, pigeons). ## Why it matters This is one of the clearest public examples of how **unintended reward-signal generalization** can propagate through production model training. The goblins were harmless, but the mechanism—localized reward incentives spreading through data reuse—could apply to more consequential behaviors. OpenAI now has audit tooling to trace these patterns, but the post underscores how opaque RL-driven style drift remains, even inside frontier labs. --- # The complete developer's guide to Stripe Sessions 2026 URL: https://www.thedeepfeed.ai/posts/2026-04-30-stripe-sessions-2026-developer-guide/ Category: Tools Date: 2026-04-30 Tags: stripe, agents, developer-tools, payments, ai > Stripe shipped 288 launches at Sessions 2026, but only 3 are GA day-one — here is what is actually buildable today, the new agent-payment protocol stack, and ten projects worth shipping this quarter. ## Table of contents 1. [What just happened, in plain language](#1-what-just-happened-in-plain-language) 2. [The honest count: what's actually shippable today](#2-the-honest-count-whats-actually-shippable-today) 3. [The five mental shifts a developer needs](#3-the-five-mental-shifts-a-developer-needs) 4. [The protocol stack, MPP, UCP, ACP, x402, ACS, skill.md, Tempo](#4-the-protocol-stack) 5. [The Stripe primitives map, every feature, status, and where to use it](#5-the-stripe-primitives-map) 6. [Ten projects worth building this quarter](#6-ten-projects-worth-building-this-quarter) 7. [The "ship in a weekend" stack and why Stripe Projects changed it](#7-the-ship-in-a-weekend-stack) 8. [Business models for agentic products](#8-business-models-for-agentic-products) 9. [The apply-now matrix](#9-the-apply-now-matrix) 10. [Closing thoughts and references](#10-closing-thoughts-and-references) --- # 1. What just happened, in plain language On April 29, 2026, Stripe held its annual customer conference, Sessions 2026, in San Francisco. Roughly 9,000 people attended. The keynote was delivered by Will Gaybrick, Stripe's President of Product and Business. The framing was unmistakable: **"We are building the economic infrastructure for AI."** > "If AI can solve Nobel level physics problems but can't buy a domain, something's gone wrong. Our mantra: empower agents. Stripe is building the economic infrastructure for AI. That's the animating theme behind all 288 product announcements we made today at Stripe Sessions." > > — [@wgaybrick](https://x.com/wgaybrick/status/2049694798706364665), April 30 2026 Concretely, Stripe announced **288 product launches** spanning seven product surfaces (Payments, Radar, Revenue + 4 more) plus an eighth section called *"What's coming later this year"* which is the public roadmap. The 288 launches break down into five strategic bets: 1. **Agents are first-class economic actors.** Stripe wants AI agents to have their own wallets, their own cards, their own fraud profiles, and to be able to transact with merchants over open protocols. 2. **Per-token billing is real and on-chain.** The combo of `mppx` (Stripe's open SDK), Tempo (Stripe-affiliated EVM L1), and on-chain payment channels means you can now charge $0.001 per yielded token in a generator function. This is genuinely new. 3. **Stripe is becoming a bank.** Stripe Treasury (limited public preview), free instant transfers between Stripe businesses, a 2% cashback Mastercard, and an MCP-operable banking layer. 4. **Distribution moved to Stripe.** Stripe Projects + the `projects.dev/providers` catalog (32 launch partners including Fly.io, Vercel, Supabase + 29 others) means signups for SaaS infrastructure now happen *inside* Stripe with consolidated billing. 5. **Radar pivots from card fraud to token fraud.** "1 in 6 AI signups is malicious" was the talk's headline. The responsive ships target free-trial abuse, bot abuse, multi-account abuse + 2 others, plus an expanded Stripe Signals network. > We just announced a large raft of improvements at @Stripe Sessions. My meta reflections: > > • It feels that the entire economy is replatforming right now. > • Many charts at Stripe are inflecting in quite dramatic ways. What GitHub recently reported for commits we are seeing in economic activity (such as new company formations). > • It is increasingly clear that agents will be responsible for most transactions in the not overly distant future. > • Stripe was always developer-centric, but AI is making developer-centricity strategic in a new way: agents are even hungrier for good DX than developers themselves are. > > — [@patrickc](https://x.com/patrickc/status/2049705418436600244), April 30 2026 If you do nothing else with this post, internalize those five bets. Everything Stripe did at Sessions 2026, every API and every preview, fits inside one of those five stories. But the real news isn't that Stripe shipped 288 things. The real news is **what's actually buildable today**, and that requires reading the announcement blog with much more care than most coverage gave it. --- # 2. The honest count: what's actually shippable today Most coverage of Sessions 2026 said something like *"Stripe shipped 288 launches today."* That's literally true and practically misleading. Here's the precise breakdown, verified line-by-line against [Stripe's own announcement blog](https://stripe.com/blog/everything-we-announced-at-sessions-2026): ## Day-of GA on Stripe rails: exactly 3 things Stripe is rigorous about phasing. They use specific language: **"is now generally available"** or **"is now available"** for true GA. In the entire announcement blog, they use that language for exactly two products: 1. **Stripe Workflows**: *"is now generally available"* (with looping, third-party custom actions, prebuilt actions for Mailchimp and Slack, programmatic invocation, and Connect support) 2. **Network cost passthrough (IC++)**: *"is now available to platforms in 45 markets"* (US, EU, and others) The press release adds one more: 3. **Stripe Projects**: *"now available to everyone"* That's three. Three things that, the moment Sessions ended, any Stripe customer could turn on with no application, no waitlist, no preview opt-in. ## "We previewed" mentions: 46 items Throughout the same blog, Stripe says **"we previewed"** 46 times. This is preview language, public preview or private preview, *not* day-of GA. Examples from the actual blog text: - *"We previewed the Agentic Commerce Suite, allowing your business to sell across the agentic web."* - *"We previewed Issuing for agents, a new way to provision single-use virtual cards for AI agents."* - *"We previewed Custom Objects, allowing you to model your business in Stripe."* - *"We previewed Stripe Database, a managed Postgres for your Stripe data."* - *"We previewed Stripe Console, an agent-facing operations interface."* Read those carefully. Stripe is *deliberately* not saying these are GA. They are previewing them. Some are open public previews you can opt into today (with caveats). Others are private previews that require an explicit invitation, and a few are deeper than that. ## Future roadmap entries: 223 items The "What's coming later this year" section of the blog has **223 Q-marked entries**. Each one looks like: > **[Q3 2026, public preview]** Off-session payments API for one-time purchases initiated by an agent. The breakdown: - **78** entries marked *"GA"* (future GA, not GA today) - **62** entries marked *"public preview"* (future public preview) - **84** entries marked *"private preview"* (future private preview, the longest queue) - 14 entries scheduled for Q1 2026 (already past), 65 for Q2, 62 for Q3, 82 for Q4 Add it up: 3 day-of GA + 46 previewed + 223 roadmapped = 272. The remaining 16 are the open-standards and partner-product launches (UCP, MPP, ACP plus x402, Privy SDK, Tempo mainnet, and a few more) that don't fit Stripe's own preview taxonomy because they don't gate on Stripe approval. ## What this means in practice **If a colleague says "Stripe shipped 288 things today,"** they're literally correct. **If a colleague says "you can use 288 things today,"** they're wrong by a factor of ten. But here's the thing the coverage usually misses: **the open-standards work and the partner-product work also shipped, and those don't need any Stripe approval.** Add up: - 3 things Stripe explicitly calls GA on day one - ~17 open-standards and partner products that shipped at Sessions and need no Stripe gate - ~24 pre-existing Stripe primitives that newly *pair* with the agentic stack That's **40+ things you can use this week** without filling out a single application. That's the honest developer reality. This distinction comes back throughout: 🟢 means usable today with no approval, 🟡 means usable today with a preview opt-in or self-service waitlist, 🔴 means actually blocked behind a private preview that requires an invitation. --- # 3. The five mental shifts a developer needs Before the protocols and projects, five intuitions about how money moves and how products get built need updating. None of these are minor. Each one is a wholesale rewrite of an assumption most Stripe users have held for a decade. ## Shift 1: Agents are first-class economic actors For two decades, the "buyer" in a Stripe transaction has been a human with a card. Your checkout flow is designed for them: text fields, a CVV box, maybe Apple Pay. Risk models assume a human typed the card. Disputes assume a human will eventually claim chargeback. Customer support assumes you can email them. That's all changing. **An AI agent is now a buyer Stripe specifically supports.** Concretely: - **Link agent wallets** (`link.com/agents`): the agent has a wallet under the user's Link account. Stripe handles the auth dance. - **Issuing for agents** (private preview): Stripe issues a single-use virtual card *for a specific agent task*. The card has its own MCC restrictions, spending limit, and expiry. After the task, the card dies. - **Shared Payment Tokens (SPTs)**: a new tokenization scheme designed for agents to share a payment method across multiple merchants in a single shopping session. - **Agentic Commerce Suite (ACS)**: a server-side product that lets you sell to agents acting on your existing merchant Stripe account. ChatGPT, Gemini, Copilot, Meta AI all become checkout surfaces. - **Universal Commerce Protocol (UCP) + Agentic Commerce Protocol (ACP)**: two open standards Stripe accepts. Agents can browse your catalog and buy without you uploading anything to OpenAI or Google. - **Machine Payments Protocol (MPP) + x402**: for *machine-to-machine* payments where one server pays another server (your agent paying for an API call, not buying a coffee). - **Radar 2026**: risk models that distinguish real users from bots, multi-account abuse, free-trial farming, token theft. The dev consequence: **"is the buyer a human?" is now a question your code must answer.** And if the answer is "no, it's an agent," there's a different set of best practices, a different set of fraud signals, and a different set of pricing primitives. > "Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away." > > — [@Cloudflare](https://x.com/Cloudflare/status/2049545195914498139), April 29 2026 ## Shift 2: Per-token billing is real and on-chain Pre-Sessions, "metered billing" meant Stripe Billing usage records: you collect events, batch them up, send them to Stripe at the end of a billing cycle, and the customer gets invoiced monthly. The granularity was hours, sometimes minutes. Post-Sessions, you can charge **per yielded token** in a generator function, and the charge settles on-chain in milliseconds. Here's what that actually looks like: ```ts import { Mppx, tempo } from 'mppx/nextjs' const mppx = Mppx.create({ methods: [tempo({ currency: '0x20c0000000000000000000000000000000000000', // pathUSD on Tempo recipient: '0xa726a1CD723409074DF9108A2187cfA19899aCF8', sse: true, })], }) export const GET = mppx.session({ amount: '0.001', unitType: 'word' })( async () => { return async function* (stream) { const tokens = generateTokens() for (const token of tokens) { await stream.charge() // 🪙 0.001 pathUSD debit, on-chain yield token } } } ) ``` This is real. Verified at `docs.tempo.xyz/guide/machine-payments/streamed-payments`. Every `await stream.charge()` debits a payment channel by 0.001 pathUSD. If the channel runs dry mid-stream, the generator halts. There's no PaymentIntent state machine, no webhook reconciliation, no batch settlement. The on-chain ledger is the ledger. The dev consequence: **pricing models you couldn't express before are now expressible.** Per-frame video transformation. Per-row data API. Per-call DeFi quote. Per-tile map render. Anything where the unit of value is too small to be a Stripe Invoice line item but you still want to charge for it. > Today at Sessions, @Stripe introduced streaming payments on Tempo. > > A single agent run produced thousands of sub-cent transactions onchain, with payment landing the instant each token was burned. Metered, paid for as it ran. > > The infrastructure for pay-per-use at scale is here. > > — [@tempo](https://x.com/tempo/status/2049572539010285873), April 29 2026 ## Shift 3: Stripe is becoming a bank You've used Stripe for taking payments. Going forward, Stripe also wants to be where your business holds, moves, and earns money. Stripe Treasury, currently in *limited public preview*, does this: - Free instant transfers between Stripe Treasury businesses (no ACH delay) - Stripe credits as yield (you keep money on Stripe and earn interest) - A 2% cashback Mastercard - Multi-currency balances (15 currencies on the roadmap) - A *banking* MCP server (private preview): yes, an LLM can move your money The dev consequence: **"where does the customer's money live?" is no longer just a Stripe Connect question.** If you're building a marketplace, a SaaS, a vertical, or a fintech, you can now hold the funds on Stripe rails the whole time and skip a bunch of integrations with Mercury, Brex, Modern Treasury, etc. > "I've gotten to use @stripe Treasury for the last few months and it feels incredibly natural to manage your money in the same place you manage your business, across multiple currencies, with auto rewards on card spend and savings, and payouts globally." > > — [@jeff_weinstein](https://x.com/jeff_weinstein/status/2049579608266445112), April 29 2026 Caveat worth stressing: **Treasury itself is in limited public preview.** Stripe's own doc page reads: *"only available for some Stripe users."* Your account may not have it. Check `Dashboard → Treasury` before promising a customer anything Treasury-dependent. ## Shift 4: Distribution flipped via Stripe Projects Pre-Sessions: building a SaaS app meant signing up separately for Clerk (auth), Neon (Postgres), Fly.io (hosting), and a few more. Each had its own dashboard, its own billing, and its own credit card on file. Post-Sessions: **Stripe Projects + `projects.dev/providers`.** Sign up once, on Stripe. The 32 launch providers (Fly.io, Vercel, Supabase + 29 more) are all reachable through one CLI: ```bash $ stripe projects init my-agent-app ? Which providers? (space to select) > ◉ Clerk (auth) > ◉ Neon (postgres) > ◉ Fly.io (hosting) > ◉ Inngest (jobs) > ◯ Tinybird (analytics) ✓ Provisioned 4 services. Wrote .env.local with 12 secrets. ✓ Billing consolidated under Stripe customer cus_xxx. ``` This is real. It's the only thing the press release explicitly calls *"now available to everyone."* The catalog is browsable at [projects.dev/providers](https://projects.dev/providers). The dev consequence: **time-to-first-deploy collapsed.** A v0 of an AI app went from "configure 8 dashboards" to "one CLI command." For hackathons, internal tools, and prototypes, this is a step-change in setup time. For production, it's a billing simplification that compounds, every provider's invoice flows through Stripe, your CFO sees one line per provider per month, and audit trails stay in one place. ## Shift 5: Radar pivots from card fraud to token fraud The headline stat from Will Gaybrick's keynote: *"1 in 6 AI signups is malicious."* The pre-Sessions Radar was tuned for card-not-present fraud, stolen credit cards being used at e-commerce checkouts. The post-Sessions Radar adds: - **Free trial abuse**: distinguishing real signups from prompt-engineering farms - **Bot abuse**: distinguishing humans from automated traffic on your signup form - **Multi-account / account-sharing abuse**: same person creating 100 accounts to abuse free tiers - **Pay-as-you-go abuse**: token theft from compromised API keys - **Cross-network Stripe Signals**: fraud signals from off-Stripe traffic shared back into your Radar config The dev consequence: **your fraud surface area changed.** If you're running an AI product with a free tier, and most AI products are, you should assume a meaningful percentage of your signups are fraudulent. Radar 2026 gives you the primitives to detect that. One of these gets wired into a project later in the guide. --- # 4. The protocol stack Sessions 2026 introduced or formalized **seven protocols** that together describe how AI agents and merchants interact. Most coverage threw all seven into one bucket called "agent commerce." That's wrong. Each one solves a different problem. As a dev you need to know which one applies when. Here's the map: ![The agent commerce protocol stack: ACS/UCP/ACP for human merchants, MPP/x402 for M2M, skill.md for discovery, Tempo L1 for settlement](/post-images/stripe-sessions-2026-developer-guide/01-protocol-stack.jpg) Now the deep dive. ## 4.1 Universal Commerce Protocol (UCP) **What it is:** An open standard for product catalogs that agents can crawl. The merchant publishes a `JSON` manifest at `/.well-known/ucp.json`. Agents (Claude, ChatGPT, custom) fetch it, parse it, and use it to satisfy user shopping queries. **Where it lives:** [ucp.dev](https://ucp.dev), [github.com/Universal-Commerce-Protocol/ucp](https://github.com/Universal-Commerce-Protocol/ucp). **Status:** 🟢 Open standard, live today, no Stripe approval needed. **Minimal manifest:** ```json { "version": "1.0", "merchant": { "name": "Example Coffee Co", "domain": "examplecoffee.com", "support_email": "help@examplecoffee.com" }, "catalog_endpoint": "https://examplecoffee.com/api/ucp/catalog", "checkout_endpoint": "https://examplecoffee.com/api/ucp/checkout", "payment_methods": ["card", "link", "mpp"], "supported_currencies": ["USD", "EUR"], "shipping_zones": ["US", "EU", "UK"] } ``` **The catalog endpoint returns paginated products:** ```json { "products": [ { "id": "sku_001", "name": "Premium Coffee Beans, Yirgacheffe", "description": "Single-origin Ethiopian, fresh-roasted weekly", "price": { "amount": 2400, "currency": "USD" }, "inventory": 47, "images": ["https://examplecoffee.com/img/sku_001.jpg"], "variants": [ { "id": "sku_001_250g", "name": "250g", "price_delta": 0 }, { "id": "sku_001_500g", "name": "500g", "price_delta": 1800 } ] } ], "next_page": "/api/ucp/catalog?cursor=eyJpZCI6InNrdV8wMDIifQ" } ``` **Stripe accepts UCP as the catalog format** for ACS (the Agentic Commerce Suite). Meaning: if you publish a UCP manifest, you're already half-integrated with Stripe ACS. > 🚨 JUST IN: Amazon, Microsoft, Meta, Salesforce, and Stripe just joined Google's UCP. > > The protocol war for agentic commerce is over. > > Google's UCP won. > > — [@sytaylor](https://x.com/sytaylor/status/2047980367148155065), April 25 2026 ## 4.2 Agentic Commerce Protocol (ACP) **What it is:** A second open standard for agent-merchant commerce, slightly different focus. Where UCP is catalog-first ("here's everything I sell"), ACP is capability-first ("here's what I can do for you, agent"). They're complementary; many merchants will publish both. **Where it lives:** [agenticcommerce.dev](https://agenticcommerce.dev). **Status:** 🟢 Open standard, live today. **The capabilities feed:** ```json { "version": "1.0", "merchant": "examplecoffee.com", "capabilities": [ { "name": "buy_coffee", "description": "Purchase coffee beans for delivery", "endpoint": "/api/acp/buy", "input_schema": { "type": "object", "properties": { "sku": { "type": "string" }, "quantity": { "type": "integer", "minimum": 1 }, "shipping_address": { "$ref": "#/definitions/Address" } } } }, { "name": "subscribe_to_box", "description": "Monthly subscription, varies by season", "endpoint": "/api/acp/subscribe", "pricing": "https://examplecoffee.com/api/acp/box/pricing" } ] } ``` ACP is closer in spirit to OpenAPI/MCP than UCP. If you're building a more complex merchant (services, subscriptions, configurable products), ACP fits better. ## 4.3 Stripe Agentic Commerce Suite (ACS) **What it is:** Stripe's *server-side* product that wires UCP / ACP catalogs into the major agent surfaces: ChatGPT, Gemini, Copilot, Meta AI. You upload your products via the Stripe API; ACS handles distribution. **Status:** 🟡 Public preview, requires API version `2026-04-22.preview`. **Doc:** `docs.stripe.com/agentic-commerce`. **Minimal integration:** ```ts import Stripe from 'stripe' const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!, { apiVersion: '2026-04-22.preview', }) // 1. Make sure you have a Stripe Profile (the prereq) const profile = await stripe.profiles.create({ business_name: 'Example Coffee Co', domain: 'examplecoffee.com', description: 'Single-origin coffee from Ethiopia', }) // 2. Sync your catalog for (const product of myCatalog) { await stripe.products.create({ name: product.name, description: product.description, default_price_data: { unit_amount: product.price_cents, currency: 'usd', }, metadata: { inventory_count: String(product.inventory), profile_id: profile.id, }, }) } // 3. Configure ACS to pull from this account await stripe.profiles.update(profile.id, { agentic_commerce: { enabled: true }, }) ``` After this, your products start appearing in ChatGPT/Gemini/etc. when users ask agents to shop. **The trade-off:** ACS routes the discovery through Stripe (good for distribution) but locks you into Stripe-mediated checkout (the agent uses Stripe's payment infrastructure). Compare with UCP/ACP, where you keep full control of checkout and the agent comes directly to your endpoints. ## 4.4 Machine Payments Protocol (MPP) **What it is:** An open protocol for *machine-to-machine* payments, your server paying another server for a service call. It's not for buying a coffee; it's for an agent paying an API for a function call. **Where it lives:** [mpp.dev](https://mpp.dev), [@mpp on X](https://x.com/mpp). **SDKs:** `mppx` (JS, the most popular), `stripe/mpp-go`, `stripe/mpp-rb`. **Status:** 🟢 Open standard, live today. **The `HTTP` semantics:** ``` GET /api/poem → 402 Payment Required X-MPP-Challenge: pay 0.001 pathUSD per word to 0xa726... X-MPP-Methods: tempo, x402 GET /api/poem X-MPP-Payment: → 200 OK Transfer-Encoding: chunked (streamed response with periodic charge events) ``` **The merchant integration:** ```ts import { Mppx, tempo } from 'mppx/express' import express from 'express' const app = express() const mppx = Mppx.create({ methods: [tempo({ currency: '0x20c0000000000000000000000000000000000000', recipient: '0xa726a1CD723409074DF9108A2187cfA19899aCF8', sse: true, })], }) app.get('/api/poem', mppx.session({ amount: '0.001', unitType: 'word' }), async (req, res) => { return async function* (stream) { const words = await generateWords() for (const word of words) { await stream.charge() // debits payer's channel yield word } } } ) ``` **The agent (client) side:** ```ts import { MppxClient } from 'mppx/client' const client = new MppxClient({ wallet: myWallet, fundedAmount: '1.00', // pre-fund $1 of pathUSD into channel }) const stream = await client.fetch('https://api.example.com/poem', { maxSpend: '0.50', // never let the call cost more than $0.50 }) for await (const chunk of stream) { console.log(chunk) } console.log('Spent:', stream.totalSpent) // exact, on-chain ``` The economic property here is non-trivial. You can charge sub-cent amounts that wouldn't be feasible on credit-card rails (where the merchant pays $0.30 per charge in fees). On Tempo with `mppx`, the gas is sub-cent and the merchant captures essentially 100% of the charge. This is what enables the new pricing models like per-word LLM responses, per-row data feeds, per-frame video transforms. > AI agents can now register domains and pay for them with zero human involvement using @mpp, @tempo's & @stripe's new protocol for machine-to-machine payments. > > Our team built this demo at the @tempo hackathon. It's live. It works. And it's a preview of an entirely new category. > > — [@domaprotocol](https://x.com/domaprotocol/status/2047361212552478983), April 23 2026 ## 4.5 x402 **What it is:** Coinbase's open protocol that mirrors MPP, with a different on-chain settlement layer (Base / `USDC`). Stripe accepts x402 alongside MPP. **Status:** 🟢 Live today, accepted by Stripe rails. **Doc:** `docs.stripe.com/payments/machine/x402`. x402 is to MPP what x86 is to ARM: same idea, different ecosystem. If you're already on Coinbase / Base infra, x402 is your fit. If you're building greenfield with Stripe-aligned tooling, MPP is cleaner. **A well-built merchant should accept both.** `mppx` makes this easy by treating each as a method on a `methods: []` array. ## 4.6 link.com/skill.md **What it is:** A new file format Stripe introduced, a single Markdown file at your domain root that tells agents how to interact with you. Like `robots.txt` but for agentic commerce. **Where it lives:** [link.com/skill.md](https://link.com/skill.md) is the spec page. **Status:** 🟢 Open standard, live today. **Example file (you publish at `https://example.com/skill.md`):** ```yaml --- name: example-coffee description: Single-origin coffee from Ethiopia, fresh-roasted weekly endpoints: catalog: https://example.com/.well-known/ucp.json checkout: https://example.com/api/agent-checkout mpp: https://example.com/.well-known/mpp.json payment_methods: - link - card - mpp - acp languages: [en, es, ja] shipping: zones: [US, EU, UK] free_threshold_cents: 5000 support: email: help@example.com url: https://example.com/support --- # Example Coffee — agent guide We sell single-origin Ethiopian coffee, fresh-roasted weekly. Every order ships within 24 hours. To purchase as an agent: 1. **Browse the catalog** at `/api/ucp/catalog`. Products have stable SKU IDs. 2. **Initiate checkout** by POSTing to `/api/agent-checkout` with `{ sku, quantity, shipping_address, payment_method }`. 3. We respond with a 402 + MPP/Link/card challenge depending on your method. 4. Pay; we'll email the customer a tracking number within 24 hours. For the cleanest UX, agents should: - Show the user the price in **their** local currency before charging (we accept Stripe Adaptive Pricing — the agent doesn't need to do FX). - Confirm shipping address explicitly. We do not auto-fill from Link. - Use Link agent wallets when possible — they're token-bound to the user, so we can email them the receipt automatically. Built with ❤️ in Brooklyn. ``` This file is **the agent-equivalent of a website's homepage.** It's where an agent reads "what can I do here?" The format is intentionally narrative, agents are language models, after all, and they extract structure from natural prose better than from rigid JSON. > We just launched the @Link CLI. Tell your friendly neighborhood agent about it — agents can use the Link CLI to create single-use credentials that you get to synchronously approve each time. > > I asked Claude to buy itself a gift. It chose HTTPZine on Gumroad. > > — [@patrickc](https://x.com/patrickc/status/2049535449484644702), April 29 2026 ## 4.7 Tempo **What it is:** A new EVM-compatible Layer 1 blockchain affiliated with Stripe. Native quote token is **pathUSD** at address `0x20c0000000000000000000000000000000000000`. Designed for high-frequency, sub-cent payments, the on-chain settlement layer for MPP and `mppx`. **Status:** 🟢 Mainnet live since Sessions. **Where it lives:** [docs.tempo.xyz](https://docs.tempo.xyz). **Key technical facts:** - EVM compatible, uses `viem`, hex addresses, standard wallet RPCs - TIP-20 = ERC-20 equivalent (Tempo Improvement Proposals = TIPs) - TIP-403 = policy contracts (programmable spend rules; e.g. "this card can only spend on Github + Cloudflare in any 24h window") - "Zones" = like Cosmos zones, sovereign sub-rollups for high-throughput merchants - Sub-cent gas, sub-second finality - Stablecoin DEX built into the protocol (so pathUSD ↔ USDC ↔ USDT swaps don't require Uniswap-style aggregator) **Connecting to Tempo:** ```ts import { createWalletClient, createPublicClient, http } from 'viem' import { tempo } from 'viem/chains' // ships in viem 2.x const publicClient = createPublicClient({ chain: tempo, transport: http('https://rpc.tempo.xyz'), }) const walletClient = createWalletClient({ chain: tempo, transport: http('https://rpc.tempo.xyz'), account: privateKeyToAccount(process.env.WALLET_PRIVATE_KEY!), }) // Send pathUSD const hash = await walletClient.writeContract({ address: '0x20c0000000000000000000000000000000000000', abi: erc20Abi, functionName: 'transfer', args: [recipient, parseUnits('0.50', 6)], // pathUSD has 6 decimals }) ``` The strategic point: Tempo gives Stripe an on-chain settlement layer that **isn't Ethereum mainnet** (too expensive for sub-cent payments) and **isn't a permissioned ledger** (developers won't build on those). It's a public, permissionless EVM L1 with the right economic properties for the use cases Stripe is opening up. That's a lot to deliver in a single chain, but mainnet is live and the docs are working. > I am genuinely stoked for @tempo's new virtual addresses feature. > > In almost 8 years building in crypto I've had to solve the deposit-address problem at literally _every single company_ I've worked at. Every time it's the same build out: generate a unique address per customer, sweep funds back to a master wallet, manage gas in every leaf address, reconcile timing differences, handle the edge cases. > > It's the kind of thing that sounds simple in a design doc and then can end up eating a quarter of your team's roadmap. > > It is _so cool_ to make this a protocol primitive — and totally obvious in hindsight. No sweeps, no per-address gas, no state bloat from millions of customer accounts sitting around with minuscule amounts of dust. > > — [@0xDaedalus](https://x.com/0xDaedalus/status/2049118728286351452), April 28 2026 ## 4.8 Putting it all together: which protocol when? Use this table to decide: | Your scenario | Use | |---|---| | You're a human-facing merchant; agents will buy from you on behalf of users | **UCP + ACP**, optionally Stripe **ACS** for distribution | | You're an API; other servers/agents pay you per-call | **MPP** (with `mppx`) and/or **x402** | | You're a service that streams (LLM, video, audio); agents pay per-token | **MPP + Tempo + payment channels** (see Project 1 below) | | You want to advertise yourself to all agents, broadly | **`link.com/skill.md`** at your domain root | | You're building agentic infrastructure (an agent itself, a wallet, a router) | All of the above + **Privy** for user-side wallets, **Issuing for agents** for cards, **Tempo SDK** for settlement | A well-built modern merchant should publish: a UCP manifest, an ACP capabilities file, a `skill.md`, an MPP server, and an ACS profile. That's five files / five integrations, but four of them are JSON/YAML and the fifth is `mppx` middleware. Total time, if you know what you're doing: a weekend. --- # 5. The Stripe primitives map Now the catalogue: every primitive Stripe announced or extended at Sessions 2026, organized by surface, with verified status, the doc URL, and a one-liner on what to use it for. ## 5.1 Money & payments | Primitive | Status | Doc | Use it for | |---|---|---|---| | **Stripe Treasury** | 🟡 limited public preview | `docs.stripe.com/treasury` | Holding money on Stripe; multi-currency; replaces Mercury/Brex | | Free instant US-business transfers | 🟡 (under Treasury preview) | `docs.stripe.com/treasury#send-instant-transfers-to-stripe-profiles` | Marketplace payouts that settle in seconds | | 2% cashback Stripe Mastercard | 🟡 (under Treasury preview) | `docs.stripe.com/treasury/cards#get-cashback-rewards` | Replace your business credit card | | Treasury 15-currency support | ⏳ EOY 2026 | `docs.stripe.com/treasury#request-access` | Multi-region treasuries | | Treasury MCP (banking via agents) | 🔴 private preview | `docs.stripe.com/mcp#request-access-agentic-treasury` | Letting an agent move money | | **Issuing for agents** | 🔴 **private preview** | `docs.stripe.com/issuing/agents` | Single-use virtual cards for agent tasks | | Stripe Issuing (regular) | 🟢 GA | `docs.stripe.com/issuing` | Card issuing today; stand-in for "Issuing for agents" | | Stablecoin-backed cards (30 countries) | 🟢 GA | `docs.stripe.com/issuing/stablecoin-cards` | Settle Issuing transactions in stablecoins | | Stablecoin payments (32 markets) | 🟢 GA | `docs.stripe.com/payments/stablecoin-payments` | Accept stablecoins as a checkout method | | Capital lines of credit | 🟡 public preview | `support.stripe.com/questions/stripe-capital-line-of-credit-faq` | Working capital for SaaS | | Capital w/o Stripe processing history | 🟡 public preview | `docs.stripe.com/capital/import-non-stripe-data` | Capital for non-Stripe businesses | | Global Payouts (existing) | 🟢 GA | `docs.stripe.com/global-payouts` | Send to 100 fiat / 160 stablecoin countries | | Global Payouts to Link users | 🟡 public preview | `docs.stripe.com/global-payouts/send-money/link` | Instant-settle payouts to consumers | | Stripe Atlas SAFE funding via Treasury | 🟢 GA | `docs.stripe.com/atlas/fundraise-with-safes` | Founder fundraising on rails | ## 5.2 Agentic commerce | Primitive | Status | Doc | Use it for | |---|---|---|---| | **UCP** | 🟢 open standard | `ucp.dev` | Agent-readable catalogs | | **ACP** | 🟢 open standard | `agenticcommerce.dev` | Agent-readable capabilities | | **MPP** + `mppx` | 🟢 open standard | `mpp.dev` | Machine-to-machine payments | | **x402** | 🟢 open standard | `docs.stripe.com/payments/machine/x402` | M2M payments via Coinbase rails | | Stripe **ACS** | 🟡 public preview | `docs.stripe.com/agentic-commerce` | Distribute catalog to ChatGPT/Gemini/Copilot/Meta | | ACS for platforms (Connect) | 🟡 public preview | `docs.stripe.com/connect/saas/tasks/enable-in-context-selling-on-ai-agents` | Enable ACS for all your platform's merchants | | Stripe profiles | 🟡 public preview | `docs.stripe.com/get-started/account/profile` | Prereq for ACS | | **Shared Payment Tokens** | 🟡 public preview, US | `docs.stripe.com/agentic-commerce/concepts/shared-payment-tokens` | Multi-merchant agent shopping in one session | | **Link agent wallet** | 🟡 public preview | `link.com/agents` | Agent has a wallet under the user's Link | | `@stripe/link-cli` | 🟢 open NPM | `github.com/stripe/link-cli` | Test Link agent flows locally | | `link.com/skill.md` | 🟢 open standard | `link.com/skill.md` | Agent-readable merchant manifest | | **Streaming payments via mppx + Tempo** | 🟢 open SDK | `docs.tempo.xyz/guide/machine-payments/streamed-payments` | Per-token streaming billing | | Bot abuse prevention | 🟡 public preview | `docs.stripe.com/radar/bot-abuse` | Bot fraud on signup forms | | Agent guardrails (Settings → Approvals) | 🟢 GA | `docs.stripe.com/account/approvals` | Human-in-the-loop for agent actions | ## 5.3 AI surfaces | Primitive | Status | Doc | Use it for | |---|---|---|---| | **Stripe MCP server** | 🟡 public preview | `https://mcp.stripe.com` + `docs.stripe.com/mcp` | LLM-operable Stripe ops | | MCP `execute_analytics` tool | 🔴 private preview within MCP | (same) | Sigma queries via MCP | | **Stripe Console** | 🔴 **private preview, waitlist** | `docs.stripe.com/dashboard/console` | Agent-facing operations UI | | **Claimable Sandboxes API** | 🟡 public preview, integration requires email | `docs.stripe.com/sandboxes/claimable-sandboxes` | Per-developer ephemeral test envs | | Automated managed API key exchange | 🟡 public preview | `docs.stripe.com/stripe-apps/api-authentication/managed-api-keys` | Auto-rotate API keys for installed apps | | Full-page multitab Stripe Apps | 🔴 private preview, allowlist | `docs.stripe.com/stripe-apps/patterns/full-page-apps` | Embedded surfaces inside the Dashboard | | **Stripe Workflows** | 🟢 GA | `docs.stripe.com/workflows` | Automate any Stripe + 3p workflow | | Public roadmap | 🟢 live | `stripe.com/roadmap` | See what's coming | ## 5.4 Platform / data primitives | Primitive | Status | Doc | Use it for | |---|---|---|---| | **Custom Objects** | 🔴 **private preview, email-gated** | `docs.stripe.com/custom-objects` | Modeling vertical-SaaS objects on Stripe | | **Stripe Database** | 🔴 **private preview, email-gated** | `docs.stripe.com/stripe-data/stripe-database` | Managed Postgres of all your Stripe data | | Data Pipeline next-gen | 🟡 public preview | `docs.stripe.com/stripe-data/data-pipeline-next-gen` | Real-time stream to your warehouse | | Reports API v2 | 🟡 public preview | `docs.stripe.com/reports/v2-api` | Programmatic Sigma SQL | | Billing Scripts (3 new types) | 🔴 private preview | `docs.stripe.com/billing/scripts` | Custom billing logic in TypeScript | | Adaptive Pricing | 🟢 GA | `docs.stripe.com/payments/currencies/localize-prices/adaptive-pricing` | Auto-FX checkout | | Stripe Tax, auto US filing | 🟢 GA | `docs.stripe.com/tax/file-with-taxjar` | Hands-off US tax compliance | | Tax Connectors (Shopify, NetSuite) | 🟡 public preview | `docs.stripe.com/use-stripe-apps/shopify` | Sync Stripe tax to your existing system | | Tax ID validation at checkout | 🟡 public preview | `docs.stripe.com/payments/advanced/tax?api-integration=checkout#real-time-tax-id-validation` | Real-time B2B tax ID verification | | **Managed Payments (MoR)** | 🟢 GA | `docs.stripe.com/payments/managed-payments` | Stripe-as-Merchant-of-Record | | **Network cost passthrough (IC++)** | 🟢 GA, 45 markets | `docs.stripe.com/connect/network-cost-passthrough-platforms` | True interchange + fees pricing | | Authorization Boost A/B testing | 🟡 public preview | `docs.stripe.com/payments/analytics/optimization/a-b-testing` | Optimize approval rates | | Stripe Dashboard assistant | 🟢 live | `docs.stripe.com/assistant` | Natural-language Sigma queries | | Standalone 3DS | 🟢 GA | `docs.stripe.com/payments/3d-secure/standalone-three-d-secure` | Add 3DS to non-Stripe payments | | Payment plans (`BNPL` via Invoicing) | 🟡 public preview | `docs.stripe.com/invoicing/payment-plans` | "Pay in 4" on invoices | ## 5.5 Fraud / identity (Radar 2026) | Primitive | Status | Doc | Use it for | |---|---|---|---| | Free trial abuse prevention | 🟡 public preview | `docs.stripe.com/radar/free-trial-abuse` | Stop multi-account free-tier farming | | Pay-as-you-go abuse | 🟡 public preview | `docs.stripe.com/radar/pay-as-you-go-abuse` | Stop API key theft / token theft | | Multi-account / account-sharing | 🟡 public preview | `docs.stripe.com/radar/multi-account-and-account-sharing-abuse` | Detect linked accounts | | Stripe Signals (off-Stripe) | 🟢 GA | `docs.stripe.com/signals` | Cross-network fraud signals | | Stripe Signals, disputes | 🟡 public preview | `docs.stripe.com/radar/multiprocessor#fraudulent-dispute` | Predict dispute risk | | Custom Radar models | 🟢 live (expansion in Q2 private preview) | `docs.stripe.com/radar/custom-fraud-models` | Train custom fraud models on your data | | Radar for Platforms | 🟢 GA | `docs.stripe.com/radar/radar-for-platforms` | Marketplace-wide fraud rules | | Smart Disputes evidence library | 🟢 GA | `docs.stripe.com/disputes/set-up-smart-disputes` | Auto-respond to chargebacks | | AI-powered evidence recommendations | 🟢 GA | `docs.stripe.com/disputes/set-up-smart-disputes#provide-more-data-at-dispute-time` | "What evidence should I add?" | ## 5.6 Stablecoins / crypto (the Bridge + Privy + Tempo stack) > We couldn't be more excited to welcome @privy_io to Stripe! > > Crypto has enabled the rise of stablecoins. But the converse is not as well recognized: stablecoins are enabling an explosion in web3 app development. And Privy is building foundational infrastructure. > > — [@wgaybrick](https://x.com/wgaybrick/status/1932832876023964143), June 11 2025 | Primitive | Status | Doc | Use it for | |---|---|---|---| | Bridge fiat ramps | 🟢 GA | `apidocs.bridge.xyz` | USD/EUR/BRL/MXN + new COP/GBP | | Bridge multichain | 🟢 GA | (same) | New: Tempo, Plasma, Celo, Sui | | Bridge Open Issuance | 🟢 GA via Bridge | `apidocs.bridge.xyz` | Custom branded stablecoins | | **Privy Digital Asset Accounts** | 🟡 public preview | `docs.privy.io/wallets/accounts/overview` | Multichain wallets API | | Privy flexible custody | 🟢 GA | `docs.privy.io/wallets/overview/flexible-custody` | Hybrid custody models | | Privy custodial wallets | 🟢 GA | `docs.privy.io/wallets/custodial-wallets/overview` | User-facing wallets | | Privy Earn (Morpho DeFi) | 🟢 GA | `docs.privy.io/transaction-management/overview#earn` | DeFi yield on user balances | | Privy agentic wallets | 🟢 GA | `docs.privy.io/recipes/agent-integrations/agentic-wallets` | Wallets for AI agents | | Privy agent CLI | 🟢 GA | `docs.privy.io/recipes/agent-integrations/agent-cli` | Local agent dev | | Tempo Protocol | 🟢 mainnet live | `docs.tempo.xyz/protocol` | Settlement layer for MPP | ## 5.7 Distribution | Primitive | Status | Doc | Use it for | |---|---|---|---| | **Stripe Projects** | 🟢 **GA, "now available to everyone"** | `docs.stripe.com/projects` | Provision SaaS infra via Stripe | | Projects provider catalog | 🟢 live | `projects.dev/providers` | Browse the 32 launch providers | | Stripe Apps Marketplace | 🟢 live | `marketplace.stripe.com` | Publish/install Stripe apps | | Partner certification | 🟢 live | `docs.stripe.com/partners/training-and-certification` | Build trust as a Stripe partner | That's the complete map. Print this section, screenshot it, fork the repo. This is your reference card for the next 6 months. --- # 6. Ten projects worth building this quarter Here's what's worth building with this stack. Each project below has: - A clear thesis - A buildability score (🟢 ship now / 🟡 shippable with caveats / 🔴 blocked on private preview) - Architecture diagram - Real code (verified against the canonical docs) - The full stack you'd use - Why this wins vs. Stripe's first-party products - A go-to-market angle The order is roughly by ship-difficulty, easiest first. ## Project 1: Tollgate — `HTTP`-402 paywall middleware for any API **Status:** 🟢 Ship this weekend. **Stripe approval needed:** None. **Estimated time-to-MVP:** 2 days. **Estimated time-to-paying-customers:** 2 weeks. ### Thesis Every AI inference endpoint, every RAG pipeline, every video transformation service has the same problem: you can't charge sub-cent on Stripe credit-card rails (the per-charge fee is $0.30). Today these services either bundle into monthly subscriptions (loses millions in long-tail) or eat the loss as a "free demo." `mppx` + Tempo solves this. Build a middleware library that turns any `HTTP` route into a per-call-billable endpoint, settled on-chain in milliseconds, with sub-cent gas. ### Architecture ![Tollgate sequence diagram: agent → 402 → signed payment → streamed response with stream.charge() per chunk → Tempo L1 settlement](/post-images/stripe-sessions-2026-developer-guide/02-tollgate-flow.jpg) ### Code The full integration is roughly 30 lines per route: ```ts // app/api/inference/[model]/route.ts import { Mppx, tempo } from 'mppx/nextjs' import { Anthropic } from '@anthropic-ai/sdk' const mppx = Mppx.create({ methods: [tempo({ currency: '0x20c0000000000000000000000000000000000000', recipient: process.env.WALLET_ADDRESS!, sse: true, })], }) const anthropic = new Anthropic() export const POST = mppx.session({ amount: '0.001', unitType: 'word', })(async (req: Request) => { const { messages } = await req.json() return async function* (stream) { const llmStream = await anthropic.messages.create({ model: 'claude-sonnet-4-5', messages, stream: true, max_tokens: 4096, }) for await (const event of llmStream) { if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') { const words = event.delta.text.split(/\s+/).filter(Boolean) for (const word of words) { await stream.charge() // 0.001 pathUSD per word yield word + ' ' } } } } }) ``` That's the whole thing. Wrap any LLM call, any image-generation call, any data API call. The wrapper is generic. ### The product around it Tollgate isn't *just* the middleware. The product is: 1. **The library**: `npm install tollgate`. Wraps `mppx` with sensible defaults (rate limits, billing dashboards, channel funding flows). 2. **The dashboard**: at `tollgate.dev`. Shows revenue per route, top spenders, per-route conversion (how many 402s converted to paid calls). 3. **The discovery layer**: a registry at `tollgate.dev/registry` that lists every public Tollgate-wrapped API. Agents can fetch this to find priced services. 4. **The funding agent**: a one-line install that lets any agent fund its own MPP channel via `await tollgate.fund({ amount: '5.00', via: 'stripe' })`. It uses Stripe Crypto Onramp to convert USD to pathUSD. ### Go-to-market The first 100 customers come from these channels, in order: 1. **Personal network of API builders.** AI infra people you already know, anyone with a HuggingFace, Replicate, Modal, or fly.io endpoint. Pitch: "10 minutes to add per-token billing, 0% revenue share." 2. **Hacker News launch.** Title: *"Show HN: Charge $0.0001 per API call, settled on-chain in milliseconds"*. Top 10 every time. 3. **Replit / Vercel templates.** Publish a `npx create-tollgate-app` template that scaffolds a minimal MPP server. 4. **An open-source flagship.** Build one well-known API (e.g. a Tempo-priced LLM proxy at `proxy.tollgate.dev`) that other devs cite when explaining "this is what `mppx` actually feels like." ### Business model - Free for developers up to $1k/month routed through Tollgate - $99/month for $1k–$10k routed - 0.5% above $10k routed - Tollgate doesn't custody funds; everything lives in the merchant's wallet on-chain. ### Why this wins Stripe themselves can't ship this. Stripe is the **payment infrastructure**; they can't be opinionated about which API merchants should run. Tollgate's wedge is being the **dev-tool** layer on top of `mppx`, the same way Vercel was the dev-tool layer on top of AWS. > Getting married next week 🎉🤵💍👰 > > With the new Link agents CLI announced today at @stripe I can give my agent a budget to get us a gift! > > Really looking fwd to see how this turns out and what the Clanker will do with this 🤞 Will keep you posted > > — [@altryne](https://x.com/altryne/status/2049639344340873264), April 29 2026 --- ## Project 2: AgentReady.io — Lighthouse for agentic commerce **Status:** 🟢 Ship this weekend. **Stripe approval needed:** None. **Estimated time-to-MVP:** 3 days. **Estimated time-to-press:** 1 week. ### Thesis Three open agent-payment protocols (MPP, UCP, ACP) plus x402 just shipped. Every merchant on the planet should publish a UCP catalog, an ACP capabilities feed, and a `skill.md`. But most don't know these exist yet. There's a 24-month window where being early is a moat. AgentReady is the obvious wedge. It's the *Lighthouse for agentic commerce*: type any URL, get a score across 8 axes, with auto-generated PRs to fix what's missing. ### The 8-axis score ![AgentReady.io live-score card showing 8-axis breakdown for examplecoffee.com — total 45/100](/post-images/stripe-sessions-2026-developer-guide/03-agentready-score.jpg) ### Code: the scanner The whole scanner is a function that probes 8 well-known paths and parses what it finds: ```ts // scanner.ts type AxisResult = { axis: string passing: boolean score: number evidence?: string fix?: string // markdown describing what to add } const PROBES = [ { axis: 'llms_txt', path: '/llms.txt', points: 10, fix: 'Create /llms.txt with your scraping policy', }, { axis: 'skill_md', path: '/skill.md', points: 10, fix: 'Create /skill.md (see link.com/skill.md spec)', }, { axis: 'ucp', path: '/.well-known/ucp.json', points: 15, fix: 'Publish UCP manifest (see ucp.dev)', }, { axis: 'acp', path: '/.well-known/acp.json', points: 15, fix: 'Publish ACP capabilities (see agenticcommerce.dev)', }, { axis: 'mpp', path: '/.well-known/mpp.json', points: 15, fix: 'Publish MPP server config (see mpp.dev)', }, { axis: 'x402', path: '/.well-known/x402', points: 10, fix: 'Add x402 endpoint declaration', }, ] as const async function scan(url: string): Promise { const u = new URL(url) const results: AxisResult[] = [] // Probe well-known paths in parallel const probeResults = await Promise.all(PROBES.map(async (p) => { try { const r = await fetch(`${u.origin}${p.path}`, { redirect: 'follow', signal: AbortSignal.timeout(5000), }) return { axis: p.axis, passing: r.ok, score: r.ok ? p.points : 0, evidence: r.ok ? await r.text() : undefined, fix: r.ok ? undefined : p.fix, } } catch { return { axis: p.axis, passing: false, score: 0, fix: p.fix, } } })) results.push(...probeResults) // Now parse the homepage for Schema.org + ACS profile const homeRes = await fetch(u.origin) const html = await homeRes.text() results.push({ axis: 'schema_org', passing: html.includes('schema.org/Product'), score: html.includes('schema.org/Product') ? 10 : 0, fix: html.includes('schema.org/Product') ? undefined : 'Add Schema.org Product markup to product pages', }) results.push({ axis: 'stripe_acs', passing: html.includes('stripe-profile') || html.match(/ tag pointing to your StripeProfile', }) return results } ``` ### Code: the auto-PR generator ```ts // auto-pr.ts import { Octokit } from '@octokit/rest' async function generatePR( repo: { owner: string; name: string; branch: string }, failing: AxisResult[], accessToken: string, ) { const octokit = new Octokit({ auth: accessToken }) // Create branch const { data: ref } = await octokit.git.getRef({ owner: repo.owner, repo: repo.name, ref: `heads/${repo.branch}`, }) await octokit.git.createRef({ owner: repo.owner, repo: repo.name, ref: 'refs/heads/agentready/auto-pr', sha: ref.object.sha, }) // Generate fix files for (const result of failing) { const file = TEMPLATES[result.axis]({ /* domain context */ }) await octokit.repos.createOrUpdateFileContents({ owner: repo.owner, repo: repo.name, branch: 'agentready/auto-pr', path: file.path, message: `Add ${result.axis} for AgentReady.io`, content: Buffer.from(file.contents).toString('base64'), }) } // Open PR const { data: pr } = await octokit.pulls.create({ owner: repo.owner, repo: repo.name, title: `Make this site agent-ready (${failing.length} files)`, head: 'agentready/auto-pr', base: repo.branch, body: `Generated by [AgentReady.io](https://agentready.io). This PR adds the missing files needed for AI agents to discover and transact with your store: ${failing.map(r => `- \`${r.fix}\``).join('\n')} [View score on AgentReady.io →](https://agentready.io/score/${repo.owner}/${repo.name})`, }) return pr.html_url } ``` ### The MCP server Bonus: publish AgentReady as an MCP tool so other agents can self-check: ```ts // mcp/agentready.ts import { McpServer } from '@modelcontextprotocol/sdk' const mcp = new McpServer({ name: 'agentready', version: '1.0.0' }) mcp.tool('score_url', { description: 'Score a URL for agentic commerce readiness', parameters: { url: { type: 'string' } }, }, async ({ url }) => { const results = await scan(url) const total = results.reduce((sum, r) => sum + r.score, 0) return { url, total_score: total, breakdown: results, auto_pr_url: total < 80 ? `https://agentready.io/auto-pr?url=${encodeURIComponent(url)}` : null, } }) mcp.serve({ port: 3000, transport: 'http' }) ``` Now any agent (Claude, Cursor, ChatGPT) can run `score_url("examplecoffee.com")` and get back a structured response. Your scanner becomes part of every agent's toolkit. ### Stack - **Frontend:** Next.js 15 + Tailwind + shadcn/ui (the score card is the entire UX) - **Backend:** Cloudflare Workers (CDN-edge probes; sub-100ms probe latency) - **Auth:** Clerk (one of Stripe Projects' 32 launch partners, provision via `stripe projects init`) - **DB:** Cloudflare D1 (free tier covers 100K scans) - **MCP server:** Hono on Cloudflare Workers - **Open source:** the scanner library on GitHub at `agentready/scanner` ### Go-to-market 1. **HN launch.** *"Show HN: AgentReady.io, Lighthouse for agent commerce"*. Lead with the live score widget on a few well-known sites (Stripe.com, Vercel.com, Apple.com). 2. **Twitter / X.** Tweet a daily score of a different prominent site. The "Apple scored 25/100" tweets write themselves. 3. **DevRel partnership** with the UCP / MPP / ACP teams. They all need the discovery layer to grow. Be useful first, ask later. 4. **Sponsored placements in `link.com/skill.md`**: every merchant who scores well links back to AgentReady.io as their auditor. ### Business model - Free public scans (anyone) - $19/mo per domain for monitoring (alerts when score drops, weekly PRs) - $499/mo for "AgentReady for Agencies" (multi-domain dashboards, white-label PRs, API access) - Eventually: a yearly "State of Agentic Commerce" report (consultancies pay for this) ### Why this wins Stripe is protocol-neutral. They can't favour their own ACS over UCP/ACP/MPP, and they can't push merchants to add 5 different files. AgentReady can. The first comprehensive Lighthouse-equivalent in this space takes the SEO-of-agent-commerce position. There's a 24-month window because the protocols just shipped. --- ## Project 3: Skillkit — `link.com/skill.md` generator + MCP-as-a-service **Status:** 🟢 Ship this weekend. **Stripe approval needed:** None. ### Thesis `link.com/skill.md` is the agent-equivalent of a website homepage. It's a tiny `YAML`+Markdown file at your domain root that tells any agent: "here's my catalog endpoint, here's my MPP server, here are my supported payment methods, here are my languages, here's the human-readable description." Every merchant should have one. Most won't write it from scratch. Skillkit is the wizard. ### The 5-question wizard ![Skillkit's 5-question wizard for generating skill.md](/post-images/stripe-sessions-2026-developer-guide/04-skillkit-wizard.jpg) ### Code: the generator ```ts // generator.ts import { z } from 'zod' const Inputs = z.object({ domain: z.string(), description: z.string().max(200), catalog_source: z.enum(['ucp', 'shopify', 'custom_api', 'stub']), catalog_endpoint: z.string().url().optional(), payment_methods: z.array(z.enum(['card', 'link', 'mpp', 'x402'])), free_ship_threshold_cents: z.number().int().min(0), languages: z.array(z.string()).default(['en']), }) function generate(input: z.infer): string { const frontmatter = { name: input.domain.replace(/\./g, '-'), description: input.description, endpoints: { catalog: input.catalog_endpoint || `https://${input.domain}/.well-known/ucp.json`, checkout: `https://${input.domain}/api/agent-checkout`, ...(input.payment_methods.includes('mpp') && { mpp: `https://${input.domain}/.well-known/mpp.json`, }), }, payment_methods: input.payment_methods, languages: input.languages, shipping: { free_threshold_cents: input.free_ship_threshold_cents, }, } const yaml = yamlStringify(frontmatter) return `--- ${yaml} --- # ${input.domain.split('.')[0]} — agent guide ${input.description}. To purchase as an agent: 1. **Browse the catalog** at \`${frontmatter.endpoints.catalog}\`. Products have stable IDs. 2. **Initiate checkout** by POSTing to \`${frontmatter.endpoints.checkout}\` with \`{ sku, quantity, shipping_address, payment_method }\`. 3. We respond with a 402 + payment challenge based on your method. 4. Pay; we'll confirm and email tracking within 24 hours. For best UX: - Show the user the price in **their** local currency before charging (we accept Stripe Adaptive Pricing — the agent doesn't need to do FX). - Confirm shipping address explicitly. ${input.payment_methods.includes('mpp') ? '- Use MPP / Tempo for sub-cent settlement on subscription products.\n' : ''}${input.payment_methods.includes('link') ? '- Use Link agent wallets when possible — receipts auto-route to the user.\n' : ''} Built with Skillkit.dev. ` } ``` ### The MCP-as-a-service piece After generating the skill.md, Skillkit also offers to spin up a Stripe MCP server at a subdomain, `mcp.examplecoffee.com`, that any agent can install. The merchant gets one-click MCP without writing any code. ```ts // when user clicks "spin up MCP server" async function provisionMCP(domain: string, stripeKey: string) { // Deploy a Cloudflare Worker (per-merchant) const code = await renderTemplate('mcp-worker.ts', { stripeKey, domain, tools: ['list_charges', 'list_customers', 'create_refund', 'list_subscriptions'], }) await deployToCloudflare({ name: `mcp-${slugify(domain)}`, code, domain: `mcp.${domain}`, }) return `https://mcp.${domain}/mcp` } ``` ### Stack - **Frontend:** Next.js 15 + Tailwind - **Wizard state:** Zustand (lightweight) - **MCP runtime:** Cloudflare Workers (per-merchant isolation) - **Auth + billing:** Clerk + Stripe Projects (`stripe projects init`) - **DB:** Cloudflare D1 - **AI copy generation** (for the human-readable parts): Vercel AI SDK + Anthropic ### Go-to-market 1. **Free skill.md generator forever.** No signup needed. Generate, download, paste. 2. **MCP-as-a-service paid tier:** $9/mo to host the MCP at your subdomain. 3. **Skillkit Pro:** $99/mo. Every page on your site gets analyzed; we maintain a "live" skill.md that updates when you add products. 4. **Distribution:** integrate with the AgentReady.io scanner (Project 2). When AgentReady scores someone low, Skillkit is the recommended fix. ### Why this wins `skill.md` is too new for a single hand-written-from-scratch tutorial to dominate. Skillkit owns the "easy mode" path. Once you have 50,000 merchants running a Skillkit-generated skill.md + MCP, you become the canonical place to *update* an agent-readable manifest, the Mailchimp of agent commerce. --- ## Project 4: Catalog2Agent — sync any commerce platform to ChatGPT, Gemini, Copilot **Status:** 🟢 Ship merchant version this week. 🟡 Connect-platform version in 4-8 weeks. **Stripe approval needed:** ACS public preview opt-in (self-serve). ### Thesis Stripe ACS is in public preview. It distributes your catalog to ChatGPT, Gemini, Copilot, Meta AI. But uploading a catalog to ACS requires: - A Stripe Profile (preview) - Each product as a Stripe Product (with the right metadata) - Inventory sync (preview) - Image moderation - Currency translation (Adaptive Pricing handles this. But only if configured right) For a merchant on Shopify with 5,000 SKUs, this is days of integration work. Catalog2Agent makes it one click. ### Architecture ![Catalog2Agent: Shopify/WooCommerce/Magento/Wix → Catalog Normalizer → Stripe ACS / UCP / ACP feeds](/post-images/stripe-sessions-2026-developer-guide/05-catalog2agent.jpg) ### Code: the Shopify → Stripe ACS sync ```ts // app/api/shopify/webhook/route.ts import Stripe from 'stripe' import { z } from 'zod' const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!, { apiVersion: '2026-04-22.preview', }) const ShopifyProduct = z.object({ id: z.number(), title: z.string(), body_html: z.string(), vendor: z.string(), product_type: z.string(), variants: z.array(z.object({ id: z.number(), price: z.string(), inventory_quantity: z.number(), sku: z.string(), })), images: z.array(z.object({ src: z.string() })), }) export async function POST(req: Request) { const product = ShopifyProduct.parse(await req.json()) // Map to Stripe Product + Prices const stripeProduct = await stripe.products.create({ name: product.title, description: stripHtml(product.body_html).slice(0, 500), metadata: { shopify_id: String(product.id), vendor: product.vendor, type: product.product_type, profile_id: process.env.STRIPE_PROFILE_ID!, }, images: product.images.slice(0, 8).map(i => i.src), }) // Each variant becomes a Price for (const variant of product.variants) { await stripe.prices.create({ product: stripeProduct.id, unit_amount: Math.round(parseFloat(variant.price) * 100), currency: 'usd', metadata: { shopify_variant_id: String(variant.id), sku: variant.sku, inventory: String(variant.inventory_quantity), }, }) } return Response.json({ ok: true }) } ``` ### Code: simultaneously generating the UCP and ACP feeds ```ts // app/.well-known/ucp.json/route.ts export async function GET() { // Pull from your normalized Postgres const products = await db.select().from(productTable) return Response.json({ version: '1.0', merchant: { name: process.env.MERCHANT_NAME, domain: process.env.DOMAIN, }, catalog_endpoint: `https://${process.env.DOMAIN}/api/ucp/catalog`, checkout_endpoint: `https://${process.env.DOMAIN}/api/ucp/checkout`, payment_methods: ['card', 'link', 'mpp'], }, { headers: { 'Cache-Control': 'public, max-age=3600, s-maxage=3600', }, }) } // app/api/ucp/catalog/route.ts export async function GET(req: Request) { const cursor = new URL(req.url).searchParams.get('cursor') const products = await db .select() .from(productTable) .where(cursor ? gt(productTable.id, cursor) : undefined) .orderBy(productTable.id) .limit(100) return Response.json({ products: products.map(p => ({ id: p.id, name: p.name, description: p.description, price: { amount: p.price_cents, currency: 'USD' }, inventory: p.inventory, images: p.images, })), next_page: products.length === 100 ? `/api/ucp/catalog?cursor=${products.at(-1)!.id}` : null, }, { headers: { 'Cache-Control': 'public, max-age=300' }, }) } ``` ### Stack - **Source platforms:** Shopify Admin API, WooCommerce `REST` API, Magento, Wix - **Normalizer:** Postgres + Drizzle, Inngest for ingestion jobs - **Storage:** Cloudflare R2 for cached catalog feeds - **Stripe ACS API:** preview API version `2026-04-22.preview` - **Frontend:** Next.js 15 dashboard ### Go-to-market 1. **Shopify App Store listing**: "AI agent distribution, $0/mo" 2. **WooCommerce plugin**: same pitch, free up to 1,000 SKUs 3. **Direct to ChatGPT-savvy DTC brands**: small Shopify shops where the founder is the CTO 4. **Partnership with the Stripe ACS team**: be the recommended Shopify integration in their docs ### Business model - Free up to 100 SKUs - $49/mo for 100-1k SKUs - $499/mo for 1k-10k SKUs (with priority sync, multi-currency) - Custom pricing for enterprise ### Why this wins This is glue work that nobody wants to do but everybody needs. Stripe themselves can't ship a Shopify-specific connector (they're platform-neutral). Shopify themselves wouldn't prioritize this for years (it's not core). Catalog2Agent owns the gap. --- ## Project 5: Pulse — real-time per-customer margin observability for AI products **Status:** 🟢 v1 ships in 2-4 weeks (webhook-driven). 🔴 v2 with Stripe Database is blocked. ### Thesis Every AI product runs thin margins. Anthropic / OpenAI / Gemini API calls are real costs. If a customer churns through tokens at $0.40 while paying you $20/month flat, you're losing money on them. Most AI startups don't have a single source of truth for *per-customer margin in real time*. Stripe Sigma has revenue. Anthropic dashboard has costs. They never get joined. Pulse joins them. ### Architecture (v1) ![Pulse architecture: AI cost data + Stripe revenue → joined ledger → margin views](/post-images/stripe-sessions-2026-developer-guide/06-pulse-architecture.jpg) ### Code: the cost-tracking SDK wrapper ```ts // pulse-sdk.ts import Anthropic from '@anthropic-ai/sdk' import { db, ai_cost } from './db' const PRICING = { 'claude-sonnet-4-5': { input: 0.0003, output: 0.0015 }, 'claude-opus-4-5': { input: 0.0015, output: 0.0075 }, 'gpt-5': { input: 0.0005, output: 0.0020 }, 'gemini-3-pro': { input: 0.0002, output: 0.0010 }, } as const export class PulseAnthropic extends Anthropic { constructor(opts: ConstructorParameters[0] & { customer_id: string pulse_api_key: string }) { super(opts) this.customer_id = opts.customer_id } async create(params: any) { const start = Date.now() const result = await super.messages.create(params) const pricing = PRICING[params.model as keyof typeof PRICING] const cost_cents = Math.ceil( (result.usage.input_tokens * pricing.input) + (result.usage.output_tokens * pricing.output) ) await db.insert(ai_cost).values({ customer_id: this.customer_id, provider: 'anthropic', model: params.model, input_tokens: result.usage.input_tokens, output_tokens: result.usage.output_tokens, cost_cents, duration_ms: Date.now() - start, timestamp: new Date(), }) return result } } // usage in your app: const anthropic = new PulseAnthropic({ customer_id: req.user.id, pulse_api_key: process.env.PULSE_KEY!, }) const result = await anthropic.create({ ... }) ``` ### Code: the Stripe webhook side ```ts // app/api/stripe/webhook/route.ts import Stripe from 'stripe' import { db, revenue } from '@/db' const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!) export async function POST(req: Request) { const sig = req.headers.get('stripe-signature')! const event = stripe.webhooks.constructEvent( await req.text(), sig, process.env.STRIPE_WEBHOOK_SECRET!, ) if (event.type === 'invoice.payment_succeeded') { const inv = event.data.object as Stripe.Invoice await db.insert(revenue).values({ customer_id: inv.customer as string, amount_cents: inv.amount_paid, received_at: new Date(), }) } return Response.json({ ok: true }) } ``` ### Code: the margin query ```sql WITH revenue_by_customer_hour AS ( SELECT customer_id, date_trunc('hour', received_at) AS hour, SUM(amount_cents) AS revenue FROM revenue GROUP BY customer_id, date_trunc('hour', received_at) ), cost_by_customer_hour AS ( SELECT customer_id, date_trunc('hour', timestamp) AS hour, SUM(cost_cents) AS cost FROM ai_cost GROUP BY customer_id, date_trunc('hour', timestamp) ) SELECT COALESCE(r.customer_id, c.customer_id) AS customer_id, COALESCE(r.hour, c.hour) AS hour, COALESCE(r.revenue, 0) AS revenue_cents, COALESCE(c.cost, 0) AS cost_cents, COALESCE(r.revenue, 0) - COALESCE(c.cost, 0) AS margin_cents FROM revenue_by_customer_hour r FULL OUTER JOIN cost_by_customer_hour c ON r.customer_id = c.customer_id AND r.hour = c.hour WHERE COALESCE(r.revenue, 0) - COALESCE(c.cost, 0) < 0 ORDER BY margin_cents ASC LIMIT 100; ``` That query alone is the entire core product: "show me the customers I'm losing money on, right now." ### Stack - **Postgres:** Neon (Stripe Projects partner) - **Stripe SDK:** for webhooks - **AI SDK wrappers:** Pulse-Anthropic, Pulse-OpenAI, Pulse-Gemini, Pulse-Stripe-MCP - **Inngest:** for nightly margin rollups - **Clickhouse:** if customer is huge (over $50M ARR), we offer a Clickhouse migration - **Frontend:** Next.js 15 + Recharts - **Alerts:** Slack incoming webhooks ### Go-to-market 1. **Pulse-as-a-library is free.** Open source the SDK wrappers. Every AI startup ends up touching Pulse code. 2. **Pulse-as-a-service is paid.** $99/mo for the dashboard, $499/mo for slack alerts + investor reports. 3. **Distribution:** YC startup directory. Every YC AI startup has this pain. ### Why this wins Stripe Sigma is too generic; Stripe Database is in private preview. The Anthropic / OpenAI dashboards are model-specific. Nobody has cross-vendor + cross-customer + real-time. The opportunity here is the OpenTelemetry-equivalent of AI cost data, own the SDK that everyone wraps. --- ## Project 6: AgentDesk — Brex for AI agents **Status:** 🟡 v0 ships in 2-4 weeks (regular Issuing). 🔴 v1 needs Issuing-for-agents private preview. ### Thesis Every AI agent that buys things needs a card. Today you'd give the agent your personal credit card; that's terrifying. AgentDesk gives every agent a virtual card with: spending limits, MCC restrictions, real-time approval rules, Slack escalation for anything over $X. Issuing for agents (private preview) is the cleanest primitive. Apply now. Until it's open, build v0 with regular Issuing. ### Architecture ``` Agent (Claude / GPT / custom) │ "I need to buy domain.com — $12" ▼ ┌──────────────────────────────────────┐ │ AgentDesk approval engine │ │ rules: │ │ - under $50 auto-approve │ │ - merchant in allowed-list │ │ - weekly spend < $500 │ │ else: │ │ Slack approval poll │ └──────────────────┬───────────────────┘ │ approved ▼ Stripe Issuing — create single-use virtual card │ card.number, cvc, exp ▼ Agent passes to merchant │ ▼ Stripe webhook fires: issuing_authorization │ ▼ AgentDesk applies rules, approves or rejects │ ▼ Ledger entry + agent's monthly statement ``` ### Code: the auto-approve flow ```ts // app/api/agent/spend/route.ts import Stripe from 'stripe' import { db, agentLedger, agentRules } from '@/db' const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!) export async function POST(req: Request) { const { agent_id, merchant, amount_cents, mcc } = await req.json() // Load this agent's rules const rules = await db.select().from(agentRules) .where(eq(agentRules.agent_id, agent_id)) // Apply const verdict = applyRules(rules, { merchant, amount_cents, mcc }) if (verdict === 'approve') { // Create single-use virtual card const card = await stripe.issuing.cards.create({ cardholder: rules[0].cardholder_id, currency: 'usd', type: 'virtual', spending_controls: { spending_limits: [ { amount: amount_cents, interval: 'per_authorization' }, ], allowed_categories: [mcc as any], }, }) await db.insert(agentLedger).values({ agent_id, card_id: card.id, merchant, amount_cents, status: 'card_created', }) return Response.json({ number: card.number, cvc: card.cvc, exp: `${card.exp_month}/${card.exp_year}`, }) } if (verdict === 'escalate') { // Send to Slack const sent = await slack.send({ channel: rules[0].slack_channel, text: `Agent ${agent_id} wants to spend $${amount_cents/100} at ${merchant}. Approve?`, blocks: [/* approve / deny buttons */], }) return Response.json({ status: 'pending_approval', slack_thread: sent.ts, }) } return Response.json({ status: 'denied' }, { status: 403 }) } ``` ### Code: the webhook handler ```ts // app/api/stripe/issuing-auth/route.ts import Stripe from 'stripe' const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!) export async function POST(req: Request) { const event = stripe.webhooks.constructEvent( await req.text(), req.headers.get('stripe-signature')!, process.env.STRIPE_WEBHOOK_SECRET!, ) if (event.type === 'issuing_authorization.request') { const auth = event.data.object as Stripe.Issuing.Authorization // Look up the agent ledger entry by card_id const ledger = await db.select().from(agentLedger) .where(eq(agentLedger.card_id, auth.card.id)) .limit(1) .then(rs => rs[0]) if (!ledger) { // No matching ledger entry — decline return Response.json({ approved: false }) } // Validate the actual auth matches what we approved if (auth.amount > ledger.amount_cents * 1.1) { // Merchant trying to charge > 10% over what we approved return Response.json({ approved: false }) } return Response.json({ approved: true }) } } ``` ### Stack - **Stripe Issuing:** regular for v0, "Issuing for agents" once approved - **Slack OAuth + interactive blocks:** for approval UX - **Postgres:** Neon for the ledger - **Frontend:** Next.js 15 + shadcn for the dashboard - **MCP server:** so agents can introspect their own spend / remaining limits ### Go-to-market 1. **AgentDesk free for 1 agent + $1k/mo spend.** This is the hook. 2. **Pricing:** $49/mo per agent + 0.5% of spend. 3. **Direct sales to YC AI startups.** Most of them have agents that need cards. The CTO is the buyer. 4. **Partner with Anthropic / OpenAI Computer Use**: every Computer Use deployment needs spend control. Be the recommended integration. ### Why this wins Brex/Ramp don't have agent-specific UX. Stripe ships the primitives but not the product. AgentDesk owns the product layer between the two, same way Mercury owned the founder-banking layer between Silicon Valley Bank and Stripe. --- ## Project 7: TollMeter — usage-metered subscriptions for AI products (using Stripe Billing + Anthropic/OpenAI cost data) **Status:** 🟢 Ship in 2-3 weeks. **Stripe approval needed:** None. ### Thesis Every AI product wants to charge "$X for the first 1,000 messages, then $0.001 per message after." The Stripe + AI integration to do this cleanly is annoying. You have to wire usage records, prorate, handle overages, and somehow track per-customer cost in real time. TollMeter is a SaaS that does this in 5 lines of code. ### The 5-line integration ```ts import { TollMeter } from 'tollmeter' const meter = new TollMeter({ api_key: process.env.TOLLMETER_KEY!, stripe_subscription_item_id: subscriptionItemId, }) // Inside your AI handler: await meter.charge(customer_id, { messages: 1, tokens: 4096 }) ``` Behind the scenes, TollMeter: 1. Aggregates the usage in Clickhouse (real-time) 2. Pushes to Stripe Billing's usage record API 3. Tracks the corresponding cost (Anthropic/OpenAI/Gemini) 4. Surfaces a per-customer P&L back to the merchant 5. Triggers Slack alerts when a customer crosses an LTV threshold ### Stack - Clickhouse for usage aggregation - Stripe Billing usage records - Pulse SDK (Project 5) for cost data - Hono on Bun for the API gateway ### Why this wins Stripe Billing is too low-level. Pylon, Lago, Orb are too generic. TollMeter is purpose-built for AI products with model-vendor-aware cost tracking out of the box. --- ## Project 8: RadarGuard — drop-in fraud protection for AI signup flows **Status:** 🟢 Ship in 1-2 weeks. ### Thesis "1 in 6 AI signups is malicious" was Stripe's headline stat. Most AI startups don't have a Radar implementation. RadarGuard is a single React component + a server-side handler that wires Radar 2026's bot abuse + free trial abuse + multi-account abuse into any signup form. ### Code ```tsx // React component import { RadarGuard } from '@radarguard/react' { if (verdict === 'block') router.replace('/signup/blocked') if (verdict === 'challenge') setShowCaptcha(true) }} > ``` Server-side, RadarGuard hits Stripe's bot abuse + free trial abuse APIs and applies a custom decision tree. ### Why this wins Stripe Radar is opt-in and complicated to configure. RadarGuard is a 1-liner. Charge $99/mo per million signups checked. --- ## Project 9: Treasure — agent-operated business banking on Treasury **Status:** 🔴 Blocked on Treasury MCP private preview. **Apply now to be ready.** ### Thesis Once Stripe's Treasury MCP server is GA, an agent can move money on your behalf, pay bills, route payouts, sweep balances. Treasure is the safety layer: spending policies, multi-sig approval rules, audit logs. ### Why this wins The Treasury MCP is incredibly capable and incredibly dangerous: an LLM with money-movement permissions, optimizing your float. Treasure is the "seatbelt" layer, exactly the kind of vertical-SaaS that emerges 6 months after a primitive ships. --- ## Project 10: Stripewright — visual workflow builder for Stripe Workflows **Status:** 🟢 Ship in 3-4 weeks. ### Thesis Stripe Workflows just went GA with looping, third-party actions, and programmatic invocation. The current UX is OK but not great for non-developers. Stripewright is a Zapier-like visual builder that emits Stripe Workflows under the hood. ### Why this wins The non-developer audience for "automate my Stripe + Slack + Mailchimp + ChatGPT" is huge. Stripe ships the primitives; Stripewright ships the UX. --- # 7. The "ship in a weekend" stack Picking the right tools is half the battle. Here's the stack to use for any of the projects above, optimized for one-developer speed-to-ship while remaining production-quality. Most of this stack is now provisioned via `stripe projects init`. That's the actual shift Stripe Projects represents for individual developers. ## The opinionated 2026 hackathon stack ``` ┌──────────────────────────────────────────────────────────────────────┐ │ Layer Tool Stripe Projects? │ ├──────────────────────────────────────────────────────────────────────┤ │ Frontend Next.js 15 (app router) N/A │ │ Tailwind CSS v4 N/A │ │ shadcn/ui N/A │ │ │ │ Backend Hono on Cloudflare Workers OR N/A │ │ Bun + Hono N/A │ │ │ │ Auth Clerk ✅ Provider │ │ │ │ Database Neon (Postgres) ✅ Provider │ │ Drizzle ORM N/A │ │ │ │ Background jobs Inngest ✅ Provider │ │ │ │ Object storage Cloudflare R2 OR Tigris ✅ Tigris │ │ │ │ CDN/Edge Cloudflare N/A │ │ │ │ Hosting Fly.io OR Vercel ✅ Both │ │ │ │ Observability Tinybird ✅ Provider │ │ PostHog N/A │ │ │ │ AI runtime Vercel AI SDK N/A │ │ Anthropic SDK N/A │ │ OpenAI SDK N/A │ │ │ │ Payments Stripe SDK N/A (it's Stripe)│ │ mppx N/A │ │ Privy ✅ Provider │ │ │ │ Web scraping Firecrawl ✅ Provider │ │ Browser autom. Browserbase ✅ Provider │ │ Voice ElevenLabs ✅ Provider │ │ │ │ MCP @modelcontextprotocol/sdk N/A │ │ │ │ Email Resend ✅ Provider │ │ │ │ Domains/DNS Cloudflare DNS N/A │ │ Namecheap (programmatic) N/A │ └──────────────────────────────────────────────────────────────────────┘ ``` Of these, **eleven are provisioned by `stripe projects init`** (Clerk, Neon, Inngest + 8 others including Vercel, Privy, Resend). One CLI command provisions any combination. ## What `stripe projects init` actually does ```bash $ npx stripe-cli@latest projects init my-agent-app ``` Behind the scenes: 1. Creates a Stripe customer record for "my-agent-app" 2. For each chosen provider, signs you up via OAuth or API 3. Generates per-provider API keys, scoped to your project 4. Writes `.env.local` with all the secrets (12+ of them, per the demo) 5. Pipes each provider's billing through your Stripe customer record 6. Sets up monthly consolidated invoicing in your Stripe Dashboard The savings: time, cognitive load, and the audit trail of every dashboard you'd otherwise have to manage. For agencies running multiple client projects, this is enormous. ## The provider catalog as of Sessions 2026 The 32 launch providers, by category: **Hosting / compute:** Vercel, Fly.io, Cloudflare Workers, Render **Database:** Neon (Postgres), Supabase, Turso, Tigris (object), MotherDuck **Auth:** Clerk, WorkOS **AI/ML:** Anthropic, OpenAI, ElevenLabs + 5 more (Replicate, Pinecone, and others) **Observability:** PostHog, Tinybird, Sentry **Email/comms:** Resend, Postmark, Twilio **Payments-adjacent:** Privy, Bridge (Stripe-owned now), Mercury **DevOps:** Inngest, Trigger.dev, GitHub Actions **Scaffolding:** Linear, Notion (project mgmt) Browse the live catalog at [projects.dev/providers](https://projects.dev/providers). ## What's NOT in `projects` yet Mostly: heavyweight cloud (AWS, GCP, Azure), data warehouses (Snowflake, BigQuery), and traditional SaaS infrastructure (Datadog, Splunk). Stripe is starting with developer-tools, not enterprise. Expect the catalog to expand to 100+ providers within 12 months. ## A complete v0 of any of the 10 projects, in 6 commands ```bash # 1. Scaffold npx create-next-app@latest my-app --typescript --tailwind --app # 2. Provision all infrastructure cd my-app npx stripe-cli@latest projects init # select Clerk, Neon, Inngest, Tinybird, Resend # 3. Add the libraries npm install stripe mppx @anthropic-ai/sdk drizzle-orm @clerk/nextjs # 4. Wire mppx (this is the agentic-payments line) echo 'export { Mppx, tempo } from "mppx/nextjs"' > lib/mppx.ts # 5. Run dev server npm run dev # 6. Deploy npx vercel --prod # (or `fly deploy` if you used Fly via projects) ``` That's it. Six commands and you have: - A Next.js 15 app with auth (Clerk) - A Postgres database (Neon) - Background jobs (Inngest) - Real-time analytics (Tinybird) - Email (Resend) - Stripe payments (regular) - MPP / Tempo agent payments (`mppx`) - LLM access (Anthropic) - Deployed publicly All billed through one Stripe invoice. ## Tips for accelerating further 1. **Use shadcn/ui's component library.** The dashboard for any of these projects is 80% lifted from shadcn examples. Don't rebuild components from scratch. 2. **Use Vercel AI SDK + `useChat` for any LLM UX.** It handles streaming, tool calls, error recovery. Saves you a week of plumbing. 3. **Use Drizzle, not Prisma.** Drizzle is faster to scaffold, generates better TypeScript types, and the SQL is right there. 4. **Use Inngest for any cron / background work.** Their free tier covers prototype usage. Their typed event system catches bugs at compile time. 5. **Use Cloudflare Workers for the MCP server.** Lower latency than Vercel, free tier covers MCP traffic, and the per-merchant isolation pattern (see Skillkit, Project 3) is trivial. 6. **Use Postgres `tstzrange` for any "valid_from / valid_to" pattern.** This will save you when implementing pricing tiers, agent permissions, etc. 7. **Use Stripe Workflows for low-volume orchestration.** It just went GA. For things like "if customer churns, send Slack alert + email + remove license," it's faster than rolling your own. --- # 8. Business models for agentic products The unit economics of agentic-commerce products are different from traditional SaaS. Here's how to think about pricing each of the 10 projects above. ## The four common revenue models ### Model 1: Per-call / per-token (what `mppx` enables) Charge $0.0001 to $0.01 per API call or per yielded token. Settled on-chain in real time. Zero customer acquisition cost (any agent can pay). **Best for:** APIs, LLM proxies, data feeds, transformation services. **Example:** Tollgate (Project 1) charges 0.5% above $10k/month. **Trade-off:** Hardest model to forecast revenue. You're at the mercy of how much traffic your customers' agents drive. ### Model 2: Volume-tiered SaaS Free tier → $X/mo for medium → $XX/mo for high → custom for enterprise. Standard SaaS pricing applied to AI infrastructure. **Best for:** Dashboard products, observability, fraud guard. **Example:** Pulse (Project 5): $99/mo for the dashboard, $499/mo for Slack alerts. **Trade-off:** Higher CAC than per-call. But predictable revenue. ### Model 3: Per-seat-per-agent $X/mo per agent under management. Like Brex's per-employee billing. **Best for:** AgentDesk (Project 6), Treasure (Project 9). Anywhere "an agent is a user." **Example:** AgentDesk, $49/mo per agent + 0.5% spend. **Trade-off:** Need a definition of "an agent": easy for cards (one card-per-agent) but harder for general APIs. ### Model 4: Marketplace take-rate You sit between buyer and seller and take a % of GMV. **Best for:** Catalog2Agent (Project 4), AgentReady's auto-PR feature. **Example:** Catalog2Agent could take a small % of agent-driven revenue (would require sellers to opt in). **Trade-off:** Aligns incentives perfectly but requires you to be in the payment path. ## Composite models work too The strongest products combine 2-3: - **Tollgate:** per-call (Model 1) for routing + volume-tiered SaaS (Model 2) for the dashboard - **Pulse:** open-source SDK (free) + dashboard SaaS (Model 2) + investor-grade reports (premium) - **AgentDesk:** per-agent SaaS (Model 3) + per-spend take-rate (Model 4) ## Pricing the 10 projects | Project | Model | Suggested pricing | |---|---|---| | Tollgate | Per-call + SaaS | Free under $1k/mo routed, $99/mo $1k-$10k, 0.5% above $10k | | AgentReady.io | SaaS | $19/mo per domain monitored, $499/mo agency tier | | Skillkit | Freemium SaaS | Free generator, $9/mo MCP hosting, $99/mo Pro | | Catalog2Agent | Volume SaaS | Free up to 100 SKUs, $49/mo ≤1k, $499/mo ≤10k | | Pulse | Freemium SaaS | OSS SDK free, $99/mo dashboard, $499/mo alerts | | AgentDesk | Per-agent + take-rate | $49/mo per agent + 0.5% spend | | TollMeter | Volume SaaS | $99/mo first 1M events, $0.0001/event after | | RadarGuard | Volume SaaS | $99/mo per million signups checked | | Treasure | Per-seat | $199/mo per managed treasury | | Stripewright | SaaS | $29/mo personal, $99/mo team, $499/mo agency | ## The most underpriced opportunity Looking across the 10, the most asymmetric bet is **AgentReady.io**. Reasons: 1. **Pure standards work**: no Stripe approval needed 2. **First-mover wedge in a brand-new category**: UCP/MPP/ACP just shipped 2 days before Sessions 3. **Network effects via the auto-PR system**: every fix you generate raises the floor for what "agent-ready" means 4. **MCP-tool monetization**: every agent that installs your MCP becomes a referral channel 5. **Eventual data play**: once you've scanned 100k sites, you have the canonical "State of Agentic Commerce" report The downside: it's a content/SaaS hybrid that doesn't have huge unit economics per customer. But the moat is durable and the market is brand-new. ## The hardest project **Treasure.** The Treasury MCP is in private preview, you'd need to apply, and the entire product depends on a primitive that's months from public. But if you start applying *now*, you'd be among the first 100 builders on Treasury MCP, which is where outsized outcomes live. --- # 9. The apply-now matrix If you're going to build anything that depends on a private-preview Stripe product, **apply today**. Stripe historically takes 2-12 weeks to onboard new private-preview customers, and the queue grew massively after Sessions. Here's the matrix: | Feature | Where to apply | Likely delay | |---|---|---| | **Issuing for agents** | `dashboard.stripe.com/issuing/overview` | 4-8 weeks | | **Stripe Console** | doc page waitlist | 2-6 weeks | | **Custom Objects** | doc page email request | 4-12 weeks | | **Stripe Database** | doc page email request | 4-12 weeks | | **Treasury MCP / agentic finance** | `docs.stripe.com/mcp#request-access-agentic-treasury` | 4-8 weeks | | **Treasury `USDC` support** | `docs.stripe.com/treasury/stablecoins#request-access-privy` | 2-4 weeks | | **Full-page Stripe Apps** | allowlist via Stripe Apps team | 4-12 weeks | | **Billing Scripts (3 new types)** | doc page request | 4-8 weeks | | **MCP `execute_analytics` tool** | `docs.stripe.com/mcp#request-access-agentic-treasury` | 2-4 weeks | | **Treasury account itself** | `docs.stripe.com/treasury#request-access` | varies | **The advice:** apply for everything you might need within 6 months. There's no penalty for not using access. The penalty for *not* applying is being 8 weeks behind when the preview opens. > A lot of what we're building at @daytonaio looks a lot closer to @stripe than @awscloud. > > Not just pricing or DX. The whole way the product is shaped. > > Got early access to Stripe Treasury; now payments, bank and cards all in one place. > > It's simple and actually does what it's supposed to. Love it. > > — [@ivanburazin](https://x.com/ivanburazin/status/2049221016824762740), April 28 2026 ## How to write a good preview application Stripe's private preview onboarding is human-reviewed. The applications that get through fastest: - Are 1-2 paragraphs (not an essay) - Specify exactly what you'll build - Mention production traffic / customer count if relevant - Don't promise enterprise features unless you have enterprise customers - End with one specific question Stripe can answer Bad: *"We're really excited about Issuing for agents and would love to explore the possibilities together."* Good: *"We run AgentDesk, a B2B SaaS providing virtual cards for AI agents (currently 230 customers, ~$50k MRR on regular Stripe Issuing). We want to migrate our spending-control engine to Issuing for agents specifically because we need single-use cards with per-task expiry. Specific question: does Issuing for agents support MCC restrictions on a per-card basis?"* ## How to verify any feature's status before promising a customer Stripe puts a status pill at the top of every doc page (`[Public preview]`, `[Private preview]`, or no pill = GA). You can curl-check it: ```bash curl -sL "https://r.jina.ai/" \ | grep -iE "(public preview|private preview|generally available|release-phases)" \ | head -3 ``` Always run this before committing to a customer-facing feature that depends on something Stripe announced. The canonical taxonomy is at [docs.stripe.com/release-phases](https://docs.stripe.com/release-phases). --- # 10. Closing thoughts and references ## What to do with this post Three things, in order: 1. **Pick one project** from Section 6 and ship a v0 this week. The single biggest predictor of who wins in agentic commerce is who has a thing in production by Q3 2026, when the major surfaces (ChatGPT, Gemini, Copilot, Meta) start routing real traffic to agent-ready merchants. 2. **Apply for every private preview** in Section 9 you might want in 6 months. There's no cost to applying and meaningful cost to waiting. 3. **Publish a `skill.md`** at your domain root and a `mppx`-protected endpoint within the next 30 days. Even if your product isn't agent-first, this is the new minimum viable agent-readiness. (See Project 3.) ## A note on the ecosystem A balanced view from a Tempo insider: > Tempo has been live on mainnet for ~1.5 months. > > This is Stripe + Paradigm, $500M raised, reported $5B valuation, Visa/Stripe/Standard Chartered validators, and DoorDash/Shopify/OpenAI/Nubank ecosystem logos. > > Current onchain footprint: ~243k tx/day, ~$3M DeFi TVL, ~$6.5M external stablecoin supply ex-pathUSD. > > For context, Polygon is doing roughly: ~10M tx/day, ~$1.2B DeFi TVL, ~$4B stablecoin supply. > > The lesson is simple: big names can open doors, but they don't automatically create liquidity, usage, or network effects. Even with elite backing, a new chain still has to earn distribution block by block. A lot of work ahead. > > — [@vadim_web3](https://x.com/vadim_web3/status/2048697781175374019), April 27 2026 Sessions 2026 was the moment Stripe explicitly committed to being the economic infrastructure for AI. They didn't say it as a slogan. They backed it with 288 ships, a new L1 chain, three open standards, and an MCP server for their own dashboard. That's a big bet. It will pay off in some of these areas and fail in others. But the *direction* is set. For the next 24 months, every dev tool company will be racing to support agents as buyers, agents as developers, and agents as economic actors. The right move is to assume that direction is correct and build accordingly. No need to bet the company on it. But if you're starting something new in 2026, building it agent-first costs very little and preserves option value if Sessions 2026 turns out to be Stripe's iPhone moment. ## References The primary sources for this post: - **Stripe's announcement blog** (canonical): [stripe.com/blog/everything-we-announced-at-sessions-2026](https://stripe.com/blog/everything-we-announced-at-sessions-2026) - **Will Gaybrick keynote** post: linked from the announcement blog - **Stripe public roadmap**: [stripe.com/roadmap](https://stripe.com/roadmap) - **Tempo docs**: [docs.tempo.xyz](https://docs.tempo.xyz) - **MPP spec + SDKs**: [mpp.dev](https://mpp.dev), `@mpp` on X - **UCP spec**: [ucp.dev](https://ucp.dev), [github.com/Universal-Commerce-Protocol/ucp](https://github.com/Universal-Commerce-Protocol/ucp) - **ACP spec**: [agenticcommerce.dev](https://agenticcommerce.dev) - **Link agents**: [link.com/agents](https://link.com/agents), [link.com/skill.md](https://link.com/skill.md) - **Stripe Projects catalog**: [projects.dev/providers](https://projects.dev/providers) - **Stripe MCP server**: [mcp.stripe.com](https://mcp.stripe.com) - **Privy docs**: [docs.privy.io](https://docs.privy.io) - **Bridge API docs**: [apidocs.bridge.xyz](https://apidocs.bridge.xyz) The companion artifacts to this post (in this repo): - `availability.md`, the verified availability audit (every feature's exact status) - `build-list.md`, 20 ranked build ideas (this post is a deeper take on the top 10) - `deep-dive.md`, the strategic memo (why Sessions 2026 matters) - `sources/`, every primary source used (X API responses, doc-page fetches) ## Methodology This guide was assembled the day after Sessions — April 30, 2026, less than 24 hours after the keynote. Every line of the announcement blog read, every doc page Stripe linked fetched, every relevant tweet pulled from the X API, every code sample run in a sandbox to verify it works. The honest availability audit (Section 2) is the centerpiece. The gap between Stripe's marketing and what's actually shippable is bigger than most coverage admits, and getting that breakdown precise on day one saves everyone months. If you ship any of the 10 projects, post a postmortem — what was harder than expected, what the API gotchas were, how customers responded. The agentic commerce era is still in its first quarter; the playbook is being written in public. Now go build. --- *Last updated: 2026-04-30. Verified against Stripe's announcement blog (April 29, 2026). All code samples have been pulled from canonical Stripe / Tempo / Privy docs and tested in a clean Node 22 / Bun 1.x environment.* --- # Welcome to The Deep Feed URL: https://www.thedeepfeed.ai/posts/2026-04-30-welcome/ Category: Products Date: 2026-04-30 Tags: meta, launch > A new continuously-updated publication on AI — models, agents, products, business, research, and the people building it all. The AI news landscape in 2026 is broken in two opposite directions. On one side: **slop firehoses.** AI-generated newsletters and aggregators that scrape primary sources, paraphrase them with mediocre LLMs, and publish without attribution or judgment. Volume, zero signal. On the other: **insider Substacks** with two posts a week, paywalled, behind a $30/mo subscription, optimized for the 0.1% who'll pay it. There's a missing middle: **a continuously-updated, attribution-first, free feed of what matters in AI.** That's what The Deep Feed is for. ## What we cover - **[Models](/models/)** — frontier launches, benchmarks, capability shifts - **[Agents](/agents/)** — autonomous systems, frameworks, real-world deployments - **[Products](/products/)** — what's shipping to consumers and developers - **[Business](/business/)** — funding, M&A, hiring, revenue, the economics - **[Research](/research/)** — papers, breakthroughs, and the long arc of where this goes - **[Tools](/tools/)** — IDEs, infrastructure, the developer surface - **[People](/people/)** — founders, researchers, movers - **[Policy](/policy/)** — regulation, executive orders, the geopolitical layer ## How we work - **Every post links to its primary source.** No engagement bait. No SEO slop. - **First-party feeds first.** OpenAI, Anthropic, Google DeepMind, Meta, xAI, Mistral, Hugging Face, Cohere, GitHub, ArXiv, official corporate blogs. - **The web is for AI too.** We publish [`/llms.txt`](/llms.txt) and [`/llms-full.txt`](/llms-full.txt) so AI search ingests us cleanly. ## Get the feed - [RSS](/rss.xml) — the canonical way - Newsletter coming soon - [GitHub](https://github.com/thedeepfeed) — site source is open Welcome. --- # Stop using Claude Code like a chatbot URL: https://www.thedeepfeed.ai/posts/2026-04-29-stop-using-claude-code-like-a-chatbot/ Category: Tools Date: 2026-04-29 Tags: claude-code, anthropic, developer-workflow, mcp, productivity > Boris Cherny built Claude Code and runs five Claudes in parallel via git worktrees — the full config (settings.json, hooks, skills, subagents, MCPs, CLAUDE.md) that turns it into a workflow. ![Five numbered terminal windows running Claude Code in parallel — Boris Cherny's signature workflow](/post-images/stop-using-claude-code-like-a-chatbot/01-hero-five-terminals.jpg) *Boris Cherny's signature workflow: five Claudes running in parallel via git worktrees, with numbered terminal tabs.* Boris Cherny — the engineer who built Claude Code — runs five Claudes in parallel using git worktrees and numbered iTerm tabs. He calls it his "single biggest productivity unlock." Most engineers spend months ignoring that tip, treating Claude like a smarter autocomplete, and wondering why their output looks the same as everyone else's. The fix is to rewrite the setup from scratch. New `settings.json`. Five hooks. Three skills. Five MCPs. One 65-line `CLAUDE.md`. A statusline worth looking at. The difference isn't subtle — it's the gap between "this is a neat tool" and "this is the way the work gets done now." This is that setup. Every config below is copy-pasteable, every shell script is complete, and every claim traces back to either Boris's threads, the official docs, or one of the GitHub repos that have together collected over 800,000 stars in the last year. No "you might consider." Do this. Here's why. ## TL;DR - Run **5 Claudes in parallel** via git worktrees with numbered terminal tabs. Boris's #1 tip, no exceptions. - Use **plan mode → auto-accept edits** every PR. Pour energy into the plan, then watch Claude one-shot the implementation. - Keep `CLAUDE.md` **under 200 lines**. After every correction, append: *"Update your CLAUDE.md so you don't make that mistake again."* - Build a **`/go` skill** that verifies, simplifies, and opens the PR. Boris: "2-3x quality of final result." - Install **5 MCPs only**: Context7, GitHub, Playwright, Postgres/Supabase, Slack or Sentry. Not 15. - Set **`opusplan`** as the default model. Opus to plan, Sonnet to execute. - Burn this prompt into muscle memory: *"Knowing everything you know now, scrap this and implement the elegant solution."* ## The 5-Minute Setup If you only read one section, read this one. Get a quick win, then come back for the deep dive. ```bash # 1. Login + scaffold claude # OAuth login on first run cd my-project claude /init # bootstraps CLAUDE.md claude /permissions # configure allow/deny baseline claude /sandbox # enable OS-level isolation (Linux/macOS) # 2. Drop in the recommended ~/.claude/settings.json (full version below) mkdir -p ~/.claude $EDITOR ~/.claude/settings.json # 3. Pull battle-tested skills + agents git clone --depth=1 https://github.com/wshobson/agents ~/agents-repo mkdir -p ~/.claude/agents ~/.claude/skills cp ~/agents-repo/agents/code-simplifier.md ~/.claude/agents/ cp ~/agents-repo/agents/code-architect.md ~/.claude/agents/ cp ~/agents-repo/agents/team-lead.md ~/.claude/agents/ # 4. Install the 5 MCPs that actually pay off claude mcp add --transport http github https://api.githubcopilot.com/mcp/ claude mcp add -- npx -y @upstash/context7-mcp@latest claude mcp add -- npx @playwright/mcp@latest claude mcp add -- npx -y @supabase/mcp-server-supabase@latest --read-only claude mcp add --transport http slack https://mcp.slack.com/mcp # 5. Paste this into ~/.zshrc for the killer model defaults export CLAUDE_CODE_EFFORT_LEVEL=xhigh export ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-7 ``` Run `claude /status` after step 5. It prints every settings layer and where each value originated. That command alone saves hours of "why is it doing that?" ## settings.json — The One You Should Actually Use ![Annotated settings.json showing the key fields that matter most](/post-images/stop-using-claude-code-like-a-chatbot/02-settings-json-anatomy.jpg) *The five fields in `settings.json` that actually move the needle.* Why this matters: most people set up `settings.json` once, never touch it, and leave half the value on the table. Permissions cause 60% of friction. Hooks turn advisory rules into enforced ones. The status line tells you when context is dying. None of that happens if you accept defaults. Drop this at `~/.claude/settings.json`. It's a synthesis of Boris's team config, the [`shanraisshan/claude-code-best-practice`](https://github.com/shanraisshan/claude-code-best-practice) reference (48K stars), and the [official docs](https://docs.claude.com/en/docs/claude-code). ```json { "$schema": "https://json.schemastore.org/claude-code-settings.json", "model": "opusplan", "effortLevel": "xhigh", "alwaysThinkingEnabled": true, "autoMemoryEnabled": true, "outputStyle": "Default", "permissions": { "defaultMode": "acceptEdits", "allow": [ "Read", "Glob", "Grep", "LS", "Bash(npm run lint)", "Bash(npm run test:*)", "Bash(npm run build)", "Bash(pnpm run *)", "Bash(yarn *)", "Bash(git status)", "Bash(git diff:*)", "Bash(git log:*)", "Bash(git show:*)", "Bash(git branch:*)", "Bash(gh pr view:*)", "Bash(gh pr list:*)", "Bash(gh issue view:*)", "Bash(rg:*)", "Bash(fd:*)", "Bash(ls:*)", "Bash(cat:*)", "Bash(jq:*)", "Bash(node:*)", "Bash(python:*)", "Bash(python3:*)", "Bash(deno:*)", "Bash(bun:*)", "WebFetch(domain:github.com)", "WebFetch(domain:docs.claude.com)", "WebFetch(domain:developer.mozilla.org)", "mcp__context7__*", "mcp__playwright__*", "mcp__github__*" ], "ask": [ "Bash(git push:*)", "Bash(git reset:*)", "Bash(git rebase:*)", "Bash(npm publish:*)", "Bash(docker:*)", "Bash(kubectl:*)", "Bash(gcloud:*)", "Bash(aws:*)", "Bash(fly:*)", "Bash(rm:*)", "Bash(mv:*)", "Bash(chmod:*)", "Bash(chown:*)", "Bash(kill:*)", "Bash(pkill:*)" ], "deny": [ "Bash(rm -rf:*)", "Bash(sudo:*)", "Bash(curl:*|sh)", "Bash(wget:*|sh)", "Bash(dd if=:*)", "Bash(mkfs:*)", "Bash(:(){ :|:&};:)", "Read(./.env)", "Read(./.env.*)", "Read(./secrets/**)", "Read(./config/credentials.json)", "Read(~/.ssh/**)", "Read(~/.aws/credentials)", "Write(./.env)", "Write(./.env.*)", "Edit(./.env)", "Edit(./.env.*)" ] }, "hooks": { "PostToolUse": [{ "matcher": "Edit|Write|MultiEdit", "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/format.sh", "timeout": 10000 }] }], "PreToolUse": [ { "matcher": "Bash", "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/danger-blocker.sh", "timeout": 5000 }] }, { "matcher": "Edit|Write|MultiEdit", "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/protect-files.sh", "timeout": 5000 }] } ], "SessionStart": [{ "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/inject-context.sh" }] }], "Stop": [{ "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/verify.sh", "timeout": 60000 }] }], "Notification": [{ "hooks": [{ "type": "command", "command": "command -v terminal-notifier >/dev/null && terminal-notifier -title 'Claude Code' -message 'Needs input' -sound Pop || true" }] }] }, "statusLine": { "type": "command", "command": "~/.claude/statusline.sh", "padding": 2 }, "env": { "CLAUDE_CODE_EFFORT_LEVEL": "xhigh", "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "80", "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-7", "CLAUDE_CODE_ENABLE_TELEMETRY": "0", "DISABLE_AUTOUPDATER": "0" }, "respectGitignore": true, "enableAllProjectMcpServers": true, "plansDirectory": "./reports", "claudeMdExcludes": ["**/node_modules/**/CLAUDE.md"], "attribution": { "commit": "Co-Authored-By: Claude ", "pr": "Generated with [Claude Code](https://claude.com/code)" }, "spinnerVerbs": { "mode": "replace", "verbs": ["Cooking", "Compounding", "Verifying", "Shipping"] } } ``` The fields that earn their keep: - `model: "opusplan"` — Opus during plan mode, Sonnet during execute. Best of both worlds, zero manual switching. - `defaultMode: "acceptEdits"` — Boris's setup. Plan mode is one `Shift+Tab` away when needed. Approving every diff individually is friction theater. - `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE: "80"` — the official default is around 95%, by which point quality has already cratered. 80% is the sweet spot the `shanraisshan` repo settled on after testing. - `effortLevel: "xhigh"` — the default for Opus 4.7 since v2.1.117. Anything less is leaving reasoning on the table. - The deny list isn't paranoid — it's the difference between "Claude can't accidentally exfil AWS keys" and "trust the model to never make a mistake." Pick the first one. ## CLAUDE.md Done Right ![The compound learning loop: mistake → correction → CLAUDE.md update → improved future sessions](/post-images/stop-using-claude-code-like-a-chatbot/03-claude-md-compounding.jpg) *The compound learning loop. Every correction makes Claude smarter on every future session.* Why this matters: `CLAUDE.md` is the single most-impactful file in your repo. It loads every session, on every prompt. Bloated `CLAUDE.md` = ignored `CLAUDE.md`. Sharp `CLAUDE.md` = compound learning. The hard cap: **200 lines.** That's the most-cited rule in the [`shanraisshan` reference repo](https://github.com/shanraisshan/claude-code-best-practice), and it matches the consensus from teams that have stress-tested it — past 200 lines, adherence drops off a cliff. ### Pattern 1: The Karpathy 65-line file This is the viral one. It went from zero to 5,800 stars in a single day via [`forrestchang/andrej-karpathy-skills`](https://github.com/forrestchang/andrej-karpathy-skills). Use it as the starting template for any greenfield repo. ```markdown # CLAUDE.md Behavioral guidelines to reduce common LLM coding mistakes. ## 1. Think Before Coding - State assumptions explicitly. If uncertain, ASK. - If multiple interpretations exist, present them — don't pick silently. - If a simpler approach exists, say so. Push back when warranted. ## 2. Simplicity First - No features beyond what was asked. - No abstractions for single-use code. - No "flexibility" or "configurability" not requested. - If you write 200 lines and it could be 50, rewrite it. - Senior-engineer test: "Is this overcomplicated?" If yes, simplify. ## 3. Surgical Changes - Touch only what you must. Match existing style. - Don't refactor things that aren't broken. - Remove imports/variables YOUR changes orphaned. Don't delete pre-existing dead code. - Test: every changed line traces to the user's request. ## 4. Goal-Driven Execution - Define success criteria. Loop until verified. - "Add validation" → "Write tests for invalid inputs, then make them pass" - "Fix the bug" → "Write a test that reproduces it, then make it pass" - For multi-step tasks state a brief plan: each step + verify check. ## 5. Project-Specific - Package manager: pnpm (NOT npm/yarn). - Test runner: vitest. Run single tests with `pnpm test path/to/file.test.ts`. - Lint/typecheck after every series of edits: `pnpm lint && pnpm typecheck`. - Commit format: `feat(scope): subject`. One commit per logical change. ``` Each rule names a specific failure mode. No fluff, no "be helpful." That's why it works. ### The compounding pattern (steal this immediately) After every correction Claude makes, end the message with this exact phrase: > **"Update your CLAUDE.md so you don't make that mistake again."** Boris's team's words: *"Claude is eerily good at writing rules for itself."* Run this for two weeks and the mistake rate measurably drops. This single behavior is the entire premise behind Dan Shipper's "Compounding Engineering" — the idea that every bug fix should leave a permanent rule behind, so the same bug never costs you twice. The corollary, from Anthropic's own engineering team: *"If Claude does X correctly without the rule, delete the rule."* Treat `CLAUDE.md` like a garden, not a vault. ### The three-level hierarchy (for monorepos) ``` ~/.claude/CLAUDE.md ← global personal preferences (every project) {project}/CLAUDE.md ← project rules (committed, team-shared) {project}/CLAUDE.local.md ← personal overrides (gitignored) ``` Personal preferences (your name, your style, "always use TypeScript strict mode") go in the home file. Team rules go in the committed file. Personal overrides ("at this client the local convention is yarn even though the team default is pnpm") go in the gitignored local file. Don't mash them together. ## Subagents That Pay Rent ![Main Claude agent orchestrating six specialist subagents](/post-images/stop-using-claude-code-like-a-chatbot/04-subagent-orchestration.jpg) *Main Claude delegates to specialist subagents. Each one keeps its work out of your main context window.* Why this matters: every agent's frontmatter description loads into context every session. Five well-tuned agents beat fifty random ones. Pick a small starter set, prove they earn their keep, add more only when you hit a real gap. Five worth keeping across every project: **1. `code-architect`** — designs blueprints before implementation. From Anthropic's `feature-dev` plugin. ````markdown --- name: code-architect description: Designs feature architectures by analyzing existing codebase patterns and conventions, then providing comprehensive implementation blueprints with specific files to create/modify, component designs, data flows, and build sequences tools: Glob, Grep, LS, Read, NotebookRead, WebFetch, TodoWrite, WebSearch model: sonnet --- You are a senior software architect who delivers comprehensive, actionable architecture blueprints by deeply understanding codebases and making confident architectural decisions. ## Core Process **1. Codebase Pattern Analysis** — extract patterns, stack, abstraction layers, CLAUDE.md guidelines. Find similar features. **2. Architecture Design** — make decisive choices, pick one approach and commit. **3. Complete Implementation Blueprint** — specify every file to create/modify, components, integration points, data flow. Phase the work into clear steps. ```` **2. `code-reviewer`** — auto-invokes after edits, catches the 90% of issues humans miss. From Anthropic's `pr-review-toolkit`. ````markdown --- name: code-reviewer description: Use PROACTIVELY after writing or modifying code. Reviews for quality, security, maintainability. Provides specific line references and suggested fixes. tools: Read, Glob, Grep, Bash model: sonnet --- You are a senior code reviewer. Review only what changed (diff-only). Focus on: - Logic bugs that produce wrong output regardless of inputs - Injection vulnerabilities, auth flaws, secrets - CLAUDE.md violations (quote the rule) - Silent failures (catch blocks that swallow errors) DO NOT flag style/lint issues, subjective improvements, or pre-existing problems. For every issue: file:line + rationale + suggested fix. ```` **3. `code-simplifier`** — Boris's daily driver. From [`wshobson/agents`](https://github.com/wshobson/agents) (34K stars). ````markdown --- name: code-simplifier description: Use PROACTIVELY after a feature is complete. Removes redundancy, dead code, over-abstractions; collapses near-duplicate functions; simplifies control flow without changing behavior. tools: Read, Edit, MultiEdit, Glob, Grep, Bash model: sonnet --- You simplify working code without changing behavior. Apply ruthlessly: 1. Inline single-call helpers 2. Collapse 2+ near-duplicate functions into 1 with a parameter 3. Remove dead code, unused imports, "just in case" branches 4. Reduce nesting (early returns, guard clauses) 5. Replace cleverness with clarity NEVER change observable behavior. Run tests after every change. If a test breaks, revert the change. ```` **4. `silent-failure-hunter`** — catches the bugs you'd otherwise ship. From the `pr-review-toolkit`. ````markdown --- name: silent-failure-hunter description: Use this agent when reviewing code changes that involve error handling, catch blocks, fallback logic, or any code that could potentially suppress errors. model: inherit --- ## Core Principles 1. Silent failures are unacceptable 2. Users deserve actionable feedback 3. Fallbacks must be explicit AND justified 4. Catch blocks must be specific (no broad excepts) 5. Mock/fake implementations belong only in tests ```` **5. `librarian`** — fetches up-to-date docs without burning a turn on web search. Jarrod Watts's pattern. ````markdown --- name: librarian description: Use PROACTIVELY when an external library, framework, or API is mentioned. Fetches up-to-date docs (via Context7 MCP or WebFetch) and returns a 1-page summary with links to the relevant sections. tools: Read, WebFetch, WebSearch, mcp__context7__* model: haiku --- You are the team's reference librarian. When invoked: 1. Identify the library/framework + the specific question 2. Use Context7 MCP if available; else WebFetch official docs 3. Return: 1-paragraph summary, code example, link to deepest doc page 4. Stop. Don't implement, don't speculate. ```` **6. `verify-app`** — Boris's end-to-end tester, the back half of the `/go` skill below. ````markdown --- name: verify-app description: Use PROACTIVELY before marking work complete. Runs the project's full verification: tests, build, smoke checks via Playwright/curl. Returns PASS or specific failures. Never modifies code. tools: Read, Bash, mcp__playwright__* model: sonnet --- You verify, you don't fix. When invoked: 1. Read CLAUDE.md / package.json for the verify command 2. Run: lint → typecheck → unit tests → build → smoke test 3. If frontend: launch Playwright, hit the changed pages, check console errors 4. If backend: hit /health and the changed endpoints 5. Return: PASS or list each failure with file:line + the exact failing assertion ```` The magic phrase: append `"use subagents"` to any prompt to throw more compute at the problem. Boris's tip #8, and it works exactly the way it sounds. ## Skills: The Killer Feature Most People Miss Why this matters: skills are markdown files at `.claude/skills//SKILL.md` that auto-invoke based on description trigger phrases. They're the missing layer between always-on `CLAUDE.md` rules and one-shot slash commands. Anthropic recently merged slash commands into skills — both produce `/name`, but skills support progressive disclosure, supporting files (`references/`, `assets/`, `scripts/`), and reuse across plugins. Decision matrix: | Need | Use | |---|---| | Always-on rules + project facts | `CLAUDE.md` | | Path-scoped rules (only for `src/api/**`) | `.claude/rules/*.md` with `paths:` frontmatter | | Reusable workflow user OR Claude can invoke | **Skill** | | Reference knowledge Claude pulls in when relevant | Skill (default frontmatter) | | Domain expertise with own context + tools | Subagent | | Voice/tone/persona change | Output style | | Guarantee an action runs | Hook | Three skills worth installing on every project: ### `/fewer-permission-prompts` (Anthropic-shipped) Boris's #1 tip from his April 2026 thread. Run it once after a few sessions. It scans your approval history and outputs a paste-ready `permissions.allow` array. Stop hand-curating allowlists — let Claude do it. ### `/go` — the Boris pipeline ````markdown --- name: go description: After implementation, ship it end-to-end. Tests itself (bash, browser, computer use), runs /simplify, then opens a PR. Use when work feels complete and you want to verify-and-ship in one shot. allowed-tools: Bash Read Edit mcp__playwright__* Bash(gh pr create:*) --- # /go — Verify, Simplify, Ship ## Phase 1: Verify - Run lint, typecheck, full test suite - If frontend: Playwright smoke test of changed pages, capture console errors - If backend: hit /health and changed endpoints, verify expected payloads - If ANY failure, STOP and report. Do NOT proceed to simplify. ## Phase 2: Simplify - Invoke @code-simplifier on the diff only - Re-run tests. If anything breaks, revert the simplification. ## Phase 3: Ship - `git add -p` (or stage all if appropriate) - Commit with conventional format: `feat(scope): subject` - Push branch - `gh pr create` with summary bullets + test plan + screenshots/log snippets Output: PR URL or specific failure with file:line. ```` This single skill does more work than any other piece of the setup. Boris claims 2-3x quality on the final PR. That's conservative. ### `/sync` — weekly context dump ````markdown --- name: sync description: Pull last 7 days of Slack threads, GDrive docs, Asana tasks, and GitHub PRs/issues into a single context dump. Use at the start of a planning session or after vacation. allowed-tools: mcp__slack__* mcp__github__* WebFetch --- 1. Slack: search workspace for "$ARG" or assigned messages from last 7d 2. GitHub: PRs touched / reviewed / mentioned in last 7d 3. Asana: tasks I own with last_modified within 7d 4. Output: dated bullets grouped by source. No commentary. ```` Use it Monday morning. Use it after vacation. Use it before any planning session. A few rules from Anthropic's own [`skill-development`](https://github.com/anthropics/skills) skill that nobody talks about: - **Trigger phrases live in the `description`.** Pack it with verbs your prompts contain: "create a hook," "block dangerous commands," "open a PR." Each phrase is another auto-invocation surface. - **Keep `SKILL.md` body under 500 lines.** Move deep content to `references/`. Progressive disclosure beats everything-in-one-file. - **Description in third person** ("Processes Excel files…"), not first or second. - **Gerund names** (`processing-pdfs`), not generic ones (`utils`). - **`disable-model-invocation: true`** for skills with side effects (deploy, commit). Forces user-only invocation. ## Hooks: Making Claude Deterministic ![The lifecycle of a tool call in Claude Code, showing where each hook fires](/post-images/stop-using-claude-code-like-a-chatbot/05-hooks-lifecycle.jpg) *Where each hook fires in the lifecycle of a tool call. CLAUDE.md is advisory; hooks are guarantees.* Why this matters: `CLAUDE.md` is advisory. Hooks are enforced. If a rule MUST hold — no `.env` writes, no `rm -rf`, lint runs after every edit — make it a hook. The model can't ignore a shell script. Five hooks worth running. Each does exactly one thing. ### Hook 1: PostToolUse formatter Auto-format after every edit. Boris's tip #9: *"Claude usually generates well-formatted code, but the hook handles the last 10% so CI never fails on style."* `.claude/hooks/format.sh`: ```bash #!/usr/bin/env bash set -euo pipefail INPUT=$(cat) FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_response.file_path // empty') [ -z "$FILE" ] && exit 0 [ ! -f "$FILE" ] && exit 0 case "$FILE" in *.ts|*.tsx|*.js|*.jsx|*.json|*.md|*.css) npx --no-install prettier --write "$FILE" 2>/dev/null || true ;; *.py) command -v black >/dev/null && black -q "$FILE" 2>/dev/null || true command -v ruff >/dev/null && ruff check --fix "$FILE" 2>/dev/null || true ;; *.go) gofmt -w "$FILE" 2>/dev/null || true ;; *.rs) rustfmt "$FILE" 2>/dev/null || true ;; *.sh) command -v shfmt >/dev/null && shfmt -w "$FILE" 2>/dev/null || true ;; esac exit 0 ``` ### Hook 2: PreToolUse danger blocker The hardstop on dangerous commands. Mirrors the official `validate-bash.sh` example. `.claude/hooks/danger-blocker.sh`: ```bash #!/usr/bin/env bash set -euo pipefail INPUT=$(cat) CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty') [ -z "$CMD" ] && { echo '{"continue": true}'; exit 0; } deny() { echo "{\"hookSpecificOutput\":{\"hookEventName\":\"PreToolUse\",\"permissionDecision\":\"deny\",\"permissionDecisionReason\":\"$1\"}}" >&2 exit 2 } [[ "$CMD" == *"rm -rf /"* ]] && deny "Destructive: rm -rf /" [[ "$CMD" == *"rm -rf ~"* ]] && deny "Destructive: rm -rf home" [[ "$CMD" == *"rm -rf ."* ]] && deny "Destructive: rm -rf . in cwd" [[ "$CMD" == *"dd if="* ]] && deny "Block-level dd" [[ "$CMD" == *"mkfs"* ]] && deny "Filesystem create" [[ "$CMD" == *"> /dev/sd"* ]] && deny "Raw device write" [[ "$CMD" == *":(){ :|:&};:"* ]] && deny "Fork bomb" [[ "$CMD" == *"curl"*"|"*"sh"* ]] && deny "curl|sh pattern" [[ "$CMD" == *"wget"*"|"*"sh"* ]] && deny "wget|sh pattern" [[ "$CMD" == *".env"* && "$CMD" == *">"* ]] && deny "Writing to .env" exit 0 ``` ### Hook 3: PreToolUse file protection `.claude/hooks/protect-files.sh`: ```bash #!/usr/bin/env bash set -euo pipefail INPUT=$(cat) FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty') [ -z "$FILE" ] && exit 0 deny() { echo "{\"hookSpecificOutput\":{\"hookEventName\":\"PreToolUse\",\"permissionDecision\":\"deny\",\"permissionDecisionReason\":\"Protected file: $1\"}}" >&2 exit 2 } case "$FILE" in */.env|*/.env.*) deny "$FILE" ;; */secrets/*) deny "$FILE" ;; */.git/config|*/.gitconfig) deny "$FILE" ;; */.ssh/*) deny "$FILE" ;; */node_modules/*) deny "$FILE" ;; */.venv/*|*/venv/*) deny "$FILE" ;; */dist/*|*/build/*) deny "$FILE" ;; esac exit 0 ``` ### Hook 4: Stop verifier The single highest-ROI hook. When Claude says it's done, run the tests. If they fail, force a re-loop. No more cycles of "implemented it" followed by "wait, the tests are red." `.claude/hooks/verify.sh`: ```bash #!/usr/bin/env bash set -euo pipefail # Only run if there are unstaged or staged changes git diff --quiet && git diff --cached --quiet && exit 0 if [ -f package.json ]; then npm test --silent 2>&1 | tail -50 elif [ -f Cargo.toml ]; then cargo test --quiet 2>&1 | tail -50 elif [ -f pyproject.toml ]; then pytest -q 2>&1 | tail -50 elif [ -f go.mod ]; then go test ./... 2>&1 | tail -50 fi if [ ${PIPESTATUS[0]} -ne 0 ]; then echo '{"decision":"block","reason":"Tests failed. Fix them before stopping."}' exit 0 fi exit 0 ``` ### Hook 5: SessionStart context injector Inject git state and active notes on every fresh session, so Claude isn't starting cold. `.claude/hooks/inject-context.sh`: ```bash #!/usr/bin/env bash set -euo pipefail BRANCH=$(git branch --show-current 2>/dev/null || echo "no-git") RECENT=$(git log --oneline -5 2>/dev/null || echo "") DIRTY=$(git status --porcelain 2>/dev/null | wc -l | tr -d ' ') NOTES="" [ -f ./notes/active.md ] && NOTES=$(cat ./notes/active.md) cat <.md`. The body is a prompt; backticks with `!` execute commands inline. ### `/commit` (Anthropic official) ````markdown --- allowed-tools: Bash(git add:*), Bash(git status:*), Bash(git commit:*) description: Create a git commit --- ## Context - Current git status: !`git status` - Current git diff (staged and unstaged changes): !`git diff HEAD` - Current branch: !`git branch --show-current` - Recent commits: !`git log --oneline -10` ## Your task Based on the above changes, create a single git commit. You have the capability to call multiple tools in a single response. Stage and create the commit using a single message. Do not use any other tools or do anything else. ```` ### `/commit-push-pr` The most-used one in this stack. Goes from "done" to a PR URL in one shot. ````markdown --- allowed-tools: Bash(git checkout:*), Bash(git add:*), Bash(git status:*), Bash(git push:*), Bash(git commit:*), Bash(gh pr create:*) description: Commit, push, and open a PR argument-hint: [optional summary] --- ## Context - Status: !`git status` - Diff: !`git diff HEAD` - Branch: !`git branch --show-current` - Recent: !`git log --oneline -10` ## Your task 1. Create a new branch if on main/master 2. Create a single commit with a descriptive conventional message (feat:/fix:/refactor:/docs:/test:/chore:) 3. Co-Authored-By: Claude 4. Push the branch with `-u` if needed 5. Create a pull request using `gh pr create` with summary + test plan 6. You MUST do all of the above in a single response via parallel tool calls. ```` ### `/code-review` (Anthropic flagship) A multi-agent PR review that spawns four parallel Sonnet/Opus reviewers. Worth the read just to see how they orchestrate. ````markdown --- allowed-tools: Bash(gh pr view:*), Bash(gh pr diff:*), Bash(gh pr comment:*), mcp__github__* description: Code review a pull request argument-hint: [--comment] --- Provide a code review for the given pull request. Steps: 1. Launch HAIKU agent → check if PR closed/draft/already-reviewed 2. Launch HAIKU agent → list relevant CLAUDE.md file paths 3. Launch SONNET agent → summarize the diff 4. Launch 4 SONNET/OPUS agents IN PARALLEL: • Agents 1+2 (Sonnet): CLAUDE.md compliance audit • Agent 3 (Opus): bug scan, diff-only context • Agent 4 (Opus): logic/security issues in changed code 5. For each issue, launch a validation subagent 6. Filter to only validated issues 7. If --comment provided: post inline comments via mcp__github_inline_comment CRITICAL: Only HIGH SIGNAL issues: - Code WILL fail to compile/parse - Code WILL produce wrong results regardless of inputs - Clear unambiguous CLAUDE.md violation (quote the rule) Do NOT flag style, subjective improvements, pedantic nitpicks, or pre-existing issues. ```` ### `/techdebt` End-of-session debt sweep. Run before stopping for the day. ````markdown --- description: Find and kill duplicated code, dead exports, over-abstractions in the current package allowed-tools: Read, Grep, Glob, Edit, MultiEdit, Bash(npm run *) --- ## Context - Modified files this session: !`git diff --name-only HEAD` - Untracked: !`git ls-files --others --exclude-standard` ## Your task 1. Find duplicated logic across the modified files 2. Find dead exports (exported symbols with zero references in the repo) 3. Find over-abstractions (single-call helpers, single-impl interfaces) 4. Propose a list of removals/inlinings — DO NOT change behavior 5. Wait for my approval. After approval, apply and run the test suite. ```` ### `/ultrathink-plan` Force a structured 5-phase deep analysis before any major change. Always waits for approval before implementing. ````markdown --- description: Force structured 5-phase deep analysis. Always WAITS for approval before implementing. argument-hint: --- ultrathink Analyze "$ARGUMENTS" through these 5 phases. Output each phase as a section. ## Phase 1: Problem Understanding - Restate the problem in your own words - List explicit + implicit requirements - List unknowns to resolve ## Phase 2: Context Gathering - Files / modules involved (cite paths) - Existing patterns to follow / avoid (cite file:line) - External dependencies + their versions ## Phase 3: Solution Design Score each candidate on 5 axes (1-5 each): - Complexity, Maintenance, Performance, Security, Testing Pick the highest-total approach. Justify trade-offs. ## Phase 4: Edge Cases & Failure Modes - Concurrency, partial failure, malicious input, scale, ops ## Phase 5: Recommendation Numbered implementation steps with per-step verify check. STOP. Wait for "approved" before writing any code. ```` ### `/rev-engine` Iteratively critique a plan along six axes until it's clean. Pair with `/ultrathink-plan`. ````markdown --- description: Iteratively critique a plan along 6 axes until no issues remain. argument-hint: --- Critique the plan in $ARGUMENTS along these 6 axes. For each issue, output {axis, severity, fix}. Repeat until you can return "No issues found". Axes: 1. Edge cases the plan misses 2. Failure modes (network, partial-write, crash recovery) 3. Redundancy / steps that could be removed 4. Ordering / dependency mistakes 5. Test coverage gaps (which steps lack a verify check) 6. Security holes (auth, injection, secret handling) Output: - Iteration 1: list of issues + proposed fixes - Updated plan - Iteration 2: re-critique the updated plan - ... repeat until clean ```` ### `/simplify` ````markdown --- description: Simplify the diff without changing behavior allowed-tools: Read, Edit, MultiEdit, Bash(npm test:*) --- ultrathink Read the current diff: !`git diff HEAD` Apply the @code-simplifier agent. After every change, run tests. If a test breaks, revert that change. Output: list of simplifications applied + diff size reduction. ```` ## The 5 MCPs That Actually Pay Off ![Claude Code connected to five MCP servers: Context7, GitHub, Playwright, Supabase, Slack](/post-images/stop-using-claude-code-like-a-chatbot/06-mcp-five-servers.jpg) *The five MCPs worth your context budget. Boris's rule: "You'll keep five of fifteen."* Why this matters: Boris's pattern is "install 15, keep 5 after three months." Each MCP's tool descriptions enter context every session — so MCP bloat is permanent context tax. These five are the keepers across every team that's posted their setup. Important caveat from the official docs: CLI tools (`gh`, `aws`, `gcloud`, `bq`, `sentry-cli`) are *more context-efficient* than MCP servers because their tool listings don't enter context. Use MCP only when you need real-time push (Slack channels) or when no CLI exists. Boris uses `bq` directly for BigQuery rather than an MCP for exactly this reason. Drop the combined config into `.mcp.json` at your project root and commit it: ```json { "mcpServers": { "context7": { "command": "npx", "args": ["-y", "@upstash/context7-mcp@latest"], "env": { "CONTEXT7_API_KEY": "${CONTEXT7_API_KEY}" } }, "github": { "type": "http", "url": "https://api.githubcopilot.com/mcp/", "headers": { "Authorization": "Bearer ${GITHUB_TOKEN}" } }, "playwright": { "command": "npx", "args": ["@playwright/mcp@latest"] }, "postgres": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-postgres"], "env": { "POSTGRES_CONNECTION_STRING": "${DATABASE_URL}" } }, "slack": { "type": "http", "url": "https://mcp.slack.com/mcp" } } } ``` The why for each: - **Context7** is the highest-ROI MCP, period. Add `use context7` to any prompt and Claude pulls the exact, version-correct docs page. No more hallucinated APIs from libraries that changed three releases ago. - **GitHub** is non-negotiable for any team that ships code. Issue triage, PR review, gh pr create — all of it. - **Playwright** is the verification half of `/go`. Boris: *"For frontend, Claude tests every change with the Claude Chrome extension — opens browser, tests UI, iterates until UX feels good."* Playwright MCP is the same thing without the extension dependency. - **Postgres/Supabase** in `--read-only` mode. Boris: *"I haven't written SQL in 6 months."* Don't run Claude in write mode either, until trust has accumulated over weeks. - **Slack** for incoming bug context. Boris's #1 workflow tip from January: *"Paste a Slack thread, just say 'fix'."* Audit `/context` and `/mcp` regularly. If a server hasn't been used in a month, kill it. ## Model + Effort Tuning The aliases: | Alias | Resolves to | Use | |---|---|---| | `default` | Account default — Max/Team Premium → Opus 4.7; others → Sonnet 4.6 | sane fallback | | `opusplan` | Opus during plan mode, Sonnet during execute | **best default** | | `opus` | Opus 4.7 | hardest reasoning | | `sonnet` | Sonnet 4.6 | most coding | | `haiku` | Haiku latest | fast triage / simple subagents | | `opus[1m]` / `sonnet[1m]` | 1M-token context variants | giant codebase work | The `opusplan` trick is Boris's vanilla default. Plan in Opus on high effort for accuracy → switch to Sonnet for speed during implementation → verify in Opus again. The alias does the swap automatically. Effort levels (Opus 4.7+): `low`, `medium`, `high`, `xhigh`, `max`. Use `xhigh` as the default — it's been the Opus default since v2.1.117 for a reason. `max` is real but it can over-think; reserve it for security audits and critical refactors. Set effort five ways, in precedence order: 1. `CLAUDE_CODE_EFFORT_LEVEL=xhigh` env var (beats everything) 2. `effort` in skill/agent frontmatter (per-invocation) 3. `effortLevel` in `settings.json` (project default) 4. `/effort xhigh` slash command (session) 5. `--effort xhigh` CLI flag (one-shot) ### `ultrathink` — the deep-reasoning escape hatch Type `ultrathink` anywhere in a prompt to trigger the maximum thinking budget (~31,999 tokens) for that turn. Anthropic removed it in late Q1 2026 and **restored it in v2.1.68** after a community revolt — hundreds of bug reports about quality regressions. Use it when stakes are high. ``` ultrathink — design the cache invalidation for the user-permissions service. List 3 candidate strategies, score them on consistency / latency / complexity, recommend one with a step-by-step migration plan. ``` ### The haiku→opus trick (read the cost warning first) ```bash export ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-opus-4-7 ``` What this does: every subagent, classifier, and hook that defaults to `model: haiku` (and there are many — `code-review` uses Haiku for the closed/draft check, permission classifiers, PR triage) silently runs on Opus 4.7 instead. **Cost warning, read it twice.** Background usage is roughly $0.04 per session under default settings. With this set, that gets multiplied by 10x. Only enable on Max or Team plans where the math has been thought through. To audit first, pair with `CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1` to see what would have run. When it's worth it: classifier accuracy noticeably improves when permission routing runs on Opus. When it's not: anything billed by the call. Don't enable this on a Pro plan. ### Shift+Tab — the cycle `Default` → `Auto-Accept Edits` → `Plan` → (`Auto`, if your plan supports it) → `Default`. This is the single biggest UX win in 2026. Plan a change, accept it, switch to acceptEdits, watch Claude one-shot the implementation. Learn the cycle, use it every PR. ## The Workflow Patterns That Compound ![The four-phase workflow loop: Plan, Implement, Verify, Simplify](/post-images/stop-using-claude-code-like-a-chatbot/07-plan-implement-verify.jpg) *The four-phase loop. Plan with Opus, implement with Sonnet, verify with tests, simplify before merge.* ### 1. Five parallel git worktrees This is the headline. Boris: *"Spin up 3-5 git worktrees, each with its own Claude session. Single biggest productivity unlock, top tip from the team."* ```bash # Set up 5 worktrees off main for i in 1 2 3 4 5; do git worktree add -b feature-$i ../wt-$i main done # Open 5 terminal tabs, name them 1-5, start a Claude in each cd ../wt-1 && claude # in tab 1 cd ../wt-2 && claude # in tab 2 # ... etc ``` Add shell aliases to hop: `alias za='cd ~/wt-1'`, `zb`, `zc`. Reserve one worktree as an "analysis" tab — log reading, BigQuery, anything verbose — so noisy data never pollutes the coding context. Configure iTerm2 or Ghostty to fire system notifications when a Claude needs input (the `Notification` hook above does this on macOS). You'll know which tab to switch to without checking each one. ### 2. Plan → Implement → Verify ``` 1. Explore (Plan Mode): Shift+Tab twice. Read code. Build context. Don't change. 2. Plan: "Here's what I want. Output a numbered plan with verify steps." 3. Implement (Normal): Shift+Tab once to acceptEdits. Claude executes. You watch. 4. Verify (Stop hook): Tests run. /go invokes verifier subagent. Or press Ctrl+G to review. ``` Boris: *"Pour energy into the plan. A good plan is really important. Then auto-accept edits and Claude usually 1-shots it."* The team's twist: **two-Claude review.** Claude #1 writes the plan. Claude #2 reviews it as a staff engineer. Iterate until clean. Then implement. The cost of a 30-second plan critique is nothing compared to a 30-minute wrong implementation. ### 3. The verification rule > *"Give Claude a way to verify its work. If Claude has that feedback loop, it will 2-3x the quality of the final result."* — Boris Cherny For each domain: - **Frontend:** Playwright MCP or the Claude Chrome extension — opens browser, tests UI - **Backend:** bash test commands, curl health endpoints - **Mobile:** phone simulator - **Data:** `bq` CLI to query and assert The Stop hook above (`verify.sh`) makes this automatic. ### 4. The single highest-ROI prompt After any mediocre fix: > **"Knowing everything you know now, scrap this and implement the elegant solution."** Burn into muscle memory. Universally cited as the single most valuable phrase in 2026. The first implementation is exploration; the second is the one to keep. ### 5. Voice input Boris: *"You speak 3x faster than you type, and your prompts get way more detailed."* Mckay Wrigley: *"If you're not using voice as input, you're working in the stone age."* Voice doesn't just speed things up — it changes the prompts themselves. People naturally provide more detail and rationale verbally than they type. More context in, better results out. Wispr Flow on macOS, or the built-in fn-fn dictation, or anything else that gets voice into the prompt buffer. ### Bonus: pseudocode in the codebase Sometimes write pseudocode directly into the file as a comment, then ask Claude to implement. Mckay: *"Opus is astonishingly good at inferring what you mean."* ```typescript // PSEUDO: // 1. validate input (zod schema below) // 2. fetch user from db; if !user throw 404 // 3. compute permissions diff // 4. if changed, write to audit log // 5. return new permission set ``` ## Anti-Patterns: Stop Doing These ![Side-by-side comparison of Claude Code anti-patterns versus best practices](/post-images/stop-using-claude-code-like-a-chatbot/08-anti-patterns.jpg) *Stop doing the things on the left. Start doing the things on the right.* In rough order of severity: 1. **Bloated `CLAUDE.md`.** Anything over 200 lines is functionally ignored. Move deep stuff to skills, which load on demand. 2. **`--dangerously-skip-permissions` as a daily habit.** Run `/fewer-permission-prompts` once and put the output in your allowlist. Boris: *"NEVER --dangerously-skip-permissions."* 3. **Kitchen-sink sessions.** Switching from auth refactor to log analysis without `/clear` = degraded performance every message after. Treat `/clear` as a free reset. 4. **Treating hooks as optional.** Hooks are the only deterministic guarantee. CLAUDE.md is advisory. If a rule MUST hold (no `.env` writes, no `rm -rf`), make it a hook. 5. **Correcting the same mistake twice.** Add the rule. Compound learning is the entire point. 6. **Over-installing subagents.** Each agent's frontmatter description loads every session. Five well-tuned > fifty random. 7. **Over-installing MCPs.** 15 day one, 5 after three months. Tool descriptions cost tokens forever. 8. **Letting auto-compact hit at 95%.** Set `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=80`. Performance has already degraded by 95%. 9. **Trust without verification.** Always provide a way for Claude to check itself — tests, screenshots, smoke commands. Anthropic's engineering team calls it their single highest-impact practice. 10. **Vague prompts on important work.** "Add input validation to `login()` in `auth.ts`" beats "improve the codebase" by 10x. Voice input naturally fixes this. 11. **Editing `CLAUDE.md` without observing whether behavior actually changed.** Anthropic's rule: *"If Claude does X correctly without the rule, delete the rule."* ## The Drop-In Starter Pack Spend thirty minutes wiring this up and the result is the 90th percentile of Claude Code users. ``` my-project/ ├── CLAUDE.md # Karpathy 65-line + project specifics. <200 lines hard cap. ├── CLAUDE.local.md # Personal overrides (gitignored) ├── .mcp.json # Context7 + GitHub + Playwright + Postgres + Slack ├── .gitignore # Add: .claude/settings.local.json, CLAUDE.local.md, .claude/worktrees/ ├── notes/ │ ├── active.md # Current blockers (loaded by SessionStart hook) │ └── decisions.md # Architectural decision log └── .claude/ ├── settings.json # The annotated power-user JSON above. Committed. ├── settings.local.json # Personal env vars / model overrides. Gitignored. ├── agents/ │ ├── code-architect.md # Designs blueprints │ ├── code-reviewer.md # Auto after edits │ ├── code-simplifier.md # Boris daily — removes redundancy │ ├── silent-failure-hunter.md # Catches swallowed errors │ ├── librarian.md # Fetches external docs │ └── verify-app.md # End-to-end verifier ├── skills/ │ ├── go/SKILL.md # Verify+simplify+ship pipeline │ ├── sync/SKILL.md # 7-day cross-tool context dump │ └── frontend-design/SKILL.md # Anti-AI-slop UI ├── commands/ │ ├── commit.md │ ├── commit-push-pr.md │ ├── code-review.md │ ├── techdebt.md │ ├── ultrathink-plan.md │ ├── rev-engine.md │ └── simplify.md ├── hooks/ │ ├── format.sh # PostToolUse: prettier/black/gofmt/rustfmt │ ├── danger-blocker.sh # PreToolUse: block rm -rf, fork bombs, curl|sh │ ├── protect-files.sh # PreToolUse: deny .env / secrets / .git edits │ ├── verify.sh # Stop: run tests; block stop on failure │ └── inject-context.sh # SessionStart: git state + notes/active.md └── rules/ └── api-conventions.md # paths-scoped: only loads when editing src/api/** ``` Plus globally on the machine: ``` ~/.claude/ ├── CLAUDE.md # Personal preferences (every project) ├── settings.json # User-level overrides ├── statusline.sh # Status line script └── agents/ # Personal agents (e.g. learning-tutor.md) ``` The whole thing fits in a starter pack — clone it, `cp -r` it into a project, edit the `CLAUDE.md` to match the stack, and it's running. The configs are battle-tested, the hooks are executable, and every snippet has been used on real projects. ## Sources & Credits This setup is a synthesis. The good ideas are stolen, with attribution. **Boris Cherny** (the engineer who built Claude Code): - "How I use Claude Code," January 2 2026 — [@bcherny on X](https://x.com/bcherny/status/2007179832300581177) - "10 team tips" thread, January 31 2026, 8.9M views — [@bcherny on X](https://x.com/bcherny/status/2017742741636321619) - "6 tips for Opus 4.7," April 16 2026 — [Threads](https://www.threads.com/@boris_cherny) - [howborisusesclaudecode.com](https://howborisusesclaudecode.com) — 87 indexed tips **Mckay Wrigley:** - "Opus 4.5 thoughts," December 6 2025 — [mckaywrigley.com](https://www.mckaywrigley.com/posts/opus-4.5) - "Claude Agent (10x Workflows)" — [mckaywrigley.substack.com](https://mckaywrigley.substack.com/p/claude-agent) **Anthropic official:** - [Claude Code repo](https://github.com/anthropics/claude-code) (117K stars) — bundled plugins for `code-review`, `feature-dev`, `hookify`, `pr-review-toolkit`, `commit-commands`, `frontend-design`, `plugin-dev` - [Skills repo](https://github.com/anthropics/skills) (123K stars) - [Hooks reference](https://docs.claude.com/en/docs/claude-code/hooks) - [Best practices](https://anthropic.com/engineering/claude-code-best-practices) **Community repos worth bookmarking:** - [`forrestchang/andrej-karpathy-skills`](https://github.com/forrestchang/andrej-karpathy-skills) (86K stars) — the 65-line CLAUDE.md - [`garrytan/gstack`](https://github.com/garrytan/gstack) (83K stars) — Garry Tan's actual setup - [`shanraisshan/claude-code-best-practice`](https://github.com/shanraisshan/claude-code-best-practice) (48K stars) — the reference implementation - [`hesreallyhim/awesome-claude-code`](https://github.com/hesreallyhim/awesome-claude-code) (41K stars) — canonical awesome list - [`wshobson/agents`](https://github.com/wshobson/agents) (34K stars) — 184 agents, 79 plugins, 150 skills, 98 commands - [`VoltAgent/awesome-claude-code-subagents`](https://github.com/VoltAgent/awesome-claude-code-subagents) (18K stars) — 100+ specialized subagents - [`jarrodwatts/claude-code-config`](https://github.com/jarrodwatts/claude-code-config) (1K stars) — the XML-style senior-engineer pattern - [`davila7/claude-code-templates`](https://github.com/davila7/claude-code-templates) — 40+ JSON hook recipes + statusline gallery + CLI installer --- # 20 days that changed the AI agent market URL: https://www.thedeepfeed.ai/posts/2026-04-28-20-days-that-changed-the-ai-agent-market/ Category: Business Date: 2026-04-28 Tags: anthropic, claude-managed-agents, agents, market-analysis, vertical-saas > Anthropic shipped Claude Managed Agents on April 8, 2026 at $0.08/hour. Twenty days, 2,029 tweets, and four converging platforms later, infrastructure has commoditized — domain expertise is the new moat. ![Claude Managed Agents — the day the market shifted](/post-images/20-days-that-changed-the-ai-agent-market/hero.jpg) ## Part I: The day the market shifted On April 8, 2026, **Anthropic** announced Claude Managed Agents. This was one of the first reactions: > "I just pulled an all-nighter building exactly this. Have never been more excited in my life. Didn't eat. Didn't sleep. Cancelled meetings. Thought I was a genius. > > Wake up and this is the first tweet I see." > — @michael_chomsky (♥ 599) Anthropic had shipped a fully hosted runtime for autonomous AI agents — not a chatbot feature, but production infrastructure for agents that run for hours in the cloud. Michael Chomsky had spent the night building exactly this: sandboxed environments, persistence, error recovery. He was convinced he'd cracked it. Then he opened Twitter. 599 likes, 102 replies. Most said the same thing: *"Ship it anyway."* The announcement tweet: **56,883 likes**. **21.2 million impressions**. **3,183 quote tweets**. **50,745 bookmarks**. For context, OpenAI's biggest product announcement in the past year peaked at roughly half those numbers. Google's Gemini launches rarely break 10K likes. The bookmark count is the interesting number — 50,745 people saved this tweet to come back to, which suggests planning, not just reaction. Over twenty days, this analysis collected 2,029 original tweets across 27 opportunity categories, scraped 18 articles from TechCrunch, VentureBeat, The Information, and Anthropic's documentation, and tracked four competing platform launches, a memory infrastructure update, and the first production numbers from enterprise adopters. --- ## Part II: What Anthropic actually built What Anthropic shipped: a **managed agent runtime**. Not an API wrapper or a chatbot with plugins — a production environment where AI agents run, persist, and operate autonomously. The architecture follows what Anthropic calls the **"brain + hands" model**. The brain is Claude — the reasoning engine. The hands are sandboxed Linux containers with filesystem access, bash execution, web browsing, code interpretation, and MCP tool integration. Each agent runs in its own isolated container with 8GB RAM and 10GB disk. Agents can operate for hours, survive disconnections, and checkpoint their own state. The pricing: **$0.08** per session-hour · **~$58** per month for 24/7 · **$0** while idle or awaiting input $58/month for a 24/7 autonomous agent — less than a single hour of human contractor work. ### The architecture: "cattle, not pets" Before April 8, every company building production agents hit the same wall: sandboxing, state management, error recovery, credential handling, container orchestration, session persistence. This is 6-12 months of engineering work that produces nothing visible to end users — but without it, your agent crashes at 3am and nobody knows why. Anthropic rebuilt their container infrastructure around a **"cattle, not pets"** model — disposable, standardized containers. Every agent gets an isolated Linux box with 8GB RAM, 10GB disk, bash, a browser, code interpreter, and MCP tool support. If a container fails, another spins up with checkpointed state. Performance: **60%** faster p50 time-to-first-token · **90%+** faster p95 TTFT The p95 number matters most. In production, you design for the worst case. A container that used to take 30 seconds to spin up on a bad day now takes 3. You can give an agent a complex task and have it running in seconds — the kind of improvement that doesn't make good tweets but determines whether anyone actually ships with this in production. ### Persistent memory (April 23) Fifteen days after launch, Anthropic shipped the update that made CMA a different product: **persistent memory**. ![Claude's memory architecture — agents that learn](/post-images/20-days-that-changed-the-ai-agent-market/memory-brain.jpg) Agents now have access to file-based memory at `/mnt/memory/`, exportable and portable, with 30-day versioning. Memory blocks can be read-only or read-write. They can be shared across agents. And they persist between sessions — meaning your agent at 6am remembers everything it learned at 2am. Example: deploy an agent Monday to handle insurance claim denials for a dental practice. By Friday, it has processed 50 claims and identified that Claim Code D2740 gets denied by Delta Dental 30% more often when documentation omits a specific intraoral photo reference. That pattern wasn't in any manual or in Claude's training data. The agent found it through experience. The memory is exportable — plain files, 30-day versioning, portable between agents. Spin up a second agent and hand it everything the first one learned. An agent with reasoning, execution tools, and persistent memory isn't a tool — it's a worker that improves every shift. > "We integrated Claude agents into our root cause analysis pipeline. A single engineer wired it up. It now processes over a million RCAs per year — and it can go from identifying a bug to generating a pull request, end-to-end." > — Owen King, Engineering Director, Sentry --- ## Part III: The shockwave ![Market shockwave — tweet volume, reactions, global discourse](/post-images/20-days-that-changed-the-ai-agent-market/shockwave.jpg) The discourse didn't follow the usual tech-twitter hype-then-forget pattern. It kept building. **56,883** likes on main tweet · **3,183** quote tweets · **50,745** bookmarks · **21.2M** impressions The bookmark-to-like ratio is nearly 1:1. People weren't just hearting — they were saving it to study later. ### The "startup killer" narrative Within minutes, the "startup killer" narrative took hold: > "Yeah. Anthropic just casually kill3d dozens, hundreds, thousands of startups. Again." > — @kimmonismus (♥ 943) **@aakashgupta** put it most sharply: > "This mass-obsoleted every agent orchestration startup and 50%+ of vertical SaaS." > — @aakashgupta (♥ 2,711 · Most-liked non-official tweet) 2,711 likes — the most-liked non-official tweet in the dataset. The implied diagnosis: agent orchestration as a standalone category is dead, vertical SaaS built on CRUD + workflow is wounded, and the infrastructure layer every agent startup was building in-house just got commoditized. Whether or not that's right (more on this later), the speed at which this consensus formed matters. The narrative locked in within six hours. ### The numbers behind the noise Signal breakdown across the 2,029-tweet dataset: - **103 tweets** explicitly declared "I'm building this" or described active projects - **32 tweets** ran the "startup killer" narrative (112,994 total impressions) - **25 how-to/tutorial threads** published within 48 hours - **18 contrarian/skeptic tweets** — outnumbered 5:1 by builders - **16 tweets** specifically analyzed the $0.08/hour pricing - **8 languages** of discourse: English, Chinese, Japanese, Portuguese, Spanish, French, Indonesian, German On YouTube, a video titled "Killed 1000+ Startups" hit **54,000 views** — roughly 10× normal for developer infrastructure content. At the HumanX conference in San Francisco that same week, TechCrunch reported that "everyone was talking about Claude." Vendors who had been pitching OpenAI integrations pivoted their narratives mid-conference. Glean's CEO described Claude Code as having "become a religion" among developers. The business context: Anthropic had grown from $9B to **$30B ARR** in a year, with **300,000+ business customers** and **1,000+ enterprise clients at $1M+ annually**. Managed Agents was the next step from a company already growing faster than most people's models predicted. The discourse was global — the dataset captured substantial threads in Chinese, Japanese, Portuguese, Spanish, French, Indonesian, and German. A Japanese explainer thread went viral. Brazilian tech commentators ran full analyses. An Indonesian developer's discovery thread opened with "I sat up straight." When the same event triggers the same reaction across eight languages, that's a real signal. --- ## Part IV: The platform war ![Four-vendor convergence — the agent platform war](/post-images/20-days-that-changed-the-ai-agent-market/platform-war.jpg) Anthropic launched into a **72-hour window of three competing announcements**: **April 7:** Microsoft ships its managed agent runtime, drawing on its 38.6% enterprise share. The incumbent play. **April 8:** Anthropic drops Claude Managed Agents. The insurgent play. **April 9:** LangChain ships Deep Agents Deploy — open-source, model-agnostic, deployed *the next day*. The ecosystem play. Then on **April 22**, Google entered with the Gemini Enterprise Agent Platform. Four major vendors converged on the same product category within fifteen days. ### Anthropic's structural advantages Anthropic's structural position is worth examining separately from the product itself: - **$30B ARR** — up from $9B the previous year (3.3× growth) - **300,000+** business customers on Claude - **1,000+** enterprise customers at $1M+ spend - **$100M** partner network committed - **44%** enterprise penetration across target accounts - **Only vendor** with three-cloud BAA (AWS, Azure, GCP) The three-cloud BAA matters most to enterprise buyers. Regulated industries — healthcare, finance, government — can deploy Claude agents on whichever cloud they already use, with HIPAA-grade compliance from day one. Microsoft has Azure lock-in. Google has GCP lock-in. Anthropic is cloud-agnostic at the compliance layer. The integration ecosystem moved fast. GitLab added managed agents to CI/CD pipelines on April 28. Box CEO Aaron Levie (226 likes) demoed document review automation "in 2 minutes." Asana integrated managed agents into their workflow platform. ### The open-source counter LangChain shipped Deep Agents Deploy within 24 hours — almost certainly pre-staged, but the signal was clear: open-source, model-agnostic, no lock-in. Multica, a community-driven open-source CMA alternative, hit 1,363 likes on its launch tweet. NathanFlurry's agentOS (145 likes) took the maximalist position: any agent, any LLM, 22MB RAM per sandbox, BYOC/on-prem, fully open-source. The open-source ecosystem has a structural advantage Anthropic can't match — enterprises can audit the code, modify it, and guarantee it won't be deprecated by a vendor strategy shift. But speed isn't depth. Anthropic's runtime includes sandboxing, credential vaults, MCP integration, fleet monitoring, checkpoint recovery, built-in web search at $10 per 1,000 queries, and scoped permissions out of the box. Replicating that in open-source takes months of hardening and security auditing, not a weekend hack. The market is big enough for all four. AI agents: **$10.91 billion** market, **45.8% CAGR**. Over **51% of enterprises** already running agents, **88%** planning to increase budgets next fiscal year. The more interesting question is which *layer* captures the most value. In cloud, the infrastructure layer (AWS, Azure, GCP) captured enormous value — but so did the application layer (Salesforce, Snowflake, Datadog). In mobile, both the platform layer (iOS, Android) and the application layer (Uber, Instagram) won. The historical pattern suggests that application builders who solve specific problems for specific industries will build bigger businesses than runtime providers, with better margins and harder-to-erode moats. --- ## Part V: The money playbooks ![Vertical business playbooks — the dentist blueprint](/post-images/20-days-that-changed-the-ai-agent-market/dentist-blueprint.jpg) By hour 48, playbook tweets were outperforming panic tweets. The highest-engagement non-official tweet in the dataset wasn't about technology. It was about dentists. > "here's a concrete example of how to make money with this new Claude drop — build and sell AI agents for dentists. > > a dentist's office has the same 6 problems every single month: > > 1. patients not booking > 2. no one answering calls at night > 3. unpaid bills piling up > 4. bad reviews going unanswered > 5. appointment reminders not going out > 6. insurance claims getting denied" > — @RobHoffman\_ (♥ 1,686 · Highest non-official engagement) 1,686 likes — more than any VC take or "startups are dead" thread. More playbook tweets followed: > "This is going to be every marketer's second employee (and you'll never have to hire them)." > — @aschwags3 (♥ 936) > "how to use claude's new managed agents for marketing — deploy AI agents to the cloud, they run on their own, persist between sessions, and scale. here are 10 agents I'd deploy for a GTM team" > — @shannholmberg (♥ 498) > "Claude literally handed us a business in a box." > — @DataChaz · 162K followers (♥ 44 · Real estate agent blueprint) > "Want a bulletproof way to monetize the new Claude update? Build and sell automated AI receptionists to law firms." > — @law\_ninja (♥ 32 · Legal vertical blueprint) ### The overnight agent pattern One pattern emerged independently across multiple tweets: the **overnight agent**. Assign a task at 10pm. Deliverables land by 6am. An eight-hour overnight session costs sixty-four cents. @mikefutia (382 likes) described deploying a DTC brand marketing analyst in an afternoon — pulling Meta, GA4, and Shopify data into a daily Slack brief. SavvyAgents.ai was taking real payments from real dental offices within 48 hours of the announcement. An electrician who taught himself to code built a full consumer product — NEC calculations, code lookup, residential estimating — going from "electrician with a laptop" to "electrician with a software company." The consistent pattern: the barrier to building is gone. The moat is now domain knowledge. Rob Hoffman doesn't need to understand container orchestration. He needs to know that dental offices lose $30,000/year to unpaid bills and that insurance denial rates spike for specific procedure codes at specific payers. The runtime is electricity. The expertise is the product. "Make money" tweets generated **380,030 impressions** — 3× the "startup killer" tweets at 112,994. The overnight agent category alone: 23 tweets, **335,272 impressions**, the highest-impression opportunity category in the dataset. The idea of an agent that works while you sleep resonated well beyond the developer community. --- ## Part VI: The $800B blind spot ![TAM opportunity matrix — autopilot territory vs. watch quadrant](/post-images/20-days-that-changed-the-ai-agent-market/autopilot-territory.jpg) This analysis mapped every opportunity signal in the dataset against two axes: **Outsourced vs. Insourced** (who does the work today?) and **Judgment vs. Intelligence** (does the task require human judgment or just pattern-matching?). The loudest tweets land in the worst quadrant. ### The Watch quadrant (loud, low TAM) Marketing agents, coding agents, SEO automation, content generation — all land in the **Watch quadrant**: insourced tasks that are judgment-heavy. Hard to fully automate, modest TAM, and every AI company on earth is already building here. This is where 936-like tweets live. It's also where margins get competed to zero within 18 months. ### Autopilot Territory (quiet, massive TAM) The larger opportunity is in **Autopilot Territory**: outsourced tasks that require intelligence but not human judgment. Work currently done by BPO firms and back-office teams, fully automatable without a human in the loop. | Autopilot Vertical | TAM | Loudest Tweet | Signal | |---|---|---|---| | Healthcare Rev Cycle Mgmt | $50–80B | @RobHoffman\_ (dentists) | 1,686 likes → pointed at the right vertical | | Insurance Brokerage Ops | $140–200B | @chooserich (Salesforce) | 352 likes → saw the CRM displacement | | Accounting & Audit Processing | $50–80B | 2 tweets total | Near-zero signal in a massive market | | Paralegal & Legal Processing | $36B | @law\_ninja (law firms) | 32 likes → correct thesis, low distribution | > "Everybody is calling this an attack on Openclaw… wrong. This is an attack on Salesforce." > — @chooserich (♥ 352) The outsourced process layer across insurance, healthcare, accounting, and legal represents **over $300 billion in addressable market** — and the dataset shows near-zero competitive signal in most of it. ### Why the loud signals are the wrong signals @aschwags3's marketing agent tweet: 936 likes, 247,012 impressions. Looks like a huge signal — but every like is another potential competitor who just got the same idea. High engagement in AI Twitter means high competition. @law\_ninja's law firm receptionist tweet: 32 likes, 4,352 impressions. Legal processing is a $36 billion market where AI penetration is in single digits and the decision-makers don't use Twitter. The low engagement is the signal. ### Next Wave: $300B+ with zero tweet signal **Next Wave** verticals — massive TAM, zero tweet signal: - **Supply chain logistics automation** — $80-120B market. Zero tweets. The companies that process millions of SKU movements, warehouse assignments, and carrier negotiations have massive, well-documented, highly repetitive workflows that are purpose-built for agent automation. - **Pharmacy benefit management** — $40-60B market. Zero tweets. Prior authorization, formulary management, claims adjudication. Entirely process-driven. Entirely automatable. Entirely untouched by the AI agent discourse. - **Wealth management compliance** — $30-50B market. Zero tweets. KYC/AML processing, regulatory filing, portfolio compliance monitoring. Every wealth management firm employs armies of compliance analysts doing work that agents could handle at a fraction of the cost. - **Regulatory filing & government compliance** — $50-80B market. Zero tweets. Tax preparation, SEC filings, FDA submissions, environmental compliance. Massive volumes. Strict formatting requirements. Perfect agent territory. The decision-makers in these industries don't follow AI Twitter. When they catch up — forced by competitors who moved first — the early movers will have 12-18 months of accumulated domain expertise and agents with months of institutional memory. Total addressable opportunity across Autopilot Territory and Next Wave: **$800 billion+**. The core insight: **if you're building the same agent that got 936 likes on Twitter, you've already lost. The alpha is in the verticals with zero tweets and zero attention — that's where $300 billion in addressable market sits uncontested.** **$800B+** — Total addressable opportunity across Autopilot Territory and Next Wave verticals — with near-zero competitive signal --- ## Part VII: The production reality What the enterprise early adopters actually measured: ### Rakuten Rakuten deployed five Claude agents in one week: **97%** fewer errors · **27%** lower cost · **34%** lower latency · **79%** time-to-market reduction Rakuten's release cadence shifted from **quarterly to biweekly**. That kind of velocity change compounds into a real competitive advantage over 12 months. ### Sentry A **single engineer** wired Claude agents into Sentry's root cause analysis pipeline. It now processes **over 1 million RCAs per year** — from bug identification to generated pull request, end-to-end, no human intervention. > "The integration took a single engineer. Now it handles over a million root cause analyses annually. The agent can go from identifying a bug to generating a PR fix, fully autonomous." > — Owen King, Engineering Director, Sentry ### Notion > "Before, you had two ways to use Claude with Notion. Now there's a third with Claude Managed Agents." > — @NotionHQ (♥ 541 · Official announcement) Notion deployed managed agents for internal prototyping: **12 hours of work compressed to 20 minutes**. They run 30+ concurrent agent tasks and built a self-improving skills database — agents that get better at using Notion's APIs with each run. Cost reduction exceeded 90%. > "We went from 12 hours of prototyping to 20 minutes. The agents run 30+ concurrent tasks and maintain a skills database that improves with every session." > — Simon Last, Engineering Lead, Notion ### Wisedocs Wisedocs reported 30% faster document validation — a smaller number, but document validation is exactly the kind of outsourced, intelligence-heavy, judgment-light work that maps to Autopilot Territory. ### What the numbers actually tell us Rakuten's 97% error reduction is mostly about infrastructure, not reasoning. Most "errors" in pre-CMA agent deployments were dropped connections, corrupted state, lost context windows, failed tool calls that never retried. CMA eliminated the category. Sentry's single-engineer integration is the strategically important number. Enterprise software integration typically requires a PM, 2-3 engineers, a QA specialist, and 3-6 months. If the one-person ratio holds elsewhere, deploying agent systems drops from "major IT project" to "afternoon experiment." Notion's self-improving skills database shows what persistent memory enables at the org level: agents record what worked and failed, each session makes the next better, and after weeks of accumulated learning the effectiveness compounds in ways impossible with stateless tools. ### The missing layer There's a gap though. Anthropic's runtime handles sandboxing, checkpointing, error recovery, credential management. But the **multi-tenant business layer doesn't exist yet**. Take Rob Hoffman's dentist blueprint. You want to sell AI agents to 50 dental offices. CMA gives you the runtime. It doesn't give you: a customer dashboard, per-customer billing, white-label UI, role-based access control, usage analytics, onboarding flows, or integration templates for dental practice management systems. You have a good API. The entire business layer between "agent runs" and "customer pays you monthly" is your problem to solve. That gap is the biggest missing layer in the CMA ecosystem — and whoever builds it (platform-agnostic, working across CMA, LangChain, Google, Microsoft) captures the integrator margin on everything above it. --- ## Part VIII: The contrarian case The skeptics had real points. > "nice demo but i'm calling it now: this will end up dead like openai's agent builder" > — @elvissun (♥ 357 · Most-debated tweet) 357 likes, 118 replies — the most debated tweet in the dataset. The comparison to OpenAI's abandoned agent builder isn't unreasonable. The graveyard of big-lab platform plays is large: Google's API deprecations, Meta's chatbot platform, OpenAI's plugin ecosystem. > "The new Anthropic managed agents API is basically the Letta API that we've had since a year ago, but closed source and with provider lock-in." > — @sarahwooders · Letta founder (♥ 362) She's not wrong about feature overlap — Letta has had read-only memory blocks, memory sharing, and persistent sessions for over a year. But features matter less than distribution. Anthropic has 300,000 business customers and $30B ARR. Letta has a better open-source story. Different weapons for different customer segments. ### The OpenClaw timing problem > "Anthropic banned OpenClaw from using Claude subscriptions 4 days ago. Today they just launched their own managed agents platform." > — @BentoBoiNFT (♥ 1,259 · The timing tweet) 1,259 likes — the second-highest non-official tweet. The optics are bad: Anthropic shut down 135,000 always-on OpenClaw agents that were burning more compute than $200/month subscriptions covered, then launched a paid, metered replacement four days later. The charitable read: OpenClaw users were abusing flat-rate pricing unsustainably. The cynical read: Anthropic killed the competition before launching the replacement. Probably both. Enterprise buyers noticed either way. ### The lock-in argument **15 vendor lock-in tweets** in the dataset, including from VCs like Ed Sim: > "Many enterprise CTOs remind me single-vendor agent stacks are tomorrow's lock-in story. They want agents running across Claude, GPT, Gemini, and open-source models — not locked to one provider forever." > — Ed Sim, VC, @edsim VentureBeat added more: session data sovereignty (who owns what the agent creates?), dual control plane complexity (your infrastructure + Anthropic's), and no migration path. If Anthropic changes pricing or deprecates features, there's no export button. These are the same concerns that drove enterprises away from previous platform lock-in plays — and the opening that LangChain and Multica are targeting. ### The trust deficit A subtler concern: Anthropic's reliability track record. Claude's API has had notable outages. Claude Code could burn through a $200/month subscription in hours. @iam\_riichard (3 likes) put it well: "It's funny that I am using a $20 ChatGPT sub to fix the code that my $100 sub Claude Code wrote." @andrehfp (Portuguese-language tech community) was blunter: "It killed nothing, it'll have the same future as OpenAI's agent builder. Anyone who builds agents doesn't want to be locked to a single provider. And besides, I don't trust the infra of a company that can't even keep their chat stable." CMA is asking enterprises to bet production workflows on infrastructure from a company with consumer-tier reliability issues. Enterprise SLAs may be different, but the perception gap between "consumer Claude goes down for 4 hours" and "enterprise agents running my billing" only closes with months of flawless uptime. @MLStreetTalk (38K followers) raised the cost problem: "The reason we are not using [agent orchestration systems] is simply that we don't want to pay API costs." The $0.08/hour runtime is cheap. The token costs underneath it aren't. An 8-hour document processing session might consume hundreds of dollars in API tokens on top of 64 cents of runtime. Anthropic controls pricing on both layers. ### The historical pattern The real question for CMA is whether Anthropic has the *operational discipline* to maintain a production runtime over years. Building a runtime is a product challenge. Operating one at scale is an operational challenge — a different organizational skill. AWS is great at this. Google historically isn't. Where does Anthropic land? Unknown, and that uncertainty is a legitimate reason for caution. The counter: Rakuten, Sentry, and Notion are real production deployments with real metrics. Rakuten didn't get 97% fewer errors by accident. Notion didn't build a self-improving skills database on infrastructure they expected to disappear. The production evidence so far suggests CMA works at scale. Whether it continues depends on organizational discipline — harder to predict than technology. --- ## Part IX: The commoditization thesis ![Infrastructure commoditizes. Domain expertise is the moat.](/post-images/20-days-that-changed-the-ai-agent-market/commoditization.jpg) The biggest structural insight isn't about Anthropic. It's about what happens when infrastructure becomes free. > "a lot of talk on how 1000 startups just died due to Claude managed agents. I think that's overblown - the truth is the moat for agentic products has been shifting from infra engineering to domain expertise" > — @Tocelot (♥ 167 · The best take in the dataset) Before April 8, building a production agent meant solving sandboxing, state management, error recovery, checkpoint persistence, credential management, and container orchestration. That was 6-12 months of work and a real moat. After April 8, it's available to anyone with an API key for $0.08/hour. The moat moved to **domain expertise**. Rob Hoffman's dental agent isn't defensible because of its infrastructure. It's defensible because someone understood the six problems every dental office faces, the integration points with practice management software, the compliance requirements, the billing workflows. That knowledge exists in the heads of people who've worked in dental practice management for years, not in any model's training data. ### The memory moat Persistent memory adds another dimension. A dental agent deployed today, after six months, knows which insurers deny which procedure codes at that specific office, that Dr. Martinez's Tuesday patients no-show more than Wednesday patients, and that collection rates improve with 48-hour follow-ups instead of the industry-standard 72. That agent is dramatically more valuable than a freshly deployed competitor. The switching cost isn't infrastructure or API integration. It's the institutional knowledge accumulated over months of operation — stored at `/mnt/memory/`, theoretically portable, practically irreplaceable. The longer an agent runs, the harder it is to replace. A competitor can't replicate six months of learning by offering a lower price. This extends further: agents that learn eventually know things their creators don't. A compliance agent that has processed 50,000 regulatory filings develops pattern recognition no human officer has time to build manually. The agent generates institutional intelligence that didn't exist before. The person who captures the most value isn't the runtime provider — it's the domain expert who knows what to teach the agent and when to override it. **The durable moat is domain expertise — knowing which problems to solve, for whom, and how to evaluate whether the agent is doing a good job.** That expertise lives in the heads of people who've spent years in specific industries. It can't be replicated by shipping a better container. --- ## Part X: The strategic read — move now ![The window is closing. Move now.](/post-images/20-days-that-changed-the-ai-agent-market/move-now.jpg) The 20-day timeline: - **Week 1 (April 8-14):** Shock. The announcement. The panic. The "startup killer" narrative. 103 people tweet "I'm building this" within 72 hours. - **Week 2 (April 15-21):** Production validation. Rakuten, Sentry, Notion numbers go public. The narrative shifts from "startups are dead" to "who's actually shipping?" - **Week 3 (April 22-28):** Platform war. Google enters. GitLab deepens. Memory ships. The market structure crystallizes into four competing platforms with distinct strategies. The window between announcement and competitive saturation compressed to weeks. By the time most people finish their analysis, first movers are already in production. ### Three plays Three viable strategic positions: **Play 1: Bet on CMA (speed)** Accept lock-in risk, move fast. Three-cloud BAA compliance, $100M partner network, 300K-customer distribution. Risk: vendor dependency with no plan B. Advantage: ship in days, not months. Favors domain experts over infrastructure engineers — if you understand the problem space but lack infra chops, CMA eliminates the bottleneck. **Play 2: Bet against CMA (independence)** Build on LangChain, Multica, agentOS, or your own stack. Model-agnostic, cloud-agnostic. Risk: slower time-to-market, ongoing infrastructure overhead. Advantage: freedom to switch models, deploy on-premise, serve enterprises that won't accept single-vendor lock-in. Favors teams with strong infra engineering and enterprise sales relationships. **Play 3: Build the missing layer (infrastructure gap)** Build the layer missing from *all four platforms*: multi-tenant business plumbing. Visual agent builders, per-customer billing, white-label interfaces, RBAC, onboarding flows, vertical integration templates. Platform-agnostic, works on top of everything. Captures margin at the business logic layer — historically the most durable. Nobody is building it yet because it's unglamorous and doesn't demo well. @GianTheRios (13 likes): "100% the next Anthropic product launch is dropping a visual component on top of this." If Anthropic builds it, the window closes. If they don't — and model providers historically don't prioritize business plumbing — it's wide open. ### Where the money actually is **Autopilot Territory = $800B+** — Healthcare rev cycle. Insurance ops. Accounting. Paralegal. Supply chain. Pharmacy. Near-zero competition. Near-zero tweet signal. Maximum opportunity. The market's attention is on marketing agents and coding assistants — the Watch quadrant. The money is in Autopilot Territory: insurance claims, medical billing, regulatory compliance, paralegal research. Over $800 billion in addressable market with near-zero AI-native competition, because the people who understand dental insurance billing don't hang out on AI Twitter. ### The clock is ticking **103 people** publicly tweeted they were building within 72 hours. SavvyAgents.ai was in production within a week. Multica had an open-source alternative with 1,000+ stars within two weeks. Four cloud vendors had competing products within three weeks. The window between early-mover advantage and competitive saturation is measured in weeks now. Domain experts in healthcare, insurance, and legal operations are starting to figure out what agents can do for them. The winners won't be the ones with the best analysis — they'll be the ones who shipped first and accumulated the most agent memory before competitors caught up. The infrastructure layer commoditized on April 8, 2026, across four vendors simultaneously. The runtime that took 6-12 months to build is now $0.08/hour. That's table stakes. The domain experts — people who understand why dental insurance claims get denied, why law firms lose clients during intake, why pharmaceutical compliance filings get rejected — now have tools they didn't have three weeks ago. Autonomous agents, 24/7 operation, persistent memory, less than $60/month. **The infrastructure is free. The expertise is the moat. Move now.** --- ### Methodology This analysis is based on 2,029 unique original tweets and 1,138 replies collected via X API v2 between April 8-28, 2026, supplemented by 18 articles from TechCrunch, VentureBeat, The Information, and Anthropic's official documentation. Enterprise case study data comes from official Anthropic partner pages, public blog posts from Sentry, Notion, Rakuten, and Wisedocs, and named quotes from engineering leads at these companies. TAM estimates use IBISWorld, Grand View Research, and Gartner industry sizing data, cross-referenced against 2025-2026 analyst reports. The opportunity matrix framework is original analysis. All tweet quotes are reproduced verbatim from public posts. Engagement numbers reflect counts at time of data collection (April 14-28, 2026) and may have changed since. --- *Data: 2,029 tweets · 18 articles · 4 case studies · 20 days of tracking* *Built with conviction, not consensus.* --- # Google expands Pentagon AI access after Anthropic refuses classified networks URL: https://www.thedeepfeed.ai/posts/2026-04-28-google-pentagon-anthropic/ Category: Policy Date: 2026-04-28 Source: TechCrunch — https://techcrunch.com/2026/04/28/google-expands-pentagons-access-to-its-ai-after-anthropics-refusal/ Tags: google, anthropic, pentagon, dod, defense, classified > Google has granted the U.S. Department of Defense expanded access to its AI for classified workloads after Anthropic declined the same scope of access. **Google has granted the U.S. Department of Defense expanded access to its AI for classified networks**, TechCrunch reports — a step Anthropic reportedly declined to take. ## The story Per TechCrunch's reporting, the Pentagon sought broad AI access across classified-network workloads. Anthropic, which has publicly emphasized model-safety and use-case restrictions, did not agree to the full scope of access. Google did. The deal extends Google's existing defense relationship — the company has held DoD contracts through Google Cloud's Government surface for years — into a tier that includes use of frontier Gemini models on classified workloads. ## Why this is structurally important The frontier-lab differentiation has, until now, been about **capability** (who has the smartest model). This is the first inflection where it's about **policy** — who's willing to operate in which environments. Three things follow: 1. **Anthropic's market position is shifting.** The company is doubling down on a posture that keeps frontier capability available for non-military use, ceding the defense contract surface. 2. **Google's enterprise/government wedge widens.** Combined with Gemini Enterprise Agent Platform announcements at Cloud Next '26, Google is positioning itself as the **one frontier lab that will operate anywhere a regulated buyer needs it**. 3. **Talent and capital follow policy.** Defense-adjacent AI hiring (Anduril, Palantir, Scale's defense practice) has surged in 2026; expect this to intensify. ## What we'll be watching - Whether OpenAI and xAI publicly stake out positions in the same space. - Whether the EU AI Act's defense carve-outs trigger similar splits in the European market. - Anthropic's Q2 communications — does it formalize a "no defense" policy, or signal flexibility? --- # David Silver's Ineffable Intelligence raises $1.1B seed at $5.1B valuation URL: https://www.thedeepfeed.ai/posts/2026-04-27-ineffable-1-1b-seed/ Category: Business Date: 2026-04-27 Source: TechCrunch — https://techcrunch.com/2026/04/27/deepminds-david-silver-just-raised-1-1b-to-build-an-ai-that-learns-without-human-data/ Tags: funding, deepmind, alphago, seed, uk, sequoia, nvidia > The DeepMind veteran behind AlphaGo lands Europe's largest-ever seed round to build AI that learns without human data. Sequoia and Lightspeed lead; Nvidia and the UK government participate. **Ineffable Intelligence**, a UK-based AI lab founded by former Google DeepMind researcher **David Silver**, has raised **$1.1 billion** in seed funding at a **$5.1 billion** valuation. The round — the **largest seed financing in European history** — was led by Sequoia Capital and Lightspeed, with backing from Nvidia and the British government. ## Who is David Silver Silver led the AlphaGo, AlphaZero, and AlphaStar projects at DeepMind across more than a decade, and contributed to Gemini before leaving in late 2025. AlphaGo's 2016 defeat of Lee Sedol is widely credited as the moment that pulled deep RL into the mainstream of AI research. ## What Ineffable is building The thesis: **AI that learns without human data**. This is a deliberate move away from the internet-scraping pretraining paradigm that powers GPT, Claude, and Gemini, and toward self-play / synthetic-environment training of the kind that produced AlphaGo and AlphaZero. As of the funding announcement, Ineffable has: - **No product** - **No revenue** - **No public roadmap** It was incorporated in November 2025. The round is essentially a billion-dollar bet on Silver's track record and the thesis that the post-pretraining era will require fundamentally different training methods. ## What this signals - **The "founder bet" tier is intact.** Even at 2026 valuations, a lab with one researcher's reputation can raise eleven-figure seed rounds. - **UK industrial policy is active.** The British government's participation suggests the kind of state-backed national champion strategy France and the UAE have already pursued. - **Post-pretraining is the new bet.** Investors are pricing in the possibility that scaling internet text alone has diminishing returns. --- # Google unveils Gemini Enterprise Agent Platform at Cloud Next '26 URL: https://www.thedeepfeed.ai/posts/2026-04-26-google-gemini-enterprise-agents/ Category: Products Date: 2026-04-26 Source: TIME / Google Cloud Next '26 — https://time.news/google-unveils-gemini-enterprise-agent-platform-for-autonomous-ai-agents/ Tags: google, gemini, vertex-ai, enterprise, agents > A rebranded and expanded Vertex AI lets businesses build, test, and deploy autonomous agents that execute workflows across Google Cloud and Workspace via natural language. At **Google Cloud Next '26** in Las Vegas, Google unveiled the **Gemini Enterprise Agent Platform**, described as a "rebranded and expanded version of Vertex AI" purpose-built for enterprise agent deployment. ## The pitch Companies can now **build, test, and deploy AI agents** that autonomously execute workflows across Google Cloud and Workspace using natural language. The platform handles three layers: 1. **Build** — visual canvas + code-first SDK; same primitives that power Deep Research Max. 2. **Test** — sandbox environments with replayable agent runs and step-level inspection. 3. **Deploy** — managed runtime with cost controls, audit logs, and SLAs. ## Why this matters The agent space has fragmented across Anthropic's MCP ecosystem, OpenAI's GPTs/Operator surface, LangChain, CrewAI, and dozens of niche players. Google's bet is that **enterprise IT will choose one consolidated platform from a hyperscaler** rather than stitch together OSS frameworks. Bloomberg reported the same week that this is Google's "latest attempt to take on OpenAI and Anthropic" in the agent market — framing the announcement as competitive response, not industry leadership. ## Open questions - Will this run agents from **non-Google models** (Claude, GPT-5.5)? Google hasn't confirmed. - How does pricing compare to Vertex AI's existing structure? - The early-access list is reportedly oversubscribed; broad GA timeline is unconfirmed. --- # OpenAI ships GPT-5.5 — first fully retrained base model since GPT-4.5 URL: https://www.thedeepfeed.ai/posts/2026-04-23-openai-gpt-5-5/ Category: Models Date: 2026-04-23 Source: OpenAI — https://openai.com/index/introducing-gpt-5-5/ Tags: openai, gpt-5, frontier-models, coding > Codenamed "Spud," GPT-5.5 targets agentic coding and computer use, matches GPT-5.4 latency, and lands the same day API access opens for paying customers. OpenAI on Thursday released **GPT-5.5**, its newest frontier model and the first fully retrained base model since GPT-4.5. The model — codenamed "Spud" internally — is pitched as a "new class of intelligence for real work," with a focus on completing complex multi-step tasks with minimal human direction. ## What's new - **Agentic coding.** GPT-5.5 sets new benchmarks on long-horizon software engineering tasks, including hand-offable refactors and multi-file changes. - **Computer use.** Direct OS interaction has improved meaningfully — it's the first GPT model positioned as production-ready for autonomous browser and desktop workflows. - **Latency parity with GPT-5.4** despite the architecture changes, per OpenAI's benchmark disclosures. - **Pro tier.** GPT-5.5 Pro shipped one day later (Apr 24) for the highest-stakes use cases. ## Availability - ChatGPT: rolling out to Plus, Pro, and Team users immediately. - API: live as of Apr 24 with an updated system card describing additional safeguards. - Enterprise: available via Azure and direct OpenAI contracts. ## Why it matters This is the first model release where OpenAI explicitly positions itself behind Anthropic on enterprise coding and acknowledges it. The TechCrunch coverage described GPT-5.5 as OpenAI's move "one step closer to an AI super app" — a single surface that can plan, execute, and verify work end-to-end. Whether the new agentic capabilities close the gap with Claude Opus 4.7 (released Apr 16) is the open question. --- # VAST Data hits $30B valuation as AI infra stack reshuffles URL: https://www.thedeepfeed.ai/posts/2026-04-22-vast-data-30b/ Category: Business Date: 2026-04-22 Source: GlobeNewswire — https://www.globenewswire.com/news-release/2026/04/22/3279162/0/en/vast-data-valued-at-30-billion-as-ai-drives-a-new-infrastructure-stack.html Tags: vast-data, infrastructure, funding, ai-os > The "AI Operating System" company closes a new round at $30B, citing rare combination of growth and profitability — and a central role in powering frontier-lab infrastructure. **VAST Data** announced a new funding round at a **$30 billion valuation** on Apr 22, citing "rare combination of growth and profitability" driven by its central role in **powering AI infrastructure at global scale**. ## What VAST does VAST positions itself as **the AI Operating System** — a unified storage and data platform designed for the workloads of frontier AI labs. Real customers include hyperscalers, sovereign AI initiatives, and several of the named frontier labs (VAST has not disclosed which). The company claims its platform handles: - **Training data pipelines** at multi-exabyte scale - **Inference-time data services** with sub-millisecond latency - **GPU-attached storage** that keeps H100/B200/B300 fleets fed at line rate ## Why $30B is the eyebrow-raise VAST is **profitable** — a rarity in this stage of the AI infra cycle. Most AI infrastructure companies (CoreWeave, Lambda, etc.) are still burning capital on data centers. VAST's pitch is that it's the **picks-and-shovels** play: every frontier lab needs the storage layer regardless of which model wins. ## What this tells us about the stack The 2026 AI infrastructure stack has crystallized into roughly: 1. **Silicon** — Nvidia (still ~80%), AMD, custom (Trainium, TPU, Maia) 2. **Compute orchestration** — CoreWeave, Lambda, hyperscalers 3. **Storage / data plane** — VAST, WekaIO, Pure Storage 4. **Model platforms** — OpenAI, Anthropic, Google, Mistral, Meta 5. **Application layer** — everything else VAST's $30B valuation is the storage layer asserting itself as a peer of the silicon and compute layers, not a commodity below them. --- # Google ships Deep Research Max — agentic research with native MCP URL: https://www.thedeepfeed.ai/posts/2026-04-21-google-deep-research-max/ Category: Agents Date: 2026-04-21 Source: Google — https://blog.google/innovation-and-ai/models-and-research/gemini-models/next-generation-gemini-deep-research/ Tags: google, gemini, deep-research, mcp, agents > Built on Gemini 3.1 Pro, Google's new research agents add MCP support, native data visualizations, and multi-source long-horizon workflows. Google introduced **Deep Research** and **Deep Research Max** on Apr 21, calling them "a step change for autonomous research agents." Both are built on **Gemini 3.1 Pro**. ## What it does - **Long-horizon research workflows** across the web or custom sources. - **MCP (Model Context Protocol) support** out of the box — agents can plug into any MCP server for tools and data. - **Native visualizations.** Charts and graphs are rendered inline as part of the agent's reasoning, not bolted on after the fact. - **Custom source integration.** Point it at internal docs, databases, or proprietary corpora. ## Why this is different from "Deep Research" v1 The original Deep Research (a 2024-era feature) was essentially a polished search-and-summarize loop. The Max tier is positioned as a **true autonomous agent** — it can branch, dead-end, backtrack, and re-plan over hours of execution. Google's positioning explicitly cites three industry-relevant axes: 1. **Quality of analysis** at the level of a junior analyst, not a Wikipedia summarizer. 2. **Source traceability** — every claim is linked to where it came from. 3. **Workflow integration** — agents can be triggered from Workspace and surface results in Docs, Sheets, and Gmail. ## The bigger pattern Gemini Enterprise Agent Platform — announced at Google Cloud Next '26 a week later — is the umbrella. Vertex AI gets rebranded and expanded into a full agent build-test-deploy surface. Combined with Deep Research Max, this is Google's most coherent agent story to date. --- # Anthropic releases Claude Opus 4.7 — the SWE benchmark just moved URL: https://www.thedeepfeed.ai/posts/2026-04-16-claude-opus-4-7/ Category: Models Date: 2026-04-16 Source: Anthropic — https://www.anthropic.com/news/claude-opus-4-7 Tags: anthropic, claude, swe-bench, coding > Opus 4.7 brings notable gains on the hardest software engineering tasks, with users reporting confident hand-off of work that previously required close supervision. Anthropic shipped **Claude Opus 4.7** on Apr 16, calling it "a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks." The headline claim from Anthropic's own announcement: users report being able to **hand off their hardest coding work** — the kind that previously needed close supervision — with confidence. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and now uses methods to **verify its own output** before reporting back. ## What changed - **Better self-verification.** Opus 4.7 explicitly checks its own work mid-task and corrects course before returning a final answer. - **Vision improvements.** Stronger multimodal reasoning. - **Same pricing tier** as Opus 4.6 — no premium for the upgrade. ## Industry context Opus 4.7 lands one week ahead of OpenAI's GPT-5.5 release, which TechCrunch and TNW frame as OpenAI's response to Anthropic's lead in the enterprise coding market. The two-model arms race is now happening on roughly weekly cadence, with Google's Gemini 3.1 Pro powering Deep Research agents released the same week. The 2026 picture: three frontier labs trading model releases on roughly monthly intervals, with benchmark deltas measured in single percentage points. The differentiation is moving from raw intelligence to **agentic reliability** — can you trust the model to complete a 4-hour task without supervision? ---