The web just put a price tag on AI answers

By Mark 8 min read 0 views

😁 Hello, super humans! Yesterday the story was which model drives your agents; today it is the plumbing those agents crawl through. Cloudflare, the company sitting in front of a huge slice of the web, just told AI crawlers the free buffet is closing. And SpaceX may or may not have built a phone. Let’s dig in.

πŸ“° Quick Signals

  • 🧠 AI: SpaceX showed investors a handset-like AI device prototype running xAI technology on a proprietary OS, slimmer than an iPhone; Elon Musk calls the report “utterly false” (TechCrunch).
  • πŸ€– Robotics: Apptronik unveiled Apollo 2, offered in bipedal and wheeled versions and built as a data collection and training platform for continuous learning through deployment, alongside a flagship training facility (The Robot Report).
  • πŸ’» Programming: GitHub shipped per-user AI credit budgets for cost centers, letting enterprise admins cap Copilot spend per team through the REST API with automatic membership sync (GitHub Changelog).
  • ⚑ Electronics: Data centers are on track to consume 70 percent of all memory chips made in 2026, spreading the shortage from servers into phones, PCs, and everyday gadgets (Tom’s Hardware).
  • πŸ“‘ Telecom: The FCC opened its Broadband Data Collection filing window on July 1 for availability data as of June 30, the dataset that decides where US broadband money flows next (FCC).

πŸ” The Big Story: Cloudflare rewires how AI pays for the web: label your crawler, cite your sources, or get blocked

If you publish anything on the web, or you build agents that read it, the economics of both jobs changed this week.

What happened: Cloudflare announced on July 1 that AI companies have until September 15 to separate the crawlers they use for search from the ones they use for AI training and agents. Crawlers that mix those purposes will be blocked by default on ad-supported pages for domains onboarding to its network, while search crawling stays allowed by default (TechCrunch). Alongside the stick, Cloudflare rolled out new AI traffic controls for all customers (Cloudflare).

The details: This is the evolution of Pay Per Crawl, the HTTP 402 Payment Required experiment Cloudflare launched in private beta a year ago (Pay per crawl). The new model, which Cloudflare calls Pay Per Use, flips the metering point: instead of charging when a bot fetches a page, publishers get paid when their content actually appears inside an AI answer. The first partners are Ceramic.ai and You.com; when an opted-in publisher’s content surfaces in Ceramic’s AI search results, or You.com touches a piece of premium content, money moves. Under the hood, everything depends on crawler classification: each bot must declare itself as Search, Training, or Agent, and each class gets its own default policy per page type.

flowchart TD
    A[AI crawler hits a Cloudflare-fronted site] --> B{Declared purpose?}
    B -->|Search| C[Allowed by default]
    B -->|Training or Agent| D{Page shows ads?}
    D -->|Yes| E[Blocked by default]
    D -->|No| F[Owner decides:<br/>allow, charge, or block]
    B -->|Mixed or undeclared| E
    C --> G[Content cited in an AI answer]
    G --> H[Pay Per Use:<br/>publisher paid on use, not on crawl]

Important

Our take: For years robots.txt was a polite note taped to the door; Cloudflare just gave it a bouncer. As bloggers, we like that payment now tracks citation instead of crawl volume, because getting scraped 10,000 times for zero readers was the worst deal on the internet. As agent builders, we read this as a deadline: your agent’s fetches are now a classifiable, billable event, so label your traffic honestly and budget for content access. And if you publish, pick your policy deliberately before September 15 instead of inheriting whatever default lands on you.

πŸ—žοΈ More News

🧠 AI

  • Ford is rehiring veteran “gray beard” engineers after AI tools fell short on complex vehicle engineering work (TechCrunch).
  • OpenAI is limiting the GPT-5.6 rollout to a government-approved preview after a White House safety request, saying such restrictions “shouldn’t be the norm” (TechCrunch).
  • An Apple Vision Pro executive is reportedly leaving for OpenAI, another hardware hire for the ChatGPT maker’s device ambitions (TechCrunch).
  • Asian AI startups are shipping Mythos-like frontier models while Anthropic’s export restrictions reshape who can buy what (TechCrunch).
  • SoftBank’s CEO joined the chorus questioning Elon Musk’s orbital data center pitch, asking who services a GPU in orbit (TechCrunch).
  • Google’s GKE Labs released OpenRL, an experimental self-hosted API for reinforcement-learning fine-tuning of LLMs on standard Kubernetes clusters (InfoQ).
  • Wedbush’s Dan Ives argues SpaceX is really an AI play positioned to become a major hyperscaler after its blockbuster IPO (CNBC).

πŸ€– Robotics

  • X Square Robot reached a $2.8 billion valuation after four consecutive funding rounds for its foundation-model-plus-hardware approach to household robots (The Robot Report).
  • BMW Group is deploying Figure 03 humanoids after Figure 02 supported production of more than 30,000 X3 vehicles over 11 months in South Carolina (The Robot Report).
  • Queue raised funding to build fully autonomous pharmacies for hospitals, retail, and underserved communities (The Robot Report).
  • morph launched “soft robotic cells” that embed physical AI directly into compliant hardware, trained with reinforcement learning in high-fidelity simulation (The Robot Report).
  • Robot hand maker Proception settled Tesla’s trade secret suit and announced an $11 million raise on the same day (TechCrunch).

πŸ’» Programming

  • VS Code 1.123 adds a two-hour delay on extension updates to blunt marketplace supply chain attacks (InfoQ).
  • Argo CD’s 3.5 release candidate enforces mutual TLS between internal components and adds Git commit signature verification (InfoQ).
  • Dapr 1.18 introduces Verifiable Execution: cryptographic trust, provenance, and tamper-evident execution records for distributed apps and AI agents (InfoQ).
  • Icon library Lucide hit 1.0 with over 1,600 icons, dropped trademarked brand icons, and cut bundle sizes for millions of projects (InfoQ).

⚑ Electronics

  • New PC purchases saw their sharpest drop in nearly three years as memory and storage prices bite; analysts forecast a 14 percent contraction hitting budget laptops hardest (Tom’s Hardware).
  • Wall Street is starting to price US memory maker Micron like the next Nvidia as the DRAM crunch turns memory into the new bottleneck (TechCrunch).
  • Sonair’s ADAR One 3D ultrasonic sensor earned safety certification as a human detection sensor under IEC 61496, a milestone for sensing around cobots and humanoids (The Robot Report).
  • Analysts warn of a barren stretch of consumer hardware launches as trillions in AI investment vacuum up components and fab capacity (Tom’s Hardware).

πŸ“‘ Telecom

  • The FCC will auction up to 180 MHz of upper C-band midband spectrum (3.98 to 4.2 GHz) for 5G and 6G services, with the auction required to finish by July 2027 (Light Reading).
  • The FCC is modernizing satellite spectrum sharing rules to expand broadband capacity, changing the math for satellite-to-consumer connectivity models (Telecoms Tech News).
  • Cloudflare published a deep dive on finding and fixing a CUBIC congestion control bug in quiche, its Rust QUIC implementation, that stalled recovery after heavy early packet loss (InfoQ).

πŸ‘¨β€πŸ’» Code Corner

With crawler policy suddenly a business decision, start by checking what your own robots.txt actually says to AI bots. This script audits any site in seconds:

import urllib.request

AI_BOTS = ["GPTBot", "ClaudeBot", "Google-Extended",
           "CCBot", "PerplexityBot", "Bytespider"]

site = "https://learningbot.tech"  # change to any site
raw = urllib.request.urlopen(f"{site}/robots.txt").read().decode()

current, rules = None, {}
for line in raw.splitlines():
    line = line.split("#")[0].strip()
    if line.lower().startswith("user-agent:"):
        current = line.split(":", 1)[1].strip()
    elif line.lower().startswith("disallow:") and current in AI_BOTS:
        rules.setdefault(current, []).append(
            line.split(":", 1)[1].strip() or "(nothing disallowed)")

for bot in AI_BOTS:
    print(f"{bot:16} -> {rules.get(bot, ['no rule: crawler allowed'])}")

Run it against your own domain and you will know in one glance which AI crawlers you have addressed and which are walking in unannounced.

Tip

robots.txt is advisory: well-behaved bots follow it, the rest ignore it. Enforcement is a separate layer, which is exactly the gap Cloudflare’s AI Crawl Control (and today’s Big Story) is built to close.

🧰 Toolbox

  • Cloudflare AI Crawl Control: dashboard to see, block, or charge every AI crawler hitting your site, including HTTP 402 responses.
  • Eve: Vercel’s open-source framework for building and operating AI agents with a filesystem-based project structure for tools, skills, and subagents.
  • AWS Blocks: open-source TypeScript framework where each Block bundles code, local mocks, and infrastructure, designed so AI agents write correct backends first try.
  • AWS Workload Credentials Provider: open-source tool that auto-delivers and refreshes certificates and secrets, killing the expired-cert outage, in and out of AWS.
  • LLM Stats: benchmarking hub tracking 500+ models across 50+ benchmarks, with side-by-side arenas; handy for sanity-checking vendor claims.

πŸ› οΈ Build of the Week (rotating)

Rover Robotic Platform: a modular rover that splits the work across three boards: Raspberry Pi for web connectivity, ESP32 for the embedded screen, and Arduino for motor control.

  • Difficulty: Intermediate
  • Parts: Raspberry Pi, ESP32, Arduino, motor driver, chassis and drivetrain
  • Why we like it: it demonstrates the right tool per job; instead of forcing one board to do everything, each MCU handles the layer it is actually good at.

πŸ“š From the Blog

πŸ˜€ The Bot Says…

The SpaceX AI phone is the first gadget to ship in quantum superposition: investors reportedly saw it, Musk says it does not exist, and it will stay both real and fake until someone collapses the wave function with a press release.


That’s all for today! If your website starts charging robots rent this fall, now you know who sent the invoice. Reply and tell us: would you block, charge, or welcome AI crawlers on your site?