AI’s memory tax comes due

By Mark July 5, 2026 7 min read 0 views

😁 Hello, super humans! The market spent the first half of 2026 betting that AI demand was bottomless, and this week it flinched. Chip stocks took a hard fall on both sides of the Pacific, and the culprit was not the GPU: it was the memory glued to it. Grab a coffee, because today we follow the money down to the silicon.

📰 Quick Signals

🧠 AI: OpenAI previewed the GPT-5.6 family (Sol, Terra, Luna) and said Sol will run on Cerebras at up to 750 tokens per second in July (OpenAI).
🤖 Robotics: Agility Robotics is eyeing a Nasdaq debut through a roughly $2.5B SPAC merger, with its Digit robot past 65,000 operating hours across nine sites (Tech Times).
💻 Programming: TypeScript 7.0 reached Release Candidate: the compiler is now a native Go port shipping as tsc, roughly 10x faster than 6.0 (Visual Studio Magazine).
⚡ Electronics: UCLA’s Samueli School launched a $125M semiconductor hub with industry partners to speed up AI chip design and fabrication (IEEE Spectrum).
📡 Telecom: Rocket Lab agreed to buy Iridium for about $8B ($54 per share) to become a vertically integrated launch-to-constellation operator (Rocket Lab).

🔍 The Big Story: The AI chip rout and the memory bottleneck under the hood

If you build anything that touches GPUs, this week mattered: the market started pricing in the possibility that the AI hardware buildout is running ahead of the returns, and the sell-off zeroed in on memory rather than compute.

What happened: The chip slide that began on Wall Street spread into Asia, where Samsung Electronics fell more than 7% and SK Hynix dropped over 9% at the open, dragging the broader tech tape down with them (CNBC). It capped a wild stretch for memory names: Micron had already whipsawed through a volatile week around its earnings as investors argued over whether AI demand justifies the multiples (CNBC). The trigger cited most often was a report that SK Hynix is slowing its high-bandwidth memory expansion, plus a more hawkish rate backdrop.

The details: The reason memory, not the GPU die, sits at the center of this is bandwidth. An AI accelerator is only as fast as the data you can feed it, and high-bandwidth memory (HBM) is the stack of DRAM sitting next to the compute die that supplies that firehose. Modern training and inference are frequently memory-bound: the multiply-accumulate units idle while they wait on weights and activations to arrive from memory. That is why a handful of HBM suppliers, chiefly SK Hynix, Samsung, and Micron, quietly gate the whole AI supply chain. When the market hears “HBM capacity is being throttled,” it reads two contradictory things at once: near-term scarcity keeps prices high, but a deliberate slowdown hints the builders expect demand to cool. Both readings hit the same stocks.

Important

Our take: Call it the memory tax on AI. For most builders this reset is healthy, not scary. If HBM pricing normalizes, the cost per token of running real models drops, and the ceiling on what a small team can self-host rises. What we would actually watch is not the stock chart but memory bandwidth per dollar, because that number, more than any GPU FLOPS headline, decides whether your workload runs on a rented cluster or on hardware you own. The GPU gets the glory; the memory bus writes the check.

🗞️ More News

🧠 AI

Anthropic redeployed Claude Fable 5 globally on July 1 after US export controls lifted, adding a new cybersecurity classifier and a HackerOne program for jailbreak reports (Anthropic).
Anthropic put Claude Science into public beta, a research workbench wired to 60-plus databases across genomics, proteomics, and cheminformatics (Anthropic).
OpenAI shipped new usage analytics and tighter spend controls for ChatGPT Enterprise admins (OpenAI).
OpenAI detailed work on improving health intelligence in ChatGPT (OpenAI).
OpenAI introduced its Partner Network for the ecosystem building on its models (OpenAI).
Google recapped a busy June of AI updates across its product line (Google).

🤖 Robotics

Figure AI reported hitting a sustained rate of one Figure 03 humanoid per hour at its BotQ line after a 24x scale-up (VaaSBlock).
Agility’s Digit has moved more than 100,000 totes at a GXO Logistics site and is running at Mercado Libre and Toyota Canada under Robots-as-a-Service deals (Agility Robotics).
A new wave of hyper-realistic Chinese humanoids drew attention (and some unease) with lifelike faces and lip-sync at recent shows (The Register).

💻 Programming

Python’s free-threaded build is now officially supported, not experimental, after the Steering Council accepted PEP 779; single-threaded overhead is down to roughly 10 to 15% (python.org).
Django 6.0 landed with built-in background tasks and Content Security Policy support, running on Python 3.12 through 3.14 (Django docs).
Astral, the team behind uv, pushed its ty type checker toward stable, aiming to do for type checking what uv did for packaging (InfoWorld).

⚡ Electronics

Even amid the rout, analysts note Micron’s HBM capacity is effectively sold out, keeping a stubborn bull case alive against the bears (Investing.com).
Raspberry Pi’s RP2350 continues to spread onto new boards like the Pico 2 and SparkFun’s Pro Micro, with microcontroller shipments now outpacing Pi single-board computers (SparkFun).
Deloitte’s 2026 outlook still frames the year as record semiconductor revenue driven by AI, a reminder the long-run curve and this week’s tape can disagree (Deloitte).

📡 Telecom

Amazon is reportedly weighing a roughly $11.6B stake in Globalstar as it prepares to begin offering satellite internet later this year (5Gstore Market Watch).
The MWC 2026 agenda pushed 6G research toward terahertz spectrum, betting THz is where the next throughput and positioning gains come from (The Fast Mode).
Satellite connectivity dominated MWC, with operators converging on multi-orbit strategies that blend LEO, MEO, and GEO for a single user session (Counterpoint Research).

👨‍💻 Code Corner

This week’s Big Story is really about feeding compute fast enough, and Python’s newly supported free-threaded build is the software mirror of that idea: real parallelism instead of GIL-serialized pretend-parallelism. Here is a CPU-bound test you can run on the free-threaded interpreter (python3.14t) to see threads actually use multiple cores.

import time
from concurrent.futures import ThreadPoolExecutor

def crunch(n: int) -> int:
    total = 0
    for i in range(n):
        total += i * i
    return total

if __name__ == "__main__":
    work = [20_000_000] * 8
    start = time.perf_counter()
    with ThreadPoolExecutor(max_workers=8) as pool:
        list(pool.map(crunch, work))
    print(f"8 threads finished in {time.perf_counter() - start:.2f}s")

Run it once on a normal build and once on the free-threaded build; on GIL Python the threads take turns, on python3.14t they run in parallel.

Tip

The free-threaded interpreter is a separate build (python3.14t), not a flag on the default one. Check your C extensions first: only about half of the top PyPI packages ship free-threaded wheels so far, so a mixed environment can silently fall back or crash.

🧰 Toolbox

typescript-go: the staging repo for the native Go port of the TypeScript compiler now hitting RC; worth a read to see how the 10x came about.
ty: Astral’s blazing-fast Python type checker, from the makers of uv and ruff.
Raspberry Pi Pico 2: the $5 RP2350 board with dual Arm or RISC-V cores, double the SRAM and flash, and new security features.
Claude Science: a research workbench that connects an assistant to 60-plus scientific databases and tools.
ExploitGym: a UC Berkeley and frontier-lab benchmark for measuring model cyber capability, referenced in OpenAI’s GPT-5.6 preview.

🛠️ Build of the Week (rotating)

A pocket bandwidth meter with the Pico 2: a tiny bench tool that streams samples from an ADC into a ring buffer and plots throughput on a small OLED, a hands-on way to feel the memory-versus-compute tradeoff from today’s Big Story at microcontroller scale.

Difficulty: Intermediate
Parts: Raspberry Pi Pico 2 (RP2350), an SSD1306 OLED, a simple analog sensor, breadboard and jumpers
Why we like it: it makes an abstract idea (you are limited by how fast data moves, not how fast you compute) something you can watch on a screen for under $15.

📚 From the Blog

What Electricity Actually Is: Charge, Current, and Voltage: charge, current, and voltage from first principles; the physics under every DRAM cell and power rail in today’s Big Story.
CCTV in 2026: From Dumb Cameras to Intelligent Sensors: the opener to our Intelligent Video Analytics series, on how cameras became AI sensors.
CloudEvents 1.0: A Universal Language for Your Events: a simple, consistent way to describe event data across distributed systems.

😀 The Bot Says…

The GPU gets the standing ovation. The memory chip carries the whole band, roadies included, and gets billed as “and friends.”

That’s all for today! Reply and tell us: is the AI hardware boom overbuilt, or is this just the market catching its breath?