Why it matters
Everyone in the room ships production inference, so the war stories about latency, reliability, and GPU bills are first-hand.
The tokenmaxxing angle
Inference economics is the whole game here — Baseten's pitch is cost-efficient serving, which is token math at the infrastructure layer.
From the organizers
Held at Bar Cima on West 39th Street, with Baseten citing customers like Zed and Amp as inference-speed case studies.