Why it matters
Self-hosting inference on Kubernetes is the main alternative to per-token APIs; an AWS solutions architect walking through model selection, fine-tuning, and production EKS infrastructure shows what owning that stack really involves.
The tokenmaxxing angle
Scaling inference on EKS is a build-vs-buy token economics decision: GPU node costs versus per-token API pricing. The talk's focus on model selection and production infrastructure is exactly the math behind that trade-off.
From the organizers
Speakers include AWS Solutions Architect Adil Can on 'Scaling Inference on Amazon EKS' plus Tech Holding's VP of Solutions Architecture; the page notes up to $50,000 in AWS funding available to offset project costs via the partner.