KubeCon + CloudNativeCon NA 2025 Recap – O’Reilly

December 8, 2025

12

As to be anticipated, AI was in every single place at KubeCon + CloudNativeCon in Atlanta this 12 months—however the true power was targeted on one thing much less headline-grabbing and extra foundational: fixing on a regular basis operational challenges. Amid the excitement about clever programs and futuristic workflows, practitioners remained grounded in pressing, sensible work—managing instrument sprawl, tackling Kubernetes complexity, and confronting the chaos of “day two” operations.

Operations Stays Human Centered

There’s actual promise in AI, particularly in areas like automation and observability. However many groups are nonetheless determining methods to combine AI into legacy programs which are already underneath stress. What stood out most was how human-centered the cloud native neighborhood stays—dedicated to lowering toil, enhancing developer expertise, and constructing resilient platforms that work when the pager goes off at 3am.

A main instance of this grounded perspective got here from Adobe’s Joseph Sandoval. In his keynote, ”Most Acceleration: Cloud Native on the Pace of AI,” Sandoval acknowledged the dramatic potential of AI-native infrastructure—however made clear it’s not only a tooling revolution. “We’ve entered the agent economic system,” he stated, describing programs that may “observe, purpose, and act.” However to assist these workloads, we should evolve Kubernetes itself: “We’re transferring from tracing requests to tracing reasoning—from metrics to that means.” Kubernetes, he argued, has turn out to be the muse for AI, if unintentionally, providing the pliability and management these programs demand.

This potential is already seen in the true world: Niantic’s Pokémon GO group, for instance, demonstrated how they use Kubernetes and Kubeflow to run a world machine studying–powered scheduling platform that predicts participant participation and orchestrates in-game occasions throughout thousands and thousands of places. However autonomy, Sandoval cautioned, solely works when it’s constructed on operational belief—smarter scheduling, adaptive orchestration, and rock-solid safety boundaries.

Niantic’s Andy Zhang shares “Scaling Geo-Temporal ML: How Pokemon Go Optimizes International Gameplay With Kubernetes and Kubeflow” at KubeCon + CloudNativeCon NA 2025, November 11. Picture courtesy of the Cloud Native Computing Basis.

This name to strengthen foundational infrastructure echoed throughout the occasion, particularly in platform engineering discussions. Abby Bangser’s keynote framed platform engineering not as yet one more revolution however as a response to complexity: “We construct platforms to scale back the complexity and scope for these constructing on high, to not give them new programs to study.” Nice platforms, she argued, are judged not by shiny structure diagrams however by how successfully they empower builders. Inside platforms turn out to be an economic system of scale—bespoke to a enterprise but broadly enabling. And most significantly: “The one success is a simpler and happier improvement group.” (When you’re serious about going deeper, take a look at her report, Platform as a Product, coauthored with Daniel Bryant, Colin Humphreys, and Cat Morris.)

Formidable AI Requires Sensible Engineering

All through the convention, this emphasis on developer expertise and sensible operations persistently overshadowed AI hype. That context made the CNCF’s launch of Kubernetes AI Conformance really feel particularly well timed. “As AI strikes into manufacturing, groups want constant infrastructure they will depend on,” stated Chris Aniszczyk, CNCF’s CTO. The objective is to create guardrails so AI workloads behave predictably throughout totally different environments. This maturity is already seen—KServe’s commencement to incubating standing is an indication that foundational work is step by step catching as much as AI ambition.

KubeCon 2025 registration — Registration at KubeCon + CloudNativeCon NA 2025, November 10. Picture courtesy of the Cloud Native Computing Basis.

In the meantime, the hallway conversations have been crammed with a really actual and rapid concern: the introduced retirement of Ingress NGINX, which at the moment runs in almost half of all Kubernetes clusters. Groups instantly needed to reckon with crucial migration planning, a reminder that whereas we speak about constructing clever programs of the longer term, our operational actuality continues to be deeply rooted in managing very important however growing old elements at this time.

There have been actually two converging tales being advised. Platform engineering talks targeted on hard-earned classes and production-hardened architectures. Audio system from Capital One, for instance, demonstrated how their inner platform, Dragon, advanced from considerate iteration and real-world adaptation over time to a scalable, resilient platform. In the meantime, the complexities of the rising AI area have been highlighted in periods like “Navigating the AI/ML Networking Maze in Kubernetes: Classes from the Trenches,” which detailed how AI/ML workloads are pushing HPC networking ideas like RDMA and MPI into Kubernetes, making a “new studying curve” and discussing the “intricacies of integrating specialised {hardware}.”

The true intrigue is watching these worlds collide in actual time: platform engineers being requested to operationalize AI workloads they barely belief, and AI groups realizing their fashions require extra than simply compute—they nonetheless want to resolve issues like site visitors routing, id, observability, and failure isolation.

The Ecosystem Continues to Mature

Because the ecosystem evolves, some clear frontrunners are rising. eBPF (particularly by way of Cilium) has turn out to be the spine of contemporary networking and observability. Gateway API has matured into a strong next-generation different to Kubernetes Ingress, with broad assist throughout fashionable ingress and repair mesh suppliers. OpenTelemetry is turning into the usual for accumulating indicators at scale. Dynamic Useful resource Allocation (DRA) and Mannequin Context Protocol (MCP) are two crucial Kubernetes API extensions clearly rising as key enablers for the brand new technology of AI-driven workloads. These aren’t simply instruments—they’re foundations for a future the place infrastructure should be extra clever and extra manageable without delay.

Solutions showcase at KubeCon 2025 — The Options Showcase exhibit corridor at KubeCon + CloudNativeCon NA 2025, November 11. Picture courtesy of the Cloud Native Computing Basis.

It’s becoming that the CNCF marked its tenth birthday at this KubeCon—10 years of evolving an ecosystem formed not by flashy developments however by constant, collaborative tooling that quietly powers at this time’s most crucial platforms. With over 200 initiatives underneath its umbrella, the muse now turns towards the AI-native future with the identical mindset: construct steady layers first, then empower innovation on high. The trail ahead gained’t come from yet one more algorithm, agent, or abstraction layer however from the much less glamorous, deeply essential work: derisking complexity, stabilizing orchestration layers, and enabling the groups who stay in manufacturing.

The groups slogging by means of ingress controller deprecations at this time are constructing the belief wanted for tomorrow’s agent-native programs. Earlier than we are able to hand over actual accountability to AI brokers, we’d like platforms resilient sufficient to include their failures—and versatile sufficient to allow their success. The following occasion, KubeCon & Cloud NativeCon Europe, takes place in Amsterdam March 23–26 within the new 12 months, and we’re trying ahead to seeing extra periods that additional this dialog.

KubeCon + CloudNativeCon NA 2025 Recap – O’Reilly

Operations Stays Human Centered

Formidable AI Requires Sensible Engineering

The Ecosystem Continues to Mature

Related Articles

America’s Cyber Technique Has a Finances Downside – The Cipher Temporary

20+ Amazon Objects We Order On Repeat

My Boo Which means: Usher & Alicia Keys Defined

LEAVE A REPLY Cancel reply

Latest Articles

America’s Cyber Technique Has a Finances Downside – The Cipher Temporary

20+ Amazon Objects We Order On Repeat

My Boo Which means: Usher & Alicia Keys Defined

7 Greatest Ethics And Compliance Studying Software program For 2026

Will Compton and Taylor Lewan wish to deal with you throughout the NFL draft