2026-03-17 08:55:52
This redesign is more than a style update—it’s a step toward clearer technical communication and better user experience. Try the new HAMi website at https://project-hami.io and submit issues here.
Over the past two months, I conducted a thorough refactor of the documentation website (see GitHub). Externally, it looks like a “visual redesign”, but from the perspective of community maintainers and content builders, it’s a comprehensive upgrade of information architecture, content system, and frontend experience.
This article aims to systematically explain three things: why we did this refactor, what exactly changed, and what these changes mean for the HAMi community.
HAMi is a CNCF-hosted open source project initiated and contributed by Dynamia, with growing influence in GPU virtualization, heterogeneous compute scheduling, and AI infrastructure. The community content is expanding, and user types are becoming more diverse: from first-time visitors to engineers and enterprise users seeking deployment docs, architecture diagrams, case studies, and ecosystem information.
The original site was functional, but as content grew, several issues became apparent:
For a fast-evolving open source community, the website is not just a “place for docs”, but the public interface of the community. It needs to serve as project introduction, knowledge gateway, adoption proof, community connector, and brand expression.
So the goal of this refactor was clear: not just superficial beautification, but to truly upgrade the website into HAMi’s systematic community entry point.
This update was not a single-point change, but a series of systematic improvements.
The most obvious change is the homepage.
We redesigned the homepage structure, moving away from simply stacking content blocks, and instead organizing the page around the main narrative: “Project Positioning → Core Capabilities → Ecosystem Entry → Content Accumulation → Community Trust”.
Specifically, the homepage received several key upgrades:
These changes include Hero animations and atmosphere layers, research/story sections, new resource entry sections, refreshed CTAs, unified background design, and ongoing reduction of visual noise. Together, they solve a core problem: enabling visitors to understand what HAMi is and why it’s worth exploring further within seconds.
Key diagrams were redrawn for clearer technical communication. This helps users grasp HAMi’s role in AI infrastructure.

For HAMi, this change is critical. The community faces not just a single feature, but a set of system-level challenges involving Kubernetes, schedulers, GPU Operators, heterogeneous devices, and enterprise platforms. Improved diagrams make the website a better technical entry point.
Another important direction was strengthening the “community proof” layer.
Many open source project sites fall into the trap of having complete docs, but users can’t tell if the project is truly adopted, if the community is active, or if the ecosystem is expanding. The HAMi website redesign consciously addresses this.



Blog cards, lists, and metadata were unified for easier reading and sharing. Blogs are now a core communication layer.

Navigation, card layouts, footer, and search were improved for smoother mobile browsing.

Footer layout was enhanced for better navigation and credibility. Built-in search replaced unreliable external solutions, improving content accessibility.


From screenshots, it looks like “the website looks better”. But from a community-building perspective, its significance is deeper.
First, HAMi’s external expression is more systematic.
The website is no longer just a collection of scattered pages, but is forming a complete narrative chain: users can understand project value from the homepage, capability details from docs, practical paths from blogs, and community impact from ecosystem modules.
Second, community content assets are reorganized.
Previously, valuable articles, diagrams, and explanations existed but were hard to find. Now, through homepage sections, navigation, and search refactor, these contents are more effectively connected.
Third, HAMi’s community image is more mature.
A mature open source project needs not just an active code repository, but clear, stable, and sustainable website expression. Structure, style, and usability are part of the community’s engineering capability.
Fourth, this lays the foundation for expanding case studies, adopters, contributors, and ecosystem content.
With the framework sorted, adding more case studies, collaboration entry points, or showcasing more adopters and partners will be more natural and easier for users to understand.
In summary, I believe this refactor got three things right:
These may not be as flashy as launching a new feature, but they directly impact content dissemination, user comprehension, and the project’s long-term image.
For infrastructure projects like HAMi, technical capability is fundamental, but clearly communicating, organizing, and continuously presenting that capability is also a form of infrastructure.
This HAMi documentation and website refactor is essentially an upgrade to the community’s “expression layer” infrastructure.
It improves visual and reading experience, reorganizes content, homepage narrative, search paths, mobile access, and community signal display. Homepage redesign, architecture diagram redraw, unified blog style, mobile optimization, enhanced footer, and switching from external to built-in search together constitute a true “refactor”.
Externally, it helps more people quickly understand HAMi; internally, it provides a stable platform for the community to accumulate case studies, expand the ecosystem, and serve adopters and contributors.
The website is not an accessory to the open source community, but part of its long-term influence. HAMi’s redesign is about taking this seriously.
If you’re interested in Kubernetes GPU virtualization, add me on WeChat jimmysong or scan the QR code below.
2026-03-15 11:34:06
AI is quietly reshaping the infrastructure landscape, and GTC 2026 may become a key node in this transformation.
Next week, one of the most important technology conferences in the AI industry, NVIDIA GTC 2026, will be held in San Jose, USA.
For many people, GTC is just a GPU technology conference. But if you follow the development of the AI industry over the past few years, you’ll find an interesting phenomenon:
Many important narratives about AI infrastructure are gradually taking shape at GTC.
From CUDA, DGX, to AI Factory, and most recently Jensen Huang’s proposed AI Five-Layer Cake, NVIDIA is constantly attempting to redefine the computing infrastructure of the AI era.
This is why many people call GTC:
AI’s “Woodstock.”

This year’s GTC (March 16-19) is expected to cover various levels of the AI stack, including:
According to NVIDIA’s official blog, this year’s keynote will focus on the complete AI stack from chips to applications.
If we put these signals together, we can actually see a larger trend:
AI is transforming from an “applied technology” into “infrastructure.”
From a longer time scale, the technological revolutions in human history are essentially infrastructure revolutions.
We usually divide industrial revolutions into four times.
In the table below, you can see the infrastructure corresponding to each industrial revolution:
| Industrial Revolution | Infrastructure |
|---|---|
| Steam Revolution | Steam Engine |
| Electrical Revolution | Power Grid |
| Digital Revolution | Computer |
| Internet Era | Network |
The steam engine allowed humans to utilize mechanical power on a large scale for the first time. Production no longer relied on human or animal power, but on machines.
Electricity changed not only the source of power, but also the organization of production. Assembly lines, large-scale manufacturing, and modern industrial systems are all built on the foundation of the power grid.
Computers allowed information to be processed digitally. Software became a production tool.
The internet connects all computers together. Cloud computing transforms computing resources into infrastructure. And AI gives machines a certain degree of “cognitive ability.”
If we observe these industrial revolutions, we discover a pattern:
Each industrial revolution produces a new General Purpose Infrastructure.
And AI is likely to become the next-generation infrastructure.
NVIDIA even directly stated in a recent article:
AI is essential infrastructure, like electricity and the internet.
In other words:
AI is no longer just an applied technology, but a new factor of production.
Recently, Jensen Huang proposed a very interesting concept: AI Five-Layer Cake.

AI is broken down into five layers:
This model actually illustrates one thing:
AI is a complete industrial system.
Jensen Huang even described AI at Davos as:
“One of the largest-scale infrastructure constructions in human history.”
This year’s GTC is expected to release several important directions.
The focus of AI in the past was training. But the main load of AI in the future is likely to be Inference.
Analysts expect that by 2030, 75% of computing demand in the AI data center market will come from inference.
The past AI model was:
User → Model → Answer
The Agent model is more complex:
User → Agent → Tools → Model → Action
The flowchart below shows the main interaction paths in the Agent model:
AI is no longer just answering questions, but executing tasks.
Recent media reports suggest that NVIDIA may launch a new Agent platform: NemoClaw, aimed at helping enterprises deploy AI Agents.
If this project is truly released, it means NVIDIA’s stack will become the following structure:
This is actually a complete AI stack.
The emergence of Agents brings new computing workload issues.
Past AI workloads were mainly:
But Agents bring a third type of workload:
Agent Workloads
The figure below shows the diverse workload types related to Agents:
The characteristic of this workload is highly fragmented. GPUs are no longer occupied for long periods, but rather face many small requests. This poses new challenges for infrastructure.
For the past few years, I’ve been thinking about a question:
What is AI-native infrastructure?
It is clearly not just “Kubernetes with GPUs.” I’m more inclined to believe it needs to possess several characteristics.
In the cloud computing era, CPU is the core resource. In the AI era, GPU is the core resource.
Real-world AI chips are not limited to NVIDIA:
Future AI infrastructure must be able to manage heterogeneous computing.
GPU is a very expensive resource. If it cannot be shared, utilization will be very low. This is why GPU virtualization and slicing are becoming increasingly important.
AI scheduling includes not only traditional CPU and Memory, but also:
GPU
VRAM
Topology
Bandwidth
Combining the above trends, the future AI stack may present the following structure:
This structure is very close to NVIDIA’s Five-Layer Cake.
Combining signals from GTC, AI Factory, Agents, and AI Five-Layer Cake, we can see a very obvious trend:
AI is rewriting computing infrastructure.
Future competition may not just be “who has the best model,” but:
Who has the best AI Infrastructure.
Just like the past few decades:
The future may be:
AI Infrastructure determines intelligence capability.
If we stretch the time scale a bit longer, we may be in a new historical stage.
AI is no longer just a technological tool. It is becoming new infrastructure.
Just like:
And AI-native infrastructure is likely to become one of the most important technology directions for the next decade.
2026-02-13 22:32:46
The future of GPU scheduling isn’t about whose implementation is more “black-box”—it’s about who can standardize device resource contracts into something governable.

Have you ever wondered: why are GPUs so expensive, yet overall utilization often hovers around 10–20%?

This isn’t a problem you solve with “better scheduling algorithms.” It’s a structural problem - GPU scheduling is undergoing a shift from “proprietary implementation” to “open scheduling,” similar to how networking converged on CNI and storage converged on CSI.
In the HAMi 2025 Annual Review, we noted: “HAMi 2025 is no longer just about GPU sharing tools—it’s a more structural signal: GPUs are moving toward open scheduling.”
By 2025, the signals of this shift became visible: Kubernetes Dynamic Resource Allocation (DRA) graduated to GA and became enabled by default, NVIDIA GPU Operator started defaulting to CDI (Container Device Interface), and HAMi’s production-grade case studies under CNCF are moving “GPU sharing” from experimental capability to operational excellence.
This post analyzes this structural shift from an AI Native Infrastructure perspective, and what it means for Dynamia and the industry.
In multi-cloud and hybrid cloud environments, GPU model diversity significantly amplifies operational costs. One large internet company’s platform spans H200/H100/A100/V100/4090 GPUs across five clusters. If you can only allocate “whole GPUs,” resource misalignment becomes inevitable.
“Open scheduling” isn’t a slogan—it’s a set of engineering contracts being solidified into the mainstream stack.
Before: GPUs were extended resources. The scheduler didn’t understand if they represented memory, compute, or device types.

Now: Kubernetes DRA provides objects like DeviceClass, ResourceClaim, and ResourceSlice. This lets drivers and cluster administrators define device categories and selection logic (including CEL-based selectors), while Kubernetes handles the full loop: match devices → bind claims → place Pods onto nodes with access to allocated devices.
Even more importantly, Kubernetes 1.34 stated that core APIs in the resource.k8s.io group graduated to GA, DRA became stable and enabled by default, and the community committed to avoiding breaking changes going forward. This means the ecosystem can invest with confidence in a stable, standard API.
Before: Device injection relied on vendor-specific hooks and runtime class patterns.
Now: The Container Device Interface (CDI) abstracts device injection into an open specification. NVIDIA’s Container Toolkit explicitly describes CDI as an open specification for container runtimes, and NVIDIA GPU Operator 25.10.0 defaults to enabling CDI on install/upgrade—directly leveraging runtime-native CDI support (containerd, CRI-O, etc.) for GPU injection.
This means “devices into containers” is also moving toward replaceable, standardized interfaces.
On this standardization path, HAMi’s role needs redefinition: it’s not about replacing Kubernetes—it’s about turning GPU virtualization and slicing into a declarative, schedulable, governable data plane.
HAMi’s core contribution expands the allocatable unit from “whole GPU integers” to finer-grained shares (memory and compute), forming a complete allocation chain:
This transforms “sharing” from ad-hoc “it runs” experimentation into engineering capability that can be declared in YAML, scheduled by policy, and validated by metrics.
HAMi’s scheduling doesn’t replace Kubernetes—it uses a Scheduler Extender pattern to let the native scheduler understand vGPU resource models:
This architecture positions HAMi naturally as an execution layer under higher-level “AI control planes” (queuing, quotas, priorities)—working alongside Volcano, Kueue, Koordinator, and others.

CNCF public case studies provide concrete answers: in a hybrid, multi-cloud platform built on Kubernetes and HAMi, 10,000+ Pods run concurrently, and GPU utilization improves from 13% to 37% (nearly 3×).

Here are highlights from several cases:
These cases demonstrate a consistent pattern: GPU virtualization becomes economically meaningful only when it participates in a governable contract—where utilization, isolation, and policy can be expressed, measured, and improved over time.
From Dynamia’s perspective (and as VP of Open Source Ecosystem), the strategic value of HAMi becomes clear:

This boundary is the foundation for long-term trust—project and company offerings remain separate, with commercial distributions and services built on the open source project.
The internal alignment memo recommends a bilingual approach:
First layer: Lead globally with “GPU virtualization / sharing / utilization” (Chinese can directly use “GPU virtualization and heterogeneous scheduling,” but English first layer should avoid “heterogeneous” as a headline)
Second layer: When users discuss mixed GPUs or workload diversity, introduce “heterogeneous” to confirm capability boundaries—never as the opening hook
Core anchor: Maintain “HAMi (project and community) ≠ company products” as the non-negotiable baseline for long-term positioning
DaoCloud’s case study already set vendor-agnostic and CNCF toolchain compatibility as hard constraints, framing vendor dependency reduction as a business and operational benefit—not just a technical detail. Project-HAMi’s official documentation lists “avoid vendor lock” as a core value proposition.
In this context, the right commercialization landing isn’t “closed-source scheduling”—it’s productizing capabilities around real enterprise complexity:
My strong judgment: over the next 2–3 years, GPU scheduling competition will shift from “whose implementation is more black-box” to “whose contract is more open.”
The reasons are practical:
These signals suggest that heterogeneity will grow: mixed accelerators, mixed clouds, mixed workload types.
Low-latency inference tiers (beyond just GPUs) will force resource scheduling toward “multi-accelerator, multi-layer cache, multi-class node” architectural design—scheduling must inherently be heterogeneous.
In this world, “open scheduling” isn’t idealism—it’s risk management. Building schedulable governable “control plane + data plane” combinations around DRA/CDI and other solidifying open interfaces, ones that are pluggable, multi-tenant governable, and co-evolvable with the ecosystem—this looks like the truly sustainable path for AI Native Infrastructure.
The next battleground isn’t “whose scheduling is smarter”—it’s “who can standardize device resource contracts into something governable.”
When you place HAMi 2025 back in the broader AI Native Infrastructure context, it’s no longer just the year of “GPU sharing tools”—it’s a more structural signal: GPUs are moving toward open scheduling.

The driving forces come from both ends:
For Dynamia, HAMi’s significance has transcended “GPU sharing tool”: it turns GPU virtualization and slicing into declarative, schedulable, measurable data planes—letting queues, quotas, priorities, and multi-tenancy actually close the governance loop.
2026-02-08 20:20:05
“The best way to learn AI is to start building. These resources will guide your journey.”

In my ongoing effort to keep the AI Resources list focused on production-ready tools and frameworks, I’ve removed 44 collection-type projects—courses, tutorials, awesome lists, and cookbooks.
These resources aren’t gone—they’ve been moved here. This post is a curated collection of those educational materials, organized by type and topic. Whether you’re a complete beginner or an experienced practitioner, you’ll find something valuable here.
My AI Resources list now focuses on concrete tools and frameworks—projects you can directly use in production. Collections, while valuable, serve a different purpose: education and discovery.
By separating them, I:
Awesome lists are community-curated collections of the best resources. They’re perfect for discovering new tools and staying updated.
Structured learning paths from universities and tech companies.
Machine Learning for Beginners
Practical code examples and recipes.
In-depth guides on specific topics.
Reusable templates and workflows.
Academic and evaluation resources.
System Prompts and Models of AI Tools
Agent frameworks and production tools remain in the AI Resources list, including:
These are functional tools you can use to build applications, not educational collections. They belong in the AI Resources list.
I removed 44 collection-type projects from the AI Resources list to keep it focused on production tools:
These resources remain incredibly valuable for learning and discovery. They just serve a different purpose than the production-focused tools in my AI Resources list.
Next Steps:
Acknowledgments: This collection was compiled during my AI Resources cleanup initiative. Special thanks to all the maintainers of these awesome lists, courses, and collections for their invaluable contributions to the AI community.
2026-02-08 16:00:00
“If I have seen further, it is by standing on the shoulders of giants.” — Isaac Newton

In the excitement surrounding LLMs, vector databases, and AI agents, it’s easy to forget that modern AI didn’t emerge from a vacuum. Today’s AI revolution stands upon decades of infrastructure work—distributed systems, data pipelines, search engines, and orchestration platforms that were built long before “AI Native” became a buzzword.
This post is a tribute to those traditional open source projects that became the invisible foundation of AI infrastructure. They’re not “AI projects” per se, but without them, the AI revolution as we know it wouldn’t exist.
| Era | Focus | Core Technologies | AI Connection |
|---|---|---|---|
| 2000s | Web Search & Indexing | Lucene, Elasticsearch | Semantic search foundations |
| 2010s | Big Data & Distributed Computing | Hadoop, Spark, Kafka | Data processing at scale |
| 2010s | Cloud Native | Docker, Kubernetes | Model deployment platforms |
| 2010s | Stream Processing | Flink, Storm, Pulsar | Real-time ML inference |
| 2020s | AI Native | Transformers, Vector DBs | Built on everything above |
Before we could train models on petabytes of data, we needed ways to store, process, and move that data.
GitHub: https://github.com/apache/hadoop
Hadoop democratized big data by making distributed computing accessible. Its HDFS filesystem and MapReduce paradigm proved that commodity hardware could process web-scale datasets.
Why it matters for AI:
GitHub: https://github.com/apache/kafka
Kafka redefined data streaming with its log-based architecture. It became the nervous system for real-time data flows in enterprises worldwide.
Why it matters for AI:
GitHub: https://github.com/apache/spark
Spark brought in-memory computing to big data, making iterative algorithms (like ML training) practical at scale.
Why it matters for AI:
Before RAG (Retrieval-Augmented Generation) became a buzzword, search engines were solving retrieval at scale.
GitHub: https://github.com/elastic/elasticsearch
Elasticsearch made full-text search accessible and scalable. Its distributed architecture and RESTful API became the standard for search.
Why it matters for AI:
GitHub: https://github.com/opensearch-project/opensearch
When AWS forked Elasticsearch, it ensured search infrastructure remained truly open. OpenSearch continues the mission of accessible, scalable search.
Why it matters for AI:
The evolution from relational databases to vector databases represents a paradigm shift—but both have AI relevance.
Why they matter for AI:
When Docker and Kubernetes emerged, they weren’t built for AI—but AI couldn’t scale without them.
GitHub: https://github.com/kubernetes/kubernetes
Kubernetes became the operating system for cloud-native applications. Its declarative API and controller pattern made it perfect for AI workloads.
Why it matters for AI:
Istio (2016), Knative (2018) - Service mesh and serverless platforms that proved:
Why they matter for AI:
API gateways weren’t designed for AI, but they became the foundation of AI Gateway patterns.
These API gateways solved rate limiting, auth, and routing at scale. When LLMs emerged, the same patterns applied:
AI Gateway Evolution:
Traditional API Gateway (2010s)
↓
Rate Limiting → Token Bucket Rate Limiting
Auth → API Key + Organization Management
Routing → Model Routing (GPT-4 → Claude → Local Models)
Observability → LLM-specific Telemetry (token usage, cost)
↓
AI Gateway (2024)
Why they matter for AI:
Data engineering needs pipelines. ML engineering needs pipelines. AI agents need workflows.
GitHub: https://github.com/apache/airflow
Airflow made pipeline orchestration accessible with its DAG-based approach. It became the standard for ETL and data engineering.
Why it matters for AI:
Modern workflow platforms that evolved from Airflow’s foundations:
Why they matter for AI:
Before we could train on massive datasets, we needed formats that supported ACID transactions and schema evolution.
These table formats brought reliability to data lakes:
Why they matter for AI:
What do all these projects have in common?
Modern “AI Native” infrastructure didn’t replace these projects—it builds on them:
| Traditional Project | AI Native Evolution | Example |
|---|---|---|
| Hadoop HDFS | Distributed model storage | HDFS for datasets, S3 for checkpoints |
| Kafka | Real-time feature pipelines | Kafka → Feature Store → Model Serving |
| Spark ML | Distributed ML training | MLlib → PyTorch Distributed |
| Elasticsearch | Vector search | ES → Weaviate/Qdrant/Milvus |
| Kubernetes | ML orchestration | K8s → Kubeflow/KServe |
| Istio | AI Gateway service mesh | Istio → LLM Gateway with mTLS |
| Airflow | ML pipeline orchestration | Airflow → Prefect/Flyte for ML |
This post honors these projects, but we’re also removing them from our AI Resources list. Here’s why:
They’re not “AI Projects”—they’re foundational infrastructure.
But their absence doesn’t diminish their importance.
By removing them, we acknowledge that:
The next time you:
Remember: You’re standing on the shoulders of Hadoop, Kafka, Elasticsearch, Kubernetes, and countless others. They built the roads we now drive on.
Just as Hadoop and Kafka enabled modern AI, today’s AI infrastructure will become tomorrow’s foundation:
The cycle continues. The giants of today will be the foundations of tomorrow.
As we clean up our AI Resources list to focus on AI-native projects, we don’t forget where we came from. Traditional big data and cloud native infrastructure made the AI revolution possible.
To the Hadoop committers, Kafka maintainers, Kubernetes contributors, and all who built the foundation: Thank you.
Your work enabled ChatGPT, enabled Transformers, enabled everything we now call “AI.”
Standing on your shoulders, we see further.
Acknowledgments: This post was inspired by the need to refactor our AI Resources list. The 27 projects mentioned here are being removed—not because they’re unimportant, but because they deserve their own category: The Foundation.
2026-02-06 20:56:35
Time flies—it’s already been a month since I joined Dynamia. In this article, I want to share my observations from this past month: why AI Native Infra is a direction worth investing in, and some considerations for those thinking about their own career or technical direction.
After nearly five years of remote work, I officially joined Dynamia last month as VP of Open Source Ecosystem. This decision was not sudden, but a natural extension of my journey from cloud native to AI Native Infra.
But this article is not just about my personal choice. I want to answer a more universal question: In the wave of AI infrastructure startups, why is compute governance a direction worth investing in?
For the past decade, I have worked continuously in the infrastructure space: from Kubernetes to Service Mesh, and now to AI Infra. I am increasingly convinced that the core challenge in the AI era is not “can the model run,” but “can compute resources be run efficiently, reliably, and in a controlled manner.” This conviction has only grown stronger through my observations and reflections during this first month at Dynamia.
This article answers three questions: What is AI Native Infra? Why is GPU virtualization a necessity? Why did I choose Dynamia and HAMi?
The core of AI Native Infrastructure is not about adding another platform layer, but about redefining the governance target: expanding from “services and containers” to “model behaviors and compute assets.”
I summarize it as three key shifts:
In essence, AI Native Infra is about upgrading compute governance from “resource allocation” to “sustainable business capability.”
Many teams focus on model inference optimization, but in production, enterprises first encounter the problem of “underutilized GPUs.” This is where GPU virtualization delivers value.
In short: GPUs must not only be allocatable, but also splittable, isolatable, schedulable, and governable.
This is the most frequently asked question. Here is the shortest answer:
Open source projects are not the same as company products, but the two evolve together. HAMi drives industry adoption and technical trust, while Dynamia brings these capabilities into enterprise production environments at scale. This “dual engine” approach is what makes Dynamia unique.
HAMi (Heterogeneous AI Computing Virtualization Middleware) delivers three key capabilities on Kubernetes:
Currently, HAMi has attracted over 360 contributors from 16 countries, with more than 200 enterprise end users, and its international influence continues to grow.
AI infrastructure is experiencing a new wave of startups. The vLLM team’s company raised $150 million, SGLang’s commercial spin-off RadixArk is valued at $4 billion, and Databricks acquired MosaicML for $1.3 billion—all pointing to a consensus: Whoever helps enterprises run large models more efficiently and cost-effectively will hold the keys to next-generation AI infrastructure.
Against this backdrop, the positioning of Dynamia and HAMi is even clearer. Many teams focus on “model performance acceleration” and “inference optimization” (like vLLM, SGLang), while we focus on “resource scheduling and virtualization”—enabling better orchestration of existing accelerated hardware resources.
The two are complementary: the former makes individual models run faster and cheaper, while the latter ensures that compute allocation at the cluster level is efficient, fair, and controllable. This is similar to extending Kubernetes’ CPU/memory scheduling philosophy to GPU and heterogeneous compute management in the AI era.
My observations this month have convinced me that compute governance is the most undervalued yet most promising area in AI infrastructure. If you are considering a career or technical investment, here is my assessment:
First, this is a real and urgent pain point
Model training and inference optimization attract a lot of attention, but in production, enterprises first encounter the problem of “underutilized GPUs”—structural idleness, scheduling failures, fragmentation waste, and vendor lock-in anxiety. Without solving these problems, even the fastest models cannot scale in production. GPU virtualization and heterogeneous compute scheduling are the “infrastructure below infrastructure” for enterprise AI transformation.
Second, this is a clear long-term track
Frameworks like vLLM and SGLang emerge constantly, making individual models run faster. But who ensures that compute allocation at the cluster level is efficient, fair, and controllable? This is similar to extending Kubernetes’ success in CPU/memory scheduling to GPU and heterogeneous compute management in the AI era. This is not something that can be finished in a year or two, but a direction for continuous construction over the next five to ten years.
Third, this is an open and verifiable path
Dynamia chose to build on HAMi as an open source foundation, first solving general capabilities, then supporting enterprise adoption. This means the technical direction is transparent and verifiable in the community. You can form your own judgment by participating in open source, observing adoption, and evaluating the ecosystem—rather than relying on the black-box promises of proprietary solutions.
Fourth, this is a window of opportunity that is opening now
AI infrastructure is being redefined. Investing in its construction today will continue to yield value in the coming years. The vLLM team’s company raised $150 million, SGLang’s commercial spin-off RadixArk is valued at $4 billion, Databricks acquired MosaicML for $1.3 billion—all validating the same trend: Whoever helps enterprises run large models more efficiently will hold the keys to next-generation AI infrastructure.
I hope to bring my experience in cloud native and open source communities to the next stage of HAMi and Dynamia: turning GPU resources from a “cost center” into an “operational asset.” This is not just my career choice, but my judgment and investment in the direction of next-generation infrastructure.
jimmysong) to join the HAMi community focused on GPU virtualization and heterogeneous compute scheduling.
If you are also interested in HAMi, GPU virtualization, AI Native Infra, or Dynamia, feel free to reach out.
From cloud native to AI Native Infra, my observations this month have only strengthened my conviction: The true upper limit of AI applications is determined by the infrastructure’s ability to govern compute resources.
HAMi addresses the fundamental issues of GPU virtualization and heterogeneous compute scheduling, while Dynamia is driving these capabilities into large-scale production. If you are also looking for a technical direction worth long-term investment, AI Native Infra—especially compute governance and scheduling—is a track with real pain points, a clear path, an open ecosystem, and an opening window of opportunity.
Joining Dynamia is not just a career choice, but a commitment to building the next generation of infrastructure. I hope the observations and reflections in this article can provide some reference for you as you evaluate technical directions and career opportunities.
If you are also interested in HAMi, GPU virtualization, AI Native Infra, or Dynamia, feel free to reach out.