Image: eWeek
Enthusiasm around generative AI has produced a large number of AI startups and is fueling massive investment that Goldman Sachs predicts will surpass $1 trillion over the next few years. Amazon is just the latest to put its money where its mouth is, announcing a $110 million investment into generative AI to fund the Build on Trainium program. Build on Trainium will provide compute hours for researchers to envision, experiment with, and create new AI architectures, machine learning (ML) libraries, and performance optimizations designed for large-scale, distributed AWS Trainium UltraClusters. Trainium UltraClusters are essentially cloud-based collections of AI accelerators that can be unified into one system to deal with highly complex computational tasks.
Built on AWS Trainium Chips
The AWS Trainium chip is tailored for deep learning training and inference. Any AI advances that emerge from this Amazon generative AI investment will be made broadly available as open-source offerings. Researchers can tap into the Trainium research UltraCluster, which has up to 40,000 Trainium chips optimized for AI workloads—far more computational power than they could ever hope to afford or assemble locally within academic institutions.
Because high-performance computing resources, graphics processing units (GPUs), and other elements of the AI arsenal don’t come cheap, budget constraints could stall AI progress. This Amazon AI investment will help some university-based students and researchers overcome such constraints. One example is the Catalyst research group at Carnegie Mellon University (CMU) in Pittsburgh, Pennsylvania, which is using Build on Trainium to study and develop ML systems and develop compiler optimizations for AI.
“AWS’s Build on Trainium initiative enables our faculty and students large-scale access to modern accelerators, like AWS Trainium, with an open programming model,” said Todd C. Mowry, a professor of computer science at CMU. “It allows us to greatly expand our research on tensor program compilation, ML parallelization, and language model serving and tuning.”
To hasten the trajectory of AI innovation, Amazon has also been investing in its own technology to make the lives of researchers easier. For example, its Neuron Kernel Interface (NKI) makes it far simpler to achieve direct access to AWS Trainium instruction sets. Researchers can quickly build optimized computational units as part of their new models and Large Language Models (LLMs). One of the first breakthroughs you can expect to see is more focused, smaller-scale LLMs.
“Small, purpose-built LLMs will address specific generative AI and agentic AI use cases,” said Kevin Cochrane, CMO of cloud infrastructure provider Vultr. “2025 will see increased attention to matching AI workloads with optimal compute resources, driving exponential demand for specialized GPUs.”