AWS Piles Features Into SageMaker, Bedrock

A Icloud

By: Mary Jander


LAS VEGAS, NV — AWS is charging hard to simplify the processes involved in creating generative AI applications. The goal is to make it easier for the cloud provider to score new customers.

In a keynote talk at the AWS re:Invent conference this week, Swami Sivasubramanian, VP of AI and Data at AWS, outlined a raft of new services that streamline model training and inferencing for enterprise customers.

“We’re seeing the convergence of big data, analytics, machine learning, and generative AI,” said Sivasubramanian. And AWS has built on a range of past successes to meet demand, he said.

The central platform supporting this trend is AWS SageMaker, he said, which helps firms such as Intuit and GE Healthcare build and deploy machine learning models. SageMaker, now dubbed SageMaker AI, sports a bunch of enhancements designed to grease the wheels of ML-based analytics, including the following:

SageMaker HyperPod flexible training plans. At the heart of SageMaker is the distribution, monitoring, and orchestration of clusters of model accelerators based on AWS Trainium or NVIDIA GPU instances. Now users can feed in the parameters of what they’ll require for a particular ML workload, and SageMaker AI will create a training plan that fits their requirements and timeframe—and set up the infrastructure and run the workload.

SageMaker HyperPod task governance. SageMaker HyperPod now coordinates modeling workloads to ensure that no GPU instance is left waiting. Through a virtual dashboard, administrators can set priorities for ML workloads and assign specific compute resources to get things done. AWS gives an example: “[W]hen training for a high-priority model needs to be completed as soon as possible but all compute resources are in use, HyperPod frees up resources from lower-priority tasks to support the training. HyperPod pauses the low-priority task, saves the checkpoint, and reallocates the freed-up compute resources. The preempted low-priority task will resume from the last saved checkpoint as resources become available again.”


Swami Sivasubramanian, VP of AI and Data, AWS. Source: AWS

There were more SageMaker announcements, as shown below:

Bedrock Still in the Spotlight

Sivasubramanian also described a number of enhancements to Amazon Bedrock, the model library and workbench that was the focus of several announcements yesterday as well. Among these are the addition of prompt caching to Bedrock, which helps avoid multiple repetitions of common prompts that can tax compute resources. By caching these prompts, AWS claims customers can reduce compute costs by up to 90% and latency by up to 85% for models supported in Bedrock.

Another feature, Bedrock Intelligent Prompt Routing, directs prompts to a choice of foundational models to match the prompt to its ability to deliver the best result at the lowest cost. Presently, the feature routes requests between Claude Sonnet 3.5 and Claude Haiku, or between Llama 3.1 8B and Llama 3.1 70B. Both prompt caching and Intelligent Prompt Routing are in preview.

Bedrock’s Retrieval Augmented Generation (RAG) platform, Amazon Bedrock Knowledge Bases, now manages retrieval via natural language of structured data for use in AI workloads. A new GraphRAG function, in preview, identifies relationships within data sources, skipping what were formerly extra steps to obtain RAG input. GraphRAG requires Amazon Neptune Analytics, a graph database with AWS compatibility.

These are just a few of announcements around Bedrock from AWS this week. Below are some others:

During these and other announcements, Sivasubramanian stressed how AWS has built on past technologies to deliver AI assistance to enterprises looking to use GenAI for revenue-generating applications. While many of the solutions require prerequisite services from AWS, the results should make life easier for those AWS users wrangling with AI processes. All of which is meant to build the argument for putting AWS at the center of a cloud strategy.

Futuriom Take: In yesterday’s keynote, Swami Sivasubramanian, VP of AI and Data at AWS, outlined many new functions, both available and in preview, that should help enterprises to adopt AI applications. These new services are clearly designed for AWS customers, or those whose primary cloud provider is AWS.