Amazon Web Services announced an AI chatbot for enterprise use called Amazon Q, new generations of its AI training chips, expanded partnerships and more during AWS re:Invent, held from November 27 to December 1, in Las Vegas.
The focus of AWS CEO Adam Selipsky’s keynote held on day two of the conference was on generative AI and how to enable organizations to train powerful models through cloud services.
Jump to:
Graviton4 and Trainium2 chips announced
AWS announced new generations of its Graviton chips, which are server processors for cloud workloads and Trainium, which provides compute power for AI foundation model training.
Graviton4 (Figure A) has 30% better compute performance, 50% more cores and 75% more memory bandwidth than Graviton3, Selipsky said. The first instance based on Graviton4 will be the R8g Instances for EC2 for memory-intensive workloads, available through AWS.
Trainium2 is coming to Amazon EC2 Trn2 instances, and each instance will be able to scale up to 100,000 Trainium2 chips. That provides the ability to train a 300-billion parameter large language model in weeks, AWS stated in a press release.
Figure A
Anthropic will use Trainium and Amazon’s high-performance machine learning chip Inferentia for its AI models, Selipsky and Dario Amodei, chief executive officer and co-founder of Anthropic, announced. These chips may help Amazon muscle into Microsoft’s space in the AI chip market.
Amazon Bedrock: Content guardrails and other features added
Selipsky made several announcements about Amazon Bedrock, the foundation model building service, during re:Invent:
- Agents for Amazon Bedrock are generally available in preview today.
- Custom models built with bespoke fine-tuning and ongoing pretraining are open in preview for customers in the U.S. today.
- Guardrails for Amazon Bedrock are coming soon; Guardrails lets organizations conform Bedrock to their own AI content limitations using a natural language wizard.
- Knowledge Bases for Amazon Bedrock, which bridge foundation models in Amazon Bedrock to internal company data for retrieval augmented generation, are now generally available in the U.S.
Amazon Q: Amazon enters the chatbot race
Amazon launched its own generative AI assistant, Amazon Q, designed for natural language interactions and content generation for work. It can fit into existing identities, roles and permissions in enterprise security permissions.
Amazon Q can be used throughout an organization and can access a wide range of other business software. Amazon is pitching Amazon Q as business-focused and specialized for individual employees who may ask specific questions about their sales or tasks.
Amazon Q is especially suited for developers and IT pros working within AWS CodeCatalyst because it can help troubleshoot errors or network connections. Amazon Q will exist in the AWS management console and documentation within CodeWhisperer, in the serverless computing platform AWS Lambda, or in workplace communication apps appreciate Slack (Figure B).
Figure B
Amazon Q has a feature that allows application developers to update their applications using natural language instructions. This feature of Amazon Q is available in preview in AWS CodeCatalyst today and will soon be coming to supported integrated development environments.
SEE: Data governance is one of the many factors that needs to be considered during generative AI deployment. (TechRepublic)
Many Amazon Q features within other Amazon services and products are available in preview today. For example, contact center administrators can access Amazon Q in Amazon Connect now.
Amazon S3 articulate One Zone opens its doors
The Amazon S3 articulate One Zone, now in general availability, is a new S3 storage class purpose-built for high-performance and low-latency cloud object storage for frequently-accessed data, Selipsky said. It’s designed for workloads that demand single-digit millisecond latency such as finance or machine learning. Today, customers proceed data from S3 to custom caching solutions; with the Amazon S3 articulate One Zone, they can pick their own geographical availability zone and bring their frequently accessed data next to their high-performance computing. Selipsky said Amazon S3 articulate One Zone can be run with 50% lower access costs than the standard Amazon S3.
Salesforce CRM available on AWS Marketplace
On Nov. 27, AWS announced Salesforce’s partnership with Amazon will extend to certain Salesforce CRM products accessed on AWS Marketplace. Specifically, Salesforce’s Data Cloud, Service Cloud, Sales Cloud, Industry Clouds, Tableau, MuleSoft, Platform and Heroku will be available for joint customers of Salesforce and AWS in the U.S. More products are expected to be available, and the geographical availability is expected to be expanded next year.
New options include:
- The Amazon Bedrock AI service will be available within Salesforce’s Einstein Trust Layer.
- Salesforce Data Cloud will preserve data sharing across AWS technologies including Amazon Simple Storage Service.
“Salesforce and AWS make it easy for developers to securely access and leverage data and generative AI technologies to drive rapid transformation for their organizations and industries,” Selipsky said in a press release.
Conversely, AWS will be using Salesforce products such as Salesforce Data Cloud more often internally.
Amazon removes ETL from more Amazon Redshift integrations
ETL can be a cumbersome part of coding with transactional data. Last year, Amazon announced a zero-ETL integration between Amazon Aurora, MySQL and Amazon Redshift.
Today AWS introduced more zero-ETL integrations with Amazon Redshift:
- Aurora PostgreSQL
- Amazon RDS for MySQL
- Amazon DynamoDB
All three are available globally in preview now.
The next thing Amazon wanted to do is make seek in transactional data more smooth; many people use Amazon OpenSearch Service for this. In response, Amazon announced DynamoDB zero-ETL with OpenSearch Service is available today.
Plus, in an effort to make data more discoverable in Amazon DataZone, Amazon added a new capability to add business descriptions to data sets using generative AI.
Introducing Amazon One Enterprise authentication scanner
Amazon One Enterprise enables security management for access to physical locations in industries such as hospitality, education or technologies. It’s a fully-managed online service paired with the AWS One palm scanner for biometric authentication administered through the AWS Management Console. Amazon One Enterprise is currently available in preview in the U.S.
NVIDIA and AWS make cloud pact
NVIDIA announced a new set of GPUs available through AWS, the NVIDIA L4 GPUs, NVIDIA L40S GPUs and NVIDIA H200 GPUs. AWS will be the first cloud provider to bring the H200 chips with NV link to the cloud. Through this link, the GPU and CPU can share memory to speed up processing, NVIDIA CEO Jensen Huang explained during Selipsky’s keynote. Amazon EC2 G6e instances featuring NVIDIA L40S GPUs and Amazon G6 instances powered by L4 GPUs will start to roll out in 2024.
In addition, the NVIDIA DGX Cloud, NVIDIA’s AI building platform, is coming to AWS. An exact date for its availability hasn’t yet been announced.
NVIDIA brought on AWS as a primary partner in Project Ceiba, NVIDIA’s 65 exaflop supercomputer including 16,384 NVIDIA GH200 Superchips.
NVIDIA NeMo Retriever
Another announcement made during re:Invent is the NVIDIA NeMo Retriever, which allows enterprise customers to supply more accurate responses from their multimodal generative AI applications using retrieval-augmented generation.
Specifically, NVIDIA NeMo Retriever is a semantic-retrieval microservice that connects custom LLMs to applications. NVIDIA NeMo Retriever’s embedding models set up the semantic relationships between words. Then, that data is fed into an LLM, which processes and analyzes the textual data. Business customers can connect that LLM to their own data sources and knowledge bases.
NVIDIA NeMo Retriever is available in early access now through the NVIDIA AI Enterprise Software platform wherever it can be accessed through the AWS Marketplace.
Early partners working with NVIDIA on retrieval-augmented generation services include Cadence, Dropbox, SAP and ServiceNow.
Note: TechRepublic is covering AWS re:Invent virtually.