Here for @AMD DC event. Starts at 10am PT, follow this thread along with the stream!π§΅
I expect to see @LisaSu, Mark Papermaster, Forrest Norrod, and Victor Peng on stage talking about #AI, #Bergamo, and #MI300
youtube.com/live/l3pe_qx95β¦
I'm with these goobers!
@dylan522p @PaulyAlcorn @Patrick1Kennedy
Lisa on stage
Optimizing for different workloads in the DC, Inc AI
Focused on building industry standard CPUs. Now the standard in the cloud. 640 epyc instances available in the cloud today
Genoa in November, 96 cores, pcie gen 5, CXL
Enterpiese leadership in industry standard workloads
Power efficiency is #1 on industry standard efficiency tests
Best CPU for AI in the market in TPCx-AI vs comp
Details on these tests likely to be in the backend of the slide deck
AWS to the stage
Happy Lisa
AWS Nitro + 4th Gen epyc. I think this is a #Bergamo comment
New M7a instances. Best price/perf x86 EC2 instance. Video transcoding, simulation, BF16
AMD uses these instances internally for data analytics workloads
Expanding to EDA too
But future workloads need optimised infra
Now for cloud native computing - scale out with containers. Benefit from density and energy efficiency. Enter #Bergamo
128 cores per socket, 82 B transistors
Uses same io die, but new 16 core core dies.
Optimised for density, not perf, but completely same isa for software compatibility. 35% smaller core physically optimized, same socket and same IO support. Same platform support as Genoa
Up to 2.6x vs comp in cloud native workloads
Double density, double efficency, vs comp
Shipping now in volume to hyperscale customers. Meta to the stage
Enablement through an open source platform via OCP. Meta is using AMD in AI
Meta can rely on AMD to deliver, time and time again
Deploying #Bergamo internally, 2.5x over Milan, substantial TCO. Easy decision. Partnered with AMD to provide design optimisations at a silicon level too
Time for #Genoa-X. Dan mcnamara to the stage
This is all technical computing.
2nd gen vcache for over 1GB L3 per socket
Four new skus, 16-96 cores, available today
Xeon 8490H vs Genoa-X
These slides went fast. Azure in stage to talk about HPC
Ansys fluent 3.6x over first gen epyc using Milan-X
Memory optimized HX instances with Genoa-X.
Customer adoption, Petronas (tie in with Mercedes F1?). Looks like oil and gas getting back in the limelight as an important vertical
GA on azure for #Genoa-X
Now for #siena. Coming later this year
Citadel talking about workloads requiring 100k cores and 100PB databases. Moved to latest gen AMD for 35% speedup.
1 million cores *. Forrest says very few workloads require that much. So efficiency and performance matters.
Density required to be as close to the financial market as possible. Latency is key, so also xilinx in the pipeline.
Here we go. Using alveo.
Solarflare NICs for millions of trades a day. Architecture needs to be optimized together.
Same thinking drove the acquisition of #Pensando. Network complexity has exploded. Managing these resources is more complicated, especially with security.
Removing the CPU tax due to the infrastructure, before you even get to load balancing
#Pensando P4 DPU. Forrest calls it the best networking architecture team in the industry
Freeing the CPU from it's overhead
I'm in the wrong seat. Can't see any of the hardware images. Can see all the text though
SmartNICs already in the cloud. Available as VMware vSphere solutions.
New P4 DPU offload in a switch. #Pensando silicon alongside the switching silicon.
HPE Aruba switch
Enables end to end security solutions
Just says this was the first half of the presentation. So now ai
Aiaiaiaiaiaiai
Lisa back to the stage. AI is the next big megatrand
AMD wants to accelerate AI solutions at scale. AMD has #AI hardware
AMD already has lots of AI partners
$150B TAM by 2027
That includes CPU and GPU
Better photo. AMD going down the HPCxAI route.
Victor Peng to the stage!
The journey of AMD's AI stack. Proven at HPC scale
AI software platforms. Edge requires Vitis
Reminder : it's "Rock-'em". Not 'Rock-emm'.
Lots of ROCm is open source
Running 100k+ validation tests nightly on latest AI configs
Pytorch founder to the stage
I'm excited about #MI300 - pytorch founder
Day 0 support for #ROCm on PyTorch 2.0
Not every model is an LLM
@huggingface CEO on the stage.
New @AMD and @huggingface partnership being announced today. Instinct, radeon, ryzen, versal. AMD hardware in HF regression testing. Native optimization for AMD platforms.
Were talking training and inference. AMD hardware has advantages.
Still waiting for #MI300!
Lisa back to the stage.
At the center is GPU
New compute engine on CDNA3
Now sampling #MI300A
13 chiplets
Can replace CPU chiplets for GPU only version
So you replace 3 CPU chiplets with 2 GPU chiplets, add in more HBM for a total of 192GB of HBM3. That's 5.2 TB/sec of mem bandwidth.
153 BILLION TRANSISTORS.
That's #MI300X from @AMD
H100 only has 80GB. Means AMD has better scaling, better TCO, reduces overhead.
8x #MI300X in OCP infrastructure for open standards. Accelerates TTM and decreases dev costs. Easy to implement. $AMD
#MI300X for LLMs
That's a wrap for today. More sessions later, not sure about embargoes, but will say what I can when I can!
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.