βΆ @Yale Low Power Fabric for Brain-Computer Interfaces, Abhishek Bhattacharjee
βΆ @ETH_en SoC for Visual Proc in Nano-UAVs, Alfio Di Mauro
βΆ @Stanford Reconfig Array SoC for Dense Linear Algebra, Kathleen Feng
βΆ @Arm Morello, Richard Grisenthwaite
5/n Machine Learning
βΆ @GroqInc Tensor Streaming MP, Dennis Abts
βΆ @UntetherAI 1456 RISC-V Core At-Memory Inference, Robert Beachler
βΆ @Tesla DOJO Microarchitecture, Emil Talpes
βΆ @Tesla DOJO System Scaling, Bill Chang
βΆ @CerebrasSystems HW/SW Co-Design of WSE, Sean Lie
βΆ @NVIDIA Orin, Michael Ditty
βΆ @NODAR 3D Vision, Leaf Jiang
βΆ @NVIDIA Grace, Jonathon Evans
8/n Mobile & Edge
βΆ @AMD Ryzen 6000, Jim Gibney
βΆ @Intel Meteor Lake and Arrow Lake, Wilfred Gomes
βΆ @Mediatek Dimensity 9000, Hugh Mair
βΆ @Intel Xeon D 2700/1700, Praveen Mosur
9/n The Day 0 Tutorials
βΆ CXL Overview
βΆ CXL 2/3 Coherency, Fabric
βΆ MLIR (Multi-Level Intermediate Representation) from Google, Nod.AI, Arm, Si-Five, Microsoft
10/ That's all the talks, as of 24th May. Feels like a real chip conference this year, covering a lot of areas and not losing too much to ML. Looking forward to insights into Dojo, Optical, networking, the chip deep dives
So, discussing all-core frequency on Ryzen 7000. We saw the demo with 5.5 GHz peak, and AMD said 5.2-5.5G was common for that game.
We are doing some napkin math about what a proper workload might be. Thread (1/n):
So certain games don't tax the CPU all that much. The code path doesn't spread out, doesn't use many execution units, and it could be a very light workload. The core power requirements might be low, and so frequency can be boosted.
As we see with CPU tests, some tests hammer the core with high IPC (P95), others with low IPC (Cinebench).
With the Ryzen 7000, let's work on core power. We'll start with this graph of core power, under a high IPC workload, for the 7nm 5950X:
Q: Lot going on macro. 54-55% organic growth in 1Q. Puts and Takes? Supply? Server?
A: Strong Q1, lots going on. Strength in Q1 was broad - gain share in server, bring supply online, strong semi and C&G. Softness in PC, but shift mix ASP to premium. Into Q2, lots in play, but managed supply well, work with customers. Xilinx has high demand
Q: FY22 Guidance - expect upside over 31% organic growth? View on DC Capex and PC? AMD is being conservative in PC?
EUV + FinFet
50nm gate pitch
30nm fin pitch
40nm min metal pitch
16 metal layers
Enhanced Copper at lower layers for lower line resistance
8 VT options (4N+4P)
Claims of 2x area scaling of HP logic library, plus +20% perf at iso-power over Intel 7.
#VLSI22 Thread Part 2:
Also from Intel, Low power 6T SRAM on Intel 4:
Old 6T design:
5.8x power at 23.8 Mb/mm2
Old 8T design:
1.0x power at 13.7 Mb/mm2
New 6T design:
1.03x power at 19.4 Mb/mm2
TL;DR can now offer low power SRAM at better density. No word on latency
It's great that Arc supports AV1 encode. But to say its great for streaming right away is not quite right.
No streaming service currently deals with a direct AV1 streaming upload iirc. We're still a few quarters (up to 2yrs+) away from that. Correct me if I'm wrong. 1/
2/ Netflix can deliver AV1 for your decode.
YouTube can deliver AV1 for your decode.
You can upload pre-recorded to YouTube in AV1.
You can't stream to YouTube in AV1.
You can't stream to twitch in AV1.
Again, correct me if I'm wrong, but Intel said this in briefings.
3/ Even if you record in AV1 offline to upload later, YouTube doesn't yet use dedicated AV1 hardware to process (wait for VCU2?).
So your AV1 pre-recorded video takes longer to convert on their backend. Only useful if you have upload limits or are hitting upload limits.