Dylan Patel Profile picture
Oct 18 23 tweets 22 min read
#OCPSummit22 kicking off. First keynote by Intel
"We have an amazing track record of improving energy efficiency" - @intel Zane Bell
Umm...
He's talking about datacenters, but Moore's law slide here is a bit funny given the history.
The bit on server resilience is very important. ImageImageImage
Intel is releasing a spec for immersion cooling, and will offer warranty too
"Air is running out of steam. It's time to embrace immersion cooling" - @intel Zane Bell
"More energy in immersion cooling than ever, the time is now"
#OCPSummit22 ImageImageImage
Up next at #OCPSummit22 is @Meta
They do 90M AI inference per second for Instagram!
Their models are massive but compute is not as high for DLRMs due to continuous and categorical features stored in embedding tables.
@OpenComputePrj Image
@Meta are announcing Grand Teton which is their new AI server.
Looks like it has 8x A100 PCIe not SMX?
Not sure what CPU.
#OCPsummit22 ImageImageImage
New open rack v3 at @OpenComputePrj
48V
Up to 20kW
Flexible in what type of servers it supports.
Can support 300% transients!
#OCPSummit22 ImageImageImage
@Meta also talking about the issues in model training.
The interconnects aren't scaling and that leads to as much as 57% of time spent waiting for networking in model training
Discussed this in the article linked
#OCPSummit22
semianalysis.com/p/meta-discuss… ImageImage
@Meta is re-presenting these slides, but networking is taking most the bandwidth and accelerators will need 1TB/S of accelerator to accelerator IO with a purpose built non-blocking fabric.
Discussed in linked article.
semianalysis.com/p/meta-discuss…
#OCPSummit22 @OpenComputePrj ImageImage
Datacenters used 1% to 1.5% of worldwide power in 2020.
By 2030, datacenters will consume 3% to 13% of worldwide power generation!
#OCPSummit22 @OpenComputePrj Image
"More compute with less power" - @jwittich is the goal for @AmpereComputing
They project in 2025, Ampere servers using @Arm ISA will require less than half the power + area as "legacy" x86 based servers.
Rack density, scalability, utilization rates, noisy neighbors
#OCPSummit22 ImageImage
Some harder figures and claims from @AmpereComputing @jwittich
Power efficiency in NGINX, 3.8x vs Icelake
Much more consistent performance, nearly no noisy neighbors
Scales near linearly in perf
Standard 12.8kW rack, more than double the cores and 2x the performance
#OCPSummit22 ImageImage
@AmpereComputing is now announcing their next generation Siyrn based platforms.
They support 2U, 2P, using DDR5 2DPC and PCIe 5.0, based on the Siyrn architecture which we detailed exclusively at SemiAnalysis!
@jwittich @OpenComputePrj #OCPSummit22 ImageImage
Here is where we exclusively detailed the architecture for @AmpereComputing next generation Siyrn architecture.
semianalysis.com/p/is-ampere-co…
Next up is @Broadcom talking about Ethernet, the leader in Ethernet switches.
~600M Ethernet ports are shipped annually.
4x as many Ethernet ports as people board each year.
#OCPSummit22 @OpenComputePrj
"Basically give every newborn 4 ethernet ports when they are born and say happy birthday! Try that with Infiniband, so expensive it would exceed world GDP"
This was funniest thing I've heard in a keynote ever
Broadcom savage on stage about @nvidia @NVIDIANetworkng
#OCPSummit22 Image
Infiniband used to be far lower latency, but less flexible.
Broadcom is talking up their advancements with HPC Ethernet bringing the latency gap to 0.
The data they presented is from Los Alamos National Lab @LosAlamosNatLab
#OCPSummit22 Image
Now talking about transient over subscription, flow collisions, and incast overload.
Addressing these quenes and load balancing is critical to maintain low latency.
#OCPSummit22 Image
Broadcom acquired a new company to help with these issues.
They will maintain Ethernet leadership
Incast can be solved by drop congestion notification aka packet trimming.
Transient congestion with packet spraying
In network telemetry can help buffer packets
#OCPSummit22 ImageImageImage
@Meta head of networks on stage talking about closer collaboration with @Broadcom
Mentioned lots of open software and copackaged optics.
Does this mean that @AristaNetworks software, a huge shipper of Broadcom silicon based in switch boxes to Meta, now going to be dropped?
$ANET Image
Caliptra being announced by @Google alongside @Microsoft @AMD @nvidia
No Intel included is noteworthy.
Trusted computing IP will be hardened in silicon at the chip level and package level.
This seems huge!
#OCPSummit22 Image
Caliptra is going to be HUGE
For confidential compute so cloud tenants can trust their data and code is secure even from the cloud providor.
It's an open source root of trust.
Reusable silicon block designed into many chips and validates their trustworthiness.
Wow
#OCPSummit22 Image
Caliptra specifications, RTL, and firmware which is written in trust is open.
0.5 released.
It's part of @OpenComputePrj and the @linuxfoundation chips alliance.
Microsoft has a live demo here at the conference. AMD, Microsoft, Nvidia, and Google supporting it.
#OCPSummit22 ImageImage
Microsoft and Nuvoton are releasing a new root of trust engine for BMCs too.
Integrated security system.
Determines what BMC can access on a debug requests.
TPM functionality.
Can ensure that BMCs are not modified!
#OCPSummit22 Image
Project Kirkland announced.
Partnership with Microsoft, Google, Intel, Infineon.
Currently Infineon chip, but will be open sourced.
Trust for TPM and prevent physical bus and interconnect attacks.
#OCPSummit22 Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Dylan Patel

Dylan Patel Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @dylan522p

Sep 11
The story of Cormac, an African American mechanical engineer who has done everything he's supposed to, but been screwed over by life and our American system.
He doesn't want a handout, just a job.
Please read, like, and retweet this thread.
Reach out if you can help!
1/13 ImageImage
Cormac did everything right in life, everything our society says.
He took dual enrollment and AP classes in high school.
He went to a state university with a scholarship + job that covered a chunk of the costs.
He chose to pursue mechanical engineering at a top 50 school!
2/13
Cormac even secured himself an apprenticeship and masters degree position at a top industrial firm in Germany after he graduated.
Cormac also had many hobbies aswell. He loved to bike, cars (electric vehical engineering), sports, computer hardware, clean energy, and more.
3/13
Read 13 tweets
Feb 18
Intel Is Throwing The Kitchen Sink, But Is The Turn Around Plan Reasonable?
Deep dive on Tower Semiconductor Fabs/IP
Intel Culture Shift
Future Product And Roadmap Competitiveness By Business Unit
$INTC $TSM $TSEM $NVDA $AMD $MRVL $AMAT $ASML $LRCX $KLA
semianalysis.substack.com/p/intel-is-thr…
I mapped out Tower Semiconductors capacity. I also wrote a lot about their specific differentiated technologies.
In short, Intel acquired ~2 million wafers per year of a lot of niche technologies and great people to help accelerate their foundry push.
$INTC $TSEM $TSMC $UMC $GFS Image
I also wrote a lot about the culture shift at Intel, including a really great story about Intel and how they treated various semicap firms (poorly).
This specific story is Applied Materials, but tons of other horror stories I've heard the supply chain
$INTC $AMAT $LRCX $ASML $KLA Image
Read 8 tweets
Oct 20, 2021
The story about Alibaba/THead 5nm Arm server chip is more complicated than it seems!
$BABA/Chinese media say in-house independent design
Taiwan media has said it uses a Taiwan based design house
The theory is it's externally designed to get around IP restriction.
/1

$NVDA $SFTBY
On first glance, stellar specs. SpecInt score matching the best current chips from AMD and demolishing Intel.
2 die package with 60B transistors!
64KB+64KB L1 cache, 1MB L2 cache, 128MB L3 cache
8 channel DDR5 4400, 96x lanes PCIe 5.0
2.75-3.2GHz, 250W TDP
Damn impressive!
2/
So @Stewrandall pointed out that Arm seems excited on their social media, but the Arm China WeChat account has said nothing.
That seems really odd?!?!
Arm China should be hailing this. Are they not involved?
That seems possible given it's gone rogue.
/3
Read 11 tweets
Oct 19, 2021
The new Macbook Pro's are HALF as efficient as the Macbook Air/Pro 13" in web browsing!
Battery life is a massive improvement over the successor, but efficiency is down quite a bit compared to these original M1 Macs.
Part of this is due to screen differences, part due to SOC/mem. Image
With the die shots Apple released. I think there are some errors in it.
Using them for face value, @Locuza_ + @andreif7 calculated die sizes:
M1 Pro 245.08-245.92mm^2
M1 Max 429.17-432.35mm^2
Both noticed some these, but it should be
M1 Pro 241.7mm^2
M1 Max 383.5mm^2 Image
Why the discrepancy?
The Firestorm CPU core, GPU cores, and 11 TOPs NPU are all scaled larger.
If they were scaled the same as real die shots of M1, then those would be the die sizes.
Both @Locuza_ and @andreif7 noticed this odd scaling btw
Read 11 tweets
Oct 14, 2021
TSMC node transitions are slowing down heavily!
At N7, cost/transistor stopped scaling
At N5, cost/transistor went up and SRAM scaling slowed
At N3 cadence moved to 2.5 years and power/SRAM scaling poor
And now N2 is 2025
This Intel/Samsung moment to catch up!
$TSM $INTC $SSNLF
N7 is an absolute monster, exiting Q3 at over 170,000 wafer per month run rate.
N5 is the slowest ramp for TSMC ever, still trodding along under 60k WPM average for the quarter.
IoT is the biggest grower, followed by seasonal smartphone ramp. HPC took the back seat this quarter
Q4 guidance doesn't seem to be forecasting any stall from Apple at all despite reports they cut orders through supply chain due to shortages at $AVGO and $T
Gross margins continue to be above and guided above 50% despite Morgan Stanley's blubbering take.
Read 14 tweets
Aug 31, 2021
Report from DigiTimes stating TSMC is going to negotiate with equipment and materials suppliers about 15%+ price cuts!
Simultaneously they are doing 20% price increases.
The latter is likely, I don't think the prior is possible.
Explanation👇
$AMAT $ASML $LRCX $KLAC $TOELY $TSM
These SemiCap firms have a vested interest in seeing competitors such as Samsung $SSNLF, Intel $INTC, $UMC, GlobalFoundries, SMIC, etc.
Customer concentration is bad for suppliers!
How can TSMC get price cuts agreed when SemiCap can deprioritize them and sell everything anyways?
$ASML and $KLAC have crazy lead times across the board! $LRCX $KLAC $TOELY all have some tools with 1 year or longer lead times
TSMC could muscle these cuts in when everyone has extra supply and play the various etch and depo players against each other but right now?
No Way Jose!
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(