Tweet

Paul Triolo

Nov 24 • 32 tweets • 6 min read

#October7Surprise US exports controls specifically on advanced computing related semiconductors, graphical processing units (GPUs) in particular, are difficult to understand. A speculative thread. 🧵

Semiconductors that could fall w/in scope of new controls: GPUs, tensor processing units (TPUs), neural processors, in-memory + vision processors, text = adaptive processors, coprocessors/accelerators, field-programmable logic devices (FPLDs), and ASICs.

But how and who will decide which of these semiconductors, particularly in complex systems configurations now typical in the industry, meet the threshold for performances? Discussions with government industry officials suggest some major confusion here. Some thoughts follow.

The new controls specific to processing units, Export Control Classification Number (ECCN) 3A090, set performance metrics along two related axes, data transfer and computational power.

The first is bidirectional throughput: a bidirectional transfer rate over all inputs/outputs of 600 Gigabyte/s or more to or from integrated circuits other than volatile memories. That is an important caveat.

The other hits on compute power: Tera Operations Per Second (TOPS): One or more digital processor units executing machine instructions having a bit length per operation multiplied by processing performance measured in TOPS, aggregated over all processor units, of 4800 or more.

While the full extent of what will be caught by the new rule on performance remains unclear, it will clearly catch a number of processors and systems based on them manufactured by Nvidia. Some speculation here. bis.doc.gov/index.php/docu…

The firm’s August SEC Edgar filing, the first indication that new controls were coming, notes that in addition to A100 and H100 GPUs, DGX or any other systems which incorporate A100 or H100 integrated circuits and the A100X are also covered.
sec.gov/Archives/edgar…

DGX systems incorporate A100s or H100s into a platform. The DGX A100 server-class workstation is an “AI Data Center in a Box”, ideal for experimentation/development teams, according to Nvidia. The DGX A100, incorporating 8 A100s, is “the universal system for all AI workloads.”

The Nvidia DGX Station A100 brings AI supercomputing to data science teams, offering data center tech w/o a data center. Designed for multiple, simultaneous users, DGX Station A100 leverages server-grade components in easy-to-place workstation form factor. nvidia.com/en-us/data-cen…

So what about the performance metrics. The 600 Gb/s appears to come from the DGX H100 specs. For an 8 H100 GPU system, the data transfer rate comes in at 400 Gb/s for Infiniband.

InfiniBand is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. nvidia.com/content/dam/en…

For performance, it appears that the 4800 TOPS number comes from the DGX A100 system. Here, 4800 TOPS = 4.8 petaOPS, and the DGX A100 system is rated at 5 petaOPS for INT8 calculations, or 5000 TOPS. nvidia.com/content/dam/en…

H100 systems also exceed performance bar by a wide margin, coming in at 32 petaFLOPS FP8, or 32000 TOPS. All of these systems are designed to be combined into larger configurations, to run bigger AI/ML workloads. So not that hard to exceed threshold, but who decides when?

Nvidia’s SEC filing also cited A100X converged accelerators as covered. These appear to come individually and have specs that do not appear to exceed those in 3A090. nvidia.com/content/dam/en…

To add to the complexity, operations per second comes in many flavors, depending on the underyling data types, floating point FPx or integer, INTx.

A computer system that can achieve 200 peta-FLOPS of FP64 is a much more powerful than a system capable of 200 peta-FLOPS at just FP8. FP64 is the standard for supercomputers on the TOP500.

All of this complicates determining performance for specific systems. Some companies are using the opportunity of the rise in machine-learning workloads to break away from using FP64, and use the lower FP8, to make their performance numbers seem larger.

There is some logic to this, as AI models do not require full FP64 and will just as well with a lower precision, such as FP8.

According to Nvidia CEO Jensen Huang: “Nvidia DGX A100 is the ultimate instrument for advancing AI…is the first AI system built for the end-to-end machine learning workflow - from data analytics to training to inference.

And with the giant performance leap of the new DGX, machine learning engineers can stay ahead of the exponentially growing size of AI models and data.”

What are these systems typically used for? Well, things like drug or vaccine discovery for one, “enabling scientists to do years’ worth of AI-accelerated work in months or days.”

Then there is the DGX SuperPOD: 140 DGX A100 systems all clustered together, capable of 700 petaFLOPS of 'AI computing power.'

The fields of bioinformatics, cheminformatics and chemogenomics in particular, incl computer-aided drug discovery (CADD), have also taken advantage of AI/ML methods running on GPU systems like the DGX A100s.

Future exascale supercomputers will provide high levels of parallelism using heterogeneous CPU and GPU environments. Already, many supercomputers use large numbers of high end GPUs for acceleration.

Clearly the goal of the controls is to give US officials a veto power over how these powerful systems are used in China. But determining which systems meet performance thresholds and what applications are of concern will be challenging.

Nvidia’s SEC filing warns: To extent that a customer requires products covered by new license requirement, Company may seek a license for customer but has no assurance that USG will grant any exemptions or licenses for any customer, or that USG will act on them in timely manner.

There do not appear to be any carveouts for licensing for GPU systems like the DGX A100/H00. Under the foreign direct product Entity List part of the new rules (Footnote 4), the licensing policy for Footnote 4 entities is presumptive denial.

For 5 leading AI Chinese companies, Sensetime, Megvii, IFlytek, Intellifusion, Yitu, there is a possibility of a license on a case by case basis if the shipment is necessary to detect, identify, or treat infectious diseases.

It is not clear how this would be determined. No other case by case exemptions are part of the new rules, say for CADD, bioinformatics, or other non-military end uses, which constitute the vast majority of use cases in China.

How the rule will be applied to a fast evolving technology hardware environment in China will also be necessary to watch. Systems optimized for AI workloads are constantly under development around the world.

Just this week Samsung indicated that Baidu would be a customer for its 3 nm process for AI chips in the cloud. Not clear how the performance criteria will be computed here, but you can be sure someone is already thinking about that. kedglobal.com/korean-chipmak…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 15 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Paul Triolo

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @pstAsiatech

Paul Triolo

Paul Triolo

Paul Triolo

Paul Triolo

Paul Triolo

Paul Triolo

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!