Longhorn Profile picture
Kernel/hypervisor engineer @awscloud EC2. Hobby @checkra1n. Mastodon: https://t.co/DsXP8PFgL0 Bluesky: https://t.co/dAOfFSSqY4
Mar 26, 2023 5 tweets 2 min read
Single-die scaling is slowing, but multi die scaling keeps going on.

Also an interesting note:

... computing is a not a chip problem
It’s a software and chip problem More details on Grace Hopper…
Mar 26, 2023 4 tweets 2 min read
The Developer’s View to Secure an Application and
Data on NVIDIA H100 with Confidential Computing

static.rainfocus.com/nvidia/gtcspri… Host pinned memory becomes managed memory.

However, cudaHostRegister and friends not available.
Sep 23, 2022 6 tweets 1 min read
More Smart App Control breakage: spawning AppInstallerFullTrustAppServiceClient.exe being broken, so seems that I can't rely on the installer app to install appx/msix but have to use PowerShell instead. WebClient too...
Sep 22, 2022 5 tweets 2 min read
NVIDIA Optical Flow SDK

developer.nvidia.com/opticalflow-sdk Video Frame Interpolation and Extrapolation  Optical flow ca In version 4.0, extended with the frame rate up compression library. Optical Flow Engine-Assisted Frame Rate Up Conversion Librar
Jul 17, 2022 4 tweets 1 min read
Playing around with OpenCL on a Galaxy S22 with Termux:

Termux expects /vendor/lib64/libOpenCL.so to be the ICD path.

This is not the case on this Samsung device, where it is the path for the ICD loader. The ICD loader then loads ICDs from /vendor/Khronos/OpenCL/vendors. However private API enforcement is still there for libSGPUOpenCL so you should load the ICD loader instead.

What I did there to hack around was to set LD_LIBRARY_PATH=/vendor/lib64, which allowed me to hack around.
Jul 3, 2022 5 tweets 1 min read
To be successful for gaming markets, a multi-die/package GPU will have to show up as a single GPU to software.

Developers don’t want to deal with the complexity of explicit multi-GPU solutions outside of GPGPU scenarios, as evidenced by explicit multi-GPU solutions floundering. Reducing the data traffic between the GPUs can entail having separate copies of most-accessed data to local memory.

With access counters to determine when duplicating is worthwhile, to not have to do full memory mirroring.

On writes, the page is “merged” into one copy again.
Jun 7, 2022 4 tweets 1 min read
Well. Rosetta 2 needs a quite recent CPU (post v8.2) to work because of the instructions used.

Does it work on non-Apple arm64 CPUs? 🤔 Yes.

(allows to settle the argument once and for all that this needs anything Apple specific outside of TSO support*. Answer is a no.) * this particular CPU doesn't support TSO, resulting in the emulation of a particular beast, an x86 with relaxed memory ordering.
Jun 4, 2022 4 tweets 2 min read
Windows 10 Enterprise multi-session FAQ
docs.microsoft.com/en-us/azure/vi…

Restricted to Azure only, with a very PR-y FAQ answer to that.

Combined with Office 365 on Windows Server 2016-19 going EOL in October 2025, this means that MS is doing a strong Azure push.

docs.microsoft.com/en-us/deployof… Image Aka, "if you're an Azure competitor then we aren't giving your customers nice options going forward".

On top of that lifetime Office licenses only have 5 years of security updates instead of 10 to entice migrations to Office 365 and Azure further.
docs.microsoft.com/en-us/deployof…
May 24, 2022 4 tweets 2 min read
From @Microsoft Build 2022: GA for Windows 10 IoT Enterprise on the @NXP i.MX 8M and 8M Mini platforms. This is a full Windows SKU, with Win32 app compatibility and 10 years of BSP support.

And that's a first on a non-Qualcomm arm64 SoC.
May 11, 2022 4 tweets 1 min read
My comment on the NVIDIA GPU kernel module:

> The open flavor of kernel modules supports Turing, Ampere, and forward. […] the open kernel modules depend on the GPU System Processor (GSP) first introduced in Turing.

GSP firmware:

34M gsp.bin TL;DR: how it was done was moving a lot of the meaty bits to the GPU itself - a much more firmware-heavy approach than previously, allowed by the relatively high performance levels of the GSP.

And of course because firmware is handled separately by rules - this raises questions.