There's a lot of weird debate about whether Rust in the kernel is useful or not... in my experience, it's way more useful than I could've ever imagined!

I went from 1st render to a stable desktop that can run run games, browsers, etc. in about two days of work on my driver (!!!)
All the concurrency bugs just vanish with Rust! Memory gets freed when it needs to be freed! Once you learn to make Rust work with you, I feel like it guides you into writing correct code, even beyond the language's safety promises. It's seriously magic! ✨
There is absolutely no way I wouldn't have run into race conditions, UAFs, memory leaks, and all kinds of badness if I'd been writing this in C.

In Rust? Just some logic bugs and some core memory management issues. Once those were fixed, the rest of the driver just worked!!
I tried kmscube, and it was happily rendering frames. Then I tried to start a KDE session, and it crashed after a while, but you know what didn't cause it? 3 processes trying to use the GPU at the same time, allocating and submitting commands in parallel. In parallel!!
After things work single-threaded in a driver as complex as this, having all the locking and threading just magically working as intended with no weird races or things stepping on top of each other is, as far as I'm concerned, completely unheard of for a driver this complex.
And then all the memory management just... happens as if by magic. A process using the GPU exits, and all the memory and structs it was using get freed. Dozens of lines in my log of everything getting freed properly. I didn't write any of that glue, Rust did it all for me!
(Okay, I wrote the part that hooks up the DRM subsystem to Rust, including things like dropping a File struct when the file is closed, which is what triggers all that memory management to happen... but then Rust does the rest!)
I actually spent more time tracking down a single forgotten `*` in the DCP driver (written in C by Alyssa and Janne, already tested) that was causing heap overflows than I spent tracking down CPU-side safety issues (in unsafe code) in Rust on my brand new driver, in total.
Even things like handling ERESTARTSYS properly: Linux Rust encourages you to use Result<T> everywhere (the kernel variant where Err is an errno), and then you just stick a ? after wherever you're sleeping/waiting on a condition (like the compiler tells you) and it all just works!
Seriously, there is a huuuuuuge difference between C and Rust here. The Rust hype is real! Fearless concurrency is real! And having to have a few unsafe {} blocks does not in any way negate Rust's advantages!
Some people seem to be misunderstanding the first tweet in this thread... I didn't write a driver in 2 days, I debugged a driver in 2 days! The driver was already written by then!

What I'm saying is that Rust stopped many classes of bugs from existing. Sorry if I wasn't clear!
There was also a bit of implementation work involved in those 2 days of work though - buffer sharing in particular wasn't properly implemented when I got first renders, so that was part of it, but the bulk of the driver was already done.
Apparently I have to clarify again?

I did write the driver myself (and the DRM kernel abstractions I needed). The 2 days were the debugging once the initial implementation was done. The Rust driver took 7 weeks, and I started reverse engineering this GPU 6 months ago...
This was the first stream where I started evaluating Rust to write the driver (I'd been eyeing the idea for a while, but this was the first real test).

Initially it was just some userspace experiments to prove the concept, then moved onto the kernel.

I mostly worked on stream and it's been 12 streams plus the debugging one, so I guess writing the driver took about 12 (long) days of work, plus a bit extra (spread out over 7 weeks because I stream twice per week and took one week off)
So just to be totally clear:

Reverse engineering and prototype driver: ~4 calendar months, ~20 (long) days of work

Rust driver development (including abstractions): ~7 calendar weeks, ~12 (long) days of work

Debugging to get a stable desktop: 5 calendar days, 2 days of work.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Asahi Lina / 朝日リナ // @lina@vt.social

Asahi Lina / 朝日リナ // @lina@vt.social Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @LinaAsahi

Sep 1
Honestly, I'm kind of sad about Wedson leaving RfL. He developed a huge part of the foundation that made Rust for Linux possible.

I'll still work on DRM (except sched) and driver upstreaming when the core stuff is in place, but I don't know about other subsystems.
At the rate things are going, I wouldn't be surprised if upstreaming the drm/asahi driver isn't possible until 2026 at the earliest. I had hopes for things to move much faster, but that's not possible without active cooperation from existing maintainers, and we aren't getting it.
Reading upstreaming mailing list threads is painful. Every second comment is "why is this not like C" or "do it like C". Nobody is putting any effort into understanding why Rust exists and why it works. It's just superficial "this code is scary and foreign" type reactions.
Read 8 tweets
Aug 31
I think people really don't appreciate just how incomplete Linux kernel API docs are, and how Rust solves the problem.

I wrote a pile of Rust abstractions for various subsystems. For practically every single one, I had to read the C source code to understand how to use its API.
Simply reading the function signature and associated doc comment (if any) or explicit docs (if you're lucky and they exist) almost never fully tells you how to safely use the API. Do you need to hold a lock? Does a ref counted arg transfer the ref or does it take its own ref?
When a callback is called are any locks held or do you need to acquire your own? What about free callbacks, are they special? What's the intended locking order? Are there special cases where some operations might take locks in some cases but not others?
Read 23 tweets
Aug 29
I regretfully completely understand Wedson's frustrations.



A subset of C kernel developers just seem determined to make the lives of the Rust maintainers as difficult as possible. They don't see Rust as having value and would rather it just goes away.lore.kernel.org/lkml/202408282…
When I tried to upstream the DRM abstractions last year, that all was blocked on basic support for the concept of a "Device" in Rust. Even just a stub wrapper for struct device would be enough.

That simple concept only recently finally got merged, over one year later.
When I wrote the DRM scheduler abstractions, I ran into many memory safety issues caused by bad design of the underlying C code. The lifetime requirements were undocumented and boiled down to "design your driver like amdgpu to make it work, or else".
Read 14 tweets
Oct 21, 2022
🎉🎉🎉 My Linux M1 GPU driver passes >99% of the dEQP-GLES2 compliance tests!!!!! 🎉🎉🎉

Most of this is thanks to @alyssarzg's prior work on macOS, but now I can replicate it on Linux! ^^ Image
@alyssarzg Got some hints from Alyssa, now at 99.3%! Image
@alyssarzg The projected tests are known broken according to her, and the etc1 ones look like some weird rounding thing (they actually pass at 128x128?).

So really, it's down to one weird compiler issue, one weird rounding issue, the projection thing, and whatever is up with those last 2. Image
Read 4 tweets
Oct 14, 2022
Hello everyone~! Stream starts in 2 hours!!

I'm honestly not too confident about this one... it feels like every time I look at the problem, it looks like something else! Maybe it's time to investigate some related issues and see if they shed any light on the issue? ^^;;
Things that might be worth doing:
- Implement tracepoints for GPU stuff instead of printk
- Hook up GPU stats & ktrace to tracepoints
- Look closer at ASIDs
- Write the firmware heap allocator so we can stop leaking firmware objects as a workaround
Tracepoints are a fun one, because it's basically a bunch of C macros and the entire tracepoint.h has to be rewritten for Rust! But it's something I really want to start using soon... so maybe that's a good thing to work on today?
Read 4 tweets
Oct 4, 2022
Thank you to everyone who watched my XDC2022 talk with @alyssarzg!! I hoped you liked my new witch hat, drawn by the amazing @nana_nono120!🧙‍♀️🪄

We did the whole talk on an M1 Mac Mini running my Rust kernel driver and Alyssa's Mesa driver, using Firefox and @Inochi2D on GNOME!🚀
I'll be keeping the hat on for at least all of October, so make sure to drop by my dev streams if you want to see more of it!! ✨✨

Coming up next, let's see if we can finally work out that pesky TLB invalidation issue!!!!
Also, one little note: this is also the first time I've used my driver on bare metal, outside the hypervisor! It shouldn't make a difference, but it's good to know it doesn't ^^

(Mostly because sometimes the USB HV connection is unstable and I didn't want that to crash us!)
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(