Profile picture
Fiora Esoterica @fioraesoterica
, 17 tweets, 3 min read Read on Twitter
so since there's a thread going around elsewhere on this hellsite about programming language ISAs, here's some hot takes as an assembly programmer and compiler maker
many ISAs, especially older ones, are built as if someone came up with a list of basic primitive operations and wrote them out and told an ISA designer to put them all in
this isn't what you actually want.

what you want is a list of operations that are:

a) maximally powerful relative to their implementation cost and opcode space cost
b) easy for a compiler or human to use to *do* all those basic primitives

things you DONT need:

c) orthogonal
i should clarify as well that the optimization heuristic for a) depends on the situation. for example, on the GPU I worked on, instruction length wasn't as important so long as the extra space spent was worth it.
let's give some examples.

PPC: RLWINM is an extremely powerful instruction that replaces a whole ton of *other* instructions in many cases but is conceptually very simple.

it uses a fair bit of opcode space, but someone's probably done the math on it being worth it.
note that despite RLWINM existing and replacing SHL/SHR in most cases, it doesn't replace them entirely; variable shifts must still exist! it is an *utterly* non-orthogonal instruction! yet it's a core element of a "classic RISC" architecture. RISC never existed, by the way
GPU example:

select.CC dst, a, b, c, d
dst = a CC b ? c : d
(depending on context, imagine variants of this with fewer register parameters, for example)
this is cheaper than you think. do the comparison in a separate pipeline stage so you don't need to feed all the inputs to the ALU.

allow immediates and this becomes an extremely powerful instruction. imagine how much code can be replaced with this!
PowerPC actually had a very similar instruction a long time ago: see "fsel", the 4-byte compromise for this.
GPU, ARM, x86, and others example: you know what's a *really* common operation? a = b + c << scale! Make that a primitive. Maybe just a few values of scale. It's still good.
you can go on endlessly with concepts like these.

some Q&A:

Q. this instruction sounds cool but will murder my pipeline!
A. then don't do it! the point here is to expose maximum power without major design costs.
Q. i thought orthogonality was good!
A. an overcomplete basis allows better compression than an orthogonal one

Q. isn't execution complexity costly?
A. so is L1 cache and instruction dispatch resources
extremely tl;dr version: the thing you should learn from PPC is not "woooooooo RISC is good wooooooo" but rather "RLWINM is good, how can we do this more often"
also try to learn from every other ISA

for example, ARM32 added some features that ARM64 later dropped

why?

think about each one before going and doing the same yourself
more thoughts: you probably should minimize flag updates to avoid dependency hell, but if you have flags it is useful to be able to use them

i personally kinda like powerpc's flag system, especially the "dot" thing where you can specify if something updates flags
in fact i would say your arch should have one of the following for compression:

1. ability to set flags from typical arithmetic ops (x86, ppc)
or
2. ability to do the comparison in the select/jump/etc (gpu)
if your ISA results in people doing add/cmp/jcc at the end of loops you've made a mistake
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Fiora Esoterica
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!