what you want is a list of operations that are:
a) maximally powerful relative to their implementation cost and opcode space cost
b) easy for a compiler or human to use to *do* all those basic primitives
things you DONT need:
c) orthogonal
PPC: RLWINM is an extremely powerful instruction that replaces a whole ton of *other* instructions in many cases but is conceptually very simple.
it uses a fair bit of opcode space, but someone's probably done the math on it being worth it.
select.CC dst, a, b, c, d
dst = a CC b ? c : d
(depending on context, imagine variants of this with fewer register parameters, for example)
allow immediates and this becomes an extremely powerful instruction. imagine how much code can be replaced with this!
some Q&A:
Q. this instruction sounds cool but will murder my pipeline!
A. then don't do it! the point here is to expose maximum power without major design costs.
A. an overcomplete basis allows better compression than an orthogonal one
Q. isn't execution complexity costly?
A. so is L1 cache and instruction dispatch resources
for example, ARM32 added some features that ARM64 later dropped
why?
think about each one before going and doing the same yourself
i personally kinda like powerpc's flag system, especially the "dot" thing where you can specify if something updates flags
1. ability to set flags from typical arithmetic ops (x86, ppc)
or
2. ability to do the comparison in the select/jump/etc (gpu)