Per Vognsen Profile picture
One-Pass Gang
Dec 21, 2021 5 tweets 3 min read
@sc13ts Yeah. Even in gamedev it isn't easy and almost certainly won't make you rich. When I was at RAD I ended up doing a bunch of technical sales (since engineers there do support, tech sales, etc) and I formulated a theory about what makes a good product type. @sc13ts 1. Initial integration time (and cost) needs to be low/near zero to get eval and close sale.
2. Low competition from open source.
3. Needs to avoid typical NIH zones where programmers might (often irrationally) want to build their own.
Dec 21, 2021 4 tweets 1 min read
For fast rank queries on bitmaps, you often use a multi-level tree where the stored popcounts use fewer bits at finer levels. Well, the same idea also works for ropes, e.g. your B-tree interior nodes near the bottom of the tree can use fewer bits for the stored prefix sums. If you're targeting a fixed cache line foot print for your interior nodes (e.g. 1 or 2 cache lines) this lets you achieve higher node fanout near the bottom of the tree, e.g. the popcounts can be 2 bytes near the bottom while scaling to however many bytes are needed at the top.
Jan 3, 2021 7 tweets 2 min read
@shachaf I just remember I wrote some brain dumps on error recovery a while back with some notes on the effect of lexical syntax design. news.ycombinator.com/item?id=249039… @shachaf This reminded me that one of the benefits of my rigid indentation rules for multi-line lexemes is that you can scan for top-level decls with the pattern "\n<letter>" and for well-formed files it's guaranteed to give you only decls.
Jan 3, 2021 4 tweets 1 min read
Has anyone played with DXGI multi-plane overlays? Apparently they're supported on Intel integrated GPUs now which could be interesting for windowed apps on laptops if it lets you avoid the DWM compositing bottleneck. There's some info here although they are focused on a different usage scenario than windowed apps: software.intel.com/content/www/us…
Jan 1, 2021 5 tweets 1 min read
Is there any work on collapsing states in a DFA to generate a smaller state machine which accepts a superset of the original? Seems like it could be promising for crunching DFAs into PSHUFB territory but I'm not sure of how to automate the selection of state collapses. There are probably some good static heuristics you could come up with but the most promising approach seems like it would be based on a training corpus where you want to optimize the false positive rate for conservative matches.
May 10, 2020 5 tweets 1 min read
For all the (generally reasonable) advice to avoid linear-depth recursion in production code, I see little evidence that programmers at large know how to transform non-trivial recursive code into non-recursive stack-based code in a systematic way. Aside from security/robustness concerns, it's also useful to transform push-based ("visitor") code to pull-based ("iterator") code. Even in languages with generators or coroutines, their overhead might be prohibitive.
Apr 25, 2020 5 tweets 1 min read
Something I hadn't really considered with memory-mapped or bulk-read IO is that once you're reading through the buffer linearly interleaved with other processing, it's going to kick everything else out of the caches because of LRU replacement. If you do traditional buffered IO where you keep updating the same buffer, that buffer's cache lines are just going to stay in cache and not pressure the rest of your working set.