a5k: Another World on a chip! This is a hardware remake of the Another World VM and renderer (no traditional CPU), that fits on a UP5K #fpga (5K LUTs, 128KB SPRAM).
Thread! (1/n)
(Written in #Silice, running on @1bitsquared icebreaker + VGA PMOD, intro only, no audio)
2/ Another World by @EricChahi was one of my favorite #Amiga500 games. It's a great game with beautiful polygon-based graphics. Its architecture is also fascinating: the whole game runs in a custom VM
As the game turned 31 I thought a hardware version would make a great present!
3/ @fabynou has several great blog entries detailing the game inner workings, with links to additional resources including source code. I won't go into much details here so check it out for an in-depth overview! fabiensanglard.net/another_world_…
4/ The game is organized around a small VM that calls a blitter, a rasterizer and a font drawer to draw in four framebuffers. The framebuffers are cleverly used to cache background renderings and produce animations.
5/ The game uses a palette of 16 colors, 4 bits per pixel, with palette tricks for transparency. Interestingly, four 320x200x4bits framebuffers amounts to 128KB of memory, which happens to be exactly what we have on the UP5K #fpga.
I simply could not resist.
6/ But where's the code and data then? Well, in SPIflash of course! Think of the VM as a script engine, by running SPIflash at 50MHz and a hardware blitter+rasterizer at 25MHz we really don't have to worry about performance.
7/ My goal was primarily to reproduce the intro sequence (minus the audio). I love this intro. I would load the game just to watch it, and I am even more impressed by it now that I understand what happens behind the scene.
8/ The on-board LEDs are showing the frame swap (green LED constantly blinking), the palette swaps (red LED) and the blitter, rasterizer and 'text' drawing busy signals.
9/ The VM, blitter and rasterizer easily fit in 5K LUTs. I did a very relaxed implementation and spent little time optimizing which explains the resource usage 😅
10/ I extracted a code+data package by instrumenting (heavily hacking!!) the C++ remake. Fortunately data packages for the intro are loaded first, and then do not change during the sequence. I think the same is true for each game 'part', so maybe I can support the full game!
11/ Having a reference implementation, I verified that my hardware VM and rasterizer are giving the exact same result in simulation, and they do! I've had most glitches due to sync issues between the VGA scan, blitter and rasterizer. A few glitches remain ...
12/ Fun note: the rasterizer code seems odd at first, but that's because the polygons are preprocessed for performance. A polygon is stored as a left(top-bottom)-right(bottom-top) sequence, e.g.:
13/ This makes drawing spans super easy. For a polygon of 2.N vertices, the left side is from 0 to N-1, the right side from 2.N-1 down to N, and both sides exactly match: the y coordinate is the same between corresponding left-right vertices!
14/ I was running low on LUTs so I took a 'shortcut' for the text. Since the text strings are static, I pre-rendered them and stored them as textures in SPIflash. I then blit the text images in the active buffer. No font, no drawString/drawChar required 😎
15/ It was great fun doing this! Another World just turned 31 years old, but it remains a unique and wonderful game, well worth playing, and a technical marvel to dive into.
Q5K: Quake level viewer in 5K LUTs on a low cost, low power ice40 up5k #fpga! Custom #GPU, @risc_v CPU and SOC, capable of rendering #Quake's level with lightmaps.
How? Thread 👇
(Written in #Silice, here running on the #mch2022 badge fpga)
The (tiny) GPU is my DMC-1 (Doom-Meets-Comanche) GPU, which also powers the Doomchip-onice demos (remember? Doom with a terrain!!).
It targets the ice40 UP5K, an entry-level fpga with great support from the Open Source toolchain #yosys/#nextpnr. 2/n
There were four main hardware changes to enable Quake level rendering: 1) 32-bits per-column depth, 2) streaming of level data from QPI memory (SPIflash on icebreaker, PSRAM on mch2022 badge), 3) multi-texturing for lightmaps (!!)
How much DooM can fit in a USB port? Quite a bit it turns out! A minuscule #Fomu#fpga board hosts my hardware/software re-implementation of the DooM render loop in the confines of a USB port (uses ~4200 LUTs and < 128 kB of internal RAM). (1/n)
This is a tiny piece of DooM in a 2.1x2.7 mm #fpga. That is pretty small! (can you see it below on the #Fomu board? you might have to zoom ...).
I created within a #riscv computer with specialized texturing and column drawing hardware. Designed to render DooM 1994 levels! (2/n)
The OLED screen is connected to the #Fomu through jumper wires soldered on the pads (a trick inspired by @brunolevy01 Fomu vga mod). (3/n)
The DooM-chip! It will run E1M1 till the end of times (or till power runs out, whichever comes first).
Algorithm is burned into wires, LUTs and flip-flops on an #FPGA: no CPU, no opcodes, no instruction counter.
Running on Altera CycloneV + SDRAM. (1/n)
Everything is described in a language I am working on: SDRAM controller, divider, BSP traversal, texture unit, etc.
Main renderer (w/o data) is 666 lines of code (!).
A great test case, made quite a few improvements, fixed some issues, learned a lot on CycloneV + Quartus.
(2/n)
Rendering uses the original BSP tree (of course!) but is modified to better fit a hardware implementation ; columns are raycast and drawn immediately front-to-back, stopping as soon as fully filled.
(3/n)
Wolfenstein 3D render loop in pure hardware! No CPU, no instruction pointer, no opcodes, only wires and flip-flops. Here runs on a Mojo V3 board (Xilinx Spartan 6) + SDRAM. Reading @fabynou black books while learning about #FPGA could only lead to this ;-)
(1/n)
Implemented from scratch using my language, from the SDRAM double-framebuffer to the Wolf3D DDA algorithm (and this is the original one; fixed point, DDA loop with only adds and shifts, tangent table!). 320x200, 256 18-bits colors palette and VGA output -- old school!
(2/n)