Tweet

Sylvain Lefebvre

10 May, 17 tweets, 11 min read

How much DooM can fit in a USB port? Quite a bit it turns out! A minuscule #Fomu #fpga board hosts my hardware/software re-implementation of the DooM render loop in the confines of a USB port (uses ~4200 LUTs and < 128 kB of internal RAM). (1/n)

This is a tiny piece of DooM in a 2.1x2.7 mm #fpga. That is pretty small! (can you see it below on the #Fomu board? you might have to zoom ...).

I created within a #riscv computer with specialized texturing and column drawing hardware. Designed to render DooM 1994 levels! (2/n)

@brunolevy01

The OLED screen is connected to the #Fomu through jumper wires soldered on the pads (a trick inspired by @brunolevy01 Fomu vga mod). (3/n)

CPU-side, the renderer traverses the BSP tree and determines the order of the sub-sectors (SSECTORS). For each screen column, it then intersects the walls of the sub-sectors (SEGS), front to back. The hardware takes care of the actual column drawing and texturing. (4/n)

@fabynou

SSECTORS? SEGS? Time for a refresher on DooM BSP! Checkout @fabynou's excellent DooM Black Book and @FSouchu post on his amazing PICO-8 port. (5/n)

freds72.itch.io/poom/devlog/24…

The design fits nicely within the #ice40 UP5K #fpga. See the DSP usage? 8 / 8 ;-) (most are used for ground/ceiling texturing, aka flats). (6/n)

Command buffers (fifo) are used between the CPU, hardware renderer, and OLED SPI controller. This allows to keep everyone happily busy. (7/n)

Everything is written from scratch in #Silice ; from the #riscv CPU to the texture unit. Original game data is extracted from doom1.wad by the Lua pre-processor of Silice. (8/n)

github.com/sylefeb/Silice

The #Fomu SPI-flash is used to initialize the fpga (bitstream + SPRAM: code, level data). However, with the Fomu we cannot write at arbitrary locations in SPI-flash from a host computer (using dfu-util), except ...

@brunolevy01

I found a simple trick to safely store data on the #Fomu SPI-flash: concatenate the data to the bitsream file, and dfu-util is happy to upload everything. The data is then at address 262144 (warmboot slot) + 104106 (bitstream size).

(cc @brunolevy01, @unaimarcor, @juanmard)

The textures are heavily downsampled, but I have a couple ideas to improve that. Also many details are incorrect, but these are mostly minor (texture scale + alignments, e.g. upper unpegged, lower unpegged, offsets, etc.).

@1bitsquared

This DooM demo should run on any #ice40 UP5K, and it does run as-is on the mighty IceBreaker board by @1bitsquared which I also used for development.

@ultraembedded

And don't miss other great works on DooM + #fpga:
- @ultraembedded #ulx3s port

https://twitter.com/ultraembedded/status/1284515315062317061?lang=en

- @tnt IceBreaker + PSRAM hack port
- @wren6991 DoomSOC (work in progress) github.com/Wren6991/DOOMS…
(?/n)

@tnt

The port by @tnt is particularly impressive. It is based on a #riscv architecture targeting a #ice40 UP5K on a IceBreaker board. It does however require a mod to add a PSRAM chip to the IceBreaker. Check out the great explanatory video!
(?+1/n)

(edit: that's DooM 1993 ;) )

@FSouchu

(edit: @FSouchu POOM is a complete remake, rather than a port, as it fully revisits the game for the PICO-8)

Of course now I am thinking about many optimizations!
- texture compression (have a block based prototype, ala S3TC)
- better handling of flats drawing (ground/ceiling) which use too much hardware to my taste
- improved CPU code (skipping some redundancies)
Stay tuned ;-)

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @sylefeb

Sylvain Lefebvre

@sylefeb

8 May 20

The DooM-chip! It will run E1M1 till the end of times (or till power runs out, whichever comes first).
Algorithm is burned into wires, LUTs and flip-flops on an #FPGA: no CPU, no opcodes, no instruction counter.
Running on Altera CycloneV + SDRAM. (1/n)

Everything is described in a language I am working on: SDRAM controller, divider, BSP traversal, texture unit, etc.
Main renderer (w/o data) is 666 lines of code (!).
A great test case, made quite a few improvements, fixed some issues, learned a lot on CycloneV + Quartus.
(2/n)

Rendering uses the original BSP tree (of course!) but is modified to better fit a hardware implementation ; columns are raycast and drawn immediately front-to-back, stopping as soon as fully filled.
(3/n)

fabiensanglard.net/doomIphone/doo…

Read 9 tweets

Sylvain Lefebvre

@sylefeb

27 Apr 20

@fabynou

Wolfenstein 3D render loop in pure hardware! No CPU, no instruction pointer, no opcodes, only wires and flip-flops. Here runs on a Mojo V3 board (Xilinx Spartan 6) + SDRAM. Reading @fabynou black books while learning about #FPGA could only lead to this ;-)
(1/n)

Implemented from scratch using my language, from the SDRAM double-framebuffer to the Wolf3D DDA algorithm (and this is the original one; fixed point, DDA loop with only adds and shifts, tangent table!). 320x200, 256 18-bits colors palette and VGA output -- old school!
(2/n)

@ID_AA_Carmack

DDA algorithm heavily building on the original @ID_AA_Carmack AsmRefresh impl.:
github.com/id-Software/wo…

Fascinating to look back into it! Love these runtime patches ;-)

mov [BYTE cs:horizop],OP_JLE ; patch a jle in

(can't do on FPGAs ... unless reconfig at runtime ...??)
(3/n)

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Sylvain Lefebvre

Try unrolling a thread yourself!

More from @sylefeb

Sylvain Lefebvre

Sylvain Lefebvre

Did Thread Reader help you today?

Like this author's thread?