, 14 tweets, 4 min read
wow, this paper is remarkable! performing computations with DRAM in a way that is very similar to how certain undocumented 6502 instructions implement wired-OR.
i want to explain it now. the heart of a DRAM is a very tiny capacitor. it is tied in to a matrix of wiring using a single transistor. access to the capacitors is done a row at a time; the row line goes high, turning on the transistors for the entire row.
so to read a row, let's say we activate the transistors in row R1. the capacitor in a single bit dumps its charge into the row line (the vertical wire) and shares the electrical charge with whatever capacitance is there. the row line is precharged to 1/2 VCC.
because we're *adding* charge to the row line, the row line voltage increases slightly above 1/2 VCC. then the sense amplifiers fire up, detecting that the voltage > 1/2VCC, then they *pull it up the rest of the way to vcc*
so the sense amplifier *recharges* the bit that we just read back as well as transmitting the state of the bit line to the data output. incidentally this is how refresh works: you open the row, and the sense amplifiers recharge all the bits in that row (either to 1 or 0).
this is where the authors get clever. they open a row, and then they skip the precharge cycle, and then open a second row. instead of being vcc/2, the bit lines are all 0v or vcc depending on the contents of row 1. this next cycle *copies* the entire row R1 into R2!
then the authors get *devilishly clever* and manage to open 3 rows at once! they use the principle of charge sharing to implement logical AND! in the example of 1 & 0, there is not enough electrical charge and the result < vdd/2, therefore the output is 0.
for logical OR, they flip the constant in one bit cell to a '1'. therefore it only takes the charge from 1 other bit cell to exceed vdd/2 and output a 1.
so to sum up, you get the ability to copy 65536 bits (contents of a single row) all at once. you get the ability to run 65536 logical AND or logical OR operations at once.
all of this done on commodity DRAM chips but with a custom DRAM controller that can shorten the timing cycles.
(the paper covers a bunch of other stuff, measuring reliability of the process over temperature, voltage, and manufacturer.)
the paper is directly useful for folks who are doing lots of number crunching in FPGAs with soft-IP core DRAM controllers. for folks using PCs with hard-IP DRAM controllers, there's nothing really useful here, sorry.
oh, and one thing i forgot to mention: they didn't discuss this in the paper, but it should be possible to make multiple copies of a single row. you open a row, skip the precharge, then open a row for a copy, then keep opening more and more rows until you've made all the copies.
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Tube Time

Stay in touch and get notified when new unrolls are available from this author!

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

# Try unrolling a thread yourself!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" `@threadreaderapp unroll`