@ethereum First, quick shout out to @christine_dkim who has been writing recaps as well over the past few calls. I encourage folks to read her thread too for an additional perspective π
@ethereum@christine_dkim As for the call itself, the first thing we covered was the set of recent shadow forks of Goerli. TL;DR: shadow forks allow us to run The Merge on a minority of nodes on the network, but still receive transactions going to the main chain.
@ethereum@christine_dkim Because of this, we can test clients working in an environment closer to live networks (e.g. the network has a large state, history, etc.) and keep getting a regular stream of transaction, most of which will be valid, at least until the state on both chains diverges too much.
@ethereum@christine_dkim This is helpful because it can help us find issues related to sync, performance, etc. which are harder to trigger on brand new devnets. And these runs were no exceptions!
@ethereum@christine_dkim The first shadow fork this week found several issues in clients related to timeouts. The second, encouragingly, went a bit smoother. We are planning a third shadow fork of Goerli Monday and then a mainnet shadow fork later next week!
@ethereum@christine_dkim The mainnet shadow fork will be a great way to collect data about how post-merge nodes behave in mainnet conditions, and how the large blocks, state and history affect node sync, stability, performance, etc.
@ethereum@christine_dkim For example, on the Goerli shadow forks, there were some issues with the Geth/Teku combo on 8GB RAM nodes because both of them required a bit over 4GB and kept fighting for RAM.
@ethereum@christine_dkim It's worth noting that these shadow forks, with each possible combination of EL/CL client pairs, are being set up largely by @parithosh_j: if you see him in Amsterdam, you probably should buy him a beer π»
@ethereum@christine_dkim@parithosh_j After this overview of the shadow forks, we spent time discussing the various timeout issues we had seen on them. There was a lot of discussion about the nuances of implementations here, and I recommend the livestream for those interested in that level of detail π€
@ethereum@christine_dkim@parithosh_j In short, most CLs tend to send ELs the call to retrieve a block too close to when they send the call to produce it. The delay doesn't allow ELs to properly construct blocks, and leads to the CLs submitting empty beacon blocks.
@ethereum@christine_dkim@parithosh_j This is often triggered in cases where the EL still is validating the most recent blocks on the chain. The CL expects a response in a few hundred milliseconds, but in practice it could wait up to 4s in the slot to get a block from the EL.
@ethereum@christine_dkim@parithosh_j That said, there are some issues in setting explicit timeouts, as it opens attack vectors to try and create blocks who will take slightly longer than the specified timeout to be validated.
@ethereum@christine_dkim@parithosh_j We couldn't come to a solution for the best values to use directly on the call, but we'll keep discussing it async and hopefully have something everyone is OK with on next week's CL call.
@ethereum@christine_dkim@parithosh_j Next up, we discussed another issue we've seen recently, which has to do with how EL clients, when they realize a block is invalid, return the hash of the last valid block they've seen on that fork, called latestValidHash.
@ethereum@christine_dkim@parithosh_j Returning this helps CL clients triage between various fork trees and discard invalid ones. When the EL is synced and has a full view of the current state, this is something easy to do.
@ethereum@christine_dkim@parithosh_j That said, when the EL is syncing, it may not have a good view into all the possible fork trees, and depending on EL client implementations, it can be harder to return the "true" latestValidHash. @mkalinin2 has a document explaining the issue: hackmd.io/GDc0maGsQeKfP8β¦
@ethereum@christine_dkim@parithosh_j@mkalinin2 Again, the discussion on the call went into much more depth so I'd encourage folks interested in the details of sync implementations to watch the full recording.
@ethereum@christine_dkim@parithosh_j@mkalinin2 Like for the timeout issues, this is something we'll keep discussing async over the week and hopefully have a solution to on the CL call next Thursday.
@ethereum@christine_dkim@parithosh_j@mkalinin2 We've been tracking the bomb's progression (see: ethresear.ch/t/blocks-per-wβ¦). It's hard to make predictions of when the bomb shows and how much because once it starts kicking in, then it's harder to estimate how long it will take for the next increment to be hit.
@ethereum@christine_dkim@parithosh_j@mkalinin2 This is because the bomb increases its impact every 100k blocks, but each 100k block will be harder to mine because of the bomb's effect.
@ethereum@christine_dkim@parithosh_j@mkalinin2 Additionally, if new hashrate comes on the network, it can mitigate the bomb's effect, to an extent. Conversely, if hashrate leaves, the effect can be increased.
@ethereum@christine_dkim@parithosh_j@mkalinin2 The Merge makes this the hardest estimation, because you'd expect miners to start selling their GPUs and hashrate to drop, but when and how quickly is impossible to predict.
@ethereum@christine_dkim@parithosh_j@mkalinin2 So, with all these caveats, if we make some rough estimations, we can expect block times to hit ~15s in early July and ~17s in late July. Again, large error bars here!
@ethereum@christine_dkim@parithosh_j@mkalinin2 Personally, I think ~15s is probably the max increase we can tolerate, so ideally we'd want to either merge before hitting the 17s blocks, or delay the bomb. Others might disagree here, but we didn't go in that on the call too much.
@ethereum@christine_dkim@parithosh_j@mkalinin2 I think that if we want to hit that, we need to be ready to start forking testnets by late April, or 2 ACD calls from now. If we get there and aren't ready for testnets, it seems like the best path would be to delay the bomb (unless we want 17-25s blocktimes).
@ethereum@christine_dkim@parithosh_j@mkalinin2 Then, we discussed how long we'd want testnets to run for. There seemed to be consensus that we want to see them run smoothly for longer than usual forks (which is usually 1 week between networks + a 3-5 weeks after that).
@ethereum@christine_dkim@parithosh_j@mkalinin2 We didn't come to a clear schedule, but given that it's a harder than usual type of upgrade (because we're using a TTD and not block number + we need to stand up new beacon chains for 2/3 testnets), we might have client releases which gradually add testnets, vs. all at once.
@ethereum@christine_dkim@parithosh_j@mkalinin2 I'll spend some time before the next call sketching out what this could look like. We'll also have a better picture of how client implementations are going once we've done a few more shadow forks, including mainnet!
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes In short, the update reflects the rough consensus from the last ACD, to append a list of withdrawals as part of the block headers instead of repurposing the ommers list which, psot-merge, will be empty.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes There are also some details added about why a system-level operation was chosen rather than a new transaction type. On the call, there were some minor questions about the withdrawal index field's purpose.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz Moody gave an overview of the impact of the EIP, which would provide significant gas savings and enable new contract patterns, an update on some of the technical issues (see the issue above) and shared he believes this EIP is the best version of a transient storage feature.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz I personally am weary of overcommitting so early in the process. Additionally, a few developers shared that while they were sympathetic to 1153, *if* we had more room in Shanghai, it wouldn't be their first choice for what to include.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz We added the Considered for Inclusion status in 2018/19 or so as a way to highlight that while we would try and implement + ship all CFI EIPs, there was no guarantee that implementation issues wouldn't arise so it wasn't certain CFI EIPs would make it into an upgrade.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz To put this in perspective, say The Merge happens in the summer-ish, then Shanghai is 6+ months after that (Winter 2022/23), then we'd be making a decision for what comes after *that*, so basically at least Summer 2023. It's hard to know our priorities won't change by then!
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz That said, I appreciate how frustrating this can be for EIP champions, and realize that several EIPs are now in this limbo state. There really isn't a silver bullet here, because our main bottleneck is the fact that there is a limited number of changes we can test + deploy safely
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz Next up on the call, unfortunately with only 5 minutes left, @q9fmz came to propose a change to help deal with Goerli ETH hoarding. Recently, faucets have been drained and people have been purchasing GoETH in the hopes that it might accrue value.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz@q9fmz This has made it much harder for developers who want to use Goerli for its intended purpose, testing, to get GoETH. Afri had two proposals which would mint a huge amount of GoETH to basically ensure the coins would always be worthless.
@ethereum@christine_dkim@parithosh_j@mkalinin2@ralexstokes@sendmoodz@q9fmz There was some discussion on the call about the pros/cons of this approach vs. potentially just restarting the network from scratch, and unfortunately we didn't have enough time to go deep in them. We're going to continue the conversation async in the Goerli repo linked above!
First up on the call, @parithosh_j gave an overview of the Kiln launch. The PoW chain went live last Wednesday, and the Beacon Chain on Friday.
@parithosh_j We were hoping to merge mid-week this week, but there was an unexpected increase in hashrate in the network. This forced us to use the override feature we built into client to set a new terminal total difficulty value. This worked perfectly! All clients merged at the new TTD!
@ethereum First on the call, we discussed the latest updates to Kiln: our upcoming merge testnet. @vdWijden has been trying to get all client combinations working and to put together a doc with instructions about how to pair them π
@ethereum@vdWijden One thing that wasn't fully aligned yet was client authentication between the EL + CL nodes. We had rough agreement on the call about how to proceed, and will make sure all clients behave similarly. In short, all calls on the auth'd port will require a token.
Wrapped up another @ethereum#AllCoreDevs today. Covered the Goerli outage, Kiln updates, merge testing, beacon chain withdrawals and a few more misc. items π
@ethereum Re: the stream, apologies but due to a zoom config issue, the first part of the discussion is missing the audio from everyone except me π The notes will have a full transcript for that part of the call too.
@ethereum First on the call, we discussed the Goerli issue from overnight: a number of validators were down and the alterting software didn't catch it. On the call, the root cause of the issue hand't been found yet, but several teams were investigating.
Another @ethereum#AllCoreDevs wrapped up this morning. Deep dives into The Merge & Shanghai, one of my favorite calls in a while π Def recommend the full recording as the signal/noise ratio was very high. If you don't have 90 mins, here's a recap!
We've covered these issues on previous calls, and fixed them in the latest spec, Kiln, but the report gives a great overview of what happened and the impacts it had π
@ethereum First on the call, @mkalinin2 presented an upgrade to the Engine API spec to address some issues we've seen on Kintsugi w.r.t. re-orgs. Full link here: github.com/ethereum/execuβ¦
@ethereum@mkalinin2 In short, the PR makes it optional for EL clients to actually execute a payload when receiving an executePayload call (and hence renames that call to newPayload for clarity).
@ethereum First, updates on the Kintsugi π΅ devnet! There was a lot of testing over the holidays, and @vdWijden managed to break the network again π₯
TL;DR: his fuzzer created a block which replaced certain fields by others, and because of caching+validation issues, some clients accepted an invalid block as valid.