On July 13, Celo Mainnet suffered a stall that lastest 24hs. I’m writing this post to share the post mortem document about the incident.
You can find the details of it here.
Aside from that, we’ll be holding a community wide post mortem meeting tomorrow. This will happen July 27th, 8AM PST / 3PM UTC. The meeting will be recorded.
Meeting Link: meet.google.com/kim-ezgt-nwe
The Hotfix on July20th
Additionally, I wanted to share that on July 20th, a governance hotfix was approved to decrease the block gas limit to 20 million gas from 50 million gas. (for those not familiar with the hotfix mechanism, check our docs)
After the network stall event, we started analyzing other possibilities for the network to generate consensus messages that exceed the 10MB size limit imposed by the p2p network protocol layer to proactively catch any future issues. Unfortunately, with the block gas limit of 50 million gas, there are still ways that a consensus message > 10MB can be generated.
To fix this, we’re working on refactoring consensus messages to avoid this scenario. While this change is in progress, it was important to take immediate action to prevent another outage by reducing the block gas limit by submitting a hotfix proposal to revert the gas limit to i’s value before the CGP-53 (20 million gas).
Using celocli, you can check the hotfix
governance:show --hotfix 0x781f90afc086489bda4e9ad15b881ca7f505bb4bdc6f46a4e0dfb016ad702467
We didn’t want to publish technical information before the approval of the Hotfix. After it, there was a technical debrief last Thursday, and now we are able to share the Post Mortem and have a proper community call with everyone.
Thank you for your patience!