Kusama: missed Paravalidation vote investigation
It finally happened. My first ever "incident" during Kusama para-validation (missed vote). Let me explain what happened, how I noticed it and what investigations I conducted to understand the reason behind it.
What happened?
My Kusama validator node missed a vote during the para-validation session 43004. Usually, normally performing validators should not miss any vote, as they must always be connected with other peers and constantly import and validate new blocks, and exchange information.
Missing a vote might happen, for example, if the network connection is down, if the hard drive failed, or if the node just rebooted during a para-validation session because of an update.
How did I notice?
Ever heard of SubVT? It’s an amazing application that allows monitoring of the network and validators. By checking my para-validation statistics, as I always do to keep an eye on how my node is performing, I noticed something unusual.
Even though this is just one missed vote among several hundreds successful votes, I was still worried because this should not have happened. I was actually proud to say that I had thousands of votes with zero missed, like in my post on the Polkadot forum.
What did I do?
First, I checked if my validator node was normally running, and if I could find some errors in the logs printed by the polkadot
binary. Everything seemed to be perfectly fine, no error log, no updates or reboots…
I checked other reports and statistics for my node, for example on One-T: no visible issue, my node was still performing at A+ grade.
Then I started going through the logs, trying to find inaccuracies, or some weird events not necessarily categorized as errors. I also checked my server performance: CPU, RAM, storage… Everything was fine.
I knew that the missed vote occurred at session 43004 thanks to SubVT, so I started investigation and I listed all the new imported blocks in that session. For that, I looked for all imported blocks between those two lines (including them):
Oct 24 13:27:55 kusama polkadot[984360]: 2024-10-24 13:27:55 🏆 Imported #25479552 (0xd4ff…810a → 0xc26c…49a2)
Oct 24 14:27:48 kusama polkadot[984360]: 2024-10-24 14:27:48 🏆 Imported #25480137 (0x3dc1…7c2c → 0xef08…c78d)
I listed a total of 728 blocks, with the import timestamp. First, I checked if there was any connectivity issue in that period, the 24th of October.
There was a slight increase in traffic since Tuesday, but no error or connection issue was noticed from the monitoring. Next, I started to look online and see if other people had missed vote issues and found this GitHub issue: https://github.com/paritytech/polkadot-sdk/issues/3613. The issue was actually a lot more significant as most of votes were missed and was due to the fact that the blocks were imported too late, so I couldn’t relate with it but it still gave me an idea.
I started calling the Kusama Subscan API on each of my 728 blocks to check when those blocks were produced. Here is the bash script.
Figures out that each of the block in that session/epoch was imported at the exact second it was produced: I had no delay! Everything was perfectly fine.
Block 25480137: Imported on time (List: 1729780068, API: 1729780068)
Block 25480136: Imported on time (List: 1729780062, API: 1729780062)
Block 25480135: Imported on time (List: 1729780056, API: 1729780056)
Block 25480134: Imported on time (List: 1729780050, API: 1729780050)
Block 25480133: Imported on time (List: 1729780044, API: 1729780044)
[...]
Block 25479554: Imported on time (List: 1729776486, API: 1729776486)
Block 25479554: Imported on time (List: 1729776486, API: 1729776486)
Block 25479554: Imported on time (List: 1729776486, API: 1729776486)
Block 25479553: Imported on time (List: 1729776480, API: 1729776480)
Block 25479552: Imported on time (List: 1729776475, API: 1729776474)
So, what happened?
Well, it seems that missing a single vote like this is actually a common thing. Indeed, I found many A/A+ rated validators on https://apps.turboflakes.io that missed one or a few votes.
Actually, I checked through One-T the history of my validator and realized this was my first and only missed vote (the black line). My missed votes ratio is extremely low (0.0003).
Even better, I realized that on the session where I missed my vote, my validation group and all para-authorities had higher missed votes ratio! Something caused multiple people to miss votes on that session.
Funny enough, there were multiple sessions where validators missed a lot of votes, like 43015, with a ratio of 0.2 (very high) of missed votes for my validator group, where I had zero missed vote.
Conclusion
More fear than harm, I realized that despite this single missed vote, my validator has been working with excellent performance. Isolated missed votes happens to everyone, and I saw that my ratio is extremely low compared to others!
By doing my research, I even found reports of One-T mentioning my validator among the top 16 best validators, multiple times!
This concluded my investigation: everything is working perfectly and there was nothing to worry about!
As you can see, I am very serious about the performance of my validator, and I will always keep an eye on it, and make sure it runs at best performance with low commission fee. If you want to nominate a trusted and transparent validator, consider nominating me!
Kusama: Dfw2gjWfm19j3Q9ewn9PJiDyCdFsBs9QQ2ZiRAZ3k2QNVar
Polkadot: 126cWhehuBFhQvbDqt26dWBNgELfkpc72WvJV3sx82qRogkT