We DaiLambda are working on improving Tezos blockchain storage layer called context. The context stores versions of blockchain states including balance and smart contracts.
Objective
We want to benchmark Tezos context reconstruction of the recent blocks, say 10000.
- Only recent blocks, not from the genesis.
- Blocks must be preloaded to exclude the network costs.
Currently we have 2 ways to replay blocks: reconstruct and replay.
tezos-node reconstruct
commits to the context, but is always from the genesis
tezos-node reconstruct
reconstructs the contexts from the genesis. It takes too long time for benchmark, several days or a week. We also do not want to benchmark the context reconstruction of the old cemented blocks, since a running node does not build contexts only from floating blocks.
tezos-node replay
can replay blocks from a recent block, but does not commit contexts
tezos-node replay
command is to replay specified block levels, but it NEVER commits contexts: new versions of contexts are built on memory, then their hashes are compared with the contexts already imported on disk. For the precise benchmark, we want to commit newly create contexts rather than just checking the context hashes, since the disk I/O is always the big performance factor of the node.
Solution: replay with reconstruction
We have developed a hybrid version of these 2 methods, replay+reconstruc. Using it, we can replay recent Tezos blocks then commit their context updates to the disk. In the view point of the context storage layer, the benchmark using this replay with reconstruction is very similar to what it performs in the actual Tezos node.
Suppose that we want to replay blocks with the context reconstruction between levels $level1
and $level2
.
The idea is:
store/
carries the blocks between$level1
and$level2
.context/
carries only the context at$level1
.- Let
tezos-node replay
command apply the blocks between$level1
and$level2
consecutively and commit their contexts tocontext/
directory.
Prepare tezos-node
Your node must have tezos-node replay
command.
$ ./tezos-node --help
...
replay
Replay a set of previously validated blocks
...
Prepare a full node
Prepare a full node and let $srcdir
be the directory of the full node.
Its storage version must be 0.0.5 or newer:
$ cat $srcdir/version.json
{ "version": "0.0.5" }
Upgrading storage
If the storage version is 0.0.4
, you have to upgrade the data directory by:
$ ./tezos-node upgrade storage --data-dir $srcdir
NOTE: Recent master of tezos-node
has a bug around --data-dir
. You MUST make sure $srcdir/config.json
exists and its data-dir
field points to $srcdir
.
Check the data
You must first check $level1
and $level2
are available in the node:
$ ./tezos-node replay --data-dir $srcdir $(($level1 + 1)) $level2
Snapshot of $level1
Make a snapshot of level $level1
:
$ ./tezos-node snapshot export --data-dir $srcdir --block $level1 tezos-mainnet-snapshot-full.$level1
NOTE: The snapshot MUST be taken by the new store. Currently, snapshots of v9.1 called “legacy snapshots” are NOT properly imported by the latest tezos-node
.
Prepare the starting context
Now import the snapshot to a new directory $importdir
:
$ mkdir $importdir
$ ./tezos-node snapshot import --data-dir $importdir tezos-mainnet-snapshot-full.$level1
Prepare the replay directory
Make another directory $replaydir
for the replay:
Copy the store and JSON files of $srcdir
:
$ cp -a $srcdir/store $replaydir/
$ cp $srcdir/*.json $replaydir/
Copy the context of $importdir
:
$ cp -a $importdir/context $replaydir/
Reinitialize the directory:
$ ./tezos-node config reset --data-dir $replaydir
Now $replaydir/
is ready for reconstruction from $level1
:
store/
: blocks enough to reconstruct between$level1
and$level2
.context/
: the context of$level1
, the starting point of the reconstruction.
Reconstruction by replay
Now the following should work:
$ ./tezos-node replay --data-dir $replaydir $((level + 1)) $((level + 2)) ...
tezos-node replay
cannot take the range of block levels. You need to specify levels one by one.
When you rerun the reconstruction, you have to reset context/
directory:
$ rm -rf $replaydir/context
$ cp -a $importdir/context $replaydir
$ ./tezos-node replay --data-dir $replaydir lev1 lev2 ...
Testing modified node using histroic data
If a modification to a node implementation changes context hash, pure replay+reconstruct fails because the block application may produce different context hashes from the ones expected.
To make replay+reconstruct working even in this situlation, tezos-node replay
has a new option --ignore-context-hash-mismatch
in https://gitlab.com/dailambda/tezos/-/tree/jun@replay-reconstruct . With this option, the context hash mismatches are ignored: if a block $B_i$ with an expected context hash $H_i$ produces a context of a different context hash $H’_i$, the mismatch does not stop the validation if the option is enabled. The replay continues and commits the result context with $H’_i$ remembering the pair of $(H_i, H’_i)$ in memory. At the replay of the next block $B_{i+1}$, the node requires the context of its predecessor $B_i$. Its context hash is $H_i$ in $B_i$, which is NOT found in the context DB. Instead we checkout the context of $H’_i$ paired with $H_i$.
Thanks to this --ignore-context-hash-mismatch
option we can quickly test and bench node modifications over historic block data, even if it might produce different context hashes.
Future work
- It takes very long time to export and import a snapshot for
$level1
. What we need here is just one context and we need no store exported. It would be nice if we have a tool to copy just one version of context. - Benchmarking fails if a shell or a protocol has modifications affect context hashes. Now we have
--ignore-context-hash-mismatch
option. - Better UI.