Geth fast sync

How long does it take to do geth sync?

Syncing Ethereum node is a pain point for many people. Every person working with Ethereum is bound to encounter this.

The current default mode of sync for Geth is called fast sync. Instead of starting from the genesis block and reprocessing all the transactions that ever occurred (which could take weeks), fast sync downloads the blocks, and only verifies the associated proof-of-works.

Downloading all the blocks is a straightforward and fast procedure and will relatively quickly download the entire chain.

Downloading all the blocks is a straightforward and fast procedure and will relatively quickly geth fast sync the entire chain.

Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case, since no transaction was executed.

These need to be downloaded separately and cross checked with the latest blocks. This phase is called the state trie download and it actually runs concurrently with the block downloads; alas it take a lot longer nowadays than downloading the blocks. This cryptographic linking is done by creating a tree data structure above the accounts, each level aggregating the layer below it into an ever smaller layer, until you reach the single root.

This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie. Ok, so why does this pose a problem? This trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes).

To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that none in the network is trying to cheat you. This itself is already a crazy number of data items.

The part where it gets even messier link that this data is geth fast sync morphing: geth fast sync every block 15sabout nodes are deleted from this trie and about new ones are added.

This means your node needs to synchronize a dataset that is changing times geth fast sync second. But until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts.

You are just done with the block download phase and still running the state downloads. You can see this yourself via the seemingly endless Imported state entries log messages. You need to wait that out too before your node comes truly online.

The reason is that a block in Ethereum only contains the state root, a single hash of the root node. When the node begins synchronizing, it knows about exactly 1 node and tries to download it.

A: As geth fast sync above, you are not stuck, just finished with the block download phase, waiting for the state download phase to complete geth fast sync.

This latter phase nowadays take a lot longer than just getting the blocks. Q: Why geth fast sync downloading the state take so long, I bitcoin green good bandwidth? A: State sync is mostly limited by disk Geth fast sync, not bandwidth.

The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes.

This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way. The end result is that even a fast sync nowadays incurs a huge disk IO cost, which is too much for a mechanical hard drive.

A: Unfortunately not.

You however should be able to run a light client with minimal impact on system resources. If you wish to run a full node however, an SSD is your only option.

2020 Altcoin Node Sync Tests

Thanks to karalabe for the explanation. Shoot link in the comments and I will answer geth fast sync here or in another article.

Here is the second part of the Series:.

