Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I built my own node one today has not been able to catch up with the latest height #2746

Open
COLUD4 opened this issue Nov 17, 2022 · 29 comments

Comments

@COLUD4
Copy link

COLUD4 commented Nov 17, 2022

version:1.6.5.1
ubuntu20.04

@sing1ee
Copy link

sing1ee commented Nov 17, 2022

same here

always out of sync

@macrocan
Copy link

+1
still slow syncing even upgrade cpu, memory and disk

@hot-westeros
Copy link

same here

always out of sync

@sing1ee
Copy link

sing1ee commented Nov 17, 2022

+1 still slow syncing even upgrade cpu, memory and disk

no matter with hardware, just the network issue

@dandavid3000
Copy link

Same issue. CPU and RAM peak due to this issue too.

@kaelabbott
Copy link

Same issue. even after pulling the new version and running the repair-state. Slow syncing

@sing1ee
Copy link

sing1ee commented Nov 17, 2022

try this

if you are using leveldb:

exchaind start 
--pruning everything 
--chain-id exchain-66 
--mempool.sort_tx_by_gp
--iavl-enable-async-commit=true 
--iavl-cache-size=10000000  
--mempool.recheck=0 
--mempool.force_recheck_gap=2000 
--disable-abci-query-mutex=1 
--mempool.size=200000
--mempool.max_gas_used_per_block=120000000 
--home /data_folder/ 
--disable-abci-query-mutex=1  
--fast-query=1 
--enable-bloom-filter=1

if you are using rocksdb

exchaind start 
--pruning everything 
--chain-id exchain-66 
--db_backend rocksdb 
--mempool.sort_tx_by_gp
--iavl-enable-async-commit=true 
--iavl-cache-size=10000000  
--mempool.recheck=0 
--mempool.force_recheck_gap=2000 
--disable-abci-query-mutex=1 
--mempool.size=200000
--mempool.max_gas_used_per_block=120000000 
--home /data_folder/ 
--disable-abci-query-mutex=1  
--fast-query=1 
--enable-bloom-filter=1

Switching to rocksdb is recommended as leveldb will no longer be maintained

@kaelabbott
Copy link

@sing1ee thanks for the response, I tried the commands above and still my node is slow syncing. im using rocksdb and the version 1.6.5.1 as well.

@cwbhhjl
Copy link
Contributor

cwbhhjl commented Nov 17, 2022

Notice about OKC Network.

Today, OKC has onboarded XEN successfully. However, due to our low gas fees and XEN's popularity, there were many users who attempted to mint XEN through scripts. These scripts consume a huge amount of gas in a single transaction which consequently filled the Mempool, resulting in temporary congestion.

We have since increased the capacity of our RPC Mempool and the gas limit of each block to alleviate the congestion.

To prevent similar incidents, we will launch a proposal for the community to vote on the option to limit excessive consumption of resources by single transactions. Updates will follow soon! Stay tuned

https://t.me/XENCryptoTalk/367273

In my personal opinion, this was caused by the wrong gas policy. the gas limit of the block was even increased. and the gas price cannot be adjusted correctly, https://www.oklink.com/en/okc/block/15416037

the execution pressure of the block during this period is too high, causing most nodes to take a long time to sync the state.

There are some ways to help you sync faster:

  1. upgrade the machine
  2. use rocksdb
  3. turn on the asynchronous commit option --iavl-enable-async-commit=true --iavl-cache-size=10000000

But these may not help much. Just let the nodes to sync, even if it will be very slow.
You can also wait for the official release of a new data snapshot to skip synchronizing today's block data

@jackie2022tec
Copy link

same issue from 17/11, who can help

@sing1ee
Copy link

sing1ee commented Nov 18, 2022

Notice about OKC Network.
Today, OKC has onboarded XEN successfully. However, due to our low gas fees and XEN's popularity, there were many users who attempted to mint XEN through scripts. These scripts consume a huge amount of gas in a single transaction which consequently filled the Mempool, resulting in temporary congestion.
We have since increased the capacity of our RPC Mempool and the gas limit of each block to alleviate the congestion.
To prevent similar incidents, we will launch a proposal for the community to vote on the option to limit excessive consumption of resources by single transactions. Updates will follow soon! Stay tuned

https://t.me/XENCryptoTalk/367273

In my personal opinion, this was caused by the wrong gas policy. the gas limit of the block was even increased. and the gas price cannot be adjusted correctly, https://www.oklink.com/en/okc/block/15416037

the execution pressure of the block during this period is too high, causing most nodes to take a long time to sync the state.

There are some ways to help you sync faster:

  1. upgrade the machine
  2. use rocksdb
  3. turn on the asynchronous commit option --iavl-enable-async-commit=true --iavl-cache-size=10000000

But these may not help much. Just let the nodes to sync, even if it will be very slow. You can also wait for the official release of a new data snapshot to skip synchronizing today's block data

I have tried, but it seems to be getting slower, only 30 blocks are synchronized in 10 minutes

@cwbhhjl
Copy link
Contributor

cwbhhjl commented Nov 18, 2022

@sing1ee The gas limit of the block is now 120 million, which I think is still too high. there is a lot of 'scripted' XEN minting with very low gas price in each block.

@0xChupaCabra
Copy link

same problem here, using rocksdb and the iavl flags mentioned in this discussion

@giskook
Copy link
Contributor

giskook commented Dec 1, 2022

same problem here, using rocksdb and the iavl flags mentioned in this discussion

Hi, @stepollo2 ,
Can you share your start command ,version and logs?
We sugguest to use v1.6.5.9

If you run node as rpc, please start exchaind:

exchaind start --home $your_home

if you run node as validator, please start exchaind:

exchaind start --node-mode=val --home $your_home

We benifit from improve our disk‘s iops(16000) and throughput(1000M). Hope it helps.

@0xChupaCabra
Copy link

Here

same problem here, using rocksdb and the iavl flags mentioned in this discussion

Hi, @stepollo2 , Can you share your start command ,version and logs? We sugguest to use v1.6.5.9

If you run node as rpc, please start exchaind:

exchaind start --home $your_home

if you run node as validator, please start exchaind:

exchaind start --node-mode=val --home $your_home

We benifit from improve our disk‘s iops(16000) and throughput(1000M). Hope it helps.

Here are the details requested:

exchaind version
v1.6.5.8

cat /etc/systemd/system/exchain.service
[Unit]
Description=OKX service
After=network.target
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
RestartSec=1
User=XXXX
ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

[Install]
WantedBy=multi-user.target

Snippet from the logs:

Dec 01 17:47:35 ovh-1 exchaind[1710424]: I[2022-12-01|17:47:35.365][1710424] Height<15425676>, Tx<6>, BlockSize<4854>, GasUsed<94425787>, InvalidTxs<0>, lastRun<11275ms>, RunTx<ApplyBlock<13096ms>, abci<11275ms>, persist<1819ms>>, MempoolTxs<0>, Workload<1.00|1.00|1.00|1.00>, MempoolTxs[0], Iavl[getnode<340987>, rdb<30504>, rdbTs<52429ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<11255ms>, refund<0ms>]. module=main
Dec 01 17:47:48 ovh-1 exchaind[1710424]: I[2022-12-01|17:47:48.792][1710424] Height<15425677>, Tx<6>, BlockSize<16069>, GasUsed<78695088>, InvalidTxs<0>, lastRun<11914ms>, RunTx<ApplyBlock<13426ms>, abci<11914ms>, persist<1510ms>>, MempoolTxs<0>, Workload<1.00|1.00|1.00|1.00>, MempoolTxs[0], Iavl[getnode<278986>, rdb<24139>, rdbTs<44188ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<11827ms>, refund<0ms>]. module=main
Dec 01 17:48:08 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:08.769][1710424] Height<15425678>, Tx<7>, BlockSize<2973>, GasUsed<113176930>, InvalidTxs<0>, lastRun<17851ms>, RunTx<ApplyBlock<19975ms>, abci<17851ms>, persist<2122ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<402953>, rdb<36778>, rdbTs<72608ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<17824ms>, refund<0ms>]. module=main
Dec 01 17:48:24 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:24.074][1710424] Height<15425679>, Tx<6>, BlockSize<3061>, GasUsed<90351907>, InvalidTxs<0>, lastRun<13627ms>, RunTx<ApplyBlock<15303ms>, abci<13627ms>, persist<1674ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<326127>, rdb<29052>, rdbTs<51812ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<13598ms>, refund<0ms>]. module=main
Dec 01 17:48:42 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:42.926][1710424] Height<15425680>, Tx<8>, BlockSize<2998>, GasUsed<113197930>, InvalidTxs<0>, lastRun<16965ms>, RunTx<ApplyBlock<18851ms>, abci<16966ms>, persist<1884ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<403674>, rdb<36418>, rdbTs<63500ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<16927ms>, refund<0ms>]. module=main
Dec 01 17:48:57 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:57.021][1710424] Height<15425681>, Tx<7>, BlockSize<15861>, GasUsed<101354033>, InvalidTxs<0>, lastRun<12198ms>, RunTx<ApplyBlock<14093ms>, abci<12198ms>, persist<1893ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<356042>, rdb<31227>, rdbTs<57318ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<12152ms>, refund<0ms>]. module=main
Dec 01 17:49:11 ovh-1 exchaind[1710424]: I[2022-12-01|17:49:11.397][1710424] Height<15425682>, Tx<7>, BlockSize<3813>, GasUsed<141031927>, InvalidTxs<0>, lastRun<12808ms>, RunTx<ApplyBlock<14375ms>, abci<12809ms>, persist<1564ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<492964>, rdb<45232>, rdbTs<48284ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<12776ms>, refund<1ms>]. module=main

@giskook
Copy link
Contributor

giskook commented Dec 2, 2022

Hi @stepollo2 ,

exchaind version
v1.6.5.8

Version is OK

ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

Is that your machine has the memory problem? I saw you set --rocksdb.opts max_open_files=100, if your machine has enough memory you can increase the value, or just remove this flag. If not just keep the flag

Workload<1.00|1.00|1.00|1.00>

the workload is pretty heavy.

BTW, how is your machine's configuration?
and how about the disk‘s IOPS and throughput?

@0xChupaCabra
Copy link

0xChupaCabra commented Dec 3, 2022

Hi @stepollo2 ,

exchaind version
v1.6.5.8

Version is OK

ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

Is that your machine has the memory problem? I saw you set --rocksdb.opts max_open_files=100, if your machine has enough memory you can increase the value, or just remove this flag. If not just keep the flag

Workload<1.00|1.00|1.00|1.00>

the workload is pretty heavy.

BTW, how is your machine's configuration? and how about the disk‘s IOPS and throughput?


I anyway updated the binaries to latest release. exchaind currently runs on sdc

xxx@ovh-1:~$ iostat
Linux 5.15.0-52-generic (ovh-1)         12/03/22        _x86_64_        (40 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          20.99    0.00    4.42    2.08    0.00   72.51

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
loop0             0.00         0.00         0.00         0.00       5325          0          0
loop1             0.00         0.00         0.00         0.00       3759          0          0
loop2             0.00         0.00         0.00         0.00       1123          0          0
loop3             0.00         0.02         0.00         0.00      59533          0          0
loop4             0.00         0.00         0.00         0.00       5234          0          0
loop5             0.00         0.00         0.00         0.00       1143          0          0
loop6             0.00         0.00         0.00         0.00         28          0          0
md2               0.00         0.01         0.14         0.33      17333     383844     881164
md3             317.50     17259.33       925.71       527.47 46763173981 2508163852 1429154104
sda             186.64      8679.09       929.73       528.57 23515499410 2519037601 1432125892
sdb             190.49      8926.61       929.99       527.80 24186140809 2519756111 1430035268
sdc             320.91      6305.24     14369.75     18280.88 17083678957 38934005708 49531003968
sdd             528.96      7125.34     31270.58     24830.51 19305695953 84725839552 67276842052
sde             503.94      6517.02     16490.06     23203.49 17657495117 44678861056 62868519456
sdf             369.31      4330.12     43404.36     23085.22 11732223141 117601608456 62548089384
sdg             396.14      5029.14     12995.20     16875.20 13626171025 35209747336 45722384544
sdh            1391.41     74930.11     84132.24     19543.80 203018824977 227951480988 52952797804


xxx@ovh-1:~$ free -g
               total        used        free      shared  buff/cache   available
Mem:             754         353           8           0         391         395
Swap:              0           0           0

Disks are 14TB SSDs. I have the possibility to test on another machine with 7TB NVMEs drives

@dandavid3000
Copy link

dandavid3000 commented Dec 3, 2022

Same issue. CPU and RAM peak due to this issue too.

Since the gas incident happened, I've never been able to run the node again.
I tried multiple times with the latest snapshots.
The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

@giskook
Copy link
Contributor

giskook commented Dec 4, 2022

Hi @stepollo2 ,
It seems your nodes run on your own machine instead of some cloud service provider.
I suggest we can drop the flag --rocksdb.opts max_open_files=100

do you run the node as rpc or validator?
If you want to run as validator please set the flag --node-mode=val with start command

@giskook
Copy link
Contributor

giskook commented Dec 4, 2022

Hi @dandavid3000 ,

I tried multiple times with the latest snapshots

Could you please provide exhaind's version information?

The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

Could you provide the exchaind's start command? do you run the node as rpc, validator or archive node?

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

This machine seems good enough. Is your node runs on cloud service provider? If so please check the disk's IOPS and throughput

@0xChupaCabra
Copy link

Hi @stepollo2 , It seems your nodes run on your own machine instead of some cloud service provider. I suggest we can drop the flag --rocksdb.opts max_open_files=100

do you run the node as rpc or validator? If you want to run as validator please set the flag --node-mode=val with start command

I run the node for RPC only

@giskook
Copy link
Contributor

giskook commented Dec 4, 2022

Hi @stepollo2 ,

I run the node for RPC only

Let's drop the flag --rocksdb.opts max_open_files=100

@0xChupaCabra
Copy link

Hi @stepollo2 ,

I run the node for RPC only

Let's drop the flag --rocksdb.opts max_open_files=100

Dropped already but is not helping much :/

@giskook
Copy link
Contributor

giskook commented Dec 4, 2022

Hi @stepollo2 ,

Dropped already but is not helping much :/

  1. Could you send me the whole logs? if the log is too large you can send it to [email protected]

  2. And It seems your exchaind is located /usr/local/bin/exchaind
    Could you run command /usr/local/bin/exchaind version --long | grep -E "version|commit" to confirm the version?

@cwbhhjl
Copy link
Contributor

cwbhhjl commented Dec 4, 2022

@stepollo2 I recommend you to use the latest snapshot for syncing

https://static.okex.org/cdn/oec/snapshot/index.html

The height you are currently synchronizing is when the block is most congested, and each block will consume hundreds of millions of gas, the synchronization will definitely be slow.

@cwbhhjl
Copy link
Contributor

cwbhhjl commented Dec 4, 2022

@dandavid3000 Can you provide a few lines of the node log for the latest height?

@dandavid3000
Copy link

dandavid3000 commented Dec 4, 2022

Hi @dandavid3000 ,

I tried multiple times with the latest snapshots

Could you please provide exhaind's version information?

The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

Could you provide the exchaind's start command? do you run the node as rpc, validator or archive node?

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

This machine seems good enough. Is your node runs on cloud service provider? If so please check the disk's IOPS and throughput

I ran the node on a local PC.

Linux 5.19.5-051905-generic (precision-3460) 	04/12/2022 	_x86_64_(24 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5,11    0,00    1,84    2,93    0,00   90,12

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
loop0             0,00         0,00         0,00         0,00         21          0          0
loop1             0,00         0,00         0,00         0,00        360          0          0
loop10            0,00         0,15         0,00         0,00     177328          0          0
loop11            0,03         1,97         0,00         0,00    2374569          0          0
loop12            0,00         0,00         0,00         0,00        431          0          0
loop13            0,01         0,38         0,00         0,00     460149          0          0
loop14            0,00         0,00         0,00         0,00         67          0          0
loop15            0,00         0,00         0,00         0,00        233          0          0
loop16            0,00         0,00         0,00         0,00        649          0          0
loop17            0,00         0,00         0,00         0,00         18          0          0
loop2             0,01         0,51         0,00         0,00     613403          0          0
loop3             0,00         0,00         0,00         0,00       1297          0          0
loop4             0,16         9,40         0,00         0,00   11335279          0          0
loop5             0,00         0,00         0,00         0,00       1083          0          0
loop6             0,06         2,14         0,00         0,00    2578811          0          0
loop7             0,00         0,00         0,00         0,00       1076          0          0
loop8             0,00         0,00         0,00         0,00        453          0          0
loop9             0,01         0,05         0,00         0,00      57515          0          0
**nvme0n1          46,99       375,54       492,72       327,43  452981416  594329933  394947352
nvme1n1        1868,08     42470,52     40381,53         0,00 51228512692 48708749400          0
nvme2n1          84,15      1527,69       921,23      1413,32 1842719093 1111201988 1704765472**
sda             174,56      2264,85      3999,26         0,00 2731890600 4823959116          0

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}
I[2022-12-04|21:50:24.851][1007860] Height<15617983>, Tx<11>, BlockSize<22333>, GasUsed<34817639>, InvalidTxs<0>, lastRun<607ms>, RunTx<ApplyBlock<1352ms>, abci<607ms>, persist<744ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<204865>, rdb<38993>, rdbTs<7606ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<603ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:25.542][1007860] Stopping peer for error. module=p2p peer="Peer{MConn{175.41.191.69:26656} 7fa5b1d1f1e48659fa750b6aec702418a0e75f13 out}" err=EOF
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 2): dial tcp 8.130.29.139:46966: i/o timeout. module=pex [email protected]:46966
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 5): dial tcp 13.228.20.99:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 1): dial tcp 47.91.245.244:33254: i/o timeout. module=pex [email protected]:33254
I[2022-12-04|21:50:25.727][1007860] Height<15617984>, Tx<2>, BlockSize<9677>, GasUsed<13670144>, InvalidTxs<0>, lastRun<252ms>, RunTx<ApplyBlock<861ms>, abci<253ms>, persist<606ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88273>, rdb<15420>, rdbTs<2369ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<248ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 3): dial tcp 18.192.220.49:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 3.64.37.17:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 35.74.98.204:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 13.213.145.109:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 5): dial tcp 13.213.117.128:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 54.248.224.222:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 3.37.121.32:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 5): dial tcp 13.250.251.11:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 3): dial tcp 13.125.38.24:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 52.221.126.186:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:26.812][1007860] Height<15617985>, Tx<11>, BlockSize<21692>, GasUsed<28449129>, InvalidTxs<1>, lastRun<463ms>, RunTx<ApplyBlock<1072ms>, abci<463ms>, persist<607ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<168172>, rdb<31068>, rdbTs<6280ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<458ms>, refund<1ms>]. module=main 
I[2022-12-04|21:50:27.834][1007860] Height<15617986>, Tx<7>, BlockSize<18198>, GasUsed<27690757>, InvalidTxs<0>, lastRun<456ms>, RunTx<ApplyBlock<1004ms>, abci<457ms>, persist<546ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<166789>, rdb<30781>, rdbTs<5014ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<453ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:27.894][1007860] dialing failed (attempts: 6): auth failure: secret conn failed: read tcp 192.168.0.12:41982->54.249.109.150:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:28.641][1007860] Height<15617987>, Tx<6>, BlockSize<14695>, GasUsed<21094604>, InvalidTxs<0>, lastRun<357ms>, RunTx<ApplyBlock<792ms>, abci<358ms>, persist<432ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<128990>, rdb<23130>, rdbTs<3728ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<353ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:28.830][1007860] dialing failed (attempts: 4): auth failure: secret conn failed: read tcp 192.168.0.12:46820->35.72.176.238:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:29.693][1007860] Height<15617988>, Tx<8>, BlockSize<20362>, GasUsed<28042968>, InvalidTxs<0>, lastRun<482ms>, RunTx<ApplyBlock<1033ms>, abci<483ms>, persist<548ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167322>, rdb<30908>, rdbTs<5062ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<478ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:30.510][1007860] Height<15617989>, Tx<7>, BlockSize<16132>, GasUsed<21271813>, InvalidTxs<1>, lastRun<348ms>, RunTx<ApplyBlock<802ms>, abci<348ms>, persist<451ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<129387>, rdb<23665>, rdbTs<5196ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<344ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:30.828][1007860] Height<15617990>, Tx<1>, BlockSize<5721>, GasUsed<6835078>, InvalidTxs<0>, lastRun<153ms>, RunTx<ApplyBlock<303ms>, abci<153ms>, persist<149ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<47150>, rdb<7850>, rdbTs<1673ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<150ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:31.336][1007860] Height<15617991>, Tx<3>, BlockSize<9718>, GasUsed<13822929>, InvalidTxs<0>, lastRun<190ms>, RunTx<ApplyBlock<493ms>, abci<190ms>, persist<302ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88530>, rdb<15567>, rdbTs<2546ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<187ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:33.262][1007860] Height<15617992>, Tx<12>, BlockSize<29757>, GasUsed<48598568>, InvalidTxs<0>, lastRun<805ms>, RunTx<ApplyBlock<1912ms>, abci<807ms>, persist<1104ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<278964>, rdb<54028>, rdbTs<9928ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<803ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:33.333][1007860] Height<15617993>, Tx<0>, BlockSize<1934>, GasUsed<0>, InvalidTxs<0>, lastRun<2ms>, RunTx<ApplyBlock<54ms>, abci<2ms>, persist<50ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<1510>, rdb<60>, rdbTs<22ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<0ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:34.443][1007860] Height<15617994>, Tx<10>, BlockSize<19576>, GasUsed<28286245>, InvalidTxs<1>, lastRun<455ms>, RunTx<ApplyBlock<1097ms>, abci<455ms>, persist<641ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167825>, rdb<31291>, rdbTs<6553ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<451ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:35.519][1007860] Height<15617995>, Tx<6>, BlockSize<17281>, GasUsed<27514085>, InvalidTxs<0>, lastRun<419ms>, RunTx<ApplyBlock<1060ms>, abci<420ms>, persist<639ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167015>, rdb<31363>, rdbTs<6780ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<415ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:36.552][1007860] Height<15617996>, Tx<8>, BlockSize<20339>, GasUsed<27940433>, InvalidTxs<1>, lastRun<444ms>, RunTx<ApplyBlock<1017ms>, abci<444ms>, persist<570ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167053>, rdb<31044>, rdbTs<5290ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<439ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:37.346][1007860] Height<15617997>, Tx<8>, BlockSize<14719>, GasUsed<21417054>, InvalidTxs<0>, lastRun<350ms>, RunTx<ApplyBlock<778ms>, abci<351ms>, persist<425ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<129900>, rdb<23353>, rdbTs<3773ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<347ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:38.422][1007860] Height<15617998>, Tx<9>, BlockSize<20073>, GasUsed<28139571>, InvalidTxs<1>, lastRun<450ms>, RunTx<ApplyBlock<1059ms>, abci<451ms>, persist<607ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167223>, rdb<31521>, rdbTs<5098ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<447ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:38.945][1007860] Height<15617999>, Tx<3>, BlockSize<9730>, GasUsed<13822929>, InvalidTxs<0>, lastRun<196ms>, RunTx<ApplyBlock<509ms>, abci<196ms>, persist<311ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88809>, rdb<15426>, rdbTs<2497ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<192ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:40.378][1007860] Height<15618000>, Tx<6>, BlockSize<18090>, GasUsed<27766776>, InvalidTxs<0>, lastRun<468ms>, RunTx<ApplyBlock<1419ms>, abci<468ms>, persist<949ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<166775>, rdb<30710>, rdbTs<6448ms>, savenode<4071>], DeliverTxs[RunAnte<0ms>, RunMsg<465ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:40.785][1007860] Height<15618001>, Tx<2>, BlockSize<5875>, GasUsed<6987863>, InvalidTxs<0>, lastRun<180ms>, RunTx<ApplyBlock<392ms>, abci<180ms>, persist<210ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<47118>, rdb<7699>, rdbTs<1275ms>, savenode<316>], DeliverTxs[RunAnte<0ms>, RunMsg<177ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:41.177][1007860] CommitSchedule. module=iavl Height=15618000 Tree=acc IavlHeight=30 NodeNum=32980 trc="commitSchedule<800ms>, cacheNode<15ms>, Pruning<515ms>, batchSet<28ms>, batchCommit<240ms>"
I[2022-12-04|21:50:43.708][1007860] Height<15618002>, Tx<10>, BlockSize<27219>, GasUsed<41692972>, InvalidTxs<0>, lastRun<740ms>, RunTx<ApplyBlock<2907ms>, abci<741ms>, persist<2163ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<242802>, rdb<47575>, rdbTs<9205ms>, savenode<32980>], DeliverTxs[RunAnte<0ms>, RunMsg<735ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:44.944][1007860] Height<15618003>, Tx<6>, BlockSize<14091>, GasUsed<37481454>, InvalidTxs<0>, lastRun<488ms>, RunTx<ApplyBlock<1219ms>, abci<489ms>, persist<728ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168253>, rdb<31326>, rdbTs<5197ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<484ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:46.436][1007860] Height<15618004>, Tx<9>, BlockSize<26244>, GasUsed<28850050>, InvalidTxs<1>, lastRun<535ms>, RunTx<ApplyBlock<1444ms>, abci<535ms>, persist<907ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168321>, rdb<32125>, rdbTs<5040ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<531ms>, refund<1ms>]. module=main 
I[2022-12-04|21:50:47.873][1007860] Height<15618005>, Tx<5>, BlockSize<17169>, GasUsed<27493073>, InvalidTxs<0>, lastRun<515ms>, RunTx<ApplyBlock<1421ms>, abci<515ms>, persist<905ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<166770>, rdb<31545>, rdbTs<6736ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<512ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:48.939][1007860] Height<15618006>, Tx<6>, BlockSize<13966>, GasUsed<20945233>, InvalidTxs<0>, lastRun<399ms>, RunTx<ApplyBlock<1052ms>, abci<399ms>, persist<651ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128533>, rdb<23743>, rdbTs<5791ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<395ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:55.426][1007860] CommitSchedule. module=iavl Height=15618000 Tree=evm IavlHeight=33 NodeNum=988161 trc="commitSchedule<15058ms>, cacheNode<966ms>, Pruning<7562ms>, batchSet<1180ms>, batchCommit<5349ms>"
I[2022-12-04|21:50:55.426][1007860] Height<15618007>, Tx<13>, BlockSize<32461>, GasUsed<42751153>, InvalidTxs<0>, lastRun<703ms>, RunTx<ApplyBlock<6471ms>, abci<704ms>, persist<5767ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<243135>, rdb<47694>, rdbTs<8619ms>, savenode<988161>], DeliverTxs[RunAnte<1ms>, RunMsg<699ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 2): dial tcp 111.200.241.59:7270: i/o timeout. module=pex [email protected]:7270
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 3): dial tcp 47.90.29.31:56925: i/o timeout. module=pex [email protected]:56925
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 43.129.73.94:43756: i/o timeout. module=pex [email protected]:43756
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 8.218.77.5:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 3.135.138.205:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:55.786][1007860] Stopping peer for error. module=p2p peer="Peer{MConn{35.74.8.189:26656} c8f32b793871b56a11d94336d9ce6472f893524b out}" err=EOF
I[2022-12-04|21:50:56.257][1007860] Height<15618008>, Tx<7>, BlockSize<15705>, GasUsed<21262131>, InvalidTxs<1>, lastRun<366ms>, RunTx<ApplyBlock<814ms>, abci<367ms>, persist<446ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<129071>, rdb<24110>, rdbTs<5017ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<362ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 3.37.121.32:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 54.150.183.225:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 2): dial tcp 13.214.12.163:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 3.37.251.158:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 52.221.126.186:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 35.74.98.204:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 3.64.37.17:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 54.151.166.67:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 13.213.145.109:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 52.78.236.126:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 54.248.224.222:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 2): dial tcp 54.180.61.142:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:57.113][1007860] Height<15618009>, Tx<4>, BlockSize<13486>, GasUsed<20657995>, InvalidTxs<0>, lastRun<325ms>, RunTx<ApplyBlock<798ms>, abci<325ms>, persist<472ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128139>, rdb<23678>, rdbTs<3930ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<322ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:58.148][1007860] Height<15618010>, Tx<5>, BlockSize<10180>, GasUsed<14305694>, InvalidTxs<0>, lastRun<262ms>, RunTx<ApplyBlock<1020ms>, abci<263ms>, persist<756ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<89250>, rdb<15676>, rdbTs<11898ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<259ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:58.968][1007860] Height<15618011>, Tx<4>, BlockSize<13494>, GasUsed<20657995>, InvalidTxs<0>, lastRun<302ms>, RunTx<ApplyBlock<806ms>, abci<303ms>, persist<502ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<127891>, rdb<23747>, rdbTs<5538ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<300ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:00.013][1007860] Height<15618012>, Tx<9>, BlockSize<21490>, GasUsed<28213316>, InvalidTxs<1>, lastRun<445ms>, RunTx<ApplyBlock<1031ms>, abci<446ms>, persist<584ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<167136>, rdb<31408>, rdbTs<6283ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<442ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:00.961][1007860] Height<15618013>, Tx<7>, BlockSize<14439>, GasUsed<37739646>, InvalidTxs<0>, lastRun<419ms>, RunTx<ApplyBlock<934ms>, abci<419ms>, persist<513ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168684>, rdb<31110>, rdbTs<6028ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<414ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:02.302][1007860] Height<15618014>, Tx<8>, BlockSize<21935>, GasUsed<34491069>, InvalidTxs<0>, lastRun<606ms>, RunTx<ApplyBlock<1327ms>, abci<607ms>, persist<719ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<204902>, rdb<39193>, rdbTs<7290ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<603ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:03.131][1007860] Height<15618015>, Tx<9>, BlockSize<18821>, GasUsed<21305044>, InvalidTxs<1>, lastRun<414ms>, RunTx<ApplyBlock<814ms>, abci<414ms>, persist<397ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128472>, rdb<23260>, rdbTs<3548ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<411ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:04.174][1007860] Height<15618016>, Tx<12>, BlockSize<19537>, GasUsed<45179910>, InvalidTxs<0>, lastRun<437ms>, RunTx<ApplyBlock<1028ms>, abci<438ms>, persist<587ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<206645>, rdb<38982>, rdbTs<7202ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<433ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:04.948][1007860] Height<15618017>, Tx<6>, BlockSize<16968>, GasUsed<21270964>, InvalidTxs<0>, lastRun<346ms>, RunTx<ApplyBlock<757ms>, abci<346ms>, persist<409ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128894>, rdb<23534>, rdbTs<3522ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<342ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:05.722][1007860] Height<15618018>, Tx<7>, BlockSize<16313>, GasUsed<21245151>, InvalidTxs<1>, lastRun<335ms>, RunTx<ApplyBlock<760ms>, abci<335ms>, persist<424ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128857>, rdb<23399>, rdbTs<3755ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<332ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:06.016][1007860] Height<15618019>, Tx<2>, BlockSize<5964>, GasUsed<6987851>, InvalidTxs<0>, lastRun<98ms>, RunTx<ApplyBlock<280ms>, abci<98ms>, persist<179ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<47309>, rdb<7737>, rdbTs<1249ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<95ms>, refund<0ms>]. module=main

@giskook
Copy link
Contributor

giskook commented Dec 5, 2022

Hi @dandavid3000

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Could you use the latest data snapshot mainnet-s0-fss-20221205-15769737-rocksdb.tar.gz it will help to sync.

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}

If you run your node as validator, you should set --node-mode=val, if you run node as rpc, we'd better use s1 data. mainnet-s1-fss-20221204-15764063-rocksdb.tar.gz
If you want to lower you memory, there are some ways to help:

  1. lower --rocksdb.opts max_open_files now there is no limited, I think we can try to set it to 30000 first;
  2. lower --iavl-cache-size and --iavl-fast-storage-cache-size, the default value is 10000000;
  3. lower --commit-gap-heigh the defaut value is 100;
  4. use tcmalloc may help:

how to install tcmalloc with OKC:

1. cd exchain
2. make tcmalloc
3. make mainnet OKCMALLOC=tcmalloc

@dandavid3000
Copy link

Hi @dandavid3000

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Could you use the latest data snapshot mainnet-s0-fss-20221205-15769737-rocksdb.tar.gz it will help to sync.

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}

If you run your node as validator, you should set --node-mode=val, if you run node as rpc, we'd better use s1 data. mainnet-s1-fss-20221204-15764063-rocksdb.tar.gz If you want to lower you memory, there are some ways to help:

  1. lower --rocksdb.opts max_open_files now there is no limited, I think we can try to set it to 30000 first;
  2. lower --iavl-cache-size and --iavl-fast-storage-cache-size, the default value is 10000000;
  3. lower --commit-gap-heigh the defaut value is 100;
  4. use tcmalloc may help:

how to install tcmalloc with OKC:

1. cd exchain
2. make tcmalloc
3. make mainnet OKCMALLOC=tcmalloc

Thanks for your help. I confirmed that the node is working well on my side.
Here is the thing I noticed during the sync. The memory usage is high. It can catch up to 40-50 GB RAM.
I tried all the suggestion flags above and confirmed they did not help.
However, after finishing the sync, the memory usage was reduced and reasonable. Some suggestions for those who have this issue:

  • Try to download the latest snapshot version to shorten the sync time
  • Use NVMe at least to have a fast sync. SSD is really bad with IAVL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants