-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: distinguish between normal node startup and snapshot loading #319
feat: distinguish between normal node startup and snapshot loading #319
Conversation
@panlei-coder please fix ci failure and file confliction. |
ok |
…i-coder/pikiwidb into new_log_playback_to_latest
Walkthrough此次变更涉及多个文件,主要集中在对外部项目获取的调整、配置文件的修改、存储和快照处理方法的更新,以及类方法的重构。这些更改旨在提升项目的功能和性能,同时增强系统的稳定性与灵活性。 Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant RaftNodeCmd
participant PSTORE
participant PRAFT
User->>RaftNodeCmd: 启动DoCmdSnapshot
RaftNodeCmd->>PSTORE: 获取self_snapshot_index
PSTORE->>RaftNodeCmd: 返回self_snapshot_index
RaftNodeCmd->>PRAFT: 调用DoSnapshot(self_snapshot_index)
PRAFT-->>RaftNodeCmd: 快照完成
RaftNodeCmd-->>User: 快照成功
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Outside diff range and nitpick comments (1)
src/praft/psnapshot.cc (1)
Line range hint
69-89
: The snapshot generation logic is comprehensive. However, consider replacing the hardcodedlast_log_index
with a dynamically retrieved or configured value to avoid potential issues in different environments.- auto last_log_index = 30000; // @todo + auto last_log_index = PSTORE.GetBackend(db_id)->GetStorage()->GetSmallestFlushedLogIndex(); // Dynamic retrieval
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (14)
- cmake/braft.cmake (1 hunks)
- cmake/openssl.cmake (2 hunks)
- pikiwidb.conf (1 hunks)
- save_load.sh (2 hunks)
- src/CMakeLists.txt (1 hunks)
- src/cmd_raft.cc (1 hunks)
- src/db.cc (1 hunks)
- src/praft/praft.cc (5 hunks)
- src/praft/praft.h (2 hunks)
- src/praft/psnapshot.cc (3 hunks)
- src/storage/include/storage/storage.h (1 hunks)
- src/storage/src/log_index.cc (2 hunks)
- src/storage/src/redis.h (1 hunks)
- src/storage/src/storage.cc (1 hunks)
Files skipped from review due to trivial changes (3)
- cmake/openssl.cmake
- pikiwidb.conf
- src/CMakeLists.txt
Additional context used
Shellcheck
save_load.sh
[info] 6-6: Use ./glob or -- glob so names with dashes won't become options. (SC2035)
[info] 7-7: Use ./glob or -- glob so names with dashes won't become options. (SC2035)
Additional comments not posted (12)
save_load.sh (1)
14-16
: The addition of repeatedredis-benchmark
andredis-cli
commands appears to be for stress testing or functionality checks.cmake/braft.cmake (1)
19-23
: The change in GIT repository and tag points to a different branch for testing. Ensure that this is intended for permanent integration or revert after testing.src/db.cc (1)
40-43
: Switching fromstd::bind
to a lambda expression fordo_snapshot_function
enhances readability and performance. Good use of modern C++ practices.src/praft/psnapshot.cc (1)
Line range hint
69-89
: Handling of files in the snapshot directory is appropriately managed to ensure only relevant files are considered.src/praft/praft.h (1)
140-141
: The addition of new methods and a member variable supports new functionalities and enhances the RAFT module's capabilities. Good use of default arguments for backward compatibility.Also applies to: 170-170
src/storage/src/log_index.cc (1)
172-172
: The addition of logging statements enhances system visibility and aids in debugging and monitoring. Good practice.src/storage/src/redis.h (1)
113-115
: Implementation ofGetSmallestFlushedLogIndex
looks correct.This method correctly retrieves the smallest flushed log index, which is crucial for ensuring that the node's state is consistent with the persisted state.
src/praft/praft.cc (3)
245-252
: Added methodsGetTerm
andGetLastLogIndex
enhance the functionality of the PRaft class by providing access to specific log details.Also applies to: 254-261
659-682
: The logic inon_snapshot_load
handles different startup scenarios effectively. However, ensure that the methodGetSmallestFlushedLogIndex
is implemented correctly and efficiently.Verification successful
The method
GetSmallestFlushedLogIndex
is indeed implemented in the codebase. Here are the relevant locations:
- Declaration:
src/storage/include/storage/storage.h
- Implementation:
src/storage/src/storage.cc
This confirms that the method is implemented and used correctly.
src/storage/include/storage/storage.h: LogIndex GetSmallestFlushedLogIndex() const;
src/storage/src/storage.cc: LogIndex Storage::GetSmallestFlushedLogIndex() const {
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of `GetSmallestFlushedLogIndex`. # Test: Search for the implementation and usage of `GetSmallestFlushedLogIndex`. Expect: Efficient and correct implementation. ast-grep --lang cpp --pattern $'class Storage { $$$ LogIndex GetSmallestFlushedLogIndex() const { $$$ } $$$ }' src/storage/include/storage/storage.hLength of output: 494
Script:
#!/bin/bash # Description: Verify the implementation of `GetSmallestFlushedLogIndex`. # Test: Search for the implementation and usage of `GetSmallestFlushedLogIndex`. Expect: Efficient and correct implementation. # Refine the search pattern and broaden the search scope. ast-grep --lang cpp --pattern $'LogIndex GetSmallestFlushedLogIndex() const { $$$ }'Length of output: 85
Script:
#!/bin/bash # Description: Verify the implementation of `GetSmallestFlushedLogIndex`. # Test: Search for the implementation and usage of `GetSmallestFlushedLogIndex`. Expect: Efficient and correct implementation. # Broaden the search to include potential variations of the method name and search across all files. rg --glob '!*.md' 'GetSmallestFlushedLogIndex'Length of output: 869
154-154
: Ensure the path construction uses secure and efficient string handling.src/storage/include/storage/storage.h (1)
1126-1127
: Add documentation forGetSmallestFlushedLogIndex
.It's good practice to document public methods, especially for complex functionalities like this one. Consider adding a brief comment explaining what "smallest flushed log index" refers to and how it should be used.
src/storage/src/storage.cc (1)
2520-2527
: The implementation ofGetSmallestFlushedLogIndex
method is correct and efficient.
src/cmd_raft.cc
Outdated
// @todo need to get self_snapshot_index | ||
// auto self_snapshot_index = PSTORE.GetBackend(client->GetCurrentDB())->GetStorage()->GetSmallestFlushedLogIndex(); | ||
auto s = PRAFT.DoSnapshot(); // self_snapshot_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the implementation for self_snapshot_index
or clarify the TODO.
The commented-out code suggests that self_snapshot_index
is intended to be used but is not implemented. Would you like assistance in completing this part of the code?
src/praft/praft.cc
Outdated
@@ -9,6 +9,7 @@ | |||
|
|||
#include <cassert> | |||
|
|||
#include "braft/raft.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The verification process did not find any direct or indirect usage of entities from braft/raft.h
in src/praft/praft.cc
. The references to raft
found were in comments and error messages, which do not necessitate the inclusion of the header file.
- Lines: 12-12
- File:
src/praft/praft.cc
Analysis chain
Ensure the newly added #include "braft/raft.h"
is necessary.
Let's further investigate if there are any indirect usages or dependencies that might require the inclusion of braft/raft.h
.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify if the newly added include is used in the file.
# Test: Search for usage of entities from `braft/raft.h`. Expect: At least one usage.
ast-grep --lang cpp --pattern $'$_ { $$$ braft::raft::$$$ }' src/praft/praft.cc
Length of output: 80
Script:
#!/bin/bash
# Description: Verify if there are any indirect usages or dependencies that might require the inclusion of `braft/raft.h`.
# Test 1: Search for any usage of `braft` namespace in the file.
ast-grep --lang cpp --pattern $'$_ { $$$ braft::$$$ }' src/praft/praft.cc
# Test 2: Search for any usage of `raft` keyword in the file.
rg --word-regexp 'raft' src/praft/praft.cc
Length of output: 343
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
兔子 留言说这个头文件没用到,却引入了,请确认
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexStocks 经过验证,#include "braft/raft.h"
头文件在 src/praft/praft.cc
中没有被直接使用。仅在注释和错误消息中出现了 raft
关键字。因此,可以安全地移除这个头文件。
Analysis chain
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify if the newly added include is used in the file.
# Test: Search for usage of entities from `braft/raft.h`. Expect: At least one usage.
ast-grep --lang cpp --pattern $'$_ { $$$ braft::raft::$$$ }' src/praft/praft.cc
# Test 1: Search for any usage of `braft` namespace in the file.
ast-grep --lang cpp --pattern $'$_ { $$$ braft::$$$ }' src/praft/praft.cc
# Test 2: Search for any usage of `raft` keyword in the file.
rg --word-regexp 'raft' src/praft/praft.cc
Length of output: 424
@@ -1,4 +1,5 @@ | |||
#!/bin/bash | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The save_load.sh
script contains glob patterns (rm -fr *
) that could potentially be affected by filenames with dashes. To prevent filenames with dashes from being interpreted as options, consider using ./*
instead of *
.
- Line 2:
rm -fr *
should be updated torm -fr ./ *
- Line 7:
rm -fr *
should be updated torm -fr ./ *
Analysis chain
Consider using ./
or --
with glob patterns to prevent filenames with dashes from being interpreted as options.
Also applies to: 7-7
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify if filenames with dashes are being interpreted as options.
# Test: Check if filenames with dashes exist and could potentially be interpreted as options.
find . -name "*-*" -exec echo "Filename with dash found: {}" \;
Length of output: 3925
Script:
#!/bin/bash
# Description: Check for glob patterns in save_load.sh that could be affected by filenames with dashes.
# Test: Search for glob patterns in save_load.sh
grep -E '\*|\?' save_load.sh
Length of output: 217
#include "braft/snapshot.h" | ||
#include "butil/files/file_path.h" | ||
|
||
#include "pstd/log.h" | ||
#include "pstd/pstd_string.h" | ||
|
||
#include "config.h" | ||
#include "praft.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additions and changes to snapshot path parsing are necessary for the new functionality. Consider adding unit tests to cover potential edge cases in the parsing logic.
Would you like me to help in creating unit tests for the snapshot path parsing logic?
Also applies to: 31-51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/praft/praft.cc (5 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/praft/praft.cc
@CodiumAI-Agent /improve |
PR Code Suggestions ✨
|
@CodiumAI-Agent /review |
PR Review 🔍
Code feedback:
|
@CodiumAI-Agent /improve |
PR Code Suggestions ✨
|
src/praft/psnapshot.cc
Outdated
@@ -66,6 +79,14 @@ braft::FileAdaptor* PPosixFileSystemAdaptor::open(const std::string& path, int o | |||
PSTORE.HandleTaskSpecificDB(tasks); | |||
AddAllFiles(snapshot_path, &snapshot_meta_memtable, snapshot_path); | |||
|
|||
// update snapshot last log index and last_log_term | |||
auto& new_meta = const_cast<braft::SnapshotMeta&>(snapshot_meta_memtable.meta()); | |||
auto last_log_index = 30000; // @todo PSTORE.GetBackend(db_id)->GetStorage()->GetSmallestFlushedLogIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这行是不是忘换了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯 忘记换了
*/ | ||
// @todo GetSmallestFlushedLogIndex | ||
uint64_t replay_point = PSTORE.GetBackend(db_id_)->GetStorage()->GetSmallestFlushedLogIndex(); | ||
node_->set_self_playback_point(replay_point); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node_->set_self_playback_point(replay_point+1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里 set_self_playback_point 设置的 last_applied_index,所以就是 replay_point
src/praft/praft.h
Outdated
@@ -137,6 +137,8 @@ class PRaft : public braft::StateMachine { | |||
std::string GetGroupID() const; | |||
braft::NodeStatus GetNodeStatus() const; | |||
butil::Status GetListPeers(std::vector<braft::PeerId>* peers); | |||
uint64_t GetTerm(uint64_t log_index); | |||
uint64_t GetLastLogIndex(bool is_flush = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如上所示,把 uint64_t 改为 LogIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (1)
src/praft/psnapshot.cc (1)
验证失败:
db_id
在快照元数据更新逻辑中未正确初始化。
- 在
src/praft/psnapshot.cc
文件中,db_id
被初始化为-1
,但没有其他赋值操作。- 这可能导致
assert(db_id >= 0);
断言失败,并引发潜在问题。请检查并确保
db_id
在使用前被正确初始化。Analysis chain
Line range hint
69-89
:
验证快照元数据更新逻辑确保
db_id
被正确初始化并用于更新快照元数据。Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that db_id is correctly initialized and used in snapshot metadata update. # Test: Search for db_id initialization and usage. Expect: Proper initialization and usage in snapshot metadata update. rg --type cpp -A 5 'db_id ='Length of output: 1375
Script:
#!/bin/bash # Description: Trace the assignment of db_id in src/praft/psnapshot.cc to ensure it is set to a valid value before usage. # Search for assignments to db_id in src/praft/psnapshot.cc rg 'db_id\s*=\s*' -A 5 src/praft/psnapshot.ccLength of output: 212
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (8)
- CMakeLists.txt (1 hunks)
- cmake/braft.cmake (1 hunks)
- pikiwidb.conf (2 hunks)
- src/cmd_raft.cc (1 hunks)
- src/praft/CMakeLists.txt (1 hunks)
- src/praft/praft.cc (6 hunks)
- src/praft/praft.h (3 hunks)
- src/praft/psnapshot.cc (3 hunks)
Files skipped from review due to trivial changes (2)
- CMakeLists.txt
- cmake/braft.cmake
Additional comments not posted (14)
src/praft/CMakeLists.txt (1)
38-38
: 更改已批准添加
src/storage/include
目录是必要的,以便访问存储相关的头文件。src/praft/psnapshot.cc (3)
14-14
: 更改已批准添加
braft/snapshot.h
是必要的,以便处理快照相关的功能。
18-18
: 更改已批准添加
pstd/pstd_string.h
是必要的,以便处理字符串相关的功能。
21-21
: 更改已批准添加
praft.h
是必要的,以便访问 PRaft 类的功能。src/praft/praft.h (3)
21-21
: 更改已批准添加
storage/storage.h
是必要的,以便访问存储相关的功能。
141-142
: 更改已批准新方法
GetTerm
和GetLastLogIndex
是必要的,以便检索日志索引和任期。
171-171
: 更改已批准新成员变量
is_node_first_start_up_
是必要的,以跟踪节点启动状态。pikiwidb.conf (2)
322-322
: 确认减少 db-instance-num 的原因将
db-instance-num
从 3 改为 1 可能会影响数据库性能。请确认这样做的原因。
346-346
: 确认启用 RAFT 的影响启用
use-raft
选项会对系统的分布式一致性产生重大影响。请确认所有相关配置已正确设置,并且系统已准备好启用 RAFT。src/praft/praft.cc (5)
245-252
: 验证 GetTerm 方法的错误处理请确保
GetTerm
方法在节点未初始化时正确处理错误,并确认返回类型是否合适。
254-261
: 验证 GetLastLogIndex 方法的错误处理请确保
GetLastLogIndex
方法在节点未初始化时正确处理错误,并确认返回类型是否合适。
543-552
: 验证 DoSnapshot 方法的重构请确保
DoSnapshot
方法在处理同步和异步快照操作时正确处理所有情况。
657-687
: 验证节点启动和快照加载的逻辑请确保新逻辑正确区分正常启动和快照加载,并正确处理重放点。
12-12
: 确认包含 braft/raft.h 的必要性请确认
#include "braft/raft.h"
是否在本文件中有直接或间接的使用。
int db_id = -1; | ||
|
||
// parse snapshot path | ||
butil::FilePath parse_snapshot_path(path); | ||
std::vector<std::string> components; | ||
bool is_find_db = false; | ||
parse_snapshot_path.GetComponents(&components); | ||
for (auto component : components) { | ||
for (const auto& component : components) { | ||
snapshot_path += component + "/"; | ||
|
||
if (is_find_db && pstd::String2int(component, &db_id)) { | ||
is_find_db = false; | ||
} | ||
|
||
if (component.find("snapshot_") != std::string::npos) { | ||
break; | ||
} else if (component == "db") { | ||
is_find_db = true; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议:为快照路径解析逻辑添加单元测试
快照路径解析逻辑对于新功能是必要的。建议添加单元测试以覆盖潜在的边界情况。
需要我帮助创建快照路径解析逻辑的单元测试吗?
auto self_snapshot_index = PSTORE.GetBackend(client->GetCurrentDB())->GetStorage()->GetSmallestFlushedLogIndex(); | ||
INFO("DoCmdSnapshot self_snapshot_index:{}", self_snapshot_index); | ||
auto s = PRAFT.DoSnapshot(self_snapshot_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
更改已批准
检索并使用 self_snapshot_index
是必要的,以便处理新的快照功能。
需要帮助完成 self_snapshot_index 的实现吗?
注释的代码表明 self_snapshot_index
计划被使用,但尚未实现。需要我帮助完成这部分代码吗?
cmake/braft.cmake
Outdated
GIT_REPOSITORY "https://github.com/pikiwidb/braft.git" | ||
GIT_TAG master | ||
GIT_TAG v1.1.2-alpha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改为 v1.1.2-alpha2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (6)
- cmake/braft.cmake (1 hunks)
- pikiwidb.conf (1 hunks)
- src/db.cc (3 hunks)
- src/storage/src/log_index.h (1 hunks)
- src/storage/src/redis.h (1 hunks)
- src/storage/src/storage.cc (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- cmake/braft.cmake
Additional comments not posted (5)
src/db.cc (1)
97-103
: 处理旧存储的逻辑在打开新存储之前关闭并重置旧存储的处理逻辑是合理的。这确保了资源得到正确释放,避免了潜在的资源泄露问题。
src/storage/src/log_index.h (1)
137-137
: 增强常量正确性通过将
GetLastFlushIndex
方法返回类型改为常量引用,并将方法标记为 const,增强了代码的常量正确性。这有助于防止不必要的修改并可能提高性能,因为避免了不必要的数据复制。pikiwidb.conf (1)
322-322
: 调整数据库实例数量将
db-instance-num
从 3 改为 1,这可能是为了简化配置或优化某些环境的性能。确保这一改变与系统需求一致,并且不会引入配置错误。src/storage/src/redis.h (1)
113-113
: 新增方法获取最小刷新日志索引在
Redis
类中新增GetSmallestFlushedLogIndex
方法,用于获取最小刷新日志索引。这对于确保数据一致性和优化日志管理非常重要。确保该方法正确实现,并与现有功能良好集成。src/storage/src/storage.cc (1)
2519-2526
: 审查新增的GetSmallestFlushedLogIndex
方法此方法用于计算最小的已刷新日志索引。代码逻辑看起来正确,使用了
INT64_MAX
作为初始值,并在所有实例中找到最小值。这种方法是有效的,因为它确保了即使在极端情况下也能找到正确的最小日志索引。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (1)
src/praft/praft.cc (1)
Line range hint
113-153
: 优化初始化逻辑和错误处理函数
PRaft::Init
在初始化过程中多次调用server_.reset()
来处理错误状态。建议将这些重复的清理逻辑封装到一个单独的函数中,以提高代码的可维护性和可读性。- server_.reset(); + resetServerAndNode();并在类中添加一个新的成员函数来处理重置逻辑。
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- src/praft/praft.cc (5 hunks)
- src/storage/include/storage/storage.h (1 hunks)
Additional comments not posted (1)
src/storage/include/storage/storage.h (1)
1126-1127
: 新增方法审查:GetSmallestFlushedLogIndex此方法似乎用于获取最小的已刷新日志索引,这对于区分正常节点启动和快照加载非常关键。建议添加适当的文档注释来描述此方法的用途、预期行为以及任何可能的边界条件或异常情况。
if (is_node_first_start_up_) { | ||
// get replay point of one db's | ||
/* | ||
1. When a node starts normally, because all memory data is flushed to disks and | ||
snapshots are truncated to the latest, the flush-index and apply-index are the | ||
same when the node starts, so the maximum log index should be obtained. | ||
2. When a node is improperly shut down and restarted, the minimum flush-index should | ||
be obtained as the starting point for fault recovery. | ||
*/ | ||
uint64_t replay_point = PSTORE.GetBackend(db_id_)->GetStorage()->GetSmallestFlushedLogIndex(); | ||
node_->set_self_playback_point(replay_point); | ||
is_node_first_start_up_ = false; | ||
INFO("set replay_point: {}", replay_point); | ||
|
||
/* | ||
If a node has just joined the cluster and does not have any data, | ||
it does not load the local snapshot at startup. Therefore, | ||
LoadDBFromCheckPoint is required after loading the snapshot from the leader. | ||
*/ | ||
if (GetLastLogIndex() != 0) { | ||
return 0; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
优化快照加载逻辑
函数 PRaft::on_snapshot_load
使用布尔标志 is_node_first_start_up_
来区分首次启动。考虑使用状态枚举而不是布尔值,以清晰地表示节点的不同状态。
- bool is_node_first_start_up_;
+ enum class NodeStartupState { First, Subsequent };
+ NodeStartupState startup_state_;
在逻辑中相应地替换这个状态的检查和设置。
Committable suggestion was skipped due to low confidence.
storage::LogIndex PRaft::GetTerm(uint64_t log_index) { | ||
if (!node_) { | ||
ERROR("Node is not initialized"); | ||
return 0; | ||
} | ||
|
||
return node_->get_term(log_index); | ||
} | ||
|
||
storage::LogIndex PRaft::GetLastLogIndex(bool is_flush) { | ||
if (!node_) { | ||
ERROR("Node is not initialized"); | ||
return 0; | ||
} | ||
|
||
return node_->get_last_log_index(is_flush); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
检查默认返回值和错误处理
函数 PRaft::GetTerm
和 PRaft::GetLastLogIndex
在节点未初始化时返回 0。考虑使用一个明确的错误代码或异常处理机制,而不是返回一个可能会被误解的魔术数字。
- return 0;
+ throw std::runtime_error("Node is not initialized");
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
storage::LogIndex PRaft::GetTerm(uint64_t log_index) { | |
if (!node_) { | |
ERROR("Node is not initialized"); | |
return 0; | |
} | |
return node_->get_term(log_index); | |
} | |
storage::LogIndex PRaft::GetLastLogIndex(bool is_flush) { | |
if (!node_) { | |
ERROR("Node is not initialized"); | |
return 0; | |
} | |
return node_->get_last_log_index(is_flush); | |
} | |
storage::LogIndex PRaft::GetTerm(uint64_t log_index) { | |
if (!node_) { | |
ERROR("Node is not initialized"); | |
throw std::runtime_error("Node is not initialized"); | |
} | |
return node_->get_term(log_index); | |
} | |
storage::LogIndex PRaft::GetLastLogIndex(bool is_flush) { | |
if (!node_) { | |
ERROR("Node is not initialized"); | |
throw std::runtime_error("Node is not initialized"); | |
} | |
return node_->get_last_log_index(is_flush); | |
} |
@@ -66,6 +79,14 @@ braft::FileAdaptor* PPosixFileSystemAdaptor::open(const std::string& path, int o | |||
PSTORE.HandleTaskSpecificDB(tasks); | |||
AddAllFiles(snapshot_path, &snapshot_meta_memtable, snapshot_path); | |||
|
|||
// update snapshot last log index and last_log_term | |||
auto& new_meta = const_cast<braft::SnapshotMeta&>(snapshot_meta_memtable.meta()); | |||
auto last_log_index = PSTORE.GetBackend(db_id)->GetStorage()->GetSmallestFlushedLogIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经维护了一个 last_flush_index_ ,这里使用 GetLastFlushIndex 可以少很多计算。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯,已经用了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (5)
- cmake/braft.cmake (1 hunks)
- cmake/openssl.cmake (1 hunks)
- src/CMakeLists.txt (1 hunks)
- src/db.cc (3 hunks)
- src/praft/CMakeLists.txt (1 hunks)
Files skipped from review due to trivial changes (2)
- cmake/openssl.cmake
- src/CMakeLists.txt
Files skipped from review as they are similar to previous changes (2)
- cmake/braft.cmake
- src/praft/CMakeLists.txt
// close the old storage, then open the new storage | ||
std::unique_ptr<storage::Storage> old_storage = std::move(storage_); | ||
if (old_storage != nullptr) { | ||
old_storage->Close(); | ||
old_storage.reset(); | ||
} | ||
storage_ = std::make_unique<storage::Storage>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议:在 DB::LoadDBFromCheckpoint
中使用 Lambda 表达式。
为了保持一致性,建议将 storage_options.do_snapshot_function
中的 std::bind
替换为 Lambda 表达式。
- storage_options.do_snapshot_function =
- std::bind(&pikiwidb::PRaft::DoSnapshot, &pikiwidb::PRAFT, std::placeholders::_1, std::placeholders::_2);
+ storage_options.do_snapshot_function = [raft = &pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) {
+ raft->DoSnapshot(std::forward<decltype(self_snapshot_index)>(self_snapshot_index),
+ std::forward<decltype(is_sync)>(is_sync));
+ };
Committable suggestion was skipped due to low confidence.
storage_options.do_snapshot_function = [raft = &pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) { | ||
raft->DoSnapshot(std::forward<decltype(self_snapshot_index)>(self_snapshot_index), | ||
std::forward<decltype(is_sync)>(is_sync)); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议:在 Lambda 表达式中使用值捕获。
使用引用捕获 raft
可能导致潜在的生命周期问题,尤其是在 raft
对象的生命周期可能比 storage_options
短的场景中。建议改为值捕获或确保 raft
的生命周期总是足够长。
- storage_options.do_snapshot_function = [raft = &pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) {
+ storage_options.do_snapshot_function = [raft = pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) {
raft->DoSnapshot(std::forward<decltype(self_snapshot_index)>(self_snapshot_index),
std::forward<decltype(is_sync)>(is_sync));
};
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
storage_options.do_snapshot_function = [raft = &pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) { | |
raft->DoSnapshot(std::forward<decltype(self_snapshot_index)>(self_snapshot_index), | |
std::forward<decltype(is_sync)>(is_sync)); | |
}; | |
storage_options.do_snapshot_function = [raft = pikiwidb::PRAFT](auto&& self_snapshot_index, auto&& is_sync) { | |
raft->DoSnapshot(std::forward<decltype(self_snapshot_index)>(self_snapshot_index), | |
std::forward<decltype(is_sync)>(is_sync)); | |
}; |
1.方案:
(1)如果数据库正常关闭,理论上是不需要进行日志回放的,因为数据库在关闭是会将内存数据刷盘,并将日志截断到最新的位置,这样就可以保证正常关闭下的节点启动是no-op的。那么在数据库非正常关闭时,节点启动如果加载了快照,只会是一个多余的操作,因为快照本身就是空的,只有snapshot_meta数据。所以,在节点启动的时候,第一次调用on_snapshot_load接口理论上是不需要进行快照的加载,需要对第一个启动和正常快照安装进行区分。但是有一点需要注意,节点在刚加入集群且没有数据时,它在初始化之后是不会加载本地的快照(因为快照文件夹是空的),所以在安装来自 leader 的快照时需要 LoadDBFromCheckpoint 过程。
(2)节点在启动的时候,它是按照快照中 snapshot_meta 记录的 last_applied_index 作为日志回放点,那么在故障重启时,节点需要过滤掉一些日志,这无疑是影响效率的。我们可以在节点调用 on_load_snapshot 开始位置就设置这个日志回放点的位置,根据 DB 中最小 flush-index 来设置(这里的逻辑需要进一步的完善)。
(3)braft 的 pr : pikiwidb/braft#4
2.测试
因为获取 DB 中最小的 flush-index 目前还没有完善,所以在测试的时候将一个 DB 下 rocksdb 的数量设置成了1,使用save_load.sh 脚本进行测试。
Summary by CodeRabbit
新功能
Storage
和Redis
类)。PRaft
类中的GetTerm
和GetLastLogIndex
方法。配置更新
pikiwidb.conf
配置文件:db-instance-num
从 3 改为 1,use-raft
从 "no" 改为 "yes"。CMakeLists.txt
中ADDRESS_SANITIZER
选项默认设置,从ON
改为OFF
。性能改进
DB::Open
方法的快照功能和数据库加载逻辑。代码重构
DoSnapshot
方法以处理同步和异步快照操作。PRaft
类中的注释错误,并优化了prefix
的构造方式。