Add case test_metric_longhorn_backup #2145

roger-ryao · 2024-10-18T08:51:33Z

Which issue(s) this PR fixes:

longhorn/longhorn#9430

What this PR does / why we need it:

This PR adds a test case to cover the backup metrics.

Special notes for your reviewer:

@c3y1huang @ChanYiLin

Additional documentation or context

longhorn/longhorn#9429

Summary by CodeRabbit

New Features
- Introduced enhanced testing for backup metrics, including validation of metrics for user-created and recurring backups.
- Added a new helper function to improve metric validation across all nodes.
Bug Fixes
- Improved existing test functions to ensure comprehensive coverage of backup operations.
Tests
- Added new test functions to validate backup metrics and ensure correct reporting.
- Defined new constants for recurring job parameters to streamline testing processes.

coderabbitai · 2024-10-18T08:51:43Z

Walkthrough

The changes involve modifications to the test_metric.py file, enhancing the testing of backup metrics and recurring jobs. New imports from the backupstore module have been added, and several functions related to backup operations are introduced. Additionally, constants for recurring job parameters are defined, and a new test function is created to validate the metrics associated with backups. Existing test functions are updated to ensure comprehensive coverage of the new backup functionality.

Changes

File	Change Summary
manager/integration/tests/test_metric.py	- Added imports from `backupstore` module. - Introduced new functions for backup operations. - Defined new constants for recurring job parameters. - Added `test_metric_longhorn_backup` to validate backup metrics. - Modified existing test functions to integrate new backup functionality.

Poem

In the meadow, where backups grow,
New tests sprout, like seeds we sow.
Metrics dance in the morning light,
Ensuring our backups are just right.
With every hop, our code takes flight! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (1)

manager/integration/tests/test_metric.py (1)

161-175: Ensure consistent parameter naming between functions

The parameter metric_labels in wait_for_metric_sum_on_all_nodes is passed to check_metric_sum_on_all_nodes as metric_labels, but check_metric_sum_on_all_nodes defines this parameter as expected_labels. For consistency and readability, consider using the same parameter name in both functions.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 460c7b0 and 8424fb5.

📒 Files selected for processing (1)

manager/integration/tests/test_metric.py (4 hunks)

🧰 Additional context used

🪛 Ruff

manager/integration/tests/test_metric.py

4-4: backupstore imported but unused

Remove unused import: backupstore

(F401)

coderabbitai · 2024-10-18T08:57:31Z

manager/integration/tests/test_metric.py

+    for backup in backups:
+        if backup['snapshotName'] == "volume-head":
+            continue
+
+        backup_size = int(backup['size'])
+    assert backup_size > 0


⚠️ Potential issue

Potential issue with backup size assignment

In the loop iterating over backups, the variable backup_size is assigned the size of each backup without accumulating. If multiple backups exist, backup_size will hold the size of the last backup processed. Ensure that you are capturing the correct backup size intended for the test, possibly by identifying the specific backup needed or summing the sizes if appropriate.

coderabbitai · 2024-10-18T08:57:32Z

manager/integration/tests/test_metric.py

+    for backup in backups:
+        if backup['snapshotName'] == "volume-head":
+            continue
+
+        recurring_backup_size = int(backup['size'])
+    assert recurring_backup_size > 0


⚠️ Potential issue

Potential issue with recurring backup size assignment

Similar to the user backup size, in the loop iterating over backups, recurring_backup_size is assigned the size of each backup without accumulating. If multiple backups exist, recurring_backup_size will hold the size of the last backup processed. Ensure that you are capturing the correct backup size for the recurring backup test, possibly by identifying the specific backup associated with the recurring job.

coderabbitai · 2024-10-18T08:57:32Z

manager/integration/tests/test_metric.py

+    backup_size = 0
+    backups = bv.backupList().data
+    for backup in backups:
+        if backup['snapshotName'] == "volume-head":
+            continue
+
+        backup_size = int(backup['size'])
+    assert backup_size > 0
+


🛠️ Refactor suggestion

Refactor to eliminate code duplication when calculating backup sizes

The code for obtaining backup_size (lines 701-709) and recurring_backup_size (lines 749-757) is similar. Consider refactoring this duplicated code into a helper function to improve maintainability and reduce redundancy.

Here's an example of a helper function:

def get_backup_size(backup_volume): backups = backup_volume.backupList().data for backup in backups: if backup['snapshotName'] == "volume-head": continue return int(backup['size']) return 0

You can then use this function to obtain the backup sizes:

backup_size = get_backup_size(bv)

Also applies to: 749-757

coderabbitai · 2024-10-18T08:57:32Z

manager/integration/tests/test_metric.py

@@ -1,12 +1,14 @@
 import pytest
 import requests
 import time
+import backupstore


⚠️ Potential issue

Remove unused import backupstore

The import statement import backupstore is not used in the code and should be removed to clean up the imports.

Apply this diff to remove the unused import:

-import backupstore

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import backupstore

🧰 Tools

🪛 Ruff

4-4: backupstore imported but unused

Remove unused import: backupstore

(F401)

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)

manager/integration/tests/test_metric.py (1)
710-710: Add more descriptive assertion messages

The current assertions for backup size being greater than 0 lack descriptive messages. Adding more informative messages can help in debugging if the test fails.

Consider updating the assertions like this:
assert backup_size > 0, f"User backup size should be greater than 0, but got {backup_size}"
assert recurring_backup_size > 0, f"Recurring backup size should be greater than 0, but got {recurring_backup_size}"
Also applies to: 758-758

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 233c3f7 and 32cfb33.

📒 Files selected for processing (1)

manager/integration/tests/test_metric.py (5 hunks)

🧰 Additional context used

🔇 Additional comments (3)

manager/integration/tests/test_metric.py (3)

38-56: LGTM: New imports and constants are well-organized

The new imports from the common module and the added constants for recurring job parameters are relevant to the new test function and follow the existing code style.

160-173: LGTM: New helper function is well-implemented

The wait_for_metric_sum_on_all_nodes function provides a useful abstraction for waiting and checking metric sums across all nodes. It follows the existing code style and uses an appropriate retry mechanism.

654-773: Overall, good addition to test coverage for backup metrics

The new test function test_metric_longhorn_backup is a valuable addition to the test suite. It comprehensively covers both user-created and recurring backups, verifying the longhorn_backup_actual_size_bytes and longhorn_backup_state metrics. The implementation is generally good, following the existing code style and using appropriate helper functions.

A few suggestions for improvement have been made:

Replace the fixed sleep with a more robust waiting mechanism.

Refactor duplicate code for getting backup size into a helper function.

Add more descriptive assertion messages.

These changes will enhance the reliability and maintainability of the test.

coderabbitai · 2024-10-18T10:15:33Z

manager/integration/tests/test_metric.py

+    # wait for the recurring backup job to run.
+    time.sleep(60)
+    bv = client.by_id_backupVolume(volume_name)
+    wait_for_backup_count(bv, 1)


🛠️ Refactor suggestion

Consider replacing sleep with a more robust waiting mechanism

Using a fixed time.sleep(60) may introduce unnecessary delays or may not be sufficient if the backup takes longer. Consider using a polling mechanism or a wait function that checks for the specific condition you're waiting for.

Here's a suggested approach:

def wait_for_backup_completion(client, volume_name, timeout=300, interval=2): start_time = time.time() while time.time() - start_time < timeout: bv = client.by_id_backupVolume(volume_name) if len(bv.backupList().data) > 0: return True time.sleep(interval) raise TimeoutError(f"Backup for volume {volume_name} did not complete within {timeout} seconds") # Replace the sleep and subsequent lines with: wait_for_backup_completion(client, volume_name) bv = client.by_id_backupVolume(volume_name)

coderabbitai · 2024-10-18T10:15:34Z

manager/integration/tests/test_metric.py

+    backup_size = 0
+    backups = bv.backupList().data
+    for backup in backups:
+        if backup['snapshotName'] == "volume-head":
+            continue
+
+        backup_size = int(backup['size'])
+    assert backup_size > 0


🛠️ Refactor suggestion

Refactor duplicate code for getting backup size

The code for obtaining the backup size is duplicated for user backups and recurring backups. Consider refactoring this into a helper function to improve maintainability and reduce redundancy.

Here's a suggested helper function:

def get_backup_size(backup_volume): backups = backup_volume.backupList().data for backup in backups: if backup['snapshotName'] == "volume-head": continue return int(backup['size']) return 0 # Then use it like this: backup_size = get_backup_size(bv) assert backup_size > 0, "Backup size should be greater than 0"

Also applies to: 751-758

longhorn/longhorn#9430 Signed-off-by: Roger Yao <[email protected]>

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 32cfb33 and 70caa8a.

📒 Files selected for processing (1)

manager/integration/tests/test_metric.py (5 hunks)

🧰 Additional context used

🔇 Additional comments (9)

manager/integration/tests/test_metric.py (9)
38-44: LGTM: New imports added for backup-related functionality

The new imports are relevant to the added backup testing functionality. They include necessary functions for creating and managing backups, as well as working with recurring jobs.

46-55: LGTM: Constants added for recurring job configuration

These constants are well-defined and will be used in the new test function for configuring recurring backup jobs. The naming is clear and follows the existing coding style.

654-681: LGTM: Well-documented test function

The test function is well-documented with a clear scenario description, issue reference, and expected outcomes. This makes it easy for other developers to understand the purpose and expectations of the test.

693-711: Potential issue with backup size assignment

In the loop iterating over backups, the variable backup_size is assigned the size of each backup without accumulating. If multiple backups exist, backup_size will hold the size of the last backup processed. Ensure that you are capturing the correct backup size intended for the test, possibly by identifying the specific backup needed or summing the sizes if appropriate.

713-726: LGTM: Proper metric verification for user-initiated backup

The code correctly verifies both the size and state metrics for the user-initiated backup. The use of wait_for_metric_sum_on_all_nodes ensures that the test waits for the metrics to be updated across all nodes.

731-747: LGTM: Well-structured recurring backup job creation

The creation of the recurring backup job is well-structured and uses the previously defined constants. The use of check_recurring_jobs and wait_for_cron_job_count ensures that the job is properly created before proceeding.

746-749: Consider replacing sleep with a more robust waiting mechanism

Using a fixed time.sleep(60) may introduce unnecessary delays or may not be sufficient if the backup takes longer. Consider using a polling mechanism or a wait function that checks for the specific condition you're waiting for.

Here's a suggested approach:
def wait_for_backup_completion(client, volume_name, timeout=300, interval=2):
    start_time = time.time()
    while time.time() - start_time < timeout:
        bv = client.by_id_backupVolume(volume_name)
        if len(bv.backupList().data) > 0:
            return True
        time.sleep(interval)
    raise TimeoutError(f"Backup for volume {volume_name} did not complete within {timeout} seconds")

# Replace the sleep and subsequent lines with:
wait_for_backup_completion(client, volume_name)
bv = client.by_id_backupVolume(volume_name)
751-759: Potential issue with recurring backup size assignment

Similar to the user backup size, in the loop iterating over backups, recurring_backup_size is assigned the size of each backup without accumulating. If multiple backups exist, recurring_backup_size will hold the size of the last backup processed. Ensure that you are capturing the correct backup size for the recurring backup test, possibly by identifying the specific backup associated with the recurring job.

761-774: LGTM: Proper metric verification for recurring backup

The code correctly verifies both the size and state metrics for the recurring backup. The use of wait_for_metric_sum_on_all_nodes ensures that the test waits for the metrics to be updated across all nodes.

coderabbitai · 2024-10-18T10:29:09Z

manager/integration/tests/test_metric.py

+def wait_for_metric_sum_on_all_nodes(client, core_api, metric_name, metric_labels, expected_value): # NOQA
+    for _ in range(RETRY_COUNTS):
+        time.sleep(RETRY_INTERVAL)
+
+        try:
+            check_metric_sum_on_all_nodes(client, core_api, metric_name,
+                                          metric_labels, expected_value)
+            return
+        except AssertionError:
+            continue
+
+    check_metric_sum_on_all_nodes(client, core_api, metric_name,
+                                  metric_labels, expected_value)
+


🛠️ Refactor suggestion

Consider enhancing the wait_for_metric_sum_on_all_nodes function

The function is well-implemented, but consider the following improvements:

Add parameters for RETRY_COUNTS and RETRY_INTERVAL to make the function more flexible.

Consider using exponential backoff instead of fixed intervals for more efficient retrying.

Add logging to provide more visibility into the waiting process.

Consider returning a boolean to indicate success or failure instead of relying on an exception.

Example implementation:

def wait_for_metric_sum_on_all_nodes(client, core_api, metric_name, metric_labels, expected_value, max_wait_time=300, initial_interval=1): start_time = time.time() interval = initial_interval while time.time() - start_time < max_wait_time: try: check_metric_sum_on_all_nodes(client, core_api, metric_name, metric_labels, expected_value) return True except AssertionError: time.sleep(interval) interval = min(interval * 2, 60) # exponential backoff, max 60 seconds return False

This implementation provides more flexibility and better handles long-running waits.

coderabbitai · 2024-10-18T10:29:09Z

manager/integration/tests/test_metric.py

+    # create a volume and attach it to a node.
+    volume_size = 50 * Mi
+    client.create_volume(name=volume_name,
+                         numberOfReplicas=1,
+                         size=str(volume_size))
+    volume = wait_for_volume_detached(client, volume_name)
+    volume.attach(hostId=self_hostId)
+    volume = wait_for_volume_healthy(client, volume_name)


🛠️ Refactor suggestion

Consider parameterizing volume creation

The volume creation process is hardcoded. Consider parameterizing the volume size and number of replicas to make the test more flexible and reusable.

Example:

def create_test_volume(client, name, size=50*Mi, replicas=1): client.create_volume(name=name, numberOfReplicas=replicas, size=str(size)) volume = wait_for_volume_detached(client, name) volume.attach(hostId=get_self_host_id()) return wait_for_volume_healthy(client, name) volume = create_test_volume(client, volume_name)

coderabbitai · 2024-10-18T10:29:09Z

manager/integration/tests/test_metric.py

+    # delete the existing backup before creating a recurring backup job.
+    delete_backup_volume(client, volume_name)


🛠️ Refactor suggestion

Consider adding error handling for backup volume deletion

The delete_backup_volume call should be wrapped in a try-except block to handle potential errors during deletion. This will make the test more robust.

Example:

try: delete_backup_volume(client, volume_name) except Exception as e: pytest.fail(f"Failed to delete backup volume: {str(e)}")

roger-ryao · 2024-10-18T11:44:19Z

manager/integration/tests/test_metric.py

+                                     3)
+
+    # delete the existing backup before creating a recurring backup job.
+    delete_backup_volume(client, volume_name)


Hi @c3y1huang

In my test case test_metric_longhorn_backup, I have referenced wait_for_metric_count_all_nodes for similar usage. However, I have a question regarding my test case design. Although I have deleted the backups of volumes that were not created by the recurring job, I am concerned about potential data caching issues when using the same volume to apply a backup recurring job to check the Longhorn backup metric. Could you please help review this test case?

Thanks

After longhorn/longhorn-manager#3216 was merged, my test case passed, and I did not observe any potential data caching issues when using the same volume to apply a backup recurring job to check the Longhorn backup metrics.

yangchiu · 2024-11-06T07:02:53Z

cc @ChanYiLin @c3y1huang as it relates to longhorn/longhorn#9429

roger-ryao requested review from ChanYiLin and c3y1huang October 18, 2024 08:51

roger-ryao requested a review from a team as a code owner October 18, 2024 08:51

coderabbitai bot reviewed Oct 18, 2024

View reviewed changes

roger-ryao self-assigned this Oct 18, 2024

roger-ryao force-pushed the issue9430 branch 2 times, most recently from 233c3f7 to 32cfb33 Compare October 18, 2024 10:13

coderabbitai bot reviewed Oct 18, 2024

View reviewed changes

Add case test_metric_longhorn_backup

70caa8a

longhorn/longhorn#9430 Signed-off-by: Roger Yao <[email protected]>

roger-ryao force-pushed the issue9430 branch from 32cfb33 to 70caa8a Compare October 18, 2024 10:24

coderabbitai bot reviewed Oct 18, 2024

View reviewed changes

roger-ryao commented Oct 18, 2024

View reviewed changes

roger-ryao mentioned this pull request Nov 4, 2024

[FEATURE] Add recurring job label to backup metrics or empty value longhorn/longhorn#9429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add case test_metric_longhorn_backup #2145

Add case test_metric_longhorn_backup #2145

roger-ryao commented Oct 18, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 18, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

coderabbitai bot left a comment

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

coderabbitai bot left a comment

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

coderabbitai bot Oct 18, 2024

roger-ryao Oct 18, 2024

roger-ryao Oct 24, 2024

yangchiu commented Nov 6, 2024

		# delete the existing backup before creating a recurring backup job.
		delete_backup_volume(client, volume_name)

Add case test_metric_longhorn_backup #2145

Are you sure you want to change the base?

Add case test_metric_longhorn_backup #2145

Conversation

roger-ryao commented Oct 18, 2024 • edited by coderabbitai bot Loading

Which issue(s) this PR fixes:

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

coderabbitai bot commented Oct 18, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 18, 2024

Choose a reason for hiding this comment

roger-ryao Oct 18, 2024

Choose a reason for hiding this comment

roger-ryao Oct 24, 2024

Choose a reason for hiding this comment

yangchiu commented Nov 6, 2024

roger-ryao commented Oct 18, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 18, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)