Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-16327 control: Update dmg storage query usage for MD-on-SSD P2 #15418

Merged
merged 2 commits into from
Nov 11, 2024

Conversation

tanabarr
Copy link
Contributor

@tanabarr tanabarr commented Oct 28, 2024

Add --mem-ratio and --show-usable flags to dmg storage query usage to
improve usability in MD-on-SSD phase-2 mode. Update table output to
show space usage per rank and tier in MD-on-SSD mode. Add relevant
pool create and storage query usage cmd examples in the admin guide.

Required-githooks: true

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

@tanabarr tanabarr requested review from a team as code owners October 28, 2024 23:47
Copy link

github-actions bot commented Oct 28, 2024

Ticket title is 'Update dmg storage query usage for MD-on-SSD P2'
Status is 'In Review'
Labels: 'control-plane,usability,md_on_ssd2'
https://daosio.atlassian.net/browse/DAOS-16327

Copy link
Contributor

@knard-intel knard-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor fix needed.

docs/admin/pool_operations.md Outdated Show resolved Hide resolved
docs/admin/pool_operations.md Outdated Show resolved Hide resolved
docs/admin/pool_operations.md Outdated Show resolved Hide resolved
src/control/cmd/dmg/pretty/storage_query.go Show resolved Hide resolved
src/control/server/storage/bdev.go Outdated Show resolved Hide resolved
@tanabarr tanabarr force-pushed the tanabarr/control-queryusage-mdonssd-2 branch from 0644365 to ef771e6 Compare November 1, 2024 20:34
@tanabarr tanabarr requested review from a team as code owners November 1, 2024 20:34
@tanabarr tanabarr changed the base branch from feature/vos_on_blob_p2 to niu/vos_on_blob_p2/p2_landing November 1, 2024 20:36
@tanabarr tanabarr removed request for a team November 1, 2024 20:43
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15418/4/display/redirect

Base automatically changed from niu/vos_on_blob_p2/p2_landing to master November 4, 2024 13:02
@gnailzenh gnailzenh requested review from a team as code owners November 4, 2024 13:02
@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15418/5/testReport/

@daltonbohning
Copy link
Contributor

Does this need a rebase before reviews?

@brianjmurrell
Copy link
Contributor

Does this need a rebase before reviews?

It definitely needs a rebase (merge at this point since reviewing has started) so probably better to do that before more reviewing happens.

Copy link
Contributor

@brianjmurrell brianjmurrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packaging changes LGTM.

Features: control
Skip-func-hw-test-medium-md-on-ssd: false
Skip-func-hw-test-large-md-on-ssd: false
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
@tanabarr tanabarr force-pushed the tanabarr/control-queryusage-mdonssd-2 branch from 78e0927 to ece44f2 Compare November 5, 2024 13:52
knard-intel
knard-intel previously approved these changes Nov 5, 2024
Copy link
Contributor

@knard-intel knard-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall is good for me. just minor fixes which could be done in a follow-up PR.


$ dmg pool create bob2 --size 100% --mem-ratio 50%
3. If we then try the same with 2 ranks/engines where bdev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the first example, there is only one rank from my understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


6. Adding another engine/rank on the same host results in more than double DATA
capacity because RAM-disk capacity is halved across two engines/ranks on the same
host and this results in a reduction of META and increase in DATA per-rank sizes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: an explanation on why total Memory File Size (i.e. sum of the two ranks) is lower than the one engine configuration, could be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

---- -----
T1 data,meta,wal

Rank T1-Total T1-Usable T1-Usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanation of the last columns meaning could be helpful. I am not sure if it is the calculated according to the available size or the usable one

@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15418/7/testReport/

src/control/server/storage/bdev.go Outdated Show resolved Hide resolved
src/control/lib/control/storage.go Show resolved Hide resolved
docs/admin/pool_operations.md Outdated Show resolved Hide resolved
docs/admin/pool_operations.md Outdated Show resolved Hide resolved
docs/admin/pool_operations.md Outdated Show resolved Hide resolved
pretty.PrintHostStorageUsageMap(resp.HostStorage, &out)
}
if dbg.Len() > 0 {
cmd.Debugf("%s", dbg.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd.Debugf("%s", dbg.String())
cmd.Debug(dbg.String())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is useful because some parsing of escape characters is different if you run it through format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the newlines aren't parsed when you just print the string? Didn't know that. That sort of seems like a bug in our logging to be honest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think it was something to do with the percentage sign parsing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hm. That makes some sense, but it seems like the non-format version shouldn't be mangling anything at all.

src/control/cmd/dmg/storage_query.go Show resolved Hide resolved
src/control/cmd/dmg/storage_query_test.go Outdated Show resolved Hide resolved
src/control/cmd/dmg/pretty/storage_query.go Outdated Show resolved Hide resolved
src/control/cmd/dmg/pretty/storage_query.go Outdated Show resolved Hide resolved
Features: control
Skip-func-hw-test-medium-md-on-ssd: false
Skip-func-hw-test-large-md-on-ssd: false
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
@tanabarr tanabarr requested a review from kjacque November 8, 2024 09:07
@tanabarr
Copy link
Contributor Author

tanabarr commented Nov 8, 2024

bump @mjmac @knard-intel reviews please

Copy link
Contributor

@knard-intel knard-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tanabarr tanabarr self-assigned this Nov 11, 2024
@tanabarr tanabarr added control-plane work on the management infrastructure of the DAOS Control Plane meta-on-ssd Metadata on SSD Feature labels Nov 11, 2024
@tanabarr tanabarr requested a review from a team November 11, 2024 10:17
@NiuYawei NiuYawei merged commit d4070e8 into master Nov 11, 2024
62 checks passed
@NiuYawei NiuYawei deleted the tanabarr/control-queryusage-mdonssd-2 branch November 11, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
control-plane work on the management infrastructure of the DAOS Control Plane meta-on-ssd Metadata on SSD Feature
Development

Successfully merging this pull request may close these issues.

7 participants