diff --git a/docs/admin/pool_operations.md b/docs/admin/pool_operations.md index 8c3db202c4b..4cee49dad8d 100644 --- a/docs/admin/pool_operations.md +++ b/docs/admin/pool_operations.md @@ -186,179 +186,391 @@ allocated in memory, set `dmg pool create --mem-ratio` option to `50%`. This implies that the ratio of metadata on memory and on storage should be 0.5 and therefore metadata-on-SSD allocation is twice that of metadata-in-memory. -A MD-on-SSD pool created with a `--mem-ratio` between 0 and 100 percent is -said to be operating in "phase-2" mode. +#### MD-on-SSD dmg pool create --mem-ratio examples -#### MD-on-SSD phase-2 pool create examples +These examples cover the recommended way to create a pool in MD-on-SSD +mode with a fractional mem-ratio and using the `--size` percentage option. -These examples cover the recommended way to create a pool in MD-on-SSD phase-2 -mode using the `--size` percentage option. +1. The first simplistic example is run on a single host with a single +rank/engine where bdev roles META and DATA are not shared. -The following example is run on a single host with dual engines where bdev -roles META and DATA are not shared. Two pools are created with VOS index file -size equal to half the meta-blob size (`--mem-ratio 50%`). Both pools use -roughly half the original capacity available (first using 50% and the second -100% of the remainder). +This is a snippet of the server config file engine section showing storage +definitions with `bdev_roles` "meta" and "data" assigned to separate tiers: +```bash + storage: + - + class: ram + scm_mount: /mnt/daos + - + class: nvme + bdev_list: ["0000:81:00.0"] + bdev_roles: [wal,meta] + - + class: nvme + bdev_list: ["0000:82:00.0"] + bdev_roles: [data] +``` + +This pool command requests to use all available storage and maintain a 1:1 +Memory-File to Metadata-Storage size ratio (mem-ratio): +```bash +$ dmg pool create bob --size 100% --mem-ratio 100% + +Pool created with 15.91%,84.09% storage tier ratio +-------------------------------------------------- + UUID : cf70ac58-a9cd-4efd-8a96-a53697353633 + Service Leader : 0 + Service Ranks : 0 + Storage Ranks : 0 + Total Size : 951 GB + Metadata Storage : 151 GB (151 GB / rank) + Data Storage : 800 GB (800 GB / rank) + Memory File Size : 151 GB (151 GB / rank) +``` Rough calculations: `dmg storage scan` shows that for each rank, one 800GB SSD is assigned for each tier (first: WAL+META, second: DATA). `df -h /mnt/daos*` -reports usable ramdisk capacity for each rank is 66GiB. -- Expected Data storage would then be 400GB for a 50% capacity first pool and - 100% capacity second pool per-rank. -- Expected Meta storage at 50% mem-ratio would be `66GiB*2 = 132GiB == 141GB` - giving ~70GB for 50% first and 100% second pools. -- Expected Memory file size (aggregated) is `66GiB/2 = 35GB` for 50% first and - 100% second pools. +reports usable ramdisk capacity for the single rank is 142 GiB (152 GB). +- Expected Data storage would then be 800GB for the pool (one rank). +- Expected Meta storage at 100% mem-ratio would be the total ramdisk capacity. +- Expected Memory-File size would be identical to Meta storage size. + + +2. If the `--mem-ratio` is reduced to 50% in the above example, we end up with +double the Metadata-Storage size compared to Memory-File size (because a larger +proportion of META is allocated due to the change in mem-ratio) and this results +in a larger total pool size: ```bash -$ dmg pool create bob --size 50% --mem-ratio 50% +$ dmg pool create bob --size 100% --mem-ratio 50% -Pool created with 14.86%,85.14% storage tier ratio +Pool created with 27.46%,72.54% storage tier ratio -------------------------------------------------- - UUID : 47060d94-c689-4981-8c89-011beb063f8f + UUID : 8e2cf446-3382-4d69-9b84-51e4e9a20c08 Service Leader : 0 - Service Ranks : [0-1] - Storage Ranks : [0-1] - Total Size : 940 GB - Metadata Storage : 140 GB (70 GB / rank) - Data Storage : 800 GB (400 GB / rank) - Memory File Size : 70 GB (35 GB / rank) + Service Ranks : 0 + Storage Ranks : 0 + Total Size : 1.1 TB + Metadata Storage : 303 GB (303 GB / rank) + Data Storage : 800 GB (800 GB / rank) + Memory File Size : 151 GB (151 GB / rank) +``` + + +3. If we then try the same with bdev roles META and DATA are shared. Here we +can illustrate how metadata overheads are accommodated for when the same +devices share roles (and will be used to store both metadata and data). -$ dmg pool create bob2 --size 100% --mem-ratio 50% +This is a snippet of the server config file engine section showing storage +definitions with `bdev_roles` "meta" and "data" assigned to the same (single) +tier: +```bash + storage: + - + class: ram + scm_mount: /mnt/daos + - + class: nvme + bdev_list: ["0000:81:00.0", "0000:82:00.0"] + bdev_roles: [wal,meta,data] +``` + +This pool command requests to use all available storage and maintain a 1:1 +Memory-File to Metadata-Storage size ratio (mem-ratio): +```bash +$ dmg pool create bob --size 100% --mem-ratio 100% -Pool created with 14.47%,85.53% storage tier ratio +Pool created with 17.93%,82.07% storage tier ratio -------------------------------------------------- - UUID : bdbef091-f0f8-411d-8995-f91c4efc690f - Service Leader : 1 - Service Ranks : [0-1] - Storage Ranks : [0-1] - Total Size : 935 GB - Metadata Storage : 135 GB (68 GB / rank) - Data Storage : 800 GB (400 GB / rank) - Memory File Size : 68 GB (34 GB / rank) - -$ dmg pool query bob - -Pool 47060d94-c689-4981-8c89-011beb063f8f, ntarget=32, disabled=0, leader=0, version=1, state=Ready -Pool health info: -- Rebuild idle, 0 objs, 0 recs -Pool space info: -- Target count:32 -- Total memory-file size: 70 GB -- Metadata storage: - Total size: 140 GB - Free: 131 GB, min:4.1 GB, max:4.1 GB, mean:4.1 GB -- Data storage: - Total size: 800 GB - Free: 799 GB, min:25 GB, max:25 GB, mean:25 GB - -$ dmg pool query bob2 - -Pool bdbef091-f0f8-411d-8995-f91c4efc690f, ntarget=32, disabled=0, leader=1, version=1, state=Ready -Pool health info: -- Rebuild idle, 0 objs, 0 recs -Pool space info: -- Target count:32 -- Total memory-file size: 68 GB -- Metadata storage: - Total size: 135 GB - Free: 127 GB, min:4.0 GB, max:4.0 GB, mean:4.0 GB -- Data storage: - Total size: 800 GB - Free: 799 GB, min:25 GB, max:25 GB, mean:25 GB + UUID : b24df7a5-17d5-4e87-9986-2dff18078b6e + Service Leader : 0 + Service Ranks : 0 + Storage Ranks : 0 + Total Size : 1.5 TB + Metadata Storage : 151 GB (151 GB / rank) + Data Storage : 1.3 TB (1.3 TB / rank) + Memory File Size : 151 GB (151 GB / rank) ``` -The following examples are with a single host with dual engines where bdev -roles WAL, META and DATA are shared. +Looking at this output and comparing with example no. 1 we observe that +because both SSDs are sharing META and DATA roles, more capacity is available +for DATA. + -Single pool with VOS index file size equal to the meta-blob size (`--mem-ratio -100%`). +4. If the `--mem-ratio` is then reduced to 50% in the above example, we end up +with double the Metadata-Storage size which detracts from the DATA capacity. ```bash -$ dmg pool create bob --size 100% --mem-ratio 100% +$ dmg -i pool create bob -z 100% --mem-ratio 50% + +Creating DAOS pool with 100% of all storage +Pool created with 20.32%,79.68% storage tier ratio +-------------------------------------------------- + UUID : 2b4147eb-ade3-4d76-82c4-b9c2c377f8d1 + Service Leader : 0 + Service Ranks : 0 + Storage Ranks : 0 + Total Size : 1.5 TB + Metadata Storage : 303 GB (303 GB / rank) + Data Storage : 1.2 TB (1.2 TB / rank) + Memory File Size : 151 GB (151 GB / rank) +``` + +META has been doubled at the cost of DATA capacity. -Pool created with 5.93%,94.07% storage tier ratio + +5. Adding another engine/rank on the same host results in more than double DATA +capacity because RAM-disk capacity is halved across two engines/ranks on the same +host and this results in a reduction of META and increase in DATA per-rank sizes. +The RAM-disk capacity for each engine is based on half of the available system +RAM. When only one engine exists on the host, all of the available system RAM +(less some calculated reserve) is used for the engine RAM-disk. + +```bash +$ dmg -i pool create bob -z 100% --mem-ratio 50% + +Creating DAOS pool with 100% of all storage +Pool created with 8.65%,91.35% storage tier ratio ------------------------------------------------- - UUID : bad54f1d-8976-428b-a5dd-243372dfa65c + UUID : ee7af142-3d72-45bf-9dc2-e1060c0de5be Service Leader : 1 Service Ranks : [0-1] Storage Ranks : [0-1] - Total Size : 2.4 TB - Metadata Storage : 140 GB (70 GB / rank) - Data Storage : 2.2 TB (1.1 TB / rank) - Memory File Size : 140 GB (70 GB / rank) + Total Size : 3.0 TB + Metadata Storage : 258 GB (129 GB / rank) + Data Storage : 2.7 TB (1.4 TB / rank) + Memory File Size : 129 GB (64 GB / rank) +``` + +6. A larger pool with 6 engines/ranks across 3 hosts using the same shared-role +configuration and pool-create commandline as the previous example. + +```bash +$ dmg -i pool create bob -z 100% --mem-ratio 50% + +Creating DAOS pool with 100% of all storage +Pool created with 8.65%,91.35% storage tier ratio +------------------------------------------------- + UUID : 678833f3-ba0a-4947-a2e8-cef45c3c3977 + Service Leader : 3 + Service Ranks : [0-1,3-5] + Storage Ranks : [0-5] + Total Size : 8.9 TB + Metadata Storage : 773 GB (129 GB / rank) + Data Storage : 8.2 TB (1.4 TB / rank) + Memory File Size : 386 GB (64 GB / rank) ``` -Rough calculations: 1.2TB of usable space is returned from storage scan and -because roles are shared required META (70GB) is reserved so only 1.1TB is -provided for data. +Here the size has increased linearly with the per-rank sizes remaining the +same. + + +7. Now for a more involved example with shared roles. Create two pools of +roughly equal size each using half available capacity and a `--mem-ratio` of +50%. + +An administrator can use the `dmg storage query usage` command to gauge +available capacity across ranks and tiers. Adding `--show-usable` flag shows +capacity that could be used to store DATA once META overheads of a new pool +have been taken into account: -Logging shows: ```bash -DEBUG 2024/09/24 15:44:38.554431 pool.go:1139: added smd device c7da7391-9077-4eb6-9f4a-a3d656166236 (rank 1, ctrlr 0000:d8:00.0, roles "data,meta,wal") as usable: device state="NORMAL", smd-size 623 GB (623307128832), ctrlr-total-free 623 GB (623307128832) -DEBUG 2024/09/24 15:44:38.554516 pool.go:1139: added smd device 18c7bf45-7586-49ba-93c0-cbc08caed901 (rank 1, ctrlr 0000:d9:00.0, roles "data,meta,wal") as usable: device state="NORMAL", smd-size 554 GB (554050781184), ctrlr-total-free 1.2 TB (1177357910016) -DEBUG 2024/09/24 15:44:38.554603 pool.go:1246: based on minimum available ramdisk capacity of 70 GB and mem-ratio 1.00 with 70 GB of reserved metadata capacity, the maximum per-rank sizes for a pool are META=70 GB (69792169984 B) DATA=1.1 TB (1107565740032 B) +$ dmg -i storage query usage --show-usable --mem-ratio 50% -l wolf-[310-312] + +Tier Roles +---- ----- +T1 data,meta,wal + +Rank T1-Total T1-Usable T1-Usage +---- -------- --------- -------- +0 1.6 TB 1.4 TB 14 % +1 1.6 TB 1.4 TB 14 % +2 1.6 TB 1.4 TB 14 % +3 1.6 TB 1.4 TB 14 % +4 1.6 TB 1.4 TB 14 % +5 1.6 TB 1.4 TB 14 % ``` -Now the same as above but with a single pool with VOS index file size equal to -a quarter of the meta-blob size (`--mem-ratio 25%`). +The last column indicates the percentage of the total capacity that is not +usable for new pool data. + +First create a pool using 50% of available capacity: ```bash -$ dmg pool create bob --size 100% --mem-ratio 25% +$ dmg -i pool create bob -z 50% --mem-ratio 50% -Pool created with 23.71%,76.29% storage tier ratio --------------------------------------------------- - UUID : 999ecf55-474e-4476-9f90-0b4c754d4619 - Service Leader : 0 - Service Ranks : [0-1] - Storage Ranks : [0-1] - Total Size : 2.4 TB - Metadata Storage : 558 GB (279 GB / rank) - Data Storage : 1.8 TB (898 GB / rank) - Memory File Size : 140 GB (70 GB / rank) +Creating DAOS pool with 50% of all storage +Pool created with 8.65%,91.35% storage tier ratio +------------------------------------------------- + UUID : 11b9dd1f-edc9-47c7-a61f-cee52d0e7ed4 + Service Leader : 3 + Service Ranks : [1-5] + Storage Ranks : [0-5] + Total Size : 4.5 TB + Metadata Storage : 386 GB (64 GB / rank) + Data Storage : 4.1 TB (681 GB / rank) + Memory File Size : 193 GB (32 GB / rank) +``` + +`dmg storage query usage` can be used to show available capacity on each +rank and tier after the first pool has been created: +```bash +$ dmg -i storage query usage -l wolf-[310-312] + +Tier Roles +---- ----- +T1 data,meta,wal + +Rank T1-Total T1-Free T1-Usage +---- -------- ------- -------- +0 1.6 TB 749 GB 53 % +1 1.6 TB 749 GB 53 % +2 1.6 TB 749 GB 53 % +3 1.6 TB 752 GB 53 % +4 1.6 TB 749 GB 53 % +5 1.6 TB 749 GB 53 % ``` -Rough calculations: 1.2TB of usable space is returned from storage scan and -because roles are shared required META (279GB) is reserved so only ~900GB is -provided for data. +`dmg storage query usage --show-usable` can show usable capacity taking into +account META overheads: -Logging shows: ```bash -DEBUG 2024/09/24 16:16:00.172719 pool.go:1246: based on minimum available ramdisk capacity of 70 GB and mem-ratio 0.25 with 279 GB of reserved metadata capacity, the maximum per-rank sizes for a pool are META=279 GB (279168679936 B) DATA=898 GB (898189230080 B) +$ dmg -i storage query usage --show-usable --mem-ratio 50% -l wolf-[310-312] + +Tier Roles +---- ----- +T1 data,meta,wal + +Rank T1-Total T1-Usable T1-Usage +---- -------- --------- -------- +0 1.6 TB 573 GB 64 % +1 1.6 TB 573 GB 64 % +2 1.6 TB 573 GB 64 % +3 1.6 TB 578 GB 63 % +4 1.6 TB 573 GB 64 % +5 1.6 TB 573 GB 64 % ``` -Now with 6 ranks and a single pool with VOS index file size equal to a half of -the meta-blob size (`--mem-ratio 50%`). +Second create a pool using 100% of remaining capacity: ```bash -$ dmg pool create bob --size 100% --mem-ratio 50% +$ dmg -i pool create ben -z 100% --mem-ratio 50% + +Creating DAOS pool with 100% of all storage +Pool created with 9.80%,90.20% storage tier ratio +------------------------------------------------- + UUID : 48391eed-71b2-47b4-9ad1-780c7143c027 + Service Leader : 5 + Service Ranks : [0,2-5] + Storage Ranks : [0-5] + Total Size : 3.8 TB + Metadata Storage : 374 GB (62 GB / rank) + Data Storage : 3.4 TB (573 GB / rank) + Memory File Size : 187 GB (31 GB / rank) +``` + +The Memory-File-Size is roughly half the dual-rank-per-host RAM-disk size of 64 +GB. The META per-rank size is double the Memory-File-Size as expected for a 50% +mem-ratio. + +Comparing per-rank values with example no. 6 & 7, we can see that the first +pool has roughly 50% META and DATA pe-rank values as expected. The second +created pool is slightly smaller meaning the total cumulative pool size reading +`4.5+3.8 == 8.3 TB` rather than `8.9 TB` which can be partly explained because +of extra per-pool overheads and possible rounding in size calculations. -Pool created with 11.86%,88.14% storage tier ratio + +8. Now for a similar experiment as example no. 8 but with separate META and +DATA roles. + +```bash +$ dmg -i storage query usage --show-usable --mem-ratio 50% -l wolf-[310-312] +Tier Roles +---- ----- +T1 meta,wal +T2 data + +Rank T1-Total T1-Usable T1-Usage T2-Total T2-Usable T2-Usage +---- -------- --------- -------- -------- --------- -------- +0 800 GB 800 GB 0 % 800 GB 800 GB 0 % +1 800 GB 800 GB 0 % 800 GB 800 GB 0 % +2 800 GB 800 GB 0 % 800 GB 800 GB 0 % +3 800 GB 800 GB 0 % 800 GB 800 GB 0 % +4 800 GB 800 GB 0 % 800 GB 800 GB 0 % +5 800 GB 800 GB 0 % 800 GB 800 GB 0 % +``` + +Because META and DATA roles exist on separate tiers, usable space is the same +as free. + +First create a pool using 50% of available capacity: + +```bash +$ dmg -i pool create bob -z 50% --mem-ratio 50% + +Creating DAOS pool with 50% of all storage +Pool created with 13.87%,86.13% storage tier ratio -------------------------------------------------- - UUID : 4fa38199-23a9-4b4d-aa9a-8b9838cad1d6 - Service Leader : 1 + UUID : 0b8a7b40-25d6-44a6-9c1b-23cea0aa3ea2 + Service Leader : 0 Service Ranks : [0-2,4-5] Storage Ranks : [0-5] - Total Size : 7.1 TB - Metadata Storage : 838 GB (140 GB / rank) - Data Storage : 6.2 TB (1.0 TB / rank) - Memory File Size : 419 GB (70 GB / rank) + Total Size : 2.8 TB + Metadata Storage : 386 GB (64 GB / rank) + Data Storage : 2.4 TB (400 GB / rank) + Memory File Size : 193 GB (32 GB / rank) +``` + +Pool size is now much smaller because DATA is confined to a single SSD on +single tier and META is limited by Memory-File-Size. + +`dmg storage query usage` can be used to show available capacity on each rank +and tier after the first pool has been created: +```bash +$ dmg -i storage query usage -l wolf-[310-312] +Tier Roles +---- ----- +T1 meta,wal +T2 data + +Rank T1-Total T1-Free T1-Usage T2-Total T2-Free T2-Usage +---- -------- ------- -------- -------- ------- -------- +0 800 GB 629 GB 21 % 800 GB 400 GB 49 % +1 800 GB 629 GB 21 % 800 GB 400 GB 49 % +2 800 GB 629 GB 21 % 800 GB 400 GB 49 % +3 800 GB 632 GB 20 % 800 GB 400 GB 49 % +4 800 GB 629 GB 21 % 800 GB 400 GB 49 % +5 800 GB 629 GB 21 % 800 GB 400 GB 49 % ``` -Rough calculations: 1177 GB of usable space is returned from storage scan and -because roles are shared required META (140 GB) is reserved so only 1037 GB is -provided for data (per-rank). +Second create a pool using 100% of remaining capacity: -Logging shows: ```bash -DEBUG 2024/09/24 16:40:41.570331 pool.go:1139: added smd device c921c7b9-5f5c-4332-a878-0ebb8191c160 (rank 1, ctrlr 0000:d8:00.0, roles "data,meta,wal") as usable: device state="NORMAL", smd-size 623 GB (623307128832), ctrlr-total-free 623 GB (623307128832) -DEBUG 2024/09/24 16:40:41.570447 pool.go:1139: added smd device a071c3cf-5de1-4911-8549-8c5e8f550554 (rank 1, ctrlr 0000:d9:00.0, roles "data,meta,wal") as usable: device state="NORMAL", smd-size 554 GB (554050781184), ctrlr-total-free 1.2 TB (1177357910016) -DEBUG 2024/09/24 16:40:41.570549 pool.go:1246: based on minimum available ramdisk capacity of 70 GB and mem-ratio 0.50 with 140 GB of reserved metadata capacity, the maximum per-rank sizes for a pool are META=140 GB (139584339968 B) DATA=1.0 TB (1037773570048 B) +$ dmg -i pool create ben -z 100% --mem-ratio 50% + +Creating DAOS pool with 100% of all storage +Pool created with 13.47%,86.53% storage tier ratio +-------------------------------------------------- + UUID : ac90de4b-522e-40de-ab50-da4c85b1000d + Service Leader : 5 + Service Ranks : [0-2,4-5] + Storage Ranks : [0-5] + Total Size : 2.8 TB + Metadata Storage : 374 GB (62 GB / rank) + Data Storage : 2.4 TB (400 GB / rank) + Memory File Size : 187 GB (31 GB / rank) ``` +It should be noted that this example is only with one SSD per tier and is not +necessarily representative of a production environment. The purpose is to give +some idea of how to use the tool commands. + +Capacity can be best utilized by understanding assignment of roles and SSDs +across tiers and the tuning of the mem-ratio pool create option. + ### Listing Pools diff --git a/src/control/cmd/dmg/pretty/pool.go b/src/control/cmd/dmg/pretty/pool.go index d28cc2f8061..987aa1e98d7 100644 --- a/src/control/cmd/dmg/pretty/pool.go +++ b/src/control/cmd/dmg/pretty/pool.go @@ -57,32 +57,45 @@ func printTierBytesRow(fmtName string, tierBytes uint64, numRanks int) txtfmt.Ta } } -func getPoolCreateRespRows(mdOnSSD bool, tierBytes []uint64, tierRatios []float64, numRanks int) (title string, rows []txtfmt.TableRow) { +func getPoolCreateRespRows(tierBytes []uint64, tierRatios []float64, numRanks int) (title string, rows []txtfmt.TableRow) { title = "Pool created with " tierName := "SCM" - if mdOnSSD { - tierName = "Metadata" - } for tierIdx, tierRatio := range tierRatios { if tierIdx > 0 { title += "," tierName = "NVMe" - if mdOnSSD { - tierName = "Data" - } } title += PrintTierRatio(tierRatio) fmtName := fmt.Sprintf("Storage tier %d (%s)", tierIdx, tierName) - if mdOnSSD { - fmtName = tierName + " Storage" + rows = append(rows, printTierBytesRow(fmtName, tierBytes[tierIdx], numRanks)) + } + title += " storage tier ratio" + + return +} + +func getPoolCreateRespRowsMDOnSSD(tierBytes []uint64, tierRatios []float64, numRanks int, memFileBytes uint64) (title string, rows []txtfmt.TableRow) { + title = "Pool created with " + tierName := "Metadata" + + for tierIdx, tierRatio := range tierRatios { + if tierIdx > 0 { + title += "," + tierName = "Data" } + + title += PrintTierRatio(tierRatio) + fmtName := tierName + " Storage" rows = append(rows, printTierBytesRow(fmtName, tierBytes[tierIdx], numRanks)) } title += " storage tier ratio" - return title, rows + // Print memory-file size for MD-on-SSD. + rows = append(rows, printTierBytesRow("Memory File Size", memFileBytes, numRanks)) + + return } // PrintPoolCreateResponse generates a human-readable representation of the pool create @@ -122,17 +135,14 @@ func PrintPoolCreateResponse(pcr *control.PoolCreateResp, out io.Writer, opts .. "Total Size": humanize.Bytes(totalSize * uint64(numRanks)), }) - mdOnSsdEnabled := pcr.MemFileBytes > 0 - - title, tierRows := getPoolCreateRespRows(mdOnSsdEnabled, pcr.TierBytes, tierRatios, - numRanks) - - // Print memory-file to meta-blob ratio for MD-on-SSD. - if mdOnSsdEnabled { - tierRows = append(tierRows, printTierBytesRow("Memory File Size", - pcr.MemFileBytes, numRanks)) + var title string + var tierRows []txtfmt.TableRow + if pcr.MemFileBytes > 0 { + title, tierRows = getPoolCreateRespRowsMDOnSSD(pcr.TierBytes, tierRatios, numRanks, + pcr.MemFileBytes) + } else { + title, tierRows = getPoolCreateRespRows(pcr.TierBytes, tierRatios, numRanks) } - fmtArgs = append(fmtArgs, tierRows...) _, err := fmt.Fprintln(out, txtfmt.FormatEntity(title, fmtArgs)) diff --git a/src/control/cmd/dmg/pretty/storage_nvme_test.go b/src/control/cmd/dmg/pretty/storage_nvme_test.go index 39dc8b6c17c..e99c72ac17d 100644 --- a/src/control/cmd/dmg/pretty/storage_nvme_test.go +++ b/src/control/cmd/dmg/pretty/storage_nvme_test.go @@ -25,7 +25,7 @@ func TestPretty_PrintNVMeController(t *testing.T) { ctrlrWithSmd := func(idx int32, roleBits int) *storage.NvmeController { c := storage.MockNvmeController(idx) sd := storage.MockSmdDevice(nil, idx) - sd.Roles = storage.BdevRoles{storage.OptionBits(roleBits)} + sd.Roles = storage.BdevRolesFromBits(roleBits) sd.Rank = ranklist.Rank(idx) c.SmdDevices = []*storage.SmdDevice{sd} return c diff --git a/src/control/cmd/dmg/pretty/storage_query.go b/src/control/cmd/dmg/pretty/storage_query.go index 894cbe8aaa4..87cd73eb530 100644 --- a/src/control/cmd/dmg/pretty/storage_query.go +++ b/src/control/cmd/dmg/pretty/storage_query.go @@ -9,21 +9,29 @@ package pretty import ( "fmt" "io" + "sort" "strings" "github.com/dustin/go-humanize" "github.com/pkg/errors" + "github.com/daos-stack/daos/src/control/common" "github.com/daos-stack/daos/src/control/lib/control" "github.com/daos-stack/daos/src/control/lib/txtfmt" "github.com/daos-stack/daos/src/control/server/storage" ) +var ( + errNoMetaRole = errors.New("no meta role detected") + errInconsistentRoles = errors.New("roles inconsistent between hosts") + errInsufficientScan = errors.New("insufficient info in scan response") +) + // PrintHostStorageUsageMap generates a human-readable representation of the supplied // HostStorageMap struct and writes utilization info to the supplied io.Writer. -func PrintHostStorageUsageMap(hsm control.HostStorageMap, out io.Writer) error { +func PrintHostStorageUsageMap(hsm control.HostStorageMap, out io.Writer) { if len(hsm) == 0 { - return nil + return } hostsTitle := "Hosts" @@ -44,19 +52,243 @@ func PrintHostStorageUsageMap(hsm control.HostStorageMap, out io.Writer) error { hosts := getPrintHosts(hss.HostSet.RangedString()) row := txtfmt.TableRow{hostsTitle: hosts} storage := hss.HostStorage - row[scmTitle] = humanize.Bytes(storage.ScmNamespaces.Total()) - row[scmFreeTitle] = humanize.Bytes(storage.ScmNamespaces.Free()) - row[scmUsageTitle] = storage.ScmNamespaces.PercentUsage() - row[nvmeTitle] = humanize.Bytes(storage.NvmeDevices.Total()) - row[nvmeFreeTitle] = humanize.Bytes(storage.NvmeDevices.Free()) - row[nvmeUsageTitle] = storage.NvmeDevices.PercentUsage() + + sns := storage.ScmNamespaces + row[scmTitle] = humanize.Bytes(sns.Total()) + scmFree := sns.Free() + row[scmFreeTitle] = humanize.Bytes(scmFree) + row[scmUsageTitle] = common.PercentageString(sns.Total()-scmFree, sns.Total()) + + ncs := storage.NvmeDevices + row[nvmeTitle] = humanize.Bytes(ncs.Total()) + nvmeFree := ncs.Free() + row[nvmeFreeTitle] = humanize.Bytes(nvmeFree) + row[nvmeUsageTitle] = common.PercentageString(ncs.Total()-nvmeFree, ncs.Total()) + table = append(table, row) } + tablePrint.Format(table) +} + +const ( + metaRole = storage.BdevRoleMeta + dataRole = storage.BdevRoleData + rankTitle = "Rank" +) + +// Return role combinations for each tier that contains either a meta or data role. +func getTierRolesForHost(nvme storage.NvmeControllers, metaRolesLast, dataRolesLast *storage.BdevRoles) error { + roles := make(map[int]*storage.BdevRoles) + for _, c := range nvme { + if c.Roles().HasMeta() { + if _, exists := roles[metaRole]; !exists { + roles[metaRole] = c.Roles() + } + } else if c.Roles().HasData() { + if _, exists := roles[dataRole]; !exists { + roles[dataRole] = c.Roles() + } + } + } + + if roles[metaRole].IsEmpty() { + return errNoMetaRole + } + + if !metaRolesLast.IsEmpty() { + // Indicates valid "last" values exist so check consistency. + if *roles[metaRole] != *metaRolesLast { + return errInconsistentRoles + } + if roles[dataRole].IsEmpty() { + if !dataRolesLast.IsEmpty() { + return errInconsistentRoles + } + } else { + if *roles[dataRole] != *dataRolesLast { + return errInconsistentRoles + } + } + } else { + *metaRolesLast = *roles[metaRole] + if !roles[dataRole].IsEmpty() { + *dataRolesLast = *roles[dataRole] + } + } + + return nil +} + +// Print which roles each tier is assigned, only print tiers with meta or data roles. +// Currently tier-list hardcoded to (META/DATA) but this can be extended. +func printTierRolesTable(hsm control.HostStorageMap, out, dbg io.Writer) ([]storage.BdevRoles, error) { + tierTitle := "Tier" + rolesTitle := "Roles" + + tablePrint := txtfmt.NewTableFormatter(tierTitle, rolesTitle) + tablePrint.InitWriter(out) + table := []txtfmt.TableRow{} + + // Currently only tiers with meta and data are of interest so select implicitly. + var metaRoles, dataRoles storage.BdevRoles + for _, key := range hsm.Keys() { + err := getTierRolesForHost(hsm[key].HostStorage.NvmeDevices, &metaRoles, &dataRoles) + if err != nil { + hSet := hsm[key].HostSet + fmt.Fprintf(dbg, "scan resp for hosts %q: %+v\n", hSet, hsm[key].HostStorage) + return nil, errors.Wrapf(err, "hosts %q", hSet) + } + } + + if metaRoles.IsEmpty() { + fmt.Fprintf(dbg, "scan resp: %+v\n", hsm) + return nil, errInsufficientScan + } + + rolesToShow := []storage.BdevRoles{metaRoles} + if !dataRoles.IsEmpty() { + // Print data role row if assigned to a separate tier from meta role. + rolesToShow = append(rolesToShow, dataRoles) + } + for i, roles := range rolesToShow { + table = append(table, txtfmt.TableRow{ + // Starting tier index of 1. + tierTitle: fmt.Sprintf("T%d", i+1), + rolesTitle: roles.String(), + }) + } + + tablePrint.Format(table) + return rolesToShow, nil +} + +func getRowTierTitles(i int, showUsable bool) []string { + totalTitle := fmt.Sprintf("T%d-Total", i) + freeTitle := fmt.Sprintf("T%d-Free", i) + if showUsable { + freeTitle = fmt.Sprintf("T%d-Usable", i) + } + usageTitle := fmt.Sprintf("T%d-Usage", i) + + return []string{totalTitle, freeTitle, usageTitle} +} + +type roleDevsMap map[storage.BdevRoles]storage.NvmeControllers +type rankRoleDevsMap map[int]roleDevsMap + +func iterRankRoleDevs(nvme storage.NvmeControllers, tierRoles []storage.BdevRoles, dbg io.Writer, rankRoleDevs rankRoleDevsMap) error { + for _, nd := range nvme { + if len(nd.SmdDevices) == 0 || nd.SmdDevices[0] == nil { + fmt.Fprintf(dbg, "no smd for %s\n", nd.PciAddr) + continue + } + rank := int(nd.Rank()) + if _, exists := rankRoleDevs[rank]; !exists { + rankRoleDevs[rank] = make(roleDevsMap) + } + roles := nd.Roles() + if roles == nil { + return errors.New("unexpected nil roles") + } + for _, rolesWant := range tierRoles { + if *roles != rolesWant { + continue + } + fmt.Fprintf(dbg, "add r%d-%s roles %q tot/avail/usabl %d/%d/%d\n", rank, + nd.PciAddr, roles, nd.Total(), nd.Free(), nd.Usable()) + rankRoleDevs[rank][rolesWant] = append( + rankRoleDevs[rank][rolesWant], nd) + break + } + } + + return nil +} + +func getRankRolesRow(rank int, tierRoles []storage.BdevRoles, roleDevs roleDevsMap, showUsable bool) txtfmt.TableRow { + row := txtfmt.TableRow{rankTitle: fmt.Sprintf("%d", rank)} + for i, roles := range tierRoles { + titles := getRowTierTitles(i+1, showUsable) + totalTitle, freeTitle, usageTitle := titles[0], titles[1], titles[2] + devs, exists := roleDevs[roles] + if !exists { + row[totalTitle] = "-" + row[freeTitle] = "-" + row[usageTitle] = "-" + continue + } + row[totalTitle] = humanize.Bytes(devs.Total()) + free := devs.Free() + // Handle special case where SSDs with META but without DATA should show usable + // space as bytes available in regards to META space. Usable bytes is only + // calculated for SSDs with DATA role. + if showUsable && !(roles.HasMeta() && !roles.HasData()) { + free = devs.Usable() + } + row[freeTitle] = humanize.Bytes(free) + row[usageTitle] = common.PercentageString(devs.Total()-free, devs.Total()) + } + + return row +} + +// Print usage table with row for each rank and column for each tier. +func printTierUsageTable(hsm control.HostStorageMap, tierRoles []storage.BdevRoles, out, dbg io.Writer, showUsable bool) error { + if len(tierRoles) == 0 { + return errors.New("no table role data to show") + } + titles := []string{rankTitle} + for i := range tierRoles { + titles = append(titles, getRowTierTitles(i+1, showUsable)...) + } + + tablePrint := txtfmt.NewTableFormatter(titles...) + tablePrint.InitWriter(out) + table := []txtfmt.TableRow{} + + // Build controllers-to-roles-to-rank map. + rankRoleDevs := make(rankRoleDevsMap) + for _, key := range hsm.Keys() { + err := iterRankRoleDevs(hsm[key].HostStorage.NvmeDevices, tierRoles, dbg, + rankRoleDevs) + if err != nil { + return errors.Wrapf(err, "host %q", hsm[key].HostSet) + } + } + + var ranks []int + for rank := range rankRoleDevs { + ranks = append(ranks, rank) + } + sort.Ints(ranks) + + for _, rank := range ranks { + table = append(table, + getRankRolesRow(rank, tierRoles, rankRoleDevs[rank], showUsable)) + } + tablePrint.Format(table) return nil } +// PrintHostStorageUsageMapMdOnSsd generates a human-readable representation of the supplied +// HostStorageMap struct and writes utilization info to the supplied io.Writer in a format +// relevant to MD-on-SSD mode. +func PrintHostStorageUsageMapMdOnSsd(hsm control.HostStorageMap, out, dbg io.Writer, showUsable bool) error { + if len(hsm) == 0 { + return nil + } + + tierRoles, err := printTierRolesTable(hsm, out, dbg) + if err != nil { + return err + } + fmt.Fprintf(out, "\n") + + return printTierUsageTable(hsm, tierRoles, out, dbg, showUsable) +} + // NVMe controller namespace ID (NSID) should only be displayed if >= 1. Zero value should be // ignored in display output. func printSmdDevice(dev *storage.SmdDevice, iw io.Writer, opts ...PrintConfigOption) error { diff --git a/src/control/cmd/dmg/pretty/storage_query_test.go b/src/control/cmd/dmg/pretty/storage_query_test.go index 1dfb41fbebb..cd20dccdae8 100644 --- a/src/control/cmd/dmg/pretty/storage_query_test.go +++ b/src/control/cmd/dmg/pretty/storage_query_test.go @@ -7,13 +7,13 @@ package pretty import ( - "errors" "fmt" "strings" "testing" "time" "github.com/google/go-cmp/cmp" + "github.com/pkg/errors" "github.com/daos-stack/daos/src/control/common/test" "github.com/daos-stack/daos/src/control/lib/control" @@ -134,9 +134,7 @@ host1 3.0 TB 750 GB 75 % 36 TB 27 TB 25 % if err := PrintResponseErrors(resp, &bld); err != nil { t.Fatal(err) } - if err := PrintHostStorageUsageMap(resp.HostStorage, &bld); err != nil { - t.Fatal(err) - } + PrintHostStorageUsageMap(resp.HostStorage, &bld) if diff := cmp.Diff(strings.TrimLeft(tc.expPrintStr, "\n"), bld.String()); diff != "" { t.Fatalf("unexpected format string (-want, +got):\n%s\n", diff) @@ -145,6 +143,296 @@ host1 3.0 TB 750 GB 75 % 36 TB 27 TB 25 % } } +func TestPretty_getTierRolesForHost(t *testing.T) { + for name, tc := range map[string]struct { + nvme storage.NvmeControllers + metaRolesLast storage.BdevRoles + dataRolesLast storage.BdevRoles + expErr error + expMetaRoles string + expDataRoles string + }{ + "no roles": { + expErr: errNoMetaRole, + }, + "no smd on controller": { + nvme: storage.NvmeControllers{storage.MockNvmeController(1)}, + expErr: errNoMetaRole, + }, + "shared roles": { + nvme: storage.NvmeControllers{ + func() *storage.NvmeController { + c := storage.MockNvmeController(1) + c.SmdDevices = []*storage.SmdDevice{storage.MockSmdDevice(nil, 1)} + return c + }(), + }, + expMetaRoles: "data,meta,wal", + }, + "separate meta,data roles": { + nvme: storage.NvmeControllers{ + func() *storage.NvmeController { + c := storage.MockNvmeController(1) + sd := storage.MockSmdDevice(nil, 1) + sd.Roles = storage.BdevRolesFromBits( + storage.BdevRoleMeta | storage.BdevRoleWAL) + c.SmdDevices = []*storage.SmdDevice{sd} + return c + }(), + func() *storage.NvmeController { + c := storage.MockNvmeController(2) + sd := storage.MockSmdDevice(nil, 2) + sd.Roles = storage.BdevRolesFromBits(storage.BdevRoleData) + c.SmdDevices = []*storage.SmdDevice{sd} + return c + }(), + }, + expMetaRoles: "meta,wal", + expDataRoles: "data", + }, + "separate wal,meta,data roles": { + nvme: storage.NvmeControllers{ + func() *storage.NvmeController { + c := storage.MockNvmeController(1) + sd := storage.MockSmdDevice(nil, 1) + sd.Roles = storage.BdevRolesFromBits(storage.BdevRoleWAL) + c.SmdDevices = []*storage.SmdDevice{sd} + return c + }(), + func() *storage.NvmeController { + c := storage.MockNvmeController(2) + sd := storage.MockSmdDevice(nil, 2) + sd.Roles = storage.BdevRolesFromBits(storage.BdevRoleMeta) + c.SmdDevices = []*storage.SmdDevice{sd} + return c + }(), + func() *storage.NvmeController { + c := storage.MockNvmeController(3) + sd := storage.MockSmdDevice(nil, 3) + sd.Roles = storage.BdevRolesFromBits(storage.BdevRoleData) + c.SmdDevices = []*storage.SmdDevice{sd} + return c + }(), + }, + expMetaRoles: "meta", + expDataRoles: "data", + }, + "different meta roles from last seen": { + nvme: storage.NvmeControllers{ + func() *storage.NvmeController { + c := storage.MockNvmeController(1) + c.SmdDevices = []*storage.SmdDevice{storage.MockSmdDevice(nil, 1)} + return c + }(), + }, + metaRolesLast: storage.BdevRolesFromBits( + storage.BdevRoleMeta | storage.BdevRoleWAL), + expErr: errInconsistentRoles, + }, + "different data roles from last seen": { + nvme: storage.NvmeControllers{ + func() *storage.NvmeController { + c := storage.MockNvmeController(1) + c.SmdDevices = []*storage.SmdDevice{storage.MockSmdDevice(nil, 1)} + return c + }(), + }, + metaRolesLast: storage.BdevRolesFromBits(storage.BdevRoleAll), + dataRolesLast: storage.BdevRolesFromBits(storage.BdevRoleData), + expErr: errInconsistentRoles, + }, + } { + t.Run(name, func(t *testing.T) { + gotErr := getTierRolesForHost(tc.nvme, &tc.metaRolesLast, &tc.dataRolesLast) + test.CmpErr(t, tc.expErr, gotErr) + if tc.expErr != nil { + return + } + + if tc.expMetaRoles == "" { + tc.expMetaRoles = "NA" + } + if tc.expDataRoles == "" { + tc.expDataRoles = "NA" + } + + if diff := cmp.Diff(tc.expMetaRoles, tc.metaRolesLast.String()); diff != "" { + t.Fatalf("unexpected output meta roles (-want, +got):\n%s\n", diff) + } + if diff := cmp.Diff(tc.expDataRoles, tc.dataRolesLast.String()); diff != "" { + t.Fatalf("unexpected output data roles (-want, +got):\n%s\n", diff) + } + }) + } +} + +func TestPretty_PrintHostStorageUsageMapMdOnSsd(t *testing.T) { + var ( + noStorage = control.MockServerScanResp(t, "noStorage") + withSpaceUsage = control.MockServerScanResp(t, "withSpaceUsage") + withSpaceUsageRolesSeparate1 = control.MockServerScanResp(t, + "withSpaceUsageRolesSeparate1") + withSpaceUsageRolesSeparate2 = control.MockServerScanResp(t, + "withSpaceUsageRolesSeparate2") + bothFailed = control.MockServerScanResp(t, "bothFailed") + ) + + for name, tc := range map[string]struct { + mic *control.MockInvokerConfig + showUsable bool + expErr error + expPrintStr string + }{ + "failed scans": { + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: bothFailed, + }, + }, + }, + }, + expErr: errors.Errorf("hosts \"host1\": %s", errNoMetaRole.Error()), + }, + "no storage": { + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: noStorage, + }, + }, + }, + }, + expErr: errNoMetaRole, + }, + "single host with space usage; shared roles so single tier": { + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: withSpaceUsage, + }, + }, + }, + }, + expPrintStr: ` +Tier Roles +---- ----- +T1 data,meta,wal + +Rank T1-Total T1-Free T1-Usage +---- -------- ------- -------- +0 36 TB 27 TB 25 % +`, + }, + "multiple hosts with space usage; inconsistent roles between hosts": { + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: withSpaceUsage, + }, + { + Addr: "host2", + Message: withSpaceUsageRolesSeparate1, + }, + }, + }, + }, + expErr: errInconsistentRoles, + }, + "multiple hosts with space available; separate roles; two-tiers per rank": { + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: withSpaceUsageRolesSeparate1, + }, + { + Addr: "host2", + Message: withSpaceUsageRolesSeparate2, + }, + }, + }, + }, + expPrintStr: ` +Tier Roles +---- ----- +T1 meta,wal +T2 data + +Rank T1-Total T1-Free T1-Usage T2-Total T2-Free T2-Usage +---- -------- ------- -------- -------- ------- -------- +0 2.0 TB 1.0 TB 50 % 2.0 TB 1.5 TB 25 % +1 2.0 TB 500 GB 75 % 2.0 TB 1.0 TB 50 % +2 1.0 TB 500 GB 50 % 1.0 TB 750 GB 25 % +3 1.0 TB 250 GB 75 % 1.0 TB 500 GB 50 % +`, + }, + // META tier separate so print available rather than usable (which would be zero). + "multiple hosts with space usable; separate roles; two-tiers per rank": { + showUsable: true, + mic: &control.MockInvokerConfig{ + UnaryResponse: &control.UnaryResponse{ + Responses: []*control.HostResponse{ + { + Addr: "host1", + Message: withSpaceUsageRolesSeparate1, + }, + { + Addr: "host2", + Message: withSpaceUsageRolesSeparate2, + }, + }, + }, + }, + expPrintStr: ` +Tier Roles +---- ----- +T1 meta,wal +T2 data + +Rank T1-Total T1-Usable T1-Usage T2-Total T2-Usable T2-Usage +---- -------- --------- -------- -------- --------- -------- +0 2.0 TB 1.0 TB 50 % 2.0 TB 1.0 TB 50 % +1 2.0 TB 500 GB 75 % 2.0 TB 500 GB 75 % +2 1.0 TB 500 GB 50 % 1.0 TB 500 GB 50 % +3 1.0 TB 250 GB 75 % 1.0 TB 250 GB 75 % +`, + }, + } { + t.Run(name, func(t *testing.T) { + log, buf := logging.NewTestLogger(t.Name()) + defer test.ShowBufferOnFailure(t, buf) + + ctx := test.Context(t) + mi := control.NewMockInvoker(log, tc.mic) + + resp, err := control.StorageScan(ctx, mi, &control.StorageScanReq{}) + if err != nil { + t.Fatal(err) + } + + var out, dbg strings.Builder + gotErr := PrintHostStorageUsageMapMdOnSsd(resp.HostStorage, &out, &dbg, tc.showUsable) + test.CmpErr(t, tc.expErr, gotErr) + + t.Logf("DBG: %s", dbg.String()) + + if diff := cmp.Diff(strings.TrimLeft(tc.expPrintStr, "\n"), out.String()); diff != "" { + t.Fatalf("unexpected output string (-want, +got):\n%s\n", diff) + } + }) + } +} + func TestPretty_PrintSmdInfoMap(t *testing.T) { mockController := storage.MockNvmeController(1) newCtrlr := storage.NvmeController{ diff --git a/src/control/cmd/dmg/storage_query.go b/src/control/cmd/dmg/storage_query.go index 8604bb81100..358188c3e83 100644 --- a/src/control/cmd/dmg/storage_query.go +++ b/src/control/cmd/dmg/storage_query.go @@ -123,14 +123,28 @@ type usageQueryCmd struct { ctlInvokerCmd hostListCmd cmdutil.JSONOutputCmd + ShowUsable bool `short:"u" long:"show-usable" description:"Set to display potential data capacity of future pools by factoring in a new pool's metadata overhead. This can include the use of MD-on-SSD mem-ratio if specified to calculate meta-blob size when adjusting NVMe free capacity"` + MemRatio tierRatioFlag `long:"mem-ratio" description:"Set the percentage of the pool metadata storage size (on SSD) that should be used as the memory file size (on ram-disk). Used to calculate data size for new MD-on-SSD phase-2 pools. Only valid with --show-usable flag"` } // Execute is run when usageQueryCmd activates. // -// Queries NVMe and SCM usage on hosts. +// Queries storage usage on hosts. func (cmd *usageQueryCmd) Execute(_ []string) error { ctx := cmd.MustLogCtx() - req := &control.StorageScanReq{Usage: true} + req := &control.StorageScanReq{ + Usage: true, + } + if cmd.MemRatio.IsSet() { + if !cmd.ShowUsable { + return errors.New("--mem-ratio is only supported with --show-usable flag") + } + f, err := ratiosToSingleFraction(cmd.MemRatio.Ratios()) + if err != nil { + return errors.Wrap(err, "md-on-ssd mode query usage unexpected mem-ratio") + } + req.MemRatio = f + } req.SetHostList(cmd.getHostList()) resp, err := control.StorageScan(ctx, cmd.ctlInvoker, req) @@ -142,16 +156,33 @@ func (cmd *usageQueryCmd) Execute(_ []string) error { return err } - var bld strings.Builder - if err := pretty.PrintResponseErrors(resp, &bld); err != nil { + var outErr strings.Builder + if err := pretty.PrintResponseErrors(resp, &outErr); err != nil { return err } - if err := pretty.PrintHostStorageUsageMap(resp.HostStorage, &bld); err != nil { - return err + if outErr.Len() > 0 { + cmd.Error(outErr.String()) + } + + var out, dbg strings.Builder + if resp.HostStorage.IsMdOnSsdEnabled() { + if err := pretty.PrintHostStorageUsageMapMdOnSsd(resp.HostStorage, &out, &dbg, cmd.ShowUsable); err != nil { + cmd.Error(err.Error()) + } + } else { + if cmd.ShowUsable { + cmd.Notice("--show-usable flag ignored when MD-on-SSD is not enabled") + } + pretty.PrintHostStorageUsageMap(resp.HostStorage, &out) + } + if dbg.Len() > 0 { + cmd.Debugf("%s", dbg.String()) + } + if out.Len() > 0 { + // Infof prints raw string and doesn't try to expand "%" + // preserving column formatting in txtfmt table + cmd.Infof("%s", out.String()) } - // Infof prints raw string and doesn't try to expand "%" - // preserving column formatting in txtfmt table - cmd.Infof("%s", bld.String()) return resp.Errors() } diff --git a/src/control/cmd/dmg/storage_query_test.go b/src/control/cmd/dmg/storage_query_test.go index 190ddaee217..f8d8af31409 100644 --- a/src/control/cmd/dmg/storage_query_test.go +++ b/src/control/cmd/dmg/storage_query_test.go @@ -116,11 +116,47 @@ func TestStorageQueryCommands(t *testing.T) { nil, }, { - "per-server storage space utilization query", + "per-server storage space query", "storage query usage", printRequest(t, &control.StorageScanReq{Usage: true}), nil, }, + { + "per-server storage space query (with custom mem-ratio but no show-usable)", + "storage query usage --mem-ratio 25.5", + "", + errors.New("only supported with --show-usable"), + }, + { + "per-server storage space query (with custom mem-ratio)", + "storage query usage --show-usable --mem-ratio 25.5", + printRequest(t, &control.StorageScanReq{Usage: true, MemRatio: 0.255}), + nil, + }, + { + "per-server storage space query (with two-tier mem-ratio)", + "storage query usage -u --mem-ratio 20,80", + printRequest(t, &control.StorageScanReq{Usage: true, MemRatio: 0.2}), + nil, + }, + { + "per-server storage space query (with 100% mem-ratio)", + "storage query usage -u --mem-ratio 100%", + printRequest(t, &control.StorageScanReq{Usage: true, MemRatio: 1}), + nil, + }, + { + "per-server storage space query (with three-tier mem-ratio)", + "storage query usage -u --mem-ratio 10,20,70", + "", + errors.New("want 2 ratio values got 3"), + }, + { + "per-server storage space query (with --show-usable flag)", + "storage query usage --show-usable", + printRequest(t, &control.StorageScanReq{Usage: true}), + nil, + }, { "Set FAULTY device status (missing host)", "storage set nvme-faulty --uuid 842c739b-86b5-462f-a7ba-b4a91b674f3d -f", diff --git a/src/control/lib/control/auto_test.go b/src/control/lib/control/auto_test.go index 2e45d641daf..724bed72180 100644 --- a/src/control/lib/control/auto_test.go +++ b/src/control/lib/control/auto_test.go @@ -523,7 +523,7 @@ func TestControl_AutoConfig_getStorageSet(t *testing.T) { HostSet: hostlist.MustCreateSet("host[1-2]"), HostStorage: &HostStorage{ NvmeDevices: storage.NvmeControllers{ - mockNvmeCtrlrWithSmd(storage.OptionBits(0)), + mockNvmeCtrlrWithSmd(0), }, ScmModules: storage.ScmModules{storage.MockScmModule()}, ScmNamespaces: storage.ScmNamespaces{ diff --git a/src/control/lib/control/mocks.go b/src/control/lib/control/mocks.go index 63b55837a1c..7da646fb904 100644 --- a/src/control/lib/control/mocks.go +++ b/src/control/lib/control/mocks.go @@ -317,11 +317,11 @@ func MockMemInfo() *common.MemInfo { } } -func mockNvmeCtrlrWithSmd(bdevRoles storage.OptionBits, varIdx ...int32) *storage.NvmeController { +func mockNvmeCtrlrWithSmd(roleBits int, varIdx ...int32) *storage.NvmeController { idx := test.GetIndex(varIdx...) nc := storage.MockNvmeController(idx) sd := storage.MockSmdDevice(nil, idx) - sd.Roles = storage.BdevRoles{bdevRoles} + sd.Roles = storage.BdevRolesFromBits(roleBits) nc.SmdDevices = []*storage.SmdDevice{sd} return nc } @@ -334,7 +334,7 @@ func standardServerScanResponse(t *testing.T) *ctlpb.StorageScanResp { } nvmeControllers := storage.NvmeControllers{ - mockNvmeCtrlrWithSmd(storage.OptionBits(0)), + mockNvmeCtrlrWithSmd(0), } if err := convert.Types(nvmeControllers, &pbSsr.Nvme.Ctrlrs); err != nil { t.Fatal(err) @@ -368,12 +368,50 @@ func MockServerScanResp(t *testing.T, variant string) *ctlpb.StorageScanResp { } return ncs } + ctrlrWithUsage := func(i, rank, roleBits int, tot, avail, usabl uint64) *storage.NvmeController { + nc := storage.MockNvmeController(int32(i)) + nc.SocketID = int32(i % 2) + sd := storage.MockSmdDevice(nil, int32(i)) + sd.TotalBytes = tot + sd.AvailBytes = avail + sd.UsableBytes = usabl + sd.Rank = ranklist.Rank(rank) + sd.Roles = storage.BdevRolesFromBits(roleBits) + nc.SmdDevices = append(nc.SmdDevices, sd) + return nc + } + ctrlrsWithUsageSepRoles := func(firstRank, secondRank int, baseBytes uint64) storage.NvmeControllers { + ncs := make(storage.NvmeControllers, 0) + for _, i := range []int{1, 2} { + ncs = append(ncs, ctrlrWithUsage(i, firstRank, + storage.BdevRoleWAL|storage.BdevRoleMeta, baseBytes, + baseBytes/2, baseBytes/4)) + } + for _, i := range []int{3, 4} { + ncs = append(ncs, ctrlrWithUsage(i, firstRank, storage.BdevRoleData, + baseBytes, (baseBytes/4)*3, // 75% available + (baseBytes/4)*2)) // 50% usable + } + for _, i := range []int{5, 6} { + ncs = append(ncs, ctrlrWithUsage(i, secondRank, + storage.BdevRoleWAL|storage.BdevRoleMeta, baseBytes, + baseBytes/4, baseBytes/8)) + } + for _, i := range []int{7, 8} { + ncs = append(ncs, ctrlrWithUsage(i, secondRank, storage.BdevRoleData, + baseBytes, (baseBytes/4)*2, // 50% available + baseBytes/4)) // 25% usable + } + return ncs + } switch variant { case "withSpaceUsage": snss := make(storage.ScmNamespaces, 0) for _, i := range []int{0, 1} { sm := storage.MockScmMountPoint(int32(i)) + sm.AvailBytes = uint64((humanize.TByte/4)*3) * uint64(i) // 75% available + sm.UsableBytes = uint64((humanize.TByte/4)*2) * uint64(i) // 50% usable sns := storage.MockScmNamespace(int32(i)) sns.Mount = sm snss = append(snss, sns) @@ -383,15 +421,21 @@ func MockServerScanResp(t *testing.T, variant string) *ctlpb.StorageScanResp { } ncs := make(storage.NvmeControllers, 0) for _, i := range []int{1, 2, 3, 4, 5, 6, 7, 8} { - nc := storage.MockNvmeController(int32(i)) - nc.SocketID = int32(i % 2) - sd := storage.MockSmdDevice(nc, int32(i)) - sd.TotalBytes = uint64(humanize.TByte) * uint64(i) - sd.AvailBytes = uint64((humanize.TByte/4)*3) * uint64(i) // 25% used - sd.UsableBytes = uint64((humanize.TByte/4)*3) * uint64(i) // 25% used - nc.SmdDevices = append(nc.SmdDevices, sd) - ncs = append(ncs, nc) + ncs = append(ncs, ctrlrWithUsage(i, 0, storage.BdevRoleAll, + uint64(humanize.TByte)*uint64(i), + uint64((humanize.TByte/4)*3)*uint64(i), // 75% available + uint64((humanize.TByte/4)*2)*uint64(i))) // 50% usable + } + if err := convert.Types(ncs, &ssr.Nvme.Ctrlrs); err != nil { + t.Fatal(err) + } + case "withSpaceUsageRolesSeparate1": + ncs := ctrlrsWithUsageSepRoles(0, 1, humanize.TByte) + if err := convert.Types(ncs, &ssr.Nvme.Ctrlrs); err != nil { + t.Fatal(err) } + case "withSpaceUsageRolesSeparate2": + ncs := ctrlrsWithUsageSepRoles(2, 3, 0.5*humanize.TByte) if err := convert.Types(ncs, &ssr.Nvme.Ctrlrs); err != nil { t.Fatal(err) } @@ -594,9 +638,7 @@ func MockFormatResp(t *testing.T, mfc MockFormatConf) *StorageFormatResp { PciAddr: fmt.Sprintf("%d", j+1), SmdDevices: []*storage.SmdDevice{ { - Roles: storage.BdevRoles{ - storage.OptionBits(mfc.NvmeRoleBits), - }, + Roles: storage.BdevRolesFromBits(mfc.NvmeRoleBits), }, }, }) diff --git a/src/control/lib/control/pool.go b/src/control/lib/control/pool.go index 65b042ad406..62578d0676f 100644 --- a/src/control/lib/control/pool.go +++ b/src/control/lib/control/pool.go @@ -1129,22 +1129,6 @@ func processNVMeSpaceStats(log debugLogger, filterRank filterRankFn, nvmeControl // Return the maximal SCM and NVMe size of a pool which could be created with all the storage nodes. func getMaxPoolSize(ctx context.Context, rpcClient UnaryInvoker, createReq *PoolCreateReq) (uint64, uint64, error) { - isMdOnSsdEnabled := func(log debugLogger, hsm HostStorageMap) bool { - for _, hss := range hsm { - hs := hss.HostStorage - if hs == nil { - continue - } - nvme := hs.NvmeDevices - if nvme.Len() > 0 && !nvme[0].Roles().IsEmpty() { - log.Debugf("fetch max pool size in md-on-size mode") - return true - } - } - - return false - } - if createReq.MemRatio < 0 { return 0, 0, errors.New("invalid mem-ratio, should be greater than zero") } @@ -1209,12 +1193,13 @@ func getMaxPoolSize(ctx context.Context, rpcClient UnaryInvoker, createReq *Pool } } - if !isMdOnSsdEnabled(rpcClient, scanResp.HostStorage) { + if !scanResp.HostStorage.IsMdOnSsdEnabled() { rpcClient.Debugf("Maximal size of a pool: scmBytes=%s (%d B) nvmeBytes=%s (%d B)", humanize.Bytes(scmBytes), scmBytes, humanize.Bytes(nvmeBytes), nvmeBytes) return scmBytes, nvmeBytes, nil } + rpcClient.Debugf("md-on-ssd mode detected") // In MD-on-SSD mode calculate metaBytes based on the minimum ramdisk (called scm here) // availability across ranks. NVMe sizes returned in StorageScan response at the beginning diff --git a/src/control/lib/control/server_meta_test.go b/src/control/lib/control/server_meta_test.go index affc59bd210..89fdbb34e9d 100644 --- a/src/control/lib/control/server_meta_test.go +++ b/src/control/lib/control/server_meta_test.go @@ -212,9 +212,7 @@ func TestControl_SmdQuery(t *testing.T) { NvmeState: storage.NvmeStateNormal, LedState: storage.LedStateNormal, }, - Roles: storage.BdevRoles{ - storage.OptionBits(storage.BdevRoleAll), - }, + Roles: storage.BdevRolesFromBits(storage.BdevRoleAll), HasSysXS: true, }, { @@ -226,9 +224,7 @@ func TestControl_SmdQuery(t *testing.T) { NvmeState: storage.NvmeStateFaulty, LedState: storage.LedStateFaulty, }, - Roles: storage.BdevRoles{ - storage.OptionBits(storage.BdevRoleData), - }, + Roles: storage.BdevRolesFromBits(storage.BdevRoleData), }, }, Pools: make(map[string][]*SmdPool), @@ -771,10 +767,8 @@ func TestControl_SmdManage(t *testing.T) { Rank: ranklist.Rank(0), TargetIDs: []int32{1, 2, 3}, Ctrlr: defMockCtrlr, - Roles: storage.BdevRoles{ - storage.OptionBits(storage.BdevRoleAll), - }, - HasSysXS: true, + Roles: storage.BdevRolesFromBits(storage.BdevRoleAll), + HasSysXS: true, }, }, }, diff --git a/src/control/lib/control/storage.go b/src/control/lib/control/storage.go index fb649a06ce1..d84f837c791 100644 --- a/src/control/lib/control/storage.go +++ b/src/control/lib/control/storage.go @@ -153,14 +153,31 @@ func (hsm HostStorageMap) HostCount() (nrHosts int) { return nrHosts } +// IsMdOnSsdEnabled returns true when bdev MD-on-SSD roles have been set on a NVMe SSD within a +// HostStorageMap's host storage set's NvmeDevices. Assumes that no roles exist if mode is PMem. +func (hsm HostStorageMap) IsMdOnSsdEnabled() bool { + for _, hss := range hsm { + hs := hss.HostStorage + if hs == nil { + continue + } + if hs.NvmeDevices.HaveMdOnSsdRoles() { + return true + } + break + } + + return false +} + type ( // StorageScanReq contains the parameters for a storage scan request. StorageScanReq struct { unaryRequest - Usage bool - NvmeHealth bool - NvmeBasic bool - MemRatio float32 + Usage bool `json:"usage"` + NvmeHealth bool `json:"nvme_health"` + NvmeBasic bool `json:"nvme_basic"` + MemRatio float32 `json:"mem_ratio"` } // StorageScanResp contains the response from a storage scan request. @@ -292,7 +309,7 @@ type ( // StorageFormatReq contains the parameters for a storage format request. StorageFormatReq struct { unaryRequest - Reformat bool + Reformat bool `json:"reformat"` } // StorageFormatResp contains the response from a storage format request. @@ -328,11 +345,7 @@ func (sfr *StorageFormatResp) addHostResponse(hr *HostResponse) (err error) { Info: info, PciAddr: nr.GetPciAddr(), SmdDevices: []*storage.SmdDevice{ - { - Roles: storage.BdevRoles{ - storage.OptionBits(nr.RoleBits), - }, - }, + {Roles: storage.BdevRolesFromBits(int(nr.RoleBits))}, }, }) default: diff --git a/src/control/lib/control/storage_test.go b/src/control/lib/control/storage_test.go index f5d1f607a95..da2da871158 100644 --- a/src/control/lib/control/storage_test.go +++ b/src/control/lib/control/storage_test.go @@ -1,5 +1,5 @@ // -// (C) Copyright 2020-2022 Intel Corporation. +// (C) Copyright 2020-2024 Intel Corporation. // // SPDX-License-Identifier: BSD-2-Clause-Patent // @@ -17,6 +17,7 @@ import ( ctlpb "github.com/daos-stack/daos/src/control/common/proto/ctl" mgmtpb "github.com/daos-stack/daos/src/control/common/proto/mgmt" "github.com/daos-stack/daos/src/control/common/test" + "github.com/daos-stack/daos/src/control/lib/ranklist" "github.com/daos-stack/daos/src/control/logging" "github.com/daos-stack/daos/src/control/server/storage" "github.com/daos-stack/daos/src/control/system" @@ -120,6 +121,35 @@ func TestControl_StorageMap(t *testing.T) { }, expHsmLen: 2, }, + "mismatch smd rank": { + hss: []*HostStorage{ + { + NvmeDevices: storage.NvmeControllers{ + &storage.NvmeController{ + SmdDevices: []*storage.SmdDevice{ + {Rank: ranklist.Rank(1)}, + }, + }, + }, + ScmNamespaces: storage.ScmNamespaces{ + storage.MockScmNamespace(0), + }, + }, + { + NvmeDevices: storage.NvmeControllers{ + &storage.NvmeController{ + SmdDevices: []*storage.SmdDevice{ + {Rank: ranklist.Rank(2)}, + }, + }, + }, + ScmNamespaces: storage.ScmNamespaces{ + storage.MockScmNamespace(0), + }, + }, + }, + expHsmLen: 2, + }, } { t.Run(name, func(t *testing.T) { hsm := make(HostStorageMap) diff --git a/src/control/server/ctl_storage_rpc.go b/src/control/server/ctl_storage_rpc.go index 90a46495ae0..8076487f272 100644 --- a/src/control/server/ctl_storage_rpc.go +++ b/src/control/server/ctl_storage_rpc.go @@ -398,7 +398,7 @@ func (cs *ControlService) getRdbSize(engineCfg *engine.Config) (uint64, error) { mdCapStr, err := engineCfg.GetEnvVar(daos.DaosMdCapEnv) if err != nil { cs.log.Debugf("using default RDB file size with engine %d: %s (%d Bytes)", - engineCfg.Index, humanize.Bytes(daos.DefaultDaosMdCapSize), + engineCfg.Index, humanize.IBytes(daos.DefaultDaosMdCapSize), daos.DefaultDaosMdCapSize) return uint64(daos.DefaultDaosMdCapSize), nil } @@ -410,7 +410,7 @@ func (cs *ControlService) getRdbSize(engineCfg *engine.Config) (uint64, error) { } rdbSize = rdbSize << 20 cs.log.Debugf("using custom RDB size with engine %d: %s (%d Bytes)", - engineCfg.Index, humanize.Bytes(rdbSize), rdbSize) + engineCfg.Index, humanize.IBytes(rdbSize), rdbSize) return rdbSize, nil } @@ -591,8 +591,6 @@ func (cs *ControlService) adjustNvmeSize(resp *ctlpb.ScanNvmeResp) { if dev.GetRoleBits() != 0 && (dev.GetRoleBits()&storage.BdevRoleData) == 0 { cs.log.Debugf("SMD device %s (rank %d, ctlr %s) not used to store data (Role bits 0x%X)", dev.GetUuid(), rank, ctlr.GetPciAddr(), dev.GetRoleBits()) - dev.TotalBytes = 0 - dev.AvailBytes = 0 dev.UsableBytes = 0 continue } @@ -614,15 +612,15 @@ func (cs *ControlService) adjustNvmeSize(resp *ctlpb.ScanNvmeResp) { } cs.log.Tracef("Initial available size of SMD device %s (rank %d, ctlr %s): %s (%d bytes)", - dev.GetUuid(), rank, ctlr.GetPciAddr(), humanize.Bytes(dev.GetAvailBytes()), dev.GetAvailBytes()) + dev.GetUuid(), rank, ctlr.GetPciAddr(), humanize.IBytes(dev.GetAvailBytes()), dev.GetAvailBytes()) clusterSize := uint64(dev.GetClusterSize()) availBytes := (dev.GetAvailBytes() / clusterSize) * clusterSize if dev.GetAvailBytes() != availBytes { cs.log.Tracef("Rounding available size of SMD device %s based on cluster size (rank %d, ctlr %s): from %s (%d Bytes) to %s (%d bytes)", dev.GetUuid(), rank, ctlr.GetPciAddr(), - humanize.Bytes(dev.GetAvailBytes()), dev.GetAvailBytes(), - humanize.Bytes(availBytes), availBytes) + humanize.IBytes(dev.GetAvailBytes()), dev.GetAvailBytes(), + humanize.IBytes(availBytes), availBytes) dev.AvailBytes = availBytes } @@ -661,7 +659,7 @@ func (cs *ControlService) adjustNvmeSize(resp *ctlpb.ScanNvmeResp) { smdDev.UsableBytes = clusters * smdDev.GetClusterSize() cs.log.Debugf("Defining usable size of the SMD device %s (rank %d, ctlr %s) as %s (%d bytes)", smdDev.GetUuid(), rank, dev.ctlr.GetPciAddr(), - humanize.Bytes(smdDev.GetUsableBytes()), smdDev.GetUsableBytes()) + humanize.IBytes(smdDev.GetUsableBytes()), smdDev.GetUsableBytes()) } } } @@ -673,7 +671,7 @@ func (cs *ControlService) adjustScmSize(resp *ctlpb.ScanScmResp) { mountPath := mnt.GetPath() mnt.UsableBytes = mnt.GetAvailBytes() cs.log.Debugf("Initial usable size of SCM %s: %s (%d bytes)", mountPath, - humanize.Bytes(mnt.GetUsableBytes()), mnt.GetUsableBytes()) + humanize.IBytes(mnt.GetUsableBytes()), mnt.GetUsableBytes()) engineCfg, err := cs.getEngineCfgFromScmNsp(scmNamespace) if err != nil { @@ -691,7 +689,7 @@ func (cs *ControlService) adjustScmSize(resp *ctlpb.ScanScmResp) { continue } cs.log.Tracef("Removing RDB (%s, %d bytes) from the usable size of the SCM device %q", - humanize.Bytes(mdBytes), mdBytes, mountPath) + humanize.IBytes(mdBytes), mdBytes, mountPath) if mdBytes >= mnt.GetUsableBytes() { cs.log.Debugf("No more usable space in SCM device %s", mountPath) mnt.UsableBytes = 0 @@ -703,7 +701,7 @@ func (cs *ControlService) adjustScmSize(resp *ctlpb.ScanScmResp) { mountPath := m.GetPath() cs.log.Tracef("Removing control plane metadata (%s, %d bytes) from the usable size of the SCM device %q", - humanize.Bytes(mdDaosScmBytes), mdDaosScmBytes, mountPath) + humanize.IBytes(mdDaosScmBytes), mdDaosScmBytes, mountPath) if mdDaosScmBytes >= m.GetUsableBytes() { cs.log.Debugf("No more usable space in SCM device %s", mountPath) m.UsableBytes = 0 @@ -731,12 +729,12 @@ func (cs *ControlService) adjustScmSize(resp *ctlpb.ScanScmResp) { } cs.log.Tracef("Removing (%s, %d bytes) of usable size from the SCM device %q: space used by the file system metadata", - humanize.Bytes(mdFsScmBytes), mdFsScmBytes, mountPath) + humanize.IBytes(mdFsScmBytes), mdFsScmBytes, mountPath) mnt.UsableBytes -= mdFsScmBytes usableBytes := scmNamespace.Mount.GetUsableBytes() cs.log.Debugf("Usable size of SCM device %q: %s (%d bytes)", - scmNamespace.Mount.GetPath(), humanize.Bytes(usableBytes), usableBytes) + scmNamespace.Mount.GetPath(), humanize.IBytes(usableBytes), usableBytes) } } diff --git a/src/control/server/ctl_storage_rpc_test.go b/src/control/server/ctl_storage_rpc_test.go index c1b795b8551..f6fe2bd7eb1 100644 --- a/src/control/server/ctl_storage_rpc_test.go +++ b/src/control/server/ctl_storage_rpc_test.go @@ -652,7 +652,6 @@ func TestServer_bdevScan(t *testing.T) { func() *ctlpb.NvmeController { nc := proto.MockNvmeController(1) sd := mockSmd(storage.BdevRoleWAL | storage.BdevRoleMeta) - sd.AvailBytes = 0 nc.SmdDevices = []*ctlpb.SmdDevice{sd} return nc }(), @@ -3574,17 +3573,17 @@ func TestServer_CtlSvc_adjustNvmeSize(t *testing.T) { 320 * clusterSize, 320 * clusterSize, 320 * clusterSize, - 0 * humanize.GiByte, - 0 * humanize.GiByte, - 0 * humanize.GiByte, + 320 * clusterSize, + 320 * clusterSize, + 320 * clusterSize, }, availableBytes: []uint64{ 320 * clusterSize, 320 * clusterSize, 320 * clusterSize, - 0 * humanize.GiByte, - 0 * humanize.GiByte, - 0 * humanize.GiByte, + 320 * clusterSize, + 320 * clusterSize, + 320 * clusterSize, }, usableBytes: []uint64{ // 5tgts * 64mib = 320mib of meta on SSD (10 clusters) @@ -3958,13 +3957,13 @@ func TestServer_CtlSvc_adjustScmSize(t *testing.T) { test.AssertEqual(t, tc.output.availableBytes[index], namespace.GetMount().GetAvailBytes(), fmt.Sprintf("Invalid SCM available bytes: nsp=%s, want=%s (%d bytes), got=%s (%d bytes)", namespace.GetMount().GetPath(), - humanize.Bytes(tc.output.availableBytes[index]), tc.output.availableBytes[index], - humanize.Bytes(namespace.GetMount().GetAvailBytes()), namespace.GetMount().GetAvailBytes())) + humanize.IBytes(tc.output.availableBytes[index]), tc.output.availableBytes[index], + humanize.IBytes(namespace.GetMount().GetAvailBytes()), namespace.GetMount().GetAvailBytes())) test.AssertEqual(t, tc.output.usableBytes[index], namespace.GetMount().GetUsableBytes(), fmt.Sprintf("Invalid SCM usable bytes: nsp=%s, want=%s (%d bytes), got=%s (%d bytes)", namespace.GetMount().GetPath(), - humanize.Bytes(tc.output.usableBytes[index]), tc.output.usableBytes[index], - humanize.Bytes(namespace.GetMount().GetUsableBytes()), namespace.GetMount().GetUsableBytes())) + humanize.IBytes(tc.output.usableBytes[index]), tc.output.usableBytes[index], + humanize.IBytes(namespace.GetMount().GetUsableBytes()), namespace.GetMount().GetUsableBytes())) } if tc.output.message != "" { test.AssertTrue(t, diff --git a/src/control/server/faults.go b/src/control/server/faults.go index ad70f202c1c..e32d2e6c728 100644 --- a/src/control/server/faults.go +++ b/src/control/server/faults.go @@ -85,7 +85,7 @@ func FaultPoolNvmeTooSmall(minTotal, minNVMe uint64) *fault.Fault { fmt.Sprintf("requested NVMe capacity too small (min %s per target)", humanize.IBytes(engine.NvmeMinBytesPerTarget)), fmt.Sprintf("retry the request with a pool size of at least %s, with at least %s NVMe", - humanize.Bytes(minTotal+humanize.MiByte), humanize.Bytes(minNVMe+humanize.MiByte), + humanize.IBytes(minTotal+humanize.MiByte), humanize.IBytes(minNVMe+humanize.MiByte), ), ) } @@ -96,7 +96,7 @@ func FaultPoolScmTooSmall(minTotal, minSCM uint64) *fault.Fault { fmt.Sprintf("requested SCM capacity is too small (min %s per target)", humanize.IBytes(engine.ScmMinBytesPerTarget)), fmt.Sprintf("retry the request with a pool size of at least %s, with at least %s SCM", - humanize.Bytes(minTotal+humanize.MiByte), humanize.Bytes(minSCM+humanize.MiByte), + humanize.IBytes(minTotal+humanize.MiByte), humanize.IBytes(minSCM+humanize.MiByte), ), ) } diff --git a/src/control/server/mgmt_pool.go b/src/control/server/mgmt_pool.go index 4ad98ccd1bf..46cfd9943c6 100644 --- a/src/control/server/mgmt_pool.go +++ b/src/control/server/mgmt_pool.go @@ -200,7 +200,7 @@ func (svc *mgmtSvc) calculateCreateStorage(req *mgmtpb.PoolCreateReq) error { nvmeBytes := req.TierBytes[1] if nvmeMissing && nvmeBytes > 0 { return errors.Errorf("%s NVMe requested for pool but config has zero bdevs", - humanize.Bytes(nvmeBytes)) + humanize.IBytes(nvmeBytes)) } // Pool tier sizes to be populated based on total-size and ratio. @@ -216,8 +216,8 @@ func (svc *mgmtSvc) calculateCreateStorage(req *mgmtpb.PoolCreateReq) error { req.TierBytes[tierIdx] = uint64(float64(req.TotalBytes)*req.TierRatio[tierIdx]) / uint64(len(req.GetRanks())) - svc.log.Infof("%s = (%s*%f) / %d", humanize.Bytes(req.TierBytes[tierIdx]), - humanize.Bytes(req.TotalBytes), req.TierRatio[tierIdx], + svc.log.Infof("%s = (%s*%f) / %d", humanize.IBytes(req.TierBytes[tierIdx]), + humanize.IBytes(req.TotalBytes), req.TierRatio[tierIdx], len(req.GetRanks())) } diff --git a/src/control/server/mgmt_pool_test.go b/src/control/server/mgmt_pool_test.go index 24f109cf196..f0cf3f86d43 100644 --- a/src/control/server/mgmt_pool_test.go +++ b/src/control/server/mgmt_pool_test.go @@ -196,7 +196,7 @@ func TestServer_MgmtSvc_calculateCreateStorage(t *testing.T) { scmTooSmallRatio := 0.01 scmTooSmallTotal := uint64(testTargetCount * engine.NvmeMinBytesPerTarget) scmTooSmallReq := uint64(float64(scmTooSmallTotal) * scmTooSmallRatio) - nvmeTooSmallTotal := uint64(3 * humanize.GByte) + nvmeTooSmallTotal := uint64(3 * humanize.GiByte) nvmeTooSmallReq := nvmeTooSmallTotal for name, tc := range map[string]struct { @@ -579,7 +579,7 @@ func TestServer_MgmtSvc_PoolCreate(t *testing.T) { memberCount: MaxPoolServiceReps + 2, req: &mgmtpb.PoolCreateReq{ Uuid: test.MockUUID(1), - TotalBytes: 100 * humanize.GByte, + TotalBytes: 100 * humanize.GiByte, TierRatio: defaultTierRatios, NumSvcReps: MaxPoolServiceReps + 2, Properties: testPoolLabelProp(), @@ -591,7 +591,7 @@ func TestServer_MgmtSvc_PoolCreate(t *testing.T) { memberCount: MaxPoolServiceReps - 2, req: &mgmtpb.PoolCreateReq{ Uuid: test.MockUUID(1), - TotalBytes: 100 * humanize.GByte, + TotalBytes: 100 * humanize.GiByte, TierRatio: defaultTierRatios, NumSvcReps: MaxPoolServiceReps - 1, Properties: testPoolLabelProp(), diff --git a/src/control/server/storage/bdev.go b/src/control/server/storage/bdev.go index 96286052fcf..d9f8ba24255 100644 --- a/src/control/server/storage/bdev.go +++ b/src/control/server/storage/bdev.go @@ -396,6 +396,15 @@ func (nc NvmeController) Free() (tb uint64) { return } +// Usable returns the cumulative usable bytes of blobstore clusters. This is a projected data +// capacity calculated whilst taking into account future pool metadata overheads. +func (nc NvmeController) Usable() (tb uint64) { + for _, d := range nc.SmdDevices { + tb += d.UsableBytes + } + return +} + // Roles returns bdev_roles for NVMe controller being used in MD-on-SSD mode. Assume that all SMD // devices on a controller have the same roles. func (nc *NvmeController) Roles() *BdevRoles { @@ -439,7 +448,7 @@ func (ncs NvmeControllers) Len() int { // Capacity returns the cumulative total bytes of all controller capacities. func (ncs NvmeControllers) Capacity() (tb uint64) { for _, c := range ncs { - tb += (*NvmeController)(c).Capacity() + tb += c.Capacity() } return } @@ -447,7 +456,7 @@ func (ncs NvmeControllers) Capacity() (tb uint64) { // Total returns the cumulative total bytes of all controller blobstores. func (ncs NvmeControllers) Total() (tb uint64) { for _, c := range ncs { - tb += (*NvmeController)(c).Total() + tb += c.Total() } return } @@ -455,17 +464,22 @@ func (ncs NvmeControllers) Total() (tb uint64) { // Free returns the cumulative available bytes of all blobstore clusters. func (ncs NvmeControllers) Free() (tb uint64) { for _, c := range ncs { - tb += (*NvmeController)(c).Free() + tb += c.Free() } return } -// PercentUsage returns the percentage of used storage space. -func (ncs NvmeControllers) PercentUsage() string { - return common.PercentageString(ncs.Total()-ncs.Free(), ncs.Total()) +// Usable returns the cumulative usable bytes of all blobstore clusters. This is a projected data +// capacity calculated whilst taking into account future pool metadata overheads. +func (ncs NvmeControllers) Usable() (tb uint64) { + for _, c := range ncs { + tb += c.Usable() + } + return } // Summary reports accumulated storage space and the number of controllers. +// Storage capacity printed with SI (decimal representation) units. func (ncs NvmeControllers) Summary() string { return fmt.Sprintf("%s (%d %s)", humanize.Bytes(ncs.Capacity()), len(ncs), common.Pluralise("controller", len(ncs))) @@ -504,6 +518,14 @@ func (ncs NvmeControllers) Addresses() (*hardware.PCIAddressSet, error) { return pas, nil } +// HaveMdOnSsdRoles returns true if bdev MD-on-SSD roles are configured on NVMe SSDs. +func (ncs NvmeControllers) HaveMdOnSsdRoles() bool { + if ncs.Len() > 0 && !ncs[0].Roles().IsEmpty() { + return true + } + return false +} + // NvmeAioDevice returns struct representing an emulated NVMe AIO device (file or kdev). type NvmeAioDevice struct { Path string `json:"path"` diff --git a/src/control/server/storage/bdev/backend_class.go b/src/control/server/storage/bdev/backend_class.go index 97b37fa8419..3d5060ff0f4 100644 --- a/src/control/server/storage/bdev/backend_class.go +++ b/src/control/server/storage/bdev/backend_class.go @@ -1,5 +1,5 @@ // -// (C) Copyright 2021-2023 Intel Corporation. +// (C) Copyright 2021-2024 Intel Corporation. // // SPDX-License-Identifier: BSD-2-Clause-Patent // @@ -41,7 +41,7 @@ func createEmptyFile(log logging.Logger, path string, size uint64) error { // adjust file size to align with block size size = (size / aioBlockSize) * aioBlockSize - log.Debugf("allocating blank file %s of size %s", path, humanize.Bytes(size)) + log.Debugf("allocating blank file %s of size %s", path, humanize.IBytes(size)) file, err := common.TruncFile(path) if err != nil { return errors.Wrapf(err, "open %q for truncate", path) diff --git a/src/control/server/storage/bdev/backend_class_test.go b/src/control/server/storage/bdev/backend_class_test.go index b15e116a6e9..0eec7e38980 100644 --- a/src/control/server/storage/bdev/backend_class_test.go +++ b/src/control/server/storage/bdev/backend_class_test.go @@ -180,10 +180,8 @@ func TestBackend_writeJSONFile(t *testing.T) { Tier: tierID, Class: storage.ClassNvme, Bdev: storage.BdevConfig{ - DeviceList: storage.MustNewBdevDeviceList(test.MockPCIAddrs(1, 2)...), - DeviceRoles: storage.BdevRoles{ - OptionBits: storage.OptionBits(storage.BdevRoleAll), - }, + DeviceList: storage.MustNewBdevDeviceList(test.MockPCIAddrs(1, 2)...), + DeviceRoles: storage.BdevRolesFromBits(storage.BdevRoleAll), }, }), expOut: ` diff --git a/src/control/server/storage/bdev/backend_json_test.go b/src/control/server/storage/bdev/backend_json_test.go index 77147423ad2..1285e8a0fee 100644 --- a/src/control/server/storage/bdev/backend_json_test.go +++ b/src/control/server/storage/bdev/backend_json_test.go @@ -240,12 +240,10 @@ func TestBackend_newSpdkConfig(t *testing.T) { Tier: tierID, Class: storage.ClassNvme, Bdev: storage.BdevConfig{ - DeviceList: storage.MustNewBdevDeviceList(tc.devList...), - FileSize: tc.fileSizeGB, - BusidRange: storage.MustNewBdevBusRange(tc.busidRange), - DeviceRoles: storage.BdevRoles{ - storage.OptionBits(tc.devRoles), - }, + DeviceList: storage.MustNewBdevDeviceList(tc.devList...), + FileSize: tc.fileSizeGB, + BusidRange: storage.MustNewBdevBusRange(tc.busidRange), + DeviceRoles: storage.BdevRolesFromBits(tc.devRoles), }, } if tc.class != "" { diff --git a/src/control/server/storage/config.go b/src/control/server/storage/config.go index 402bee1c400..6346d422dbc 100644 --- a/src/control/server/storage/config.go +++ b/src/control/server/storage/config.go @@ -10,6 +10,7 @@ import ( "encoding/json" "fmt" "path/filepath" + "sort" "strconv" "strings" @@ -228,7 +229,7 @@ func (tc *TierConfig) WithBdevBusidRange(rangeStr string) *TierConfig { // WithBdevDeviceRoles sets the role assignments for the bdev tier. func (tc *TierConfig) WithBdevDeviceRoles(bits int) *TierConfig { - tc.Bdev.DeviceRoles = BdevRoles{OptionBits(bits)} + tc.Bdev.DeviceRoles = BdevRolesFromBits(bits) return tc } @@ -846,7 +847,7 @@ func (ofm optFlagMap) keys() []string { return keys.ToSlice() } -// toStrings returns a slice of option names that have been set. +// toStrings returns a sorted slice of option names that have been set. func (obs OptionBits) toStrings(optStr2Flag optFlagMap) []string { opts := common.NewStringSet() for str, flag := range optStr2Flag { @@ -855,7 +856,9 @@ func (obs OptionBits) toStrings(optStr2Flag optFlagMap) []string { } } - return opts.ToSlice() + outOpts := opts.ToSlice() + sort.Strings(outOpts) + return outOpts } // toString returns a comma separated list of option names that have been set. @@ -967,6 +970,20 @@ func (bdr *BdevRoles) HasWAL() bool { return bdr.OptionBits&BdevRoleWAL != 0 } +// IsEmpty returns true if no options have been set. +func (bdr *BdevRoles) IsEmpty() bool { + return bdr == nil || bdr.OptionBits.IsEmpty() +} + +// BdevRolesFromBits returns BdevRoles initialized with supplied option bitset. +func BdevRolesFromBits(bits int) BdevRoles { + if bits <= 0 { + return BdevRoles{} + } + + return BdevRoles{OptionBits(bits)} +} + // BdevConfig represents a Block Device (NVMe, etc.) configuration entry. type BdevConfig struct { DeviceList *BdevDeviceList `yaml:"bdev_list,omitempty"` diff --git a/src/control/server/storage/mocks.go b/src/control/server/storage/mocks.go index dc54f7e68cb..b83ada8f79c 100644 --- a/src/control/server/storage/mocks.go +++ b/src/control/server/storage/mocks.go @@ -107,7 +107,7 @@ func MockSmdDevice(c *NvmeController, varIdx ...int32) *SmdDevice { sd := SmdDevice{ UUID: test.MockUUID(idx), TargetIDs: []int32{startTgt, startTgt + 1, startTgt + 2, startTgt + 3}, - Roles: BdevRoles{OptionBits(BdevRoleAll)}, + Roles: BdevRolesFromBits(BdevRoleAll), } if c != nil { sd.Ctrlr = *c @@ -203,7 +203,7 @@ func MockScmMountPoint(varIdx ...int32) *ScmMountPoint { Path: fmt.Sprintf("/mnt/daos%d", idx), DeviceList: []string{fmt.Sprintf("pmem%d", idx)}, TotalBytes: uint64(humanize.TByte) * uint64(idx+1), - AvailBytes: uint64(humanize.TByte/4) * uint64(idx+1), // 75% used + AvailBytes: uint64(humanize.TByte/4) * uint64(idx+1), // 25% available Rank: ranklist.Rank(uint32(idx)), } } diff --git a/src/control/server/storage/scm.go b/src/control/server/storage/scm.go index 432410f802d..be24bf5cf4d 100644 --- a/src/control/server/storage/scm.go +++ b/src/control/server/storage/scm.go @@ -199,9 +199,8 @@ func (sms ScmModules) Capacity() (tb uint64) { return } -// Summary reports total storage space and the number of modules. -// -// Capacity given in IEC standard units. +// Summary reports total storage space and the number of modules. Memory capacity printed with IEC +// (binary representation) units. func (sms ScmModules) Summary() string { return fmt.Sprintf("%s (%d %s)", humanize.IBytes(sms.Capacity()), len(sms), common.Pluralise("module", len(sms))) @@ -295,14 +294,9 @@ func (sns ScmNamespaces) Usable() (tb uint64) { return } -// PercentUsage returns the percentage of used storage space. -func (sns ScmNamespaces) PercentUsage() string { - return common.PercentageString(sns.Total()-sns.Free(), sns.Total()) -} - -// Summary reports total storage space and the number of namespaces. -// -// Capacity given in IEC standard units. +// Summary reports total storage space and the number of namespaces. Although the underlying +// hardware is memory the PMem namespaces will be presented as block storage devices so print +// capacity with SI (decimal representation) units. func (sns ScmNamespaces) Summary() string { return fmt.Sprintf("%s (%d %s)", humanize.Bytes(sns.Capacity()), len(sns), common.Pluralise("namespace", len(sns))) diff --git a/src/control/server/storage/scm/ipmctl_region.go b/src/control/server/storage/scm/ipmctl_region.go index a22774a3d6f..3cf6a6c24b3 100644 --- a/src/control/server/storage/scm/ipmctl_region.go +++ b/src/control/server/storage/scm/ipmctl_region.go @@ -293,7 +293,7 @@ func getPMemState(log logging.Logger, regions Regions) (*storage.ScmSocketState, return resp, nil case storage.ScmFreeCap: log.Debugf("socket %d app-direct region has %s free", r.SocketID, - humanize.Bytes(uint64(r.FreeCapacity))) + humanize.IBytes(uint64(r.FreeCapacity))) hasFreeCap = true case storage.ScmNoFreeCap: // Fall-through diff --git a/src/control/server/storage/scm/ndctl.go b/src/control/server/storage/scm/ndctl.go index 72f30b3b008..cca6bb1ede5 100644 --- a/src/control/server/storage/scm/ndctl.go +++ b/src/control/server/storage/scm/ndctl.go @@ -1,5 +1,5 @@ // -// (C) Copyright 2022-2023 Intel Corporation. +// (C) Copyright 2022-2024 Intel Corporation. // // SPDX-License-Identifier: BSD-2-Clause-Patent // @@ -147,8 +147,8 @@ func (cr *cmdRunner) createNamespaces(regionPerSocket socketRegionMap, nrNsPerSo if pmemBytes%alignmentBoundaryBytes != 0 { return nil, errors.Errorf("%s: available size (%s) is not %s aligned", - region.Dev, humanize.Bytes(pmemBytes), - humanize.Bytes(alignmentBoundaryBytes)) + region.Dev, humanize.IBytes(pmemBytes), + humanize.IBytes(alignmentBoundaryBytes)) } // Create specified number of namespaces on a single region (NUMA node). @@ -160,7 +160,7 @@ func (cr *cmdRunner) createNamespaces(regionPerSocket socketRegionMap, nrNsPerSo return nil, errors.WithMessagef(err, "%s", region.Dev) } cr.log.Debugf("created namespace on %s size %s", region.Dev, - humanize.Bytes(pmemBytes)) + humanize.IBytes(pmemBytes)) } numaNodesPrepped = append(numaNodesPrepped, int(region.NumaNode)) diff --git a/utils/config/examples/daos_server_mdonssd.yml b/utils/config/examples/daos_server_mdonssd.yml index 043288a59df..090df281a6b 100644 --- a/utils/config/examples/daos_server_mdonssd.yml +++ b/utils/config/examples/daos_server_mdonssd.yml @@ -80,11 +80,11 @@ engines: # - # class: nvme # bdev_list: ["0000:81:00.0"] - # bdev_roles: [wal] + # bdev_roles: [wal,meta] # - # class: nvme # bdev_list: ["0000:82:00.0"] - # bdev_roles: [meta,data] + # bdev_roles: [data] # To use emulated NVMe, use `class: file` instead of `class: nvme`, see # utils/config/daos_server.yml for details on AIO file class.