NAS-130426 / Cherry-pick multi-gen LRU patches for DF #187

usaleem-ix · 2024-08-06T12:31:02Z

This PR contains 2 commits, that are cherry-picked from v6.6.44 release. These commits contain fixes for multi-gen LRU.

Changelog is also updated.

API tests run can be found here.

Manually tested by creating an ISO and installing on disk. No new warnings or failures found in dmesg logs.

commit 3f74e6bd3b84a8b6bb3cc51609c89e5b9d58eed7 upstream. set_initial_priority() tries to jump-start global reclaim by estimating the priority based on cold/hot LRU pages. The estimation does not account for shrinker objects, and it cannot do so because their sizes can be in different units other than page. If shrinker objects are the majority, e.g., on TrueNAS SCALE 24.04.0 where ZFS ARC can use almost all system memory, set_initial_priority() can vastly underestimate how much memory ARC shrinker can evict and assign extreme low values to scan_control->priority, resulting in overshoots of shrinker objects. To reproduce the problem, using TrueNAS SCALE 24.04.0 with 32GB DRAM, a test ZFS pool and the following commands: fio --name=mglru.file --numjobs=36 --ioengine=io_uring \ --directory=/root/test-zfs-pool/ --size=1024m --buffered=1 \ --rw=randread --random_distribution=random \ --time_based --runtime=1h & for ((i = 0; i < 20; i++)) do sleep 120 fio --name=mglru.anon --numjobs=16 --ioengine=mmap \ --filename=/dev/zero --size=1024m --fadvise_hint=0 \ --rw=randrw --random_distribution=random \ --time_based --runtime=1m done To fix the problem: 1. Cap scan_control->priority at or above DEF_PRIORITY/2, to prevent the jump-start from being overly aggressive. 2. Account for the progress from mm_account_reclaimed_pages(), to prevent kswapd_shrink_node() from raising the priority unnecessarily. Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Reported-by: Alexander Motin <[email protected]> Cc: Wei Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8b671fe1a879923ecfb72dda6caf01460dd885ef upstream. evict_folios() uses a second pass to reclaim folios that have gone through page writeback and become clean before it finishes the first pass, since folio_rotate_reclaimable() cannot handle those folios due to the isolation. The second pass tries to avoid potential double counting by deducting scan_control->nr_scanned. However, this can result in underflow of nr_scanned, under a condition where shrink_folio_list() does not increment nr_scanned, i.e., when folio_trylock() fails. The underflow can cause the divisor, i.e., scale=scanned+reclaimed in vmpressure_calc_level(), to become zero, resulting in the following crash: [exception RIP: vmpressure_work_fn+101] process_one_work at ffffffffa3313f2b Since scan_control->nr_scanned has no established semantics, the potential double counting has minimal risks. Therefore, fix the problem by not deducting scan_control->nr_scanned in evict_folios(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 359a5e1 ("mm: multi-gen LRU: retry folios written back while isolated") Reported-by: Wei Xu <[email protected]> Signed-off-by: Yu Zhao <[email protected]> Cc: Alexander Motin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Umer Saleem <[email protected]>

bugclerk · 2024-08-06T12:31:38Z

Jira URL: https://ixsystems.atlassian.net/browse/NAS-130426

ixhamza · 2024-08-06T12:39:05Z

Are we also planning to enable CONFIG_LRU_GEN_ENABLED in dragonfish? I am wondering whether it's worth maintaining this patch with CONFIG_LRU_GEN_ENABLED disabled, or if there is a plan to enable it in the upcoming dragonfish release.

usaleem-ix · 2024-08-06T13:19:03Z

CONFIG_LRU_GEN_ENABLED option enables the multi-gen LRU by default. Currently, it is disabled for both EE and DF (this is coming from Debian). I am not sure we are planning to enable this by default.

Since, these patches have landed in v6.6.44, I don't think this would introduce extra maintainability overhead. Moving to or merging a newer release will already contain these patches.

bugclerk · 2024-08-06T13:44:18Z

JIRA ticket https://ixsystems.atlassian.net/browse/NAS-130426 is targeted to the following versions which have not received their corresponding PRs: 24.10

bugclerk · 2024-08-06T13:44:19Z

This PR has been merged and conversations have been locked.
If you would like to discuss more about this issue please use our forums or raise a Jira ticket.

yuzhaogoogle and others added 3 commits August 6, 2024 11:43

Update changelog after cherry-pick multi-gen LRU pactches

7705ef9

Signed-off-by: Umer Saleem <[email protected]>

usaleem-ix requested review from amotin and ixhamza August 6, 2024 12:31

ixhamza approved these changes Aug 6, 2024

View reviewed changes

amotin approved these changes Aug 6, 2024

View reviewed changes

amotin merged commit ef7d464 into stable/dragonfish Aug 6, 2024
6 checks passed

amotin deleted the NAS-130426-df branch August 6, 2024 13:44

bugclerk added the no-time-tracked label Aug 6, 2024

truenas locked as resolved and limited conversation to collaborators Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAS-130426 / Cherry-pick multi-gen LRU patches for DF #187

NAS-130426 / Cherry-pick multi-gen LRU patches for DF #187

usaleem-ix commented Aug 6, 2024

bugclerk commented Aug 6, 2024

ixhamza commented Aug 6, 2024

usaleem-ix commented Aug 6, 2024 •

edited

Loading

bugclerk commented Aug 6, 2024

bugclerk commented Aug 6, 2024

NAS-130426 / Cherry-pick multi-gen LRU patches for DF #187

NAS-130426 / Cherry-pick multi-gen LRU patches for DF #187

Conversation

usaleem-ix commented Aug 6, 2024

bugclerk commented Aug 6, 2024

ixhamza commented Aug 6, 2024

usaleem-ix commented Aug 6, 2024 • edited Loading

bugclerk commented Aug 6, 2024

bugclerk commented Aug 6, 2024

usaleem-ix commented Aug 6, 2024 •

edited

Loading