Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream/release/2.6 into upstream/google/2.6 #15460

Merged
merged 35 commits into from
Nov 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
76cfb41
DAOS-16556 client: call fstat() before mmap() to update file status i…
wiliamhuang Oct 15, 2024
b4eb689
DAOS-16446 test: HDF5-VOL test - Set object class and container prope…
shimizukko Oct 16, 2024
6e16c8e
DAOS-16673 common: ignore Hadoop 3.4.0 related CVE (#15320)
grom72 Oct 16, 2024
d9f16a1
DAOS-14408 common: ensure NDCTL not used for storage class `ram` (#15…
grom72 Oct 16, 2024
60d4b5d
DAOS-16653 pool: Batch crt events (#15230) (#15302)
liw Oct 18, 2024
e0f5883
DAOS-16720 cq: pin isort to v1.1.0 (#15338) (#15339)
daltonbohning Oct 18, 2024
cb9d278
DAOS-15852 test: more timing samples for co_op_dup_timing() (#14497) …
kccain Oct 19, 2024
f8682fb
DAOS-16572 rebuild: properly assign global_dtx_resync_version in IV -…
Nasf-Fan Oct 20, 2024
81e57d0
DAOS-16716 ci: Set reference build for PRs (#15337)
jolivier23 Oct 21, 2024
c821379
DAOS-16329 chk: maintenance mode after checking pool with dryrun - b2…
Nasf-Fan Oct 21, 2024
b913d3e
DAOS-16265 test: Fix erasurecode/rebuild_fio.py out of space (#15020)…
phender Oct 21, 2024
ffa1c9d
DAOS-16693 telemetry: Avoid race between init/read (#15306) (#15322)
mjmac Oct 22, 2024
42a0d35
DAOS-16696 cart: Fix rc in error path (#15313) (#15357)
frostedcmos Oct 22, 2024
1ae3f29
DAOS-16574 vos: shrink DTX table blob size - b26 (#15220) (#15221)
Nasf-Fan Oct 22, 2024
dcf8419
DAOS-16653 doc: Fix CRT_EVENT_DELAY description (#15351) (#15371)
liw Oct 23, 2024
4c49f36
DAOS-16650 control: dmg system exclude, update group version (#15288)…
kccain Oct 23, 2024
2819d45
DAOS-16488 chk: take sd_lock before accessing VOS sys_db - b26 (#15269)
Nasf-Fan Oct 24, 2024
ec3aa1c
DAOS-16469 dtx: optimize DTX CoS cache - b26 (#15085)
Nasf-Fan Oct 24, 2024
2b5620b
DAOS-14262 cart: add ability to select traffic class for SWIM context…
soumagne Oct 24, 2024
70b12e3
DAOS-16469 container: Lower log level for cont_aggregate_interval (#1…
Nasf-Fan Oct 24, 2024
2a1892f
DAOS-16716 ci: Set reference build for PRs (#15379)
jolivier23 Oct 24, 2024
23f0787
DAOS-15914: crt_reply_send_input_free() (#14817)
frostedcmos Oct 25, 2024
67da3b9
DAOS-16721 object: fix coll RPC for obj with sparse layout - b26 (#15…
Nasf-Fan Oct 25, 2024
c4cf4f7
DAOS-16687 control: Handle missing PCIe caps in storage query usage (…
tanabarr Oct 28, 2024
eb95b55
DAOS-16722 client: to intercept PMPI_Init() in libpil4dfs (#15387)
wiliamhuang Oct 28, 2024
bde13c3
DAOS-15943 test: Remove server logging from pre-teardown (#15282) (#1…
mjean308 Oct 28, 2024
4637c00
DAOS-16685 dfuse: Change eq poll to use NOWAIT. (#15425)
jgmoore-or Oct 31, 2024
c10428a
DAOS-16508 csum: retry a few times on checksum mismatch on update (#1…
johannlombardi Nov 5, 2024
d17d3d6
DAOS-16211 vos: Avoid race condition with discard (#15370) (#15432)
jolivier23 Nov 5, 2024
9ab3200
DAOS-15162 build: update to libfabric 1.22.0 (#15441)
soumagne Nov 5, 2024
f7d12a4
DAOS-16721 dtx: handle potential DTX ID reusing trouble - b26 (#15409)
Nasf-Fan Nov 5, 2024
cf2a8b9
DAOS-16365 client: intercept dlsym() and zeInit() to avoid nested cal…
wiliamhuang Nov 6, 2024
d1af13a
DAOS-16752 build: update mercury to 2.4.0 (#15443)
soumagne Nov 6, 2024
12bd641
DAOS-16784 build: Tag 2.6.2 tb1 (#15455)
phender Nov 6, 2024
b53cc5b
Merge remote-tracking branch 'upstream/release/2.6' into mjmac/google…
mjmac Nov 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
- uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5.1.0
with:
python-version: '3'
- uses: isort/isort-action@master
- uses: isort/isort-action@f14e57e1d457956c45a19c05a89cccdf087846e5 # v1.1.0
with:
requirementsFiles: "requirements.txt"
- name: Run on SConstruct file.
Expand Down
70 changes: 70 additions & 0 deletions .github/workflows/trivy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
name: Trivy scan

on:
workflow_dispatch:
push:
branches: ["master", "release/**"]
pull_request:
branches: ["master", "release/**"]

# Declare default permissions as nothing.
permissions: {}

jobs:
build:
name: Build
runs-on: ubuntu-20.04
steps:
- name: Checkout code
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

- name: Run Trivy vulnerability scanner in repo mode
uses: aquasecurity/trivy-action@6e7b7d1fd3e4fef0c5fa8cce1229c54b2c9bd0d8 # 0.24.0
with:
scan-type: 'fs'
scan-ref: '.'
trivy-config: 'utils/trivy/trivy.yaml'

- name: Prepare the report to be uploaded to the GitHub artifact store
run: |
mkdir report
cp trivy-report-daos.txt report
cp utils/trivy/.trivyignore report/trivyignore.txt

- name: Upload the report to the GitHub artifact store
uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3
with:
path: report/*
name: trivy-report-daos

- name: Adjust config file to use sarif format
run: |
sed -i 's/output: "trivy-report-daos.txt"/output: "trivy-results.sarif"/g' \
utils/trivy/trivy.yaml
sed -i 's/format: template/format: sarif/g' utils/trivy/trivy.yaml

- name: Run Trivy vulnerability scanner in repo mode
uses: aquasecurity/trivy-action@6e7b7d1fd3e4fef0c5fa8cce1229c54b2c9bd0d8 # 0.24.0
with:
scan-type: 'fs'
scan-ref: '.'
trivy-config: 'utils/trivy/trivy.yaml'

- name: Upload Trivy scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@afb54ba388a7dca6ecae48f608c4ff05ff4cc77a
# 3.25.15 (v3)
with:
sarif_file: 'trivy-results.sarif'

- name: Adjust config file to show and validate scan results
run: |
sed -i 's/output: "trivy-results.sarif"//g' utils/trivy/trivy.yaml
sed -i 's/format: sarif/format: table/g' utils/trivy/trivy.yaml
sed -i 's/exit-code: 0/exit-code: 1/g' utils/trivy/trivy.yaml

- name: Run Trivy vulnerability scanner in repo mode
uses: aquasecurity/trivy-action@6e7b7d1fd3e4fef0c5fa8cce1229c54b2c9bd0d8 # 0.24.0
with:
scan-type: 'fs'
scan-ref: '.'
trivy-config: 'utils/trivy/trivy.yaml'
2 changes: 1 addition & 1 deletion TAG
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.6.1-rc3
2.6.2-tb1
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.6.1
2.6.2
6 changes: 6 additions & 0 deletions debian/changelog
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
daos (2.6.2-1) unstable; urgency=medium
[ Phillip Henderson ]
* First test build for 2.6.2

-- Phillip Henderson <[email protected]> Tue, 05 Nov 2024 23:25:00 -0500

daos (2.6.1-4) unstable; urgency=medium
[ Tomasz Gromadzki ]
* Add support of the PMDK package 2.1.0 with NDCTL enabled.
Expand Down
1 change: 1 addition & 0 deletions docs/admin/env_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ Environment variables in this section only apply to the server side.
|DAOS\_MD\_CAP |Size of a metadata pmem pool/file in MBs. INTEGER. Default to 128 MB.|
|DAOS\_START\_POOL\_SVC|Determines whether to start existing pool services when starting a daos\_server. BOOL. Default to true.|
|CRT\_DISABLE\_MEM\_PIN|Disable memory pinning workaround on a server side. BOOL. Default to 0.|
|CRT\_EVENT\_DELAY|Delay in seconds before handling a set of CaRT events. INTEGER. Default to 10 s. A longer delay enables batching of successive CaRT events, leading to fewer pool map changes when multiple engines become unavailable at around the same time.|
|DAOS\_SCHED\_PRIO\_DISABLED|Disable server ULT prioritizing. BOOL. Default to 0.|
|DAOS\_SCHED\_RELAX\_MODE|The mode of CPU relaxing on idle. "disabled":disable relaxing; "net":wait on network request for INTVL; "sleep":sleep for INTVL. STRING. Default to "net"|
|DAOS\_SCHED\_RELAX\_INTVL|CPU relax interval in milliseconds. INTEGER. Default to 1 ms.|
Expand Down
12 changes: 7 additions & 5 deletions src/cart/README.env
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
This file lists the environment variables used in CaRT.

. D_PROVIDER (Deprecated: CRT_PHY_ADDR_STR)
It determines which mercury NA plugin to be used:
It determines which mercury NA plugin and transport to be used:
- set it as "ofi+verbs;ofi_rxm" to use OFI verbs;ofi_rxm provider
- set it as "ofi+gni" to use OFI gni provider
- set it as "sm" to use SM plugin which only works within single node
- set it as "ofi+tcp;ofi_rxm" to use OFI tcp;ofi_rxm provider.
- set it as "ofi+sockets" to use OFI sockets provider
NOTE: This provider is deprecated in favor of "ofi+tcp;ofi_rxm"
- set it as "ofi+tcp" to use OFI tcp provider.
- by default (not set or set as any other value) it will use ofi tcp
provider.

Expand Down Expand Up @@ -205,3 +202,8 @@ This file lists the environment variables used in CaRT.
start copying data in an effort to release multi-recv buffers. Copy will occur when at
most D_MRECV_BUF_COPY buffers remain.

SWIM_TRAFFIC_CLASS
(server only) Select a traffic class for the SWIM protocol to use and prevent potential
traffic congestion. Available options are: "unspec" (default), "best_effort",
"low_latency", "bulk_data".

13 changes: 13 additions & 0 deletions src/cart/crt_hg.c
Original file line number Diff line number Diff line change
Expand Up @@ -863,6 +863,9 @@ crt_hg_class_init(crt_provider_t provider, int ctx_idx, bool primary, int iface_
init_info.request_post_incr = crt_gdata.cg_post_incr;
init_info.multi_recv_op_max = crt_gdata.cg_mrecv_buf;
init_info.multi_recv_copy_threshold = crt_gdata.cg_mrecv_buf_copy;
/* Separate SWIM traffic in an effort to prevent potential congestion. */
if (crt_is_service() && ctx_idx == crt_gdata.cg_swim_crt_idx)
init_info.traffic_class = (enum na_traffic_class)crt_gdata.cg_swim_tc;

hg_class = HG_Init_opt2(info_string, crt_is_service(), HG_VERSION(2, 4), &init_info);
if (hg_class == NULL) {
Expand Down Expand Up @@ -1479,6 +1482,16 @@ crt_hg_reply_send(struct crt_rpc_priv *rpc_priv)
rc = crt_hgret_2_der(hg_ret);
}

/* Release input buffer */
if (rpc_priv->crp_release_input_early && !rpc_priv->crp_forward) {
hg_ret = HG_Release_input_buf(rpc_priv->crp_hg_hdl);
if (hg_ret != HG_SUCCESS) {
RPC_ERROR(rpc_priv, "HG_Release_input_buf failed, hg_ret: " DF_HG_RC "\n",
DP_HG_RC(hg_ret));
/* Fall through */
}
}

return rc;
}

Expand Down
36 changes: 27 additions & 9 deletions src/cart/crt_init.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ static volatile int gdata_init_flag;
struct crt_plugin_gdata crt_plugin_gdata;
static bool g_prov_settings_applied[CRT_PROV_COUNT];

#define X(a, b) b,
static const char *const crt_tc_name[] = {CRT_TRAFFIC_CLASSES};
#undef X

static void
crt_lib_init(void) __attribute__((__constructor__));

Expand Down Expand Up @@ -237,18 +241,30 @@ crt_gdata_dump(void)
DUMP_GDATA_FIELD("%d", cg_rpc_quota);
}

static enum crt_traffic_class
crt_str_to_tc(const char *str)
{
enum crt_traffic_class i = 0;

while (str != NULL && strcmp(crt_tc_name[i], str) != 0 && i < CRT_TC_UNKNOWN)
i++;

return i == CRT_TC_UNKNOWN ? CRT_TC_UNSPEC : i;
}

/* first step init - for initializing crt_gdata */
static int data_init(int server, crt_init_options_t *opt)
{
uint32_t timeout = 0;
uint32_t credits;
uint32_t fi_univ_size = 0;
uint32_t mem_pin_enable = 0;
uint32_t is_secondary;
uint32_t post_init = CRT_HG_POST_INIT, post_incr = CRT_HG_POST_INCR;
unsigned int mrecv_buf = CRT_HG_MRECV_BUF;
unsigned int mrecv_buf_copy = 0; /* buf copy disabled by default */
int rc = 0;
uint32_t timeout = 0;
uint32_t credits;
uint32_t fi_univ_size = 0;
uint32_t mem_pin_enable = 0;
uint32_t is_secondary;
uint32_t post_init = CRT_HG_POST_INIT, post_incr = CRT_HG_POST_INCR;
unsigned int mrecv_buf = CRT_HG_MRECV_BUF;
unsigned int mrecv_buf_copy = 0; /* buf copy disabled by default */
char *swim_traffic_class = NULL;
int rc = 0;

crt_env_dump();

Expand All @@ -261,6 +277,8 @@ static int data_init(int server, crt_init_options_t *opt)
crt_gdata.cg_mrecv_buf = mrecv_buf;
crt_env_get(D_MRECV_BUF_COPY, &mrecv_buf_copy);
crt_gdata.cg_mrecv_buf_copy = mrecv_buf_copy;
crt_env_get(SWIM_TRAFFIC_CLASS, &swim_traffic_class);
crt_gdata.cg_swim_tc = crt_str_to_tc(swim_traffic_class);

is_secondary = 0;
/* Apply CART-890 workaround for server side only */
Expand Down
15 changes: 15 additions & 0 deletions src/cart/crt_internal_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,17 @@ struct crt_na_config {
char **noc_domain_str; /* Array of domains */
};

#define CRT_TRAFFIC_CLASSES \
X(CRT_TC_UNSPEC, "unspec") /* Leave it upon plugin to choose */ \
X(CRT_TC_BEST_EFFORT, "best_effort") /* Best effort */ \
X(CRT_TC_LOW_LATENCY, "low_latency") /* Low latency */ \
X(CRT_TC_BULK_DATA, "bulk_data") /* Bulk data */ \
X(CRT_TC_UNKNOWN, "unknown") /* Unknown */

#define X(a, b) a,
enum crt_traffic_class { CRT_TRAFFIC_CLASSES };
#undef X

struct crt_prov_gdata {
/** NA plugin type */
int cpg_provider;
Expand Down Expand Up @@ -105,6 +116,9 @@ struct crt_gdata {
/** global swim index for all servers */
int32_t cg_swim_crt_idx;

/** traffic class used by SWIM */
enum crt_traffic_class cg_swim_tc;

/** credits limitation for #in-flight RPCs per target EP CTX */
uint32_t cg_credit_ep_ctx;

Expand Down Expand Up @@ -220,6 +234,7 @@ struct crt_event_cb_priv {
ENV(SWIM_PING_TIMEOUT) \
ENV(SWIM_PROTOCOL_PERIOD_LEN) \
ENV(SWIM_SUSPECT_TIMEOUT) \
ENV_STR(SWIM_TRAFFIC_CLASS) \
ENV_STR(UCX_IB_FORK_INIT)

/* uint env */
Expand Down
8 changes: 6 additions & 2 deletions src/cart/crt_iv.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* (C) Copyright 2016-2023 Intel Corporation.
* (C) Copyright 2016-2024 Intel Corporation.
*
* SPDX-License-Identifier: BSD-2-Clause-Patent
*/
Expand Down Expand Up @@ -2911,8 +2911,12 @@ bulk_update_transfer_done_aux(const struct crt_bulk_cb_info *info)
return rc;

send_error:
rc = crt_bulk_free(cb_info->buc_bulk_hdl);
/* send back whatever error got us here */
output->rc = rc;
rc = crt_bulk_free(cb_info->buc_bulk_hdl);
if (rc != 0)
DL_ERROR(rc, "crt_bulk_free() failed");

iv_ops->ivo_on_put(ivns_internal, &cb_info->buc_iv_value,
cb_info->buc_user_priv);

Expand Down
20 changes: 20 additions & 0 deletions src/cart/crt_rpc.c
Original file line number Diff line number Diff line change
Expand Up @@ -1550,6 +1550,26 @@ crt_req_send(crt_rpc_t *req, crt_cb_t complete_cb, void *arg)
return rc;
}

int
crt_reply_send_input_free(crt_rpc_t *req)
{
struct crt_rpc_priv *rpc_priv = NULL;
int rc = 0;

if (req == NULL) {
D_ERROR("invalid parameter (NULL req).\n");
D_GOTO(out, rc = -DER_INVAL);
}

rpc_priv = container_of(req, struct crt_rpc_priv, crp_pub);
rpc_priv->crp_release_input_early = 1;

return crt_reply_send(req);

out:
return rc;
}

int
crt_reply_send(crt_rpc_t *req)
{
Expand Down
47 changes: 24 additions & 23 deletions src/cart/crt_rpc.h
Original file line number Diff line number Diff line change
Expand Up @@ -166,29 +166,30 @@ struct crt_rpc_priv {
* match with crp_req_hdr.cch_flags.
*/
uint32_t crp_flags;
uint32_t crp_srv:1, /* flag of server received request */
crp_output_got:1,
crp_input_got:1,
/* flag of collective RPC request */
crp_coll:1,
/* flag of crp_tgt_uri need to be freed */
crp_uri_free:1,
/* flag of forwarded rpc for corpc */
crp_forward:1,
/* flag of in timeout binheap */
crp_in_binheap:1,
/* set if a call to crt_req_reply pending */
crp_reply_pending:1,
/* set to 1 if target ep is set */
crp_have_ep:1,
/* RPC is tracked by the context */
crp_ctx_tracked:1,
/* 1 if RPC fails HLC epsilon check */
crp_fail_hlc:1,
/* RPC completed flag */
crp_completed:1,
/* RPC originated from a primary provider */
crp_src_is_primary:1;
uint32_t crp_srv : 1, /* flag of server received request */
crp_output_got : 1, crp_input_got : 1,
/* flag of collective RPC request */
crp_coll : 1,
/* flag of crp_tgt_uri need to be freed */
crp_uri_free : 1,
/* flag of forwarded rpc for corpc */
crp_forward : 1,
/* flag of in timeout binheap */
crp_in_binheap : 1,
/* set if a call to crt_req_reply pending */
crp_reply_pending : 1,
/* set to 1 if target ep is set */
crp_have_ep : 1,
/* RPC is tracked by the context */
crp_ctx_tracked : 1,
/* 1 if RPC fails HLC epsilon check */
crp_fail_hlc : 1,
/* RPC completed flag */
crp_completed : 1,
/* RPC originated from a primary provider */
crp_src_is_primary : 1,
/* release input buffer early */
crp_release_input_early : 1;

struct crt_opc_info *crp_opc_info;
/* corpc info, only valid when (crp_coll == 1) */
Expand Down
2 changes: 1 addition & 1 deletion src/chk/chk_common.c
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,7 @@ chk_pool_restart_svc(struct chk_pool_rec *cpr)
if (cpr->cpr_started)
chk_pool_shutdown(cpr, true);

rc = ds_pool_start_after_check(cpr->cpr_uuid);
rc = ds_pool_start_after_check(cpr->cpr_uuid, cpr->cpr_immutable);
if (rc != 0) {
D_WARN("Cannot start full PS for "DF_UUIDF" after CR check: "DF_RC"\n",
DP_UUID(cpr->cpr_uuid), DP_RC(rc));
Expand Down
13 changes: 8 additions & 5 deletions src/chk/chk_engine.c
Original file line number Diff line number Diff line change
Expand Up @@ -1797,10 +1797,8 @@ chk_engine_pool_ult(void *args)
}

rc = chk_engine_cont_cleanup(cpr, svc, &aggregator);
if (rc != 0)
goto out;

rc = ds_pool_svc_schedule_reconf(svc);
if (rc == 0 && !cpr->cpr_immutable)
rc = ds_pool_svc_schedule_reconf(svc);

out:
chk_engine_cont_list_fini(&aggregator);
Expand Down Expand Up @@ -2113,6 +2111,11 @@ chk_engine_start_post(struct chk_instance *ins)
if (pool_cbk->cb_phase == CHK__CHECK_SCAN_PHASE__CSP_DONE)
continue;

if (ins->ci_prop.cp_flags & CHK__CHECK_FLAG__CF_DRYRUN)
cpr->cpr_immutable = 1;
else
cpr->cpr_immutable = 0;

if (phase > pool_cbk->cb_phase)
phase = pool_cbk->cb_phase;

Expand Down Expand Up @@ -2950,7 +2953,7 @@ chk_engine_pool_start(uint64_t gen, uuid_t uuid, uint32_t phase, uint32_t flags)
cbk = &cpr->cpr_bk;
chk_pool_get(cpr);

rc = ds_pool_start(uuid, false);
rc = ds_pool_start(uuid, false, cpr->cpr_immutable);
if (rc != 0)
D_GOTO(put, rc = (rc == -DER_NONEXIST ? 1 : rc));

Expand Down
Loading
Loading