Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add third-party library integration testing of cudf.pandas to cudf #16645

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
4fddb30
Move tests and ci files to cudf
Matt711 Aug 23, 2024
c2c88a9
Add missing ci
Matt711 Aug 23, 2024
88789eb
Combine jobs
Matt711 Aug 23, 2024
9c76ec3
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
galipremsagar Aug 23, 2024
f3cccea
Address review: mv nightly.yml to pr.yml and test.yml
Matt711 Aug 24, 2024
a36fa9d
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 24, 2024
8bd1378
add job to pr bnuilder
Matt711 Aug 24, 2024
0d0d268
Merge branch 'feat/cudf-pandas-integration-tests' of github.com:Matt7…
Matt711 Aug 24, 2024
d2a6fc8
preprocess test names
Matt711 Aug 26, 2024
de531e2
Add --config to rdfg
Matt711 Aug 26, 2024
aba7509
Change --config arg in rdfg
Matt711 Aug 26, 2024
555ffd6
continue --output on next line
Matt711 Aug 26, 2024
72c806a
Point to ci script
Matt711 Aug 26, 2024
54e01ff
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 27, 2024
e11bff6
preprend pythonpath to pytest
Matt711 Aug 27, 2024
8443f55
set the test_dir
Matt711 Aug 27, 2024
8ffbc2f
xfail pytorch test and move integration tests out of cudf_pandas_tests
Matt711 Aug 27, 2024
ef773c5
mrefactor
Matt711 Aug 27, 2024
57372d4
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 27, 2024
a92ef3f
change job name to match pr.ymal
Matt711 Aug 27, 2024
7245f76
Update pr.yaml
Matt711 Aug 27, 2024
9a045b6
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 28, 2024
9855a0c
merge extract_lib.sh and test.sh
Matt711 Aug 28, 2024
89ac9e0
Merge branch 'feat/cudf-pandas-integration-tests' of github.com:Matt7…
Matt711 Aug 28, 2024
560eb82
chmod test.sh
Matt711 Aug 28, 2024
609313c
remove extract_lib.sh
Matt711 Aug 28, 2024
0922bfe
default to 11.8 and 12.5
Matt711 Aug 28, 2024
9561ea2
remove some pytest flags
Matt711 Aug 28, 2024
495f66f
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 28, 2024
ef3dc72
remove test keys
Matt711 Aug 29, 2024
7792e2b
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 29, 2024
9c10dae
address review
Matt711 Aug 29, 2024
1afea54
remove job completeyly from pr.yaml
Matt711 Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,14 @@ jobs:
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
script: ci/cudf_pandas_scripts/run_tests.sh
third-party-integration-tests-cudf-pandas:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
container_image: "rapidsai/ci-conda:latest"
run_script: |
ci/cudf_pandas_scripts/third-party-integration/test.sh python/cudf/cudf_pandas_tests/third_party_integration_tests/dependencies.yaml
3 changes: 3 additions & 0 deletions ci/cudf_pandas_scripts/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ fi
python -m pip install ipykernel
python -m ipykernel install --user --name python3

# The third-party integration tests are ignored because they are run nightly in seperate CI job
python -m pytest -p cudf.pandas \
--ignore=./python/cudf/cudf_pandas_tests/third_party_integration_tests/ \
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
--cov-config=./python/cudf/.coveragerc \
--cov=cudf \
--cov-report=xml:"${RAPIDS_COVERAGE_DIR}/cudf-pandas-coverage.xml" \
Expand All @@ -80,6 +82,7 @@ for version in "${versions[@]}"; do
echo "Installing pandas version: ${version}"
python -m pip install "numpy>=1.23,<2.0a0" "pandas==${version}"
python -m pytest -p cudf.pandas \
--ignore=./python/cudf/cudf_pandas_tests/third_party_integration_tests/ \
--cov-config=./python/cudf/.coveragerc \
--cov=cudf \
--cov-report=xml:"${RAPIDS_COVERAGE_DIR}/cudf-pandas-coverage.xml" \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES.
# All rights reserved.
# SPDX-License-Identifier: Apache-2.0

cleanup() {
rm ${TEST_DIR}/results-*.pickle
}

trap cleanup EXIT

runtest_gold() {
local lib=$1

pytest \
-v \
--continue-on-collection-errors \
--cache-clear \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py
}

runtest_cudf_pandas() {
local lib=$1

pytest \
-p cudf.pandas \
-v \
--continue-on-collection-errors \
--cache-clear \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py
}

main() {
local lib=$1

# generation phase
runtest_gold ${lib}
runtest_cudf_pandas ${lib}

# assertion phase
pytest \
--compare \
-p cudf.pandas \
-v \
--continue-on-collection-errors \
--cache-clear \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py
}

main $@
83 changes: 83 additions & 0 deletions ci/cudf_pandas_scripts/third-party-integration/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.

# Common setup steps shared by Python test jobs

set -euo pipefail

write_output() {
local key="$1"
local value="$2"
echo "$key=$value" | tee --append "${GITHUB_OUTPUT:-/dev/null}"
}

extract_lib_from_dependencies_yaml() {
local file=$1
# Parse all keys in dependencies.yaml under the "files" section,
# extract all the keys that start with "test_", and extract the rest
local extracted_libs="$(yq -o json $file | jq -rc '.files | with_entries(select(.key | contains("test_"))) | keys | map(sub("^test_"; ""))')"
echo $extracted_libs
}

main() {
local dependencies_yaml="$1"

LIBS=$(extract_lib_from_dependencies_yaml "$dependencies_yaml")
LIBS=${LIBS#[}
LIBS=${LIBS%]}

for lib in ${LIBS//,/ }; do
lib=$(echo "$lib" | tr -d '""')
echo "Running tests for library $lib"

CUDA_MAJOR=$(if [ "$lib" = "tensorflow" ]; then echo "11"; else echo "12"; fi)

. /opt/conda/etc/profile.d/conda.sh

rapids-logger "Generate Python testing dependencies"
rapids-dependency-file-generator \
--config "$dependencies_yaml" \
--output conda \
--file-key test_${lib} \
--matrix "cuda=${CUDA_MAJOR};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml

rapids-mamba-retry env create --yes -f env.yaml -n test

# Temporarily allow unbound variables for conda activation.
set +u
conda activate test
set -u

repo_root=$(git rev-parse --show-toplevel)
TEST_DIR=${repo_root}/python/cudf/cudf_pandas_tests/third_party_integration_tests/tests

rapids-print-env

rapids-logger "Check GPU usage"
nvidia-smi

EXITCODE=0
trap "EXITCODE=1" ERR
set +e

rapids-logger "pytest ${lib}"

NUM_PROCESSES=8
serial_libraries=(
"tensorflow"
)
for serial_library in "${serial_libraries[@]}"; do
if [ "${lib}" = "${serial_library}" ]; then
NUM_PROCESSES=1
fi
done

TEST_DIR=${TEST_DIR} NUM_PROCESSES=${NUM_PROCESSES} ci/cudf_pandas_scripts/third-party-integration/ci_run_library_tests.sh ${lib}

rapids-logger "Test script exiting with value: ${EXITCODE}"
done

exit ${EXITCODE}
}

main "$@"
Loading
Loading