Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add third-party library integration testing of cudf.pandas to cudf #16645

Merged
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
4fddb30
Move tests and ci files to cudf
Matt711 Aug 23, 2024
c2c88a9
Add missing ci
Matt711 Aug 23, 2024
88789eb
Combine jobs
Matt711 Aug 23, 2024
9c76ec3
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
galipremsagar Aug 23, 2024
f3cccea
Address review: mv nightly.yml to pr.yml and test.yml
Matt711 Aug 24, 2024
a36fa9d
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 24, 2024
8bd1378
add job to pr bnuilder
Matt711 Aug 24, 2024
0d0d268
Merge branch 'feat/cudf-pandas-integration-tests' of github.com:Matt7…
Matt711 Aug 24, 2024
d2a6fc8
preprocess test names
Matt711 Aug 26, 2024
de531e2
Add --config to rdfg
Matt711 Aug 26, 2024
aba7509
Change --config arg in rdfg
Matt711 Aug 26, 2024
555ffd6
continue --output on next line
Matt711 Aug 26, 2024
72c806a
Point to ci script
Matt711 Aug 26, 2024
54e01ff
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 27, 2024
e11bff6
preprend pythonpath to pytest
Matt711 Aug 27, 2024
8443f55
set the test_dir
Matt711 Aug 27, 2024
8ffbc2f
xfail pytorch test and move integration tests out of cudf_pandas_tests
Matt711 Aug 27, 2024
ef773c5
mrefactor
Matt711 Aug 27, 2024
57372d4
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 27, 2024
a92ef3f
change job name to match pr.ymal
Matt711 Aug 27, 2024
7245f76
Update pr.yaml
Matt711 Aug 27, 2024
9a045b6
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 28, 2024
9855a0c
merge extract_lib.sh and test.sh
Matt711 Aug 28, 2024
89ac9e0
Merge branch 'feat/cudf-pandas-integration-tests' of github.com:Matt7…
Matt711 Aug 28, 2024
560eb82
chmod test.sh
Matt711 Aug 28, 2024
609313c
remove extract_lib.sh
Matt711 Aug 28, 2024
0922bfe
default to 11.8 and 12.5
Matt711 Aug 28, 2024
9561ea2
remove some pytest flags
Matt711 Aug 28, 2024
495f66f
Merge branch 'branch-24.10' into feat/cudf-pandas-integration-tests
Matt711 Aug 28, 2024
ef3dc72
remove test keys
Matt711 Aug 29, 2024
7792e2b
Merge branch 'branch-24.10' of github.com:rapidsai/cudf into feat/cud…
Matt711 Aug 29, 2024
9c10dae
address review
Matt711 Aug 29, 2024
1afea54
remove job completeyly from pr.yaml
Matt711 Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/workflows/nightly.yaml
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Copyright (c) 2023-2024, NVIDIA CORPORATION.
name: cudf-pandas-integration test on default branch (nightly / manually)

on:
workflow_dispatch:
# The below exists in alignment with rest of RAPIDS nightly pipeline. They are currently unused.
inputs:
branch:
required: true
type: string
date:
required: true
type: string
sha:
required: true
type: string

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
integration-tests:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4

- name: Extract libraries from dependencies.yaml
id: extractlib
run: |
LIBS=$(python/cudf/cudf_pandas_tests/third_party_integration_tests/ci/extract_lib.sh python/cudf/cudf_pandas_tests/third_party_integration_tests/dependencies.yaml)
echo "LIBS=${LIBS}" >> $GITHUB_ENV

- name: Run integration tests
run: |
for lib in ${{ env.LIBS }}; do
echo "Running tests for $lib"
CUDA_MAJOR=$(if [ "$lib" = "tensorflow" ]; then echo "11"; else echo "12"; fi)
python/cudf/cudf_pandas_tests/third_party_integration_tests/ci/test.sh $lib $CUDA_MAJOR
done
env:
LIBS: ${{ env.LIBS }}
secrets: inherit
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES.
# All rights reserved.
# SPDX-License-Identifier: Apache-2.0

cleanup() {
rm tests/results-*.pickle
}

trap cleanup EXIT

runtest_gold() {
local lib=$1
local test_keys=${@:2}

pytest \
-v \
--continue-on-collection-errors \
--cache-clear \
--junitxml="${RAPIDS_TESTS_DIR}/junit-${lib}-gold.xml" \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py \
${test_keys}
}

runtest_cudf_pandas() {
local lib=$1
local test_keys=${@:2}

pytest \
-p cudf.pandas \
-v \
--continue-on-collection-errors \
--cache-clear \
--junitxml="${RAPIDS_TESTS_DIR}/junit-${lib}-cudf-pandas.xml" \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py \
${test_keys}
}
Matt711 marked this conversation as resolved.
Show resolved Hide resolved

main() {
local lib=$1
local test_keys=${@:2}

# generation phase
runtest_gold ${lib} ${test_keys}
runtest_cudf_pandas ${lib} ${test_keys}

# assertion phase
pytest \
--compare \
-p cudf.pandas \
-v \
--continue-on-collection-errors \
--cache-clear \
--junitxml="${RAPIDS_TESTS_DIR}/junit-${lib}-assertion.xml" \
--numprocesses=${NUM_PROCESSES} \
--dist=worksteal \
${TEST_DIR}/test_${lib}*.py \
${test_keys}
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
}

main $@
Matt711 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.

set -euo pipefail

write_output() {
local key="$1"
local value="$2"
echo "$key=$value" | tee --append "${GITHUB_OUTPUT:-/dev/null}"
}

extract_lib_from_dependencies_yaml() {
local file=$1
# Parse all keys in dependencies.yaml under the "files" section,
# extract all the keys that starts with "test_", and extract the
# rest
local extracted_libs="$(yq -o json $file | jq -rc '.files | with_entries( select(.key | contains("test_")) ) | keys | map(sub("^test_"; ""))')"
echo $extracted_libs
write_output "LIBS" $extracted_libs
}


main() {
local dependencies_yaml="$1"
extract_lib_from_dependencies_yaml "$dependencies_yaml"
}

main "$@"
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.

# Common setup steps shared by Python test jobs

LIB=$1

set -euo pipefail

. /opt/conda/etc/profile.d/conda.sh

rapids-logger "Generate Python testing dependencies"
rapids-dependency-file-generator \
--output conda \
--file-key test_${LIB} \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml

rapids-mamba-retry env create --yes -f env.yaml -n test

# Temporarily allow unbound variables for conda activation.
set +u
conda activate test
set -u

RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}
mkdir -p "${RAPIDS_TESTS_DIR}"

repo_root=$(git rev-parse --show-toplevel)
TEST_DIR=${repo_root}/tests

rapids-print-env

rapids-logger "Check GPU usage"
nvidia-smi

EXITCODE=0
trap "EXITCODE=1" ERR
set +e

rapids-logger "pytest ${LIB}"

NUM_PROCESSES=8
serial_libraries=(
"tensorflow"
)
for serial_library in "${serial_libraries[@]}"; do
if [ "${LIB}" = "${serial_library}" ]; then
NUM_PROCESSES=1
fi
done

RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR} TEST_DIR=${TEST_DIR} NUM_PROCESSES=${NUM_PROCESSES} ci/ci_run_library_tests.sh ${LIB}

rapids-logger "Test script exiting with value: ${EXITCODE}"
exit ${EXITCODE}
Loading
Loading