Skip to content

Commit

Permalink
merge conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
Matt711 committed Aug 28, 2024
2 parents 2761f76 + 925530a commit c794216
Show file tree
Hide file tree
Showing 48 changed files with 713 additions and 373 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,3 +178,7 @@ jupyter_execute
# clang tooling
compile_commands.json
.clangd/

# pytest artifacts
rmm_log.txt
python/cudf/cudf_pandas_tests/data/rmm_log.txt
3 changes: 3 additions & 0 deletions ci/cudf_pandas_scripts/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ else
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"
fi

python -m pip install ipykernel
python -m ipykernel install --user --name python3

# We're ignoring third-party library tests because they are run nightly in a seperate CI job
python -m pytest -p cudf.pandas \
--ignore=./python/cudf/cudf_pandas_tests/third_party_integration_tests/ \
Expand Down
5 changes: 5 additions & 0 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ dependencies:
- hypothesis
- identify>=2.5.20
- ipython
- jupyter_client
- libcufile-dev=1.4.0.31
- libcufile=1.4.0.31
- libcurand-dev=10.3.0.86
Expand All @@ -48,6 +49,8 @@ dependencies:
- moto>=4.0.8
- msgpack-python
- myst-nb
- nbconvert
- nbformat
- nbsphinx
- ninja
- notebook
Expand All @@ -57,12 +60,14 @@ dependencies:
- nvcc_linux-64=11.8
- nvcomp==3.0.6
- nvtx>=0.2.1
- openpyxl
- packaging
- pandas
- pandas>=2.0,<2.2.3dev0
- pandoc
- pre-commit
- ptxcompiler
- pyarrow>=14.0.0,<18.0.0a0
- pydata-sphinx-theme!=0.14.2
- pytest-benchmark
- pytest-cases>=3.8.2
Expand Down
5 changes: 5 additions & 0 deletions conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ dependencies:
- hypothesis
- identify>=2.5.20
- ipython
- jupyter_client
- libcufile-dev
- libcurand-dev
- libkvikio==24.10.*,>=0.0.0a0
Expand All @@ -47,6 +48,8 @@ dependencies:
- moto>=4.0.8
- msgpack-python
- myst-nb
- nbconvert
- nbformat
- nbsphinx
- ninja
- notebook
Expand All @@ -55,11 +58,13 @@ dependencies:
- numpydoc
- nvcomp==3.0.6
- nvtx>=0.2.1
- openpyxl
- packaging
- pandas
- pandas>=2.0,<2.2.3dev0
- pandoc
- pre-commit
- pyarrow>=14.0.0,<18.0.0a0
- pydata-sphinx-theme!=0.14.2
- pynvjitlink>=0.0.0a0
- pytest-benchmark
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ requirements:
- cupy >=12.0.0
- numba >=0.57
- numpy >=1.23,<3.0a0
- pyarrow ==16.1.0.*
- pyarrow>=14.0.0,<18.0.0a0
- libcudf ={{ version }}
- pylibcudf ={{ version }}
- {{ pin_compatible('rmm', max_pin='x.x') }}
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/pylibcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ requirements:
- typing_extensions >=4.0.0
- pandas >=2.0,<2.2.3dev0
- numpy >=1.23,<3.0a0
- pyarrow ==16.1.0.*
- pyarrow>=14.0.0,<18.0.0a0
- {{ pin_compatible('rmm', max_pin='x.x') }}
- fsspec >=0.6.0
{% if cuda_major == "11" %}
Expand Down
20 changes: 9 additions & 11 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ files:
- docs
- notebooks
- py_version
- pyarrow_run
- rapids_build_skbuild
- rapids_build_setuptools
- run_common
Expand All @@ -31,6 +32,7 @@ files:
- test_python_cudf
- test_python_dask_cudf
- test_python_pylibcudf
- test_python_cudf_pandas
test_static_build:
output: none
includes:
Expand All @@ -45,10 +47,10 @@ files:
includes:
- cuda_version
- py_version
- pyarrow_run
- test_python_common
- test_python_cudf
- test_python_dask_cudf
- test_python_cudf_pandas
test_java:
output: none
includes:
Expand Down Expand Up @@ -134,13 +136,6 @@ files:
- build_base
- build_cpp
- depends_on_librmm
py_run_libcudf:
output: pyproject
pyproject_dir: python/libcudf
extras:
table: project
includes:
- pyarrow_run
py_build_pylibcudf:
output: pyproject
pyproject_dir: python/pylibcudf
Expand Down Expand Up @@ -388,8 +383,7 @@ dependencies:
common:
- output_types: [conda, requirements, pyproject]
packages:
# Allow runtime version to float up to patch version
- pyarrow>=16.1.0,<16.2.0a0
- pyarrow>=14.0.0,<18.0.0a0
cuda_version:
specific:
- output_types: conda
Expand Down Expand Up @@ -934,9 +928,13 @@ dependencies:
# installation issues with `psycopg2`.
- pandas[test, pyarrow, performance, computation, fss, excel, parquet, feather, hdf5, spss, html, xml, plot, output-formatting, clipboard, compression]
- pytest-reportlog
- ipython
test_python_cudf_pandas:
common:
- output_types: [requirements, pyproject]
- output_types: [conda, requirements, pyproject]
packages:
- ipython
- jupyter_client
- nbconvert
- nbformat
- openpyxl
6 changes: 5 additions & 1 deletion docs/cudf/source/user_guide/10min.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@
"\n",
"[Dask](https://dask.org/) is a flexible library for parallel computing in Python that makes scaling out your workflow smooth and simple. On the CPU, Dask uses Pandas to execute operations in parallel on DataFrame partitions.\n",
"\n",
"[Dask-cuDF](https://github.com/rapidsai/cudf/tree/main/python/dask_cudf) extends Dask where necessary to allow its DataFrame partitions to be processed using cuDF GPU DataFrames instead of Pandas DataFrames. For instance, when you call `dask_cudf.read_csv(...)`, your cluster's GPUs do the work of parsing the CSV file(s) by calling [`cudf.read_csv()`](https://docs.rapids.ai/api/cudf/stable/api_docs/api/cudf.read_csv.html).\n",
"[Dask cuDF](https://github.com/rapidsai/cudf/tree/main/python/dask_cudf) extends Dask where necessary to allow its DataFrame partitions to be processed using cuDF GPU DataFrames instead of Pandas DataFrames. For instance, when you call `dask_cudf.read_csv(...)`, your cluster's GPUs do the work of parsing the CSV file(s) by calling [`cudf.read_csv()`](https://docs.rapids.ai/api/cudf/stable/api_docs/api/cudf.read_csv.html).\n",
"\n",
"\n",
"> [!NOTE] \n",
"> This notebook uses the explicit Dask cuDF API (`dask_cudf`) for clarity. However, we strongly recommend that you use Dask's [configuration infrastructure](https://docs.dask.org/en/latest/configuration.html) to set the `\"dataframe.backend\"` to `\"cudf\"`, and work with the `dask.dataframe` API directly. Please see the [Dask cuDF documentation](https://github.com/rapidsai/cudf/tree/main/python/dask_cudf) for more information.\n",
"\n",
"\n",
"## When to use cuDF and Dask-cuDF\n",
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,9 @@ select = [
# non-pep585-annotation
"UP006",
# non-pep604-annotation
"UP007"
"UP007",
# Import from `collections.abc` instead: `Callable`
"UP035",
]
ignore = [
# whitespace before :
Expand Down
3 changes: 2 additions & 1 deletion python/cudf/cudf/_typing.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Copyright (c) 2021-2024, NVIDIA CORPORATION.

import sys
from typing import TYPE_CHECKING, Any, Callable, Dict, Iterable, TypeVar, Union
from collections.abc import Callable
from typing import TYPE_CHECKING, Any, Dict, Iterable, TypeVar, Union

import numpy as np
from pandas import Period, Timedelta, Timestamp
Expand Down
13 changes: 0 additions & 13 deletions python/cudf/cudf/core/_internals/where.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,19 +106,6 @@ def _check_and_cast_columns_with_other(
return _normalize_categorical(source_col.astype(common_dtype), other)


def _make_categorical_like(result, column):
if isinstance(column, cudf.core.column.CategoricalColumn):
result = cudf.core.column.build_categorical_column(
categories=column.categories,
codes=result,
mask=result.base_mask,
size=result.size,
offset=result.offset,
ordered=column.ordered,
)
return result


def _can_cast(from_dtype, to_dtype):
"""
Utility function to determine if we can cast
Expand Down
1 change: 0 additions & 1 deletion python/cudf/cudf/core/column/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
from cudf.core.column.column import (
ColumnBase,
as_column,
build_categorical_column,
build_column,
column_empty,
column_empty_like,
Expand Down
Loading

0 comments on commit c794216

Please sign in to comment.