Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: resolve GL08 for pandas.core.groupby.SeriesGroupBy.value_counts #57609

Merged
merged 4 commits into from
Feb 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
pandas.Timestamp.tzinfo\
pandas.Timestamp.value\
pandas.Timestamp.year\
pandas.core.groupby.SeriesGroupBy.value_counts\
pandas.tseries.offsets.BQuarterBegin.is_anchored\
pandas.tseries.offsets.BQuarterBegin.is_on_offset\
pandas.tseries.offsets.BQuarterBegin.n\
Expand Down
79 changes: 79 additions & 0 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -801,6 +801,85 @@ def value_counts(
bins=None,
dropna: bool = True,
) -> Series | DataFrame:
"""
Return a Series or DataFrame containing counts of unique rows.

.. versionadded:: 1.4.0

Parameters
----------
normalize : bool, default False
Return proportions rather than frequencies.
sort : bool, default True
Sort by frequencies.
ascending : bool, default False
Sort in ascending order.
bins : int or list of ints, optional
Rather than count values, group them into half-open bins,
a convenience for pd.cut, only works with numeric data.
dropna : bool, default True
Don't include counts of rows that contain NA values.

Returns
-------
Series or DataFrame
Series if the groupby as_index is True, otherwise DataFrame.
jordan-d-murphy marked this conversation as resolved.
Show resolved Hide resolved

See Also
--------
Series.value_counts: Equivalent method on Series.
DataFrame.value_counts: Equivalent method on DataFrame.
DataFrameGroupBy.value_counts: Equivalent method on DataFrameGroupBy.

Notes
-----
- If the groupby as_index is True then the returned Series will have a
MultiIndex with one level per input column.
- If the groupby as_index is False then the returned DataFrame will have an
additional column with the value_counts. The column is labelled 'count' or
'proportion', depending on the ``normalize`` parameter.
jordan-d-murphy marked this conversation as resolved.
Show resolved Hide resolved

By default, rows that contain any NA values are omitted from
the result.

By default, the result will be in descending order so that the
first element of each group is the most frequently-occurring row.

Examples
--------
>>> s = pd.Series(
... [1, 1, 2, 3, 2, 3, 3, 1, 1, 3, 3, 3],
... index=["A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B"],
... )
>>> s
A 1
A 1
A 2
A 3
A 2
A 3
B 3
B 1
B 1
B 3
B 3
B 3
dtype: int64
>>> g1 = s.groupby(s.index)
>>> g1.value_counts(bins=2)
A (0.997, 2.0] 4
(2.0, 3.0] 2
B (2.0, 3.0] 4
(0.997, 2.0] 2
Name: count, dtype: int64
>>> g1.value_counts(normalize=True)
A 1 0.333333
2 0.333333
3 0.333333
B 3 0.666667
1 0.333333
Name: proportion, dtype: float64
"""
name = "proportion" if normalize else "count"

if bins is None:
Expand Down
Loading