Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Humanize and describe_multi Bug Fix #997

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions arrow/arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -1122,6 +1122,7 @@ def humanize(
locale: str = DEFAULT_LOCALE,
only_distance: bool = False,
granularity: Union[_GRANULARITY, List[_GRANULARITY]] = "auto",
dynamic: bool = False,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this False by default to avoid a breaking change? That's understandable, but also disappointing since to me, the dynamic behaviour seems much more useful than outputting a bunch of zeros for units.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarkKoz, yes we left as False by default in order to avoid changing the behaviour of humanize drastically. I agree that it would make more sense to leave it as True by default however. @jadchaar @krisfremen @systemcatch what are your thoughts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the least, I hope changing this can be considered for the next major release (assuming you're following SemVer).

Unrelated: dynamic isn't a good name — it's vague and non-self-descriptive. omit_zeros or something similar would be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave it as False for the time being, do a warning for changing behavior and change it to default True after a few versions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable. Any thoughts on my name suggestion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errr throwing out a few ideas for the name, only_natural, minimal, drop_zeros. omit_zeros is fine as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think omit_zeros is probably the best name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think the dynamic naming is fine, as IMO it indicates it will use dynamically any of the granularity fields that are specified, as the time progresses or is shifted.

Although, omit_zeroes is a nice alternative name, I would prefer dynamic.

) -> str:
"""Returns a localized, humanized representation of a relative difference in time.

Expand Down Expand Up @@ -1264,7 +1265,11 @@ def gather_timeframes(_delta: float, _frame: TimeFrameLiteral) -> float:
if _frame in granularity:
value = sign * _delta / self._SECS_MAP[_frame]
_delta %= self._SECS_MAP[_frame]
if trunc(abs(value)) != 1:

# If user chooses dynamic and the display value is 0 don't subtract
if dynamic and trunc(abs(value)) == 0:
pass
elif trunc(abs(value)) != 1:
Comment on lines +1282 to +1284
Copy link

@MarkKoz MarkKoz Aug 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make any significant difference to save the value of trunc(abs(value)) rather than calculating it twice? This could also be said for the other parts of the diff that use trunc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decent catch.

Profiling the code, you'd need to run about 10k of the trunc(abs()) calls to even come close to seeing a 1ms difference.

timeframes.append(
(cast(TimeFrameLiteral, _frame + "s"), value)
)
Expand All @@ -1273,6 +1278,7 @@ def gather_timeframes(_delta: float, _frame: TimeFrameLiteral) -> float:
return _delta

delta = float(delta_second)

frames: Tuple[TimeFrameLiteral, ...] = (
"year",
"month",
Expand All @@ -1285,12 +1291,25 @@ def gather_timeframes(_delta: float, _frame: TimeFrameLiteral) -> float:
for frame in frames:
delta = gather_timeframes(delta, frame)

if len(timeframes) < len(granularity):
if len(timeframes) < len(granularity) and not dynamic:
raise ValueError(
"Invalid level of granularity. "
"Please select between 'second', 'minute', 'hour', 'day', 'week', 'month' or 'year'."
)

# Needed to see if there are no units output an error
if not timeframes and dynamic:
raise ValueError(
"All provided granulairty values produced an output of zero. "
anishnya marked this conversation as resolved.
Show resolved Hide resolved
"Consider using smaller granularities, or set the dynamic flag to False. "
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about defaulting to "just now" rather than raising an exception? If you imagine this being used with user input, it would pretty much be a requirement to always wrap it in a try-except due to this being raised.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you misspelled granularity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A user could have a delta of let's say 2 days but only have the granularity of ["year, "month", "week"]. If they had dynamic on, it would output "just now." I think it would be a better idea to error out, then to give an inaccurate answer in that scenario.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it doesn't convert in that case e.g. 2 days = 2/7ths of a week? That's a good point then.

Will this still raise the exception if all granularities are provided, but all values are 0, or will it actually display "just now" in that case? I think it should be able to do that.

Copy link

@MarkKoz MarkKoz Aug 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, it could show "just now" if all units evaluate to zero, regardless of which granularities are provided, not just if all are provided.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though that would be inconsistent with the behaviour of e.g. granularity="year" returning '0 years' rather than 'just now'. On the other hand, it seems like it raising an error in those cases should be avoidable somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our main goal is that if you provide a granularity, you expect the output to contain said granularity (with the exception of the omit zeros/dynamic functionality). Trying to figure out whether we should or shouldn't adapt the output to include some other unit seems unnecessary when we already have the auto function in humanize.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising an error absolves arrow of having to make the tough decision of how to handle this. Letting the user handle the error gives them flexibility, but not all users might prefer that at the cost of having to practically always handle this error if they're dealing with unknown inputs.

It's a matter of which use case is more common: wanting custom behaviour to handle this edge case, or wanting to not have to think about it. Either way, the user can anticipate what the result will be by subtracting the times and manually inspecting the delta before calling humanize. If they see the delta will result in all zeros, they can handle it instead of relying on the default behaviour proposed below. Of course, that's not as convenient as just catching an exception, but I don't see a way to make both sides happy.


The most consistent solution may be to return zero in the smallest unit of the given granularity. This would ensure that while dynamic=True may omit some units in the given granularity, it will never introduce new units. Consider

>>> a = arrow.get(2021, 8, 8)
>>> b = arrow.get(2021, 8, 10)
>>> a.humanize(b, granularity=["year", "month", "week"])
'0 years 0 months and 0 weeks ago'

It has no problem omitting the "2 days" even though it's the only non-zero unit. This is arguably not very useful, but it's what the current behaviour is. There are probably use cases that need to strictly follow the granularity, and those users appreciate this behaviour. Anyway, following from this behaviour, it should then also be acceptable for this to happen

>>> a = arrow.get(2021, 8, 8)
>>> b = arrow.get(2021, 8, 10)
>>> a.humanize(b, granularity=["year", "month", "week"], dynamic=True)
'0 weeks ago'

If the user has dynamic on, that is an expression of an intent to cut down on the zeros in the output. I'd say it's more practical to make a compromise to return 1 zero than to take a strict stance of "must have no zeros" and be forced to raise an exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only do so much, arrow in the end is a library meant to work together with the dev, not think or do things the dev might not be aware of and does behind the scenes without awareness and not raise an exception that the dev might even be expecting to see raised.


# Needed for the case of dynamic usage (could end up with only one frame unit)
if len(timeframes) == 1:
return locale.describe(
timeframes[0][0], delta, only_distance=only_distance
)

return locale.describe_multi(timeframes, only_distance=only_distance)

except KeyError as e:
Expand Down
21 changes: 19 additions & 2 deletions arrow/locales.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,17 @@ def describe_multi(
humanized = " ".join(parts)

if not only_distance:
humanized = self._format_relative(humanized, *timeframes[-1])

# Needed to determine the correct relative string to use
timeframe_value = 0

for _unit_name, unit_value in timeframes:
if trunc(unit_value) != 0:
timeframe_value = trunc(unit_value)
break

# Note it doesn't matter the timeframe unit we use on the call, only the value
humanized = self._format_relative(humanized, "seconds", timeframe_value)

return humanized

Expand Down Expand Up @@ -3408,8 +3418,15 @@ def describe_multi(
"""

humanized = ""
relative_delta = 0

for index, (timeframe, delta) in enumerate(timeframes):
last_humanized = self._format_timeframe(timeframe, trunc(delta))

# A check for the relative timeframe unit
if trunc(delta) != 0:
relative_delta = trunc(delta)

if index == 0:
humanized = last_humanized
elif index == len(timeframes) - 1: # Must have at least 2 items
Expand All @@ -3421,7 +3438,7 @@ def describe_multi(
humanized += ", " + last_humanized

if not only_distance:
humanized = self._format_relative(humanized, timeframe, delta)
humanized = self._format_relative(humanized, timeframe, relative_delta)

return humanized

Expand Down
35 changes: 35 additions & 0 deletions tests/test_arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -2285,6 +2285,41 @@ def test_no_floats_multi_gran(self):
)
assert humanize_string == "916 минути 40 няколко секунди назад"

# Dynamic Humanize Tests
def test_dynamic_on(self):
arw = arrow.Arrow(2013, 1, 1, 0, 0, 0)
later = arw.shift(seconds=3630)
humanize_string = arw.humanize(
later, granularity=["second", "hour", "day", "month", "year"], dynamic=True
)

assert humanize_string == "an hour and 30 seconds ago"

def test_dynamic_on_one_granularity(self):
arw = arrow.Arrow(2013, 1, 1, 0, 0, 0)
later = arw.shift(seconds=3600)
humanize_string = arw.humanize(
later, granularity=["hour", "second"], dynamic=True
)

assert humanize_string == "in an hour"

def test_dynamic_on_zero_output(self):
arw = arrow.Arrow(2013, 1, 1, 0, 0, 0)
later = arw.shift(seconds=0)

with pytest.raises(ValueError):
arw.humanize(later, granularity=["hour", "second"], dynamic=True)

def test_dynamic_off(self):
arw = arrow.Arrow(2013, 1, 1, 0, 0, 0)
later = arw.shift(seconds=3600)
humanize_string = arw.humanize(
later,
granularity=["second", "hour", "day", "month", "year"],
)
assert humanize_string == "0 years 0 months 0 days an hour and 0 seconds ago"


@pytest.mark.usefixtures("time_2013_01_01")
class TestArrowHumanizeTestsWithLocale:
Expand Down
3 changes: 3 additions & 0 deletions tests/test_locales.py
Original file line number Diff line number Diff line change
Expand Up @@ -732,6 +732,9 @@ def test_describe_multi(self):
assert describe(seconds60) == "בעוד דקה"
assert describe(seconds60, only_distance=True) == "דקה"

fulltestend0 = [("years", 5), ("weeks", 1), ("hours", 1), ("minutes", 0)]
assert describe(fulltestend0) == "בעוד 5 שנים, שבוע, שעה ו־0 דקות"


@pytest.mark.usefixtures("lang_locale")
class TestMarathiLocale:
Expand Down