Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ActiveSupport::Notifications and LogSubscriber for customizing log output #338

Merged
merged 1 commit into from
Mar 22, 2023

Conversation

bdewater
Copy link
Contributor

@bdewater bdewater commented Feb 14, 2023

This is the same mechanism Rails uses to output logs from events. This allows for more instrumentation and easier customization of log output. The default log output is completely unchanged.

Since a bunch of events need cursor_position to keep the existing logging, I made it a default tag for all events.

Edit: closes #40 by including it in an updated way.

@bdewater bdewater force-pushed the improve-instrumentation branch 2 times, most recently from 4a6f90c to 37b1301 Compare February 14, 2023 03:46
@bdewater
Copy link
Contributor Author

bdewater commented Feb 14, 2023

Test failure on Ruby head/Rails head, insofar other combinations ran it succeeded.

  1) Failure:
ResqueIntegrationTest#test_interrupts_the_job [/home/runner/work/job-iteration/job-iteration/test/integration/integration_behaviour.rb:22]:
Expected: 0
  Actual: nil

Copy link
Contributor

@Mangara Mangara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach! Just some small comments.

lib/job-iteration/iteration.rb Outdated Show resolved Hide resolved
found_record = true
each_iteration(object_from_enumerator, *arguments)
self.cursor_position = index
tags.replace(instrumentation_tags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want the updated cursor_position here? The notification describes the iteration that just happened, so it makes most sense to me to tag it with the cursor position at the start of that iteration. If we do want to include both, we can give the one after a different name?

Copy link
Contributor Author

@bdewater bdewater Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, yes. I would like to know how long the thing (object_from_enumerator) being iterated on here takes. Wrote a quick test:

class SlowJob < ActiveJob::Base
  include JobIteration::Iteration

  def build_enumerator(cursor:)
    enumerator_builder.active_record_on_records(
      Product.where("id <= 5"),
      cursor: cursor,
    )
  end

  def each_iteration(product)
    sleep 0.1 * product.id
  end
end

def test_notification_cursor_position
  collector = []
  ActiveSupport::Notifications.monotonic_subscribe("each_iteration.iteration") do |_, started, finished, _, tags|
    collector << { cursor_position: tags[:cursor_position], elapsed: finished - started }
  end

  push(SlowJob)
  work_one_job
  pp collector
end

output:

[{:cursor_position=>1, :elapsed=>0.1011069999949541},
 {:cursor_position=>2, :elapsed=>0.20117599997320212},
 {:cursor_position=>3, :elapsed=>0.30098699999507517},
 {:cursor_position=>4, :elapsed=>0.40121400001225993},
 {:cursor_position=>5, :elapsed=>0.5012379999971017}]

This looks useful to me, I know exactly the ID of the thing that is slow.

If I'm understanding your suggestion correctly, it would be this implementation:

ActiveSupport::Notifications.instrument("each_iteration.iteration", instrumentation_tags) do
  found_record = true
  each_iteration(object_from_enumerator, *arguments)
  self.cursor_position = index
end

which outputs:

[{:cursor_position=>nil, :elapsed=>0.10110400000121444},
 {:cursor_position=>1, :elapsed=>0.2011929999862332},
 {:cursor_position=>2, :elapsed=>0.3052169999864418},
 {:cursor_position=>3, :elapsed=>0.4006680000165943},
 {:cursor_position=>4, :elapsed=>0.5009349999891128}]

Which doesn't seem as useful. The mental overhead for this example is still manageable to answer the question what thing was slow but it quickly becomes impossible to do from the top of your head, eg when using ULIDs as database IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes sense. I think my confusion comes from the fact that when you enqueue the job with cursor_position: 1, it doesn't process id 1, but only 2 and up, so the cursor position and item processed don't quite line up. But this is indeed the cursor_position of the thing we just processed, so it makes sense to use this as the tag 👍

lib/job-iteration/log_subscriber.rb Show resolved Hide resolved
This is the same mechanism Rails uses to output logs from events. This allows for more instrumentation and easier customization of log output. The default log output is completely unchanged.

Since a bunch of events need cursor_position to keep the existing logging, I made it a default tag for all events.

allow to setup logger for JobIteration
@Mangara Mangara merged commit fcf832a into Shopify:main Mar 22, 2023
@bdewater bdewater deleted the improve-instrumentation branch March 23, 2023 01:50
@lavoiesl lavoiesl mentioned this pull request Aug 15, 2023
@shopify-shipit shopify-shipit bot temporarily deployed to rubygems August 23, 2023 19:41 Inactive
@@ -11,6 +13,10 @@ module JobIteration

extend self

attr_accessor :logger

self.logger = ActiveJob::Base.logger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does change something. Previously we ended up logging to Rails.logger, because when ActiveJob initializes itself in the context of a Rails application, it sets its logger to the Rails one:
https://github.com/rails/rails/blob/v7.0.7.2/activejob/lib/active_job/railtie.rb#L13-L15

Whereas now it will use the default Active Job logger which logs to stdout: https://github.com/rails/rails/blob/v7.0.7.2/activejob/lib/active_job/logging.rb#L11

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL. Thanks for also providing a fix - linking #427 here in case others run into this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants