Skip to content

Conversation

@casperisfine
Copy link
Contributor

@casperisfine casperisfine commented Mar 26, 2024

Ref: #26103
Ref: #51474

A fairly common mistake with Rails is to enqueue a job from inside a transaction, with a record as argumemnt, which then lead to a RecordNotFound error when picked up by the queue.

This is even one of the arguments advanced for job runners backed by the database such as solid_queue, delayed_job or good_job.

But relying on this is undesirable in my opinion as it makes the Active Job abstraction leaky, and if in the future you need to migrate to another backend or even just move the queue to a separate database, you may experience a lot of race conditions of the sort.

To resolve this problem globally, we can make Active Job optionally transaction aware, and automatically defer job queueing to after_commit.

Copy link
Member

@rafaelfranca rafaelfranca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should not require the user to always explicitly create the transaction.

I find the complexity to find the right transaction too much and all usage that I know of can easily build the transaction.

Something like:

Topic.transaction do |transaction|
  transaction.after_commit { called += 1 }
end

@matthewd
Copy link
Member

@rafaelfranca I think job queueing is the base example of a use case where that doesn't work: we want to defer until any currently-in-flight transactions have committed, not set up a local transaction to wrap our work.

If the transaction was expected to be as local as in your example, the caller could just move the after_commit body to be physically after the transaction block.

@rafaelfranca
Copy link
Member

Yeah, Jean explained to me

@casperisfine casperisfine force-pushed the transaction-callbacks-2 branch from 2fc2e5e to 9e61d74 Compare March 27, 2024 11:40
@casperisfine casperisfine force-pushed the transaction-callbacks-2 branch from 9e61d74 to e3676b1 Compare March 27, 2024 12:08
@rails-bot rails-bot bot added the docs label Mar 27, 2024
@casperisfine casperisfine force-pushed the transaction-callbacks-2 branch from e3676b1 to ba8c36b Compare March 27, 2024 12:10
@casperisfine
Copy link
Contributor Author

Alright, I implemented the Active Job integration, in the end I think it's not too invasive, but I can rework it if it's seen as too tightly coupled.

I think with a small bit of polish this could move to a proper PR instead of being a PoC.

@rafaelfranca @matthewd I'd appreciate some second-opinion on the naming and implementation, etc.

@casperisfine casperisfine force-pushed the transaction-callbacks-2 branch 3 times, most recently from db0589d to 3f4ae43 Compare March 27, 2024 13:19
elsif open_transactions.size == 1
open_transactions.first.after_commit(&block)
else
callback = count_down_callback(open_transactions.size, &block)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly tying in with my comment above 😅 -- I'm not sure what I think should happen, but I think this will behave surprisingly if any of the transactions get rolled back (and thus count never reaches zero).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how I see it though. If any transaction is rolled back, we don't enqueue the job at all. I think that's what we want most of the time, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO no: while we can make a best effort at having the job run at a convenient time by introducing delay (waiting for concurrent transactions to get out the way), if perform_later didn't raise, we've claimed that we have successfully enqueued the job, so we have to follow through.

(Also having a potentially completely-unrelated DB transaction slow down the enqueuing is a defensible curiosity, having it's rollback result in your job being silently dropped seems more complicated.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that we'll probably need an escape hatch, but e.g.:

transaction do
  user = User.create!
  WelcomeEmailJob.perform_later(user)
  # something that fails and rollback the transaction
end

If the transaction is rolled back, you don't want this job enqueued.

Now maybe perform_later should be left unchanged, and it's another API that should have this behavior.

But the goal here is to reduce the behavioral difference between queues that use the same DB as Active Record, hence are transaction aware, and those that aren't.

In the above example, if you are using SolidQueue / GoodJob etc, the perform_later is acked but will be dropped. So I think we're consistent here.

Copy link
Contributor

@bensheldon bensheldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As GoodJob maintainer, I think this is a good change. It's easy enough for me to add configuration to the adapter, and easy enough for an application developer to override that on ActiveJob::Base 👍🏻

composerinteralia

This comment was marked as outdated.

@casperisfine casperisfine force-pushed the transaction-callbacks-2 branch 5 times, most recently from d65638f to 7983982 Compare March 29, 2024 15:36
@casperisfine casperisfine changed the title PoC: register transaction callbacks outside of a record Automatically delay Active Job enqueues to after commit Mar 29, 2024
@casperisfine casperisfine marked this pull request as ready for review March 29, 2024 15:38
@bensheldon
Copy link
Contributor

bensheldon commented Apr 8, 2024

@casperisfine to help with context:

  • I think that's probably not great to return nil (it would be nice to get back some info that job enqueue was deferred for this specific reason).
  • Usually the value that's needed by the application is the job_id or the provider_job_id so its status can be monitored (though this is fairly advanced usage, and one could use the explicit after-transaction lambdas if they need to do something subsequent; though that does make me wonder if some kind of ignore-config-and-enqueue-immediately-despite-any-transaction behavior would be necessary). If you got back the job, it could be checked later to see if it has a provider_job_id and check...
  • There is successfully_enqueued and enqueue_error attributes on the job: Communicate enqueue failures to callers of perform_later #41191 (though it's not widely used afaik; the log messages were broken for 2 years and no one noticed)

@byroot
Copy link
Member

byroot commented Apr 8, 2024

Yeah, I'm also thinking returning the job makes sense. I'm just a bit unclear on a few things, so I need to investigate all the consequences. And it's definitely a bit of a breaking change I didn't foresee.

casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 9, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 9, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 9, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 9, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 10, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
notapatch pushed a commit to notapatch/rails that referenced this pull request Apr 10, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
@rosa
Copy link
Member

rosa commented Apr 14, 2024

I love this change, it's something we've discussed a few times at 37signals, to introduce in Active Job 🤩 Thanks a lot!

This is even one of the arguments advanced for job runners backed by the database such as solid_queue, delayed_job or good_job.

FWIW, I think the good argument for DB-backed active job backends is slightly more subtle than simply not caring about when you're enqueuing the job. The really useful case for enqueuing a job within a transaction and having it run after the transaction commits, taking advantage of jobs living in the same DB as your app records, is that you get transactional integrity, in the sense that the transaction is rolled back if enqueuing the job fails.

But relying on this is undesirable in my opinion as it makes the Active Job abstraction leaky, and if in the future you need to migrate to another backend or even just move the queue to a separate database, you may experience a lot of race conditions of the sort.

Totally, we have a note/warning about this in the Solid Queue README, but there are some valid cases where you might want to have this behaviour instead of some complicated, ad-hoc manual "rollback" of a committed transaction if enqueuing your job after_commit fails, which is something we have to do in some cases in our apps.

@rafaelfranca rafaelfranca deleted the transaction-callbacks-2 branch April 18, 2024 21:14
smudge added a commit to Betterment/delayed that referenced this pull request Apr 29, 2024
…commit) (#40)

This anticipates the release of a new feature in Rails 7.2 introduced here: rails/rails#51426

In Rails 7.2, ActiveJob will default `enqueue_after_transaction_commit` to `true`, and we would like to explicitly set this to `false` (retaining transactional enqueues, since `delayed` is DB-backed).

This PR goes a step further and also raises a `UnsafeEnqueueError` if, during enqueue, it encounters any job type that has overridden `enqueue_after_transaction_commit` to `true`. Worth noting that this exception may still be raised _after transaction commit_ (based on the way the lifecycle hooks are handled), so it is intended first and foremost as a guard during development & testing.

As part of this change I also:
- Added a new Appraisal for the latest Rails main build on GitHub.
- Re-generated the rubocop todo file. There are changes we can pull out & autocorrect, but we can do them in separate PRs.
fractaledmind pushed a commit to fractaledmind/rails that referenced this pull request May 13, 2024
Ref: rails#26103
Ref: rails#51426

A fairly common mistake with Rails is to enqueue a job from inside a
transaction, and a record as argumemnt, which then lead to a RecordNotFound
error when picked up by the queue.

This is even one of the arguments advanced for job runners backed by the
database such as `solid_queue`, `delayed_job` or `good_job`. But relying
on this is undesirable iin my opinion as it makes the Active Job abstraction
leaky, and if in the future you need to migrate to another backend or even
just move the queue to a separate database, you may experience a lot of race
conditions of the sort.

But more generally, being able to defer work to after the current transaction
has been a missing feature of Active Record. Right now the only way to do it
is from a model callback, and this forces moving things in Active Record
models that sometimes are better done elsewhere. Even as a self-proclaimed
"service object skeptic", I often wanted this capability over the last decade,
and I'm sure it got asked or desired by many more people.

Also there's some 3rd party gems adding this capability using monkey patches.
It's not a reason to upstream the capability, but it's a proof that there is
demand for it.

Implementation wise, this proof of concept shows that it's not really hard to
implement, even with nested multi-db transactions support.

Co-Authored-By: Cristian Bica <cristian.bica@gmail.com>
fractaledmind pushed a commit to fractaledmind/rails that referenced this pull request May 13, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
@urubatan
Copy link

urubatan commented May 31, 2024

This is completely wrong, for example if I'm publishing a news story and part of that process is to enqueue the job that will deliver that story, the publishing transaction should fail if the job is not enqueued.

also, RecordNotFound will only happen for perform_now, job queue will only notice the job after transaction is committed (if talking about solid_queue, delayed_job or good_job), for redis backed ones might happen if the commit is slow and queue empty but for business reasons is better to retry a job than have one not enqueued when it needed to be.

@rafaelfranca
Copy link
Member

rafaelfranca commented May 31, 2024

This is completely wrong, for example if I'm publishing a news story and part of that process is to enqueue the job that will deliver that story, the publishing transaction should fail if the job is not enqueued.

You can still do that. This is a config that you disable if it doesn't work for you. You can even disable per-job

also, RecordNotFound will only happen for perform_now, job queue will only notice the job after transaction is committed (if talking about solid_queue, delayed_job or good_job), for redis backed ones might happen if the commit is slow and queue empty but for business reasons is better to retry a job than have one not enqueued when it needed to be.

You are right. This change is mostly to match the behavior of non-relational queues with relational queues. But it is more common than you think. Just Google about jobs and after_create on Google and you are going to see many issues or blog posts telling people to change to after_commit.

Example:

sidekiq/sidekiq#322
https://www.justinweiss.com/articles/a-couple-callback-gotchas-and-a-rails-5-fix/
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88632

Now, let's talk about your feedback. You could have said "This doesn't work for me" or "That is not my experience" would be fine. But Saying "completely wrong" when a simple google search shows that it is you that got it wrong is just disrespectful. I got what you mean, and I'm sure you didn't tried to be disrespectful, but please try to avoid that.

@rafaelfranca
Copy link
Member

rafaelfranca commented May 31, 2024

Ah, you also have a point about the chance of the job not being enqueued, let's say if the redis server is down when enqueuing the job. This does happen, and it is worse than retying a job when RecordNotFound happens. Agree! Business critical jobs should probably no use this config, that is why we made a per-job config.

Now you could argue which is the better default, and we do believe enqueuing the job in after_commit is a better default because it is the most common case. This default doesn't suites everyone, and that is fine, that is why it is also a global config.

@urubatan
Copy link

Thanks for the feedback. Sorry if my last message came across the wrong way. I'll try to be more professional in the future.

xjunior pushed a commit to xjunior/rails that referenced this pull request Jun 9, 2024
Ref: rails#26103
Ref: rails#51426

A fairly common mistake with Rails is to enqueue a job from inside a
transaction, and a record as argumemnt, which then lead to a RecordNotFound
error when picked up by the queue.

This is even one of the arguments advanced for job runners backed by the
database such as `solid_queue`, `delayed_job` or `good_job`. But relying
on this is undesirable iin my opinion as it makes the Active Job abstraction
leaky, and if in the future you need to migrate to another backend or even
just move the queue to a separate database, you may experience a lot of race
conditions of the sort.

But more generally, being able to defer work to after the current transaction
has been a missing feature of Active Record. Right now the only way to do it
is from a model callback, and this forces moving things in Active Record
models that sometimes are better done elsewhere. Even as a self-proclaimed
"service object skeptic", I often wanted this capability over the last decade,
and I'm sure it got asked or desired by many more people.

Also there's some 3rd party gems adding this capability using monkey patches.
It's not a reason to upstream the capability, but it's a proof that there is
demand for it.

Implementation wise, this proof of concept shows that it's not really hard to
implement, even with nested multi-db transactions support.

Co-Authored-By: Cristian Bica <cristian.bica@gmail.com>
xjunior pushed a commit to xjunior/rails that referenced this pull request Jun 9, 2024
Fix: rails#51426 (comment)

`perform_later` is supposed to return the Job instance on success,
and `false` on error.

When the `enqueue` is automatically delayed, it's of course impossible
to predict if the actual queueing will succeed, but for backward compatibility
reasons, it's best to assume it will.

If necessary, you can hold onto the job instance and check for
`#successfully_enqueued?` after the transaction has completed.
@viralpraxis
Copy link
Contributor

So, it's still unclear to me if this feature replaces https://github.com/Envek/after_commit_everywhere ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.