-
Notifications
You must be signed in to change notification settings - Fork 535
Add Snapshotter #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Snapshotter #20
Conversation
a0fbdd2 to
3e703c2
Compare
5d6a540 to
aa93706
Compare
59566e5 to
b96b1b0
Compare
46f625b to
3519dcb
Compare
3519dcb to
68464c0
Compare
dd3c770 to
87a1daf
Compare
| snapshotter._snapshot_cpu(event_system_data_info) | ||
| assert len(snapshotter._cpu_snapshots) == 1 | ||
| assert snapshotter._cpu_snapshots[0].used_ratio == event_system_data_info.cpu_info.used_ratio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda sad from seeing all those protected member accesses. Is there any way we could refactor the snapshotter to avoid this? Is this test really valuable? From what I see here, we're really just verifying some internal details, not the behavior of the Snapshotter class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you test otherwise that the snapshotter correctly read and stored the used ratio of the CPU? I don't know, it seems valuable to me to have these checks there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that it would be better to set up an EventManager and have it emit a SystemInfo event. And then, instead of looking through the snapshots property, to use one of those get_*_system_info methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but we would need to either directly invoke the private method _emit_system_info_event or wait for the system_info_interval to trigger its execution by RecurringTask.
And we would have to update it for all other test_snapshot_*.
Also, isn't this a better unit test approach? I mean, to have it isolated. When a bug in the RT occurs we would see that only unit tests for RT are failing and will be able to easily identify where the problem is. Rather than failing everything and looking for the cause.
I understand your point of touching just the public interface in the tests, but I'd prefer to stay with the current implementation in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but we would need to either directly invoke the private method
_emit_system_info_eventor wait for thesystem_info_intervalto trigger its execution byRecurringTask.
Or we could make a testing implementation of EventManager where emitting events could be done from the outside (I mean from the test).
And we would have to update it for all other
test_snapshot_*.
Yes, but I'd consider that a benefit 🙂
Also, isn't this a better unit test approach? I mean, to have it isolated. When a bug in the RT occurs we would see that only unit tests for RT are failing and will be able to easily identify where the problem is. Rather than failing everything and looking for the cause.
Well, depends what you consider to be an unit, but for me, this is a sub-unit test. You're right that using a non-trivial event manager would make it closer to an integration test. With a fake (not sure if that's correct TDD terminology) implementation, I don't see an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, agreed. Taking into account the current situation, I opened a new issue for that - #73. Let's improve the testing of the Snapshotter later and merge this now.
Description
Snapshottercomponent;SystemStatus,LocalEventManager, and relevant data classes;crawlee/_utils/system.py;Additional notes
crawlee.configmodule for now since we do not need it. Let's add it again in the future and move some fields to it if needed.cast(list[Snapshot], self._memory_snapshots)) in Snapshotter. It is there because of this https://mypy.readthedocs.io/en/stable/common_issues.html#variance, I am not sure if we can address it better.Testing
Unit tests
Manual testing / execution