-
Notifications
You must be signed in to change notification settings - Fork 53
Add support for tracing #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi Skoudmar. Thanks for the contribution. I just wanted you to know that I have seen this, but I haven't had time to test it (on mac right now). I am not familiar with the ros2 tracing features but I think this looks solid and non-intrusive so definitely open to accepting. What do you think about the slight API change when you want to trace (i.e changing from stream to callback style)? I realize that you need to call I hope to be able to test it a bit soon. |
|
Hi @m-dahl, thanks for your feedback! The callback approach is necessary because ROS's existing tracing API was designed specifically with callbacks in mind. An async function can pause execution via While I did consider wrapping the stream as you suggested, there's a fundamental mismatch between stream semantics and callback semantics. Treating However, I agree that wrapping the stream is better for maintainability. So, using a combined approach of using a wrapped stream and also having callbacks offers benefits in maintainability and correctness. This PR already includes something similar for timers. What about something like this: // Subscribe as usual
let stream = node.subscribe("/topic", QosProfile::default())?;
// Option 1 (untraced): Use the stream normally like now
let untraced_future = stream.for_each(|msg| {
println!("Untraced: {msg:?}");
async {
// some async code
}
});
// Option 2 (traced): Convert to traced callback style
// `traced_callback` is a new function on the stream wrapper.
let traced_future = stream.traced_callback(|msg| {
println!("Traced: {msg:?}");
// some sync code
});What do you think? I'd like to hear your feedback before implementing this. |
|
Hi, again sorry for not being so responsive. Yes I can see that there is a fundamental difference in the stream vs cb style here. I have wondered when the stream design would come back to bite us and it seems the time has come :) But its not so bad, and I think the wrapping approach you present is fine. If you would like to implement it I will merge this PR. Another thing, should we gate the feature also on |
This change is necessary because rcl uses address of the rcl handle in tracepoints. If the location changes, further publish events will not be correctly assigned to the publisher.
This change is necessary because rcl uses address of the rcl handle in tracepoints. If the location changes further `take` events will not be correctly assigned to the subscriber.
This prevents another service initializing at the same location. Therefore preventing conflicts in a trace.
This prevents another client initializing at the same location. Therefore preventing conflicts in a trace.
The crate contains tracepoints; either defined directly or imported from ros2_tracing project (tracetools package). The crate define a `tracing` feature which should only be used on Linux because it uses LTTng only available on Linux.
The TracingId is used as a wrapper for the pointer to allow implementing Send and Sync traits. The pointer is used only as an ID and is never dereferenced, therefore, it is safe.
Location and arguments of tracepoints is consistent with rclcpp
The StreamWithTracingData wrapper implements both Stream and FusedStream similarly to Receiver
|
Hi, I have updated the implementation (and rebased it on the current master) to implement the Stream wrapper. However, I am not sure about its name: I have also added a compile error when the feature is used on non-Linux inside build.rs. |
This PR adds support for tracing by importing tracepoints used by rclcpp and creating new ones.
ROS 2 tracing is built on the LTTng tracing framework.
Motivation
Tracing ROS applications can help identify performance bottlenecks or measure and identify causes of latencies.
Available tools to analyze traces:
tracetools_analysisOverview
Traces of R2R application without this PR already include tracepoints
from
rclandrmw, but these are not sufficient for extractinguseful information. Therefore, this PR extends R2R in the following
areas:
r2r_tracingmodule, whichtracetools(a part ofros2_tracing).Node::spin_once).tracingfeature flag. Without it nothing is imported/linked, and tracepoints are no-ops.r2r_tracingfromr2rcode.Node::subscribe_traced,Node::create_service_traced,and
Timer::on_tick. The first two are similar toNode::subscribeand
Node::create_service, but add tracepoints around user-suppliedcallback.
Timer::on_tickadds tracepoints to timer callback.stable address.
Stable memory address
Without this PR, ROS entities are first created on the stack, so initialization events like
ros2::rcl_publisher_inituse the stack address as their tracepoint event. Besides the address, these initialization events usually contain helpful information such astopicorperiod. R2R then moves the entity to the heap usingBoxorArc. When a traced function, for example,rcl_publish, is later called on the entity, the tracepoint event references the heap memory address instead. Since the heap address differs from the original stack address, the traced function cannot be correctly matched with the corresponding initialization event.This PR solves the problem for publishers (cb10d28), subscriptions (3f24514), services (e9dece1), and clients (c926fa6). For timers, the problem was already solved in PR #88.
Importing tracepoints from
ros2_tracingTo add tracepoints compatible with
rclcppto the R2R code, we first need to import them. This is handled by dynamically loadingtracetoolslibrary fromros2_tracing.tracetoolsis a part of ROS, and is depended on by, e.g.,rclorrclcpp.The following tracepoints are imported:
ros2:rclcpp_subscription_initros2:rclcpp_timer_link_noderos2:rclcpp_subscription_callback_addedros2:rclcpp_service_callback_addedros2:rclcpp_timer_callback_addedros2:rclcpp_callback_registerros2:rclcpp_publishros2:rclcpp_takeros2:callback_startros2:callback_endReference: https://github.com/ros2/ros2_tracing/blob/rolling/doc/design_ros_2.md
Defining new tracepoints
In
rclcpp, sampling of ready entities is announced by tracepointrclcpp_executor_wait_for_work. However, this tracepoint is tied to the C++ executor lifecycle, and R2R does not include executors. In R2R, sampling is done byNode::spin_once. Therefore, to be able to get information about sampling, we define new tracepoints:r2r:spin_startr2r:spin_waker2r:spin_timeoutr2r:spin_endThese are executed according to the following table:
The tracepoints are generated by crate
lttng-ust-generate. Its release version is old and does not supportusize, so I have created a PR (sapphire-arches/lttng-ust-rs#5) which fixes it, but the maintainer is not active, so my fork is used instead.Additionally, the generated code needs to link with
lttng-ustlibrary which is only available on Linux. This is the main reason the tracing support is feature-gated.tracingfeature flagGeneration, inclusion, and linking are controlled by the
tracingfeature flag inr2r_tracing, which is also exported byr2r.Without the feature flag
tracing, nothing is imported/linked, and tracepoints are no-ops.New callback tracing functions
In order to support tracing of callback execution, new functions
Node::subscribe_traced,Node::create_service_traced, andTimer::on_tickare added.Example usage is provided in
r2r/examples/callback_tracing.rs.CC: @wentasah