Organization: Open Robotics
Mentor: Christophe Bedard
Student: Deva Shravan Kumar (Github, LinkedIn)
Link to GSoc Project: https://summerofcode.withgoogle.com/programs/2025/projects/quKpaaVA
Hello everyone,
I am Shravan, a 3rd year Electronics Engineering student at Indian Institute of Technology Roorkee and a member of the ROS development vertical at MaRS, our student robotics society. This summer, I was selected as a Google Summer of Code 2025 student developer to work on ros2_tracing with Open Robotics.
Over the past 4 months, I worked on implementing new features, adding tests, and updating documentation under the guidance of my mentor Christophe.
Project Overview
ros2_tracing
provides developers with tracing tools built on LTTng tracer to help them understand what is happening inside their applications and ROS 2 itself. This visibility helps uncover bugs and performance bottlenecks in complex, resource-constrained robotics systems.
However, ros2_tracing had a significant limitation: meaningful trace data could only be captured if recording the trace was started before application launch. Due to how ROS 2 tracepoints are designed, starting to record after launch meant missing critical initialization phase data, resulting in runtime trace data being essentially useless (see issue #44). This constraint created workflow challenges for developers. When uncertain whether they would need trace data later, they were forced to start recording before application launch, often resulting in accumulating unused trace files on disk. Moreover, if developers noticed an issue on an already (possibly long) running system, it was too late to get useful trace data, either from that point forward or starting from some time before the problem started.
The main goal of this project was to solve this problem. With some pre-configuration, developers can now decide at runtime whether or not to record trace data, significantly improving their debugging workflow and eliminating unnecessary disk usage.
Repository link: https://github.com/ros2/ros2_tracing
LTTng documentation: https://lttng.org/docs/v2.13/
Implementation and Challenges
We identified a few potential solutions to enable runtime tracing:
Solution 1: ROS 2 State Dump
We explored implementing a ROS 2 state dump mechanism similar to LTTng’s state dump feature, which would collect initialization information about currently-active objects at the time tracing starts, rather than at application startup.
However, this approach faced a significant obstacle: there was no mechanism to trigger state dumps when recording starts (unlike LTTng’s internal capability).
Due to this fundamental limitation, we rejected this approach.
Solution 2: Add new tracepoints
Instrument the ROS 2 source code with new tracepoints that collect initialization information from within runtime code, alongside existing runtime tracepoints. This would ensure initialization data is captured even when tracing starts after application launch.
However, this approach presented significant challenges:
- Due to ROS 2’s layered architecture (
rclcpp
, rcl
, rmw
), some initialization data exists in one layer while runtime code executes in another, making it difficult to capture all necessary initialization information
- Even if we managed to get all the required initialization data, these new tracepoints would trigger repeatedly throughout the runtime phase, significantly increasing overhead
Due to these challenges, we decided to implement the third solution.
Solution 3: Dual Session Tracing:
This approach uses two separate tracing sessions: An initialization session configured in LTTng’s snapshot mode, which stores initialization data in memory and writes to disk on-demand, and a normal runtime session for ongoing events.
Users configure their launch files to automatically start the snapshot session. When they decide to record at runtime, the snapshot session dumps its contents to disk while the runtime session begins recording new events. This guarantees the final trace contains both initialization and runtime data. One major challenge in implementing this solution was to determine how users would interact with this feature throughout their entire workflow and to ensure it supported all their use cases.

During the process, we also added the ability to configure any tracing session in snapshot mode, enabling a “flight recorder” capability that maintains a rolling history of events and can dump data when something interesting occurs.
Links to pull requests made during the project:
- Add support for starting tracing at runtime (PR #191, PR #196)
- Allow creating snapshot sessions (PR #195, PR #206)
- Add dual session tests (PR #205)
- Add documentation for snapshot mode and dual session tracing (PR #207)
(other miscellaneous contributions)
- Fix Clang warnings by using proper function prototypes in macros (PR #179)
- Make trace action parameters substitutable (PR #187, PR #188)
Future Work
The implemented features can be further improved by addressing the following issues:
-
Allow creating multiple channels in a tracing session (issue #199)
- Ability to create multiple channels allows to separate initialization and runtime events into two different channels, reducing the risk of high frequency runtime events overwriting the initialization events
-
Allow pre-configuring a dual session using ros2 trace command (issue #198)
Conclusion
This project was successful in addressing the “runtime tracing” limitation of ros2_tracing
, therefore improving the developer experience and laying the basis for future developments.
This summer was truly a remarkable learning experience for me. Being part of the Open Robotics community and contributing to ROS 2 has been incredibly rewarding. I am looking forward to continuing my contributions to the open source robotics ecosystem.
I would like to express my sincere gratitude to my mentor @christophebedard for his invaluable guidance and support throughout the project and to Open Robotics for giving me this opportunity.
5 posts - 4 participants
Read full topic