Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cyclone DDS and Fast RTPS exhibiting vastly different CPU usage when running same ROS2 application on Jetson Xavier AGX SoC #793

Open
nat-tsang opened this issue Dec 15, 2024 · 5 comments

Comments

@nat-tsang
Copy link

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • From source
  • DDS implementation:
    • rmw_fastrtps_cpp
  • Client library (if applicable):
    • rclcpp

Expected behavior

In comparing our processes running on rmw_fastrtps_cpp and rmw_cyclonedds_cpp, the processes running with fastrtps cause all 8 of our CPU threads on the Jetson Xavier AGX to run at maximum usage 95-100% which is causing there to be no topics listed, and preventing the processes from publishing/subscribing. With cyclone_dds, the exact same process can run as expected, with all of the threads of the CPU running at around 15-20% usage. We have tested running the processes with both middlewares for up to 48 hours, and after 48 hours the cyclonedds process can still run normally whereas the fastrtps process will reach <100% CPU usage in the hour. We are wondering if this has been seen before, and what might be the root cause of the differences in operation of fastrtps and cyclonedds.

@cferreiragonz
Copy link

Hello @nat-tsang! Thank you for your report!

Could you please provide more details about the ROS 2 version you are using (or the Fast RTPS version)?

Additionally, could you check if the issue persists when Shared Memory is disabled? You can disable it by using the following XML profile:

<?xml version="1.0" encoding="UTF-8"?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
    <transport_descriptors>
        <transport_descriptor>
            <transport_id>UDP_transport</transport_id>
            <type>UDPv4</type>
        </transport_descriptor>
    </transport_descriptors>

    <participant profile_name="participant_profile_ros2" is_default_profile="true">
        <rtps>
            <name>profile_for_ros2_context</name>
            <userTransports>
                <transport_id>UDP_transport</transport_id>
            </userTransports>
            <useBuiltinTransports>false</useBuiltinTransports>
        </rtps>
    </participant>
</profiles>

Please let us know if this has a relevant impact in your CPU performance.

@nat-tsang
Copy link
Author

Hi! We are using ROS2 Foxy which comes with Fast RTPS 2.1.0. We were using the default version, so I will try disabling Shared Memory to see if the issue persists

@cferreiragonz
Copy link

Hi @nat-tsang, please note that ROS 2 Foxy has reached its End of Life (EOL) more than a year ago, and the same applies to Fast RTPS 2.1. If disabling Shared Memory does not resolve the issue, could you consider upgrading to ROS 2 Humble, which uses Fast RTPS 2.6? Testing with these newer versions may help solve the issue if it persists.

@nat-tsang
Copy link
Author

Hi, I have found that disabling Shared Memory has been effective so far in preventing the CPU from throttling as the application has been running for the last 48 hours without throttling the CPU. Why does disabling Shared Memory produce better CPU efficiency? What exactly is happening when Shared Memory is disabled?

@cferreiragonz
Copy link

Shared Memory (SHM) is one of the default transports used by Fast DDS, alongside UDP. It is automatically employed whenever two entities are on the same host, as it typically provides better performance for data transmission in terms of memory usage and latency. However, the architecture of SHM differs from that of UDP, and this difference can impact CPU thread behavior. For example, using SHM generally results in fewer blocking calls on the receiver side, which can affect CPU usage. The extent of this impact varies depending on factors such as the kernel, operating system, and hardware configuration.

By disabling SHM, you rely exclusively on UDP for data transmission. While this may be less resource-intensive in some scenarios, it can result in slightly slower throughput performance. The best option depends on your specific use case and setup requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants