Reduce RM initialization timeout from 2.0s to 1.5s#831
Reduce RM initialization timeout from 2.0s to 1.5s#831Vanjoseluis wants to merge 1 commit intoros-controls:rollingfrom
Conversation
| If needed, we could explore using different RM initialization timeouts depending on the test or hardware. Long‑term, it might be interesting to explore whether Gazebo could be kept alive across tests with a proper reset mechanism. |
| It may also be that physics starts advancing before the RM is fully initialized, so a larger timeout is needed to let the system settle. |
christophfroehlich left a comment
There was a problem hiding this comment.
I suppose this is dependent on the system load (CPU), and can get flaky on the CI runners. have you tested the same with higher CPU load? I use this to max out 15 of my 16 cores for example stress-ng --cpu 15 --vm 1 --vm-bytes 3G --vm-keep
I tried running the pendulum test under extreme load using stress-ng (15 CPU hogs + 3 GB VM pressure). Under these conditions Gazebo becomes systematically unstable: most runs fail due to the assertion, and a couple of them due to missing joint_state messages. Update: This shows that stress‑ng overload breaks Gazebo initialization regardless of the timeout value, so it’s not a meaningful criterion for choosing the timeout. Under normal and CI‑like load, the 1.5 s timeout behaves reliably. |
This PR replaces the previous conservative 2.0 s Resource Manager initialization timeout with a measured and reproducible value of 1.5 seconds.
The new value is based on an extensive experimental study designed to identify the minimum stable timeout under Gazebo.
This change reduces test runtime by 25% while maintaining full stability.
Motivation
Issue #801 showed that a too small timeout (0.2 s) leads to incorrect controller behavior and large deviations in expected joint positions. The goal of this study was to:
Methodology
The pendulum_effort_test was used as the primary benchmark because it is the most timing‑sensitive test in the suite.
All other tests (pendulum_position_test, gripper_mimic_joint_position_test, gripper_mimic_joint_effort_test...) were also validated with the final value (1.5 s).
Initial Bisection Results
Extended jitter analysis
Further testing revealed that Gazebo introduces significant initialization jitter.
Values that initially appeared stable (e.g., 1.1 s) were not consistently reproducible.
Additional runs:
1.0 s → 8/9 (not stable)
1.2 s → unstable
1.5 s → 30/30 PASS, fully stable with all tests
CONCLUSION
The RM initialization timeout is updated to 1.5 seconds, which:
This value is supported by reproducible experimental evidence (30/30 runs).