I am trying to use the robot_localization library to fuse the pose information from a camera fixed to a wall and an IMU mounted on the robot itself. The robot has a base_link and attached to it are some aruco tags. These tags are detected from the camera and their pose is known. So on one side i have the pose of 4 aruco tags that are rigidly attached to the base_link and their transform is known. Additionally I have the IMU sensor that is also rigidly attached to the base_link and the transform is known.
I have been getting very confused by how the urdf and frame_ids must be set. Should I create a URDF with a odom_link to which the camera frame and the base_link are children. And should I transform as well all the poses of the aruco tags detected by the fixed transform (base_link->aruco_tag_n)⁻1 to get the estimated base_link according to the aruco tags?
Finally, should the world_frame=odom_frame="odom" and the base_link= "base_link".
Thanks in advance!