You can try letter-boxing
Letterboxing is a great approach! It allows you to maintain the aspect ratio of your frames without losing any critical information about the human-object interactions. By adding padding to the sides or top/bottom of the frame, you ensure that no part of the image is cropped out, which is especially useful when dealing with wide aspect ratios like 1280x720.
Here is the code of the letter boxing using python
def class_letterbox(im, new_shape=(640, 640), color=(0, 0, 0), scaleup=True): """ This function is used to letterbox the image. Args: im (_type_): _description_. new_shape (tuple, optional): shape of the image. Defaults to (224, 224). color (tuple, optional): color of the image. Defaults to (0, 0, 0). scaleup (bool, optional): scale up the image. Defaults to True. Returns: im (np.array): Processed image. """ # Resize and pad image while meeting stride-multiple constraints shape = im.shape[:2] # current shape [height, width] if isinstance(new_shape, int): new_shape = (new_shape, new_shape) if im.shape[0] == new_shape[0] and im.shape[1] == new_shape[1]: return im # Scale ratio (new / old) r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) if not scaleup: # only scale down, do not scale up (for better val mAP) r = min(r, 1.0) # Compute padding # ratio = r, r # width, height ratios new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding dw /= 2 # divide padding into 2 sides dh /= 2 if shape[::-1] != new_unpad: # resize im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR) top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border return im
alternatively, you can try following things
To address the issue of random cropping potentially removing the object of interest during augmentation, you could try a few strategies:
Center Cropping:
Instead of random cropping, you could use center cropping, ensuring that the middle of the frame (where the human-object interaction likely occurs) is always preserved.
Object-aware Cropping:
Use a bounding box or region proposal to locate the human and object before cropping. You can crop around this region to ensure the object of interest remains in the frame.
Random Resizing with Padding:
Instead of cropping, resize the frames while keeping the aspect ratio, and pad the remaining area with a solid color (usually black or white). This would retain the object of interest and help the model generalize to different sizes.
Custom Cropping Policies:
You can define a set of policies that guide cropping. For example, limit the crop to a central area or the region where humans and objects tend to appear, rather than performing purely random crops.