Questions tagged [computer-vision]
Computer Vision is a subfield of computer science which deals with analyzing and understanding images. This includes detection of objects like faces in images or segmenting images.
609 questions
0 votes
1 answer
122 views
YOLO knowledge distillation (11x to 11n) yields poorer performance than native training
I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...
2 votes
0 answers
62 views
DensNet169 model accuracy not increasing on medical classification dataset
I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...
5 votes
1 answer
94 views
How to normalize bounding box sizes in perspective transform for objects at different distances from the camera
I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and ...
0 votes
0 answers
36 views
How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?
I am working on my first object detection project and need to implement multi-object detection using ResNet-18 (I am restricted to using this architecture). My dataset follows the COCO format and ...
0 votes
0 answers
51 views
How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?
I have an image of a one-line substation schema diagram that includes various components (like transformers, circuit breakers, etc.) and the connections between them. I’m looking for a way to convert ...
1 vote
0 answers
42 views
CNN for gaze regression predicts near the mean
I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...
0 votes
1 answer
29 views
Looking for images dataset with multiple images per instance
I'm looking for images dataset which have multiple images per instance. For example, healthcare dataset, where each person is classified with a diagnosys and have several images describing them.
1 vote
0 answers
36 views
How to make a correct prompt for "gpt-4o" vision API to find letters in an image?
I have an example of a generated image containing words, as well as several red arrows pointing to certain characters. I need to get these characters from GPT, but when I ask "what characters do ...