Questions tagged [computer-vision]

Ask Question

Computer Vision is a subfield of computer science which deals with analyzing and understanding images. This includes detection of objects like faces in images or segmenting images.

609 questions

0 votes

1 answer

122 views

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...

Simon Hergott

asked Jul 11 at 11:30

2 votes

0 answers

62 views

DensNet169 model accuracy not increasing on medical classification dataset

I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...

NIrbhay Mathur

asked May 22 at 4:15

5 votes

1 answer

94 views

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and ...

Basavaraj Kittali

asked May 14 at 12:42

0 votes

0 answers

36 views

How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?

I am working on my first object detection project and need to implement multi-object detection using ResNet-18 (I am restricted to using this architecture). My dataset follows the COCO format and ...

Daniel

asked Mar 17 at 10:50

0 votes

0 answers

51 views

How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?

I have an image of a one-line substation schema diagram that includes various components (like transformers, circuit breakers, etc.) and the connections between them. I’m looking for a way to convert ...

Necrosis

asked Mar 9 at 10:51

1 vote

0 answers

42 views

CNN for gaze regression predicts near the mean

I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...

bebel

asked Mar 1 at 13:24

0 votes

1 answer

29 views

Looking for images dataset with multiple images per instance

I'm looking for images dataset which have multiple images per instance. For example, healthcare dataset, where each person is classified with a diagnosys and have several images describing them.

J. Doe

asked Feb 6 at 19:24

1 vote

0 answers

36 views

How to make a correct prompt for "gpt-4o" vision API to find letters in an image?

I have an example of a generated image containing words, as well as several red arrows pointing to certain characters. I need to get these characters from GPT, but when I ask "what characters do ...

user175111

asked Dec 17, 2024 at 21:13

15 30 50 per page

2 3 4 5

…

41 Next

Stack Exchange Network

Questions tagged [computer-vision]

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

DensNet169 model accuracy not increasing on medical classification dataset

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?

How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?

CNN for gaze regression predicts near the mean

Looking for images dataset with multiple images per instance

How to make a correct prompt for "gpt-4o" vision API to find letters in an image?

Hot Network Questions