What are the Applications of Computer Vision

Computer Vision - An Overview

What exactly is computer vision?

Computer vision is a branch of artificial intelligence that deals with extracting information from visual data. You can use this data Photos, scans, video sequences or multidimensional data, such as counting from medical scanners. Computer vision is based on the human ability to capture, process and analyze images. In some areas, machines have even outperformed human performance. Our series of articles beginning with this article is intended to provide an overview of the tasks, areas of application and technology of this broadly diversified subject area.

What are typical tasks of computer vision?

Computer vision is designed to identify, localize and classify certain objects in images. One of the most important tasks is object recognition. Here you can differentiate between different sub-areas:

Object classification

Categorization / classification is about assigning discrete categories such as “dog”, “human” or “car” to images. The granularity can be increased to such an extent that a system can not only differentiate between dogs and cats, but can also identify individual breeds of dogs or certain people (face recognition). Several different objects can be recognized and classified within one image.

Image Classification: The Assistant of the Google Photo App assigns vacation pictures to certain classes with the help of Computer Vision.

Object localization

During the localization one tries to describe the position of an object in the picture by, for example, placing a "bounding box" as close as possible around the object. Segmentation, on the other hand, is used when images are to be divided into pixel-perfect regions with symbolic meaning, for example into “foreground” - “background”. Object recognition can also be used to estimate the position of an object relative to the camera. This is used in robotics, for example, when a robot arm is supposed to pick up certain objects from an assembly line.

Localization and Classification: The system tries to recognize people and to mark their position with a bounding box.

Content Based Image Retrieval

In Content Based Image Retrieval, Object Recognition is used to search through large amounts of data for specific content. Here, “content” does not refer to metadata, such as the file name or the image description, but to the information contained in the images themselves. The advantage over conventional search engines is that the search is no longer limited by the quality of the metadata. Creating them can often be time-consuming and costly.

Motion analysis

Another task of computer vision is motion analysis. An important sub-area of ​​this is egomotion, by which one understands the estimation of the camera movement relative to a three-dimensional environment. This technology is crucial for controlling autonomous cars, for example. With tracking, on the other hand, the movement of certain objects is tracked in a larger context.

Motion Analysis: This is what motion tracking can look like in a person

Reconstruction, Image Captioning, and more

However, this does not mean that the potential of computer vision is exhausted. Other problems for which Computer Vision offers successful solutions include, for example, the creation of 3D models of certain scenes from individual images, the removal of background noise from images, the reconstruction of missing image elements, the description of image content (image captioning) and much more.