Computer vision, on its way to take fiction out of science-fiction

What’s new and what’s next?

Published in

AR/VR Journey: Augmented & Virtual Reality Magazine

6 min readNov 8, 2019

A multidisciplinary field, computer vision was revolutionized in the last decade due to the availability and advances in hardware and cloud computing, that took deep learning and deep neural networks out of the sci-fi realm into the practical applications world. With a multitude of applications in real life, computer vision and hardware market is expected to reach $48.6 billion by 2022.

Considered a subfield of artificial intelligence and machine learning, computer vision has been able to take great leaps in recent years and even surpassed humans in some tasks related to detecting and labeling objects, thanks to advances in artificial intelligence and innovations in deep learning and neural networks. The amount of data we generate today is one of the driving factors behind the growth of computer vision. This data is used to train and make computer vision better.

In the 60s, the issue of making computers see was regarded as a trivially simple problem that could be solved by any student doing a summer project that consists in connecting a camera to a computer. But things were not as simple as computer scientists initially thought and after five decades of research CV remains kind of a mystery, at least in terms of matching the capabilities of human vision. And that’s because we don’t actually have a strong grasp of how human vision works. Besides understanding how our eyes function, the interpretation of the perception within the brain raised other problems; as in any study that involves the brain, there is a long way to go…

Trending AR VR Articles:

1. How XR Can Unleash Cognition
2. Oculus Go, the Nintendo Switch of VR
3. Ready Player One : How Close Are We?
4. Augmented Reality — with React-Native

And to make things even more complicated, there is also the complexity inherent in the visual world. As we know from empirical experience, any given object may be seen from any angle, in different lighting conditions, with any type of occlusion from other objects and so on. A genuine vision system must be able to “see” in any of an infinite number of scenes and extract relevant information. But computers don’t work well with open, unbound problems like visual perception does; they need a tight milieu in order to produce useable results.

Early experiments in computer vision started in the 50s and it was first put to commercial use in the 70s to distinguish between typed and handwritten text. The field has been able to progress greatly in recent years, with the applications for computer vision growing exponentially. The computing power required to analyze the data is now accessible and more affordable and in the last 10 years, the systems have reached 99 percent accuracy from 50 percent, making them more suitable than humans to quickly react to visual inputs.

The arrival of deep learning

Before the emergence of deep learning the tasks that computer vision could perform were very limited and required a lot of manual coding and efforts by developers and human operators.

There was very little automation involved with most of the work being done manually, not to mention that the error margin was unacceptably large. The arrival of machine learning meant that developers no longer needed to manually code every single rule into their vision applications. Instead, they programmed “features,” smaller applications that could detect specific patterns in images. A statistical learning algorithm (linear regression, logistic regression, support vector machines) was then used to detect patterns or classify images.

To take just an example from one field of application for CV, machine learning engineers, alongside cancer experts, were able to create a software that could predict breast cancer survival windows better than human experts.

It was a breakthrough, deep learning bringing a fundamentally different approach to how machine learning was done. It relies on neural networks, that with enough labeled examples of a specific kind of data will be able to extract common patterns between those examples, helping classify future information.

Compared to previous types of machine learning, deep learning is easier to develop and faster to deploy.

Computer vision applications such as cancer detection, self-driving cars and facial recognition make use of deep learning. The availability and advances in hardware and cloud computing took deep learning and deep neural networks out of the sci-fi realm into the practical applications world.

Computer vision at work

Many popular computer vision applications involve trying to recognize things, for example:

· Object Classification: What broad category of object is in this photograph?

· Object Identification: Which type of a given object is in this photograph?

· Object Verification: Is the object in the photograph?

· Object Detection: Where are the objects in the photograph?

· Object Landmark Detection: What are the key points for the object in the photograph?

· Object Segmentation: What pixels belong to the object in the image?

· Object Recognition: What objects are in this photograph and where are they?

Other tasks can be related to information retrieval, like finding an image or images that contain an object.

Besides recognition of elements in an image, other methods of analysis include:

· Video motion analysis (CV is used to estimate the velocity of objects in a video);

· In image segmentation, algorithms partition images into multiple sets of views;

· Scene reconstruction creates a 3D model of a scene inputted through images or video;

· In image restoration (noise or blurring is removed from photos using Machine Learning based filters).

To conclude, any application that involves understanding pixels through software can be labeled as computer vision.

The last decade’s progress in computer vision has been impressive, but we’re still far from solving all its mysteries.

However, there are already multiple areas where core concepts from Machine Learning’s computer vision are already being integrated into everyday use products:

· Face recognition

CV enables computers to match images of people’s faces to their identities. Algorithms detect facial features in images and compare them with databases of face profiles. Also consumer devices use facial recognition to identify the owners. Social media apps use facial recognition to detect and tag users and law enforcement agencies also rely on facial recognition technology to identify wanted individuals.

· Self-driving cars

By using the technology, self-driving cars make sense of their surroundings. Cameras capture video from different angles, feed it to computer vision software to detect extremities of roads, read traffic signs, “see” other cars, objects and pedestrians. A hot technology these days.

· Health care

Computer vision algorithms can help automate tasks such as detecting cancerous moles in skin images or finding symptoms in x-ray and MRI scans.

· Augmented & Mixed reality

The technology also plays an important role in augmented and mixed reality, enabling smartphones, tablets and smart glasses to overlay and embed virtual objects on real-world imagery.

As a conclusion, computer vision still remains an elusive field in computer science and, in spite of huge progress in the last decade, exploration is to be done and breakthrough innovations are still to be made.

Did you enjoy our article?

www.rinftech.com

Don’t forget to give us your 👏 !