Blog

AI, a real boost for Computer Vision
Computer vision (VO) is a multidisciplinary branch related to mathematics and artificial intelligence (AI). The VO is about how computers can gain an understanding of the world around them from digital images or videos. Its main objective is to allow a machine, through algorithms and models, to interpret visual content like a human. However, will the VO be able to successfully replicate human capacities, or even exceed them? Will the progress of AI allow the original version to automate and thus make certain professions obsolete?
Computer vision is a set of technologies that allows a machine to see and above all to understand what it sees and to draw conclusions from it. The tools and methods used can be hardware and software. During acquisition, the principle is to imitate human or animal vision by manufacturing complex devices that can have wide-field vision (ie as for certain birds) or even night vision (eg infra-red, ultraviolet, ray x, etc.), thus exceeding the wavelengths that the human eye perceives. Sometimes the image is acquired by special sensors suitable for extreme environments such as the interior of nuclear facilities. That said, in addition to visual perception, VO aims to automate and integrate a wide range of processes to enable better analysis and understanding of the perceived scene.
Moreover, why give the eyes an application? How could a computer that “sees” be useful? And what are the areas of application of the vision?
The fields of application of vision are very wide, whether in imaging or for video processing. As a scientific discipline, vision deals with the recognition of shapes, faces or people, the detection of areas of interest and characteristic points in a scene. Subsequently, in industry, this research work led to different use cases and can be applied to security, biometrics, sport (Foxtenn and Hawk-eye) or even medical imaging, which represents the one of the greatest promises of the original version. A radiologist assisted by anomalies or tumor detection software saves time for the radiologist and improves patient safety. In video processing, we find methods of tracking objects, targets, or people as well as reconstruction techniques of 3D objects or scenes.
However, despite the variety of applications that go into VO, identifying shapes or tracking a target in video is not enough these days. The mathematical complexity of designing more useful algorithms has long held back the advancement of vision techniques. Thus, certain applications such as the recognition of emotions or events have for a long time been obstacles to the VO. Today, vision faces major problems in interpreting complex scenes on video, such as detecting activities, recognizing feelings, suggesting actions, or issuing alerts in the event of danger.
While it remained without significant development until the 2000s, how did the vision get over this milestone?
The field of artificial intelligence (AI) and more particularly the techniques of Machine Learning (ML) and Deep Learning (DL), have enabled computers to perform several processes, long considered impossible. ML is progressing impressively and its impact on the industry is undeniable. It must be recognized that ML techniques have been decisive in lifting several scientific barriers on subjects that were until now inaccessible, and VO is obviously one of them. Indeed, the latter is experiencing spectacular development, thanks to material advances and the breath given by ML, in turn bringing (ie VO) a strong added value to multiple fields of higher-level applications such as biometrics, medicine, security, surveillance, etc. An algorithm designed to identify cars and people as distinct objects is now able, thanks to AI, to predict what those objects would be likely to do in the next instant. This obviously requires the analysis of interactions between objects and the development of statistical models describing these interactions.
With this boost given by ML techniques, can vision overtake humans?
Today, thanks to vision, a robot (or drone) can see, analyze, avoid obstacles, understand, learn, and even explain. Once all of this is acquired, the robot can perform several very useful tasks that humans cannot or do not want to do. In addition to daily chores, a robot can visually recognize the person ringing your doorbell and perform other subsequent tasks, such as a human might. Cowa-Robot is a suitcase robot which, thanks to a camera, can follow you everywhere avoiding obstacles. Another similar prototype allows a strong assistance to the destitute people, particularly the visually impaired. Autonomous vehicles are also one of those technological advances that arouse both awe and awe. They pave the way for important legal and ethical questions. Smart and autonomous cars can, by being connected, communicate with each other, and drive safely on their own. For example, an intersection could then be directed by a machine connected to the other vehicles: the latter, equipped with one or more on-board cameras, would count the number of vehicles coming from each lane and would adapt the behavior of the vehicles to optimize the circulation.
From a scientific perspective, vision seeks to automate tasks that the human visual system can perform, from acquisition to understanding and analysis. Rather, as a technological discipline, vision refers to a combination of image analysis and mapping techniques to inspect objects or places automatically. The rapid evolution of AI techniques has enabled this multidisciplinary branch to give unprecedented impetus to several higher-level application areas such as robotics and smart cars. The ability to combine HD video, image processing, analytics, and connectivity has become an essential feature for building next-generation smart applications. In industry, innovative projects require mastery of VO and ML techniques to develop applications that meet today’s needs.
However, no matter how advanced vision and artificial intelligence advance in the future, a machine will never equal the perception and thinking of a human. Finally, is it not Albert Einstein who said that “Machines one day will be able to solve all problems, but none of them will ever be able to pose one.”
INFO +: The ILYEUM INSIGHTS is working on a technical assistance application for people in different fields such as DIY, maintenance, and assembly, based on computer vision and Machine Learning. Are you interested in the original version and the ILYEUM Inno Lab projects? Contact us: labinno@ilyeum.com
For further :
Liens
https://vision.i2s.fr/fr/57-objectif-vision-360
https://www.scriptol.fr/robotique/vision.php
https://www.scriptol.fr/robotique/robots-humanoides.php
References
N. Kavitha, D. N. Chandrappa, « Comparative Review on Video Based Vehicular Traffic Data Collection for Intelligent Transport System« , International Conference on Recent Advances in Electronics and Communication Technology (ICRAECT) 2017, pp. 260-264, 2017.
Shanghang Zhang, Guanhang Wu, Joao P. Costeira, Jose M. F. Moura, “FCN-rLSTM : Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras” The IEEE International Conference on Computer Vision (ICCV), pp. 3667-3676, 2017.
Rajalingappaa Shanmugamani “Deep Learning for Computer Vision: Expert techniques to train advanced neural networks using TensorFlow and Keras” Packt Publishing, January 23, 2018.
Roland Siegwart, Illah Reza Nourbakhsh, Davide Scaramuzza, “Introduction to Autonomous Mobile Robots” The MIT Press; 2nd Edition, Kindle Edition, February 18, 2011.
Richard Szelisk “Computer Vision: Algorithms and Applications” Springer; 2011 edition (November 24,

The fuzyo group is recruiting
Are you a consultant and want to join a team of motivated and passionate experts?
Check out our job offers regularly and join us!