The evolution of machine vision

Mouser Electronics
By Jon Gabay, Mouser Electronics
Friday, 09 August, 2024

Since the first weather satellites, machines have been endowed with “vision”, setting the stage for the evolution of machine vision applications. In the early stages, these applications heavily depended on human intervention for the analysis and extraction of crucial information. However, with continual advancements, modern image-processing techniques have surpassed human capabilities, discovering elements beyond our perception. The process of digitising image data has played a significant role in achieving these capabilities. The integration of artificial intelligence (AI) into these digitised systems has opened a new realm of possibilities, transforming machines into more sophisticated tools. As we stand at the cusp of this revolution, more devices are incorporating vision into their core functionalities.

Introduction of the digital darkroom

News services and television broadcasts accelerated the widespread adoption of digital images. The first digital image, credited to Russell Kirsch in 1957, marked a pivotal moment in the history of image processing. A major stride was made in the late 1970s when news bureaus began using a software-based approach that leveraged the power of early microprocessors to create the first digital darkroom. In 1987, the early Macintosh program “Digital Darkroom” made history as the first tool available to the public that could edit and manipulate images. The advent of high-resolution scanners subsequently brought the power of image digitisation to everyday users. The digitised images allowed a variety of manipulations, such as cropping, adjusting contrast and brightness, and resizing, as well as basic enhancements like edge convolution. As technology advanced, these capabilities evolved and extended, calling for superior hardware and more refined software solutions.

Integrating machine vision into society

In our modern era, image sensors and machine vision have become integral to our lives. These technologies are present everywhere, from personal devices such as phones, tablets and laptops to industrial imaging systems in factories, vehicles and highways. The cost-to-performance ratio of these systems has dramatically improved over the years, and the ubiquity of digital cameras extends to our homes and public buildings, where they provide security. The advancements in image-sensing technology have significantly eased the burden on engineers, who now have access to a range of modular cameras with built-in processing and communication capabilities.

Machine vision has greatly benefited society. In factories, it is used to inspect products, control robotic manipulators, detect flaws, and ensure safety. For example, modern high-frame-rate auto-focusing cameras can inspect thousands of solder joints on a circuit board within seconds, a task that would take humans several minutes (Figure 1).

Figure 1: Inspecting unstuffed and stuffed PC boards can detect manufacturing flaws much quicker and more accurately than any human. Image credit: iStock.com/kynny.

We also see machine vision in vehicle cameras and image sensors that provide backup views, presence detection, and collision avoidance. In the medical field, high-resolution image sensors are used for everything from diagnosing fractures to identifying individual cells in a tissue sample. The integration of machine vision and AI has outperformed experienced doctors in these areas, often discovering new indicators that were previously unknown. In cities, street and building cameras use real-time facial recognition for security or targeted advertising. Both law enforcement and the military have also adopted machine vision technology for use in drones, missiles, aircraft, and satellites.

Modern capabilities of machine vision

While there has been an exponential increase in resolution and frame rate, the most notable advancement in machine vision is stereoscopic vision, which enables machines to perceive in three dimensions. This development has had profound implications in the industrial and manufacturing sectors, aiding in the swift identification and classification of objects and enhanced flaw detection.

The 3D perception has revolutionised manufacturing processes and self-driving vehicle operations. Coupled with robust processors and deep memory pools, 3D machine vision systems can swiftly identify an object, track its movement, discern its speed and direction, and predict its future location.

Evolving design challenges for next-gen machine vision systems

As camera capabilities increase, so does the need for faster, more advanced electronics. For example, an old 256 × 256 px grayscale camera needs 65,536 bytes to capture a single image. At a moderate 30 frames per second required to exceed the flicker fusion rate of the eye, this takes 1,966,080 bytes for a second of video. By contrast, a high-end, high-resolution, fast-frame-rate camera can process 4,000 frames per second (FPS) at 1080 × 800 px 24-bit resolution. This translates to 2,592,000 bytes per image, with an astounding 10,368,000,000 bytes for one second of video.

To meet these memory demands, the processing and communications speeds of the supporting hardware have skyrocketed. So too has the need for massive pools of very-high-speed memory (typically DDR4), where both clock edges are used and a write accompanies each read. Fortunately, multi-core processors and dedicated FPGA hardware can be used to pipeline data streams and perform rudimentary image enhancements like bit-plane separation in real time. Bit-plane separation can detect edges by simply examining the most significant bits in its own monochrome image.

Perhaps the most significant development with machine vision has been the development of the graphics processing unit (GPU) coupled with AI processors and neural networks. GPUs use internal parallel processing techniques and dedicated image manipulation hardware to simplify the design engineer’s job significantly. And because machine learning loves large data sets, the marriage of GPUs with AI is taking these capabilities to the next level.

Design engineers for modern machine-driven applications must also consider communications systems fast enough to transport image data from point to point. For example, modern cars use 100 Mbps Ethernet networks to transport medium-speed, medium-resolution images to displays and the cars’ supercomputers. Additionally, these computers need a large amount of non-volatile flash storage for their event data recorders (commonly known as black box recorders), which continuously record data for accident reconstruction and criminal investigations.

Modern machine vision also often requires night vision. For example, in automotive applications, a forward-looking camera with night vision can alert the driver to potential hazards before the driver can see them.

Modules make it easier

The imposing learning curve of AI vision design can slow a good idea’s time to market. Fortunately, modules such as the Advantech ICAM-520 industrial AI camera (Figure 2) flatten that curve. Based on an industrial-grade 1.6MP Sony image sensor, the ICAM-520 features a programmable variable focus lensing system and multiple Arm processors for cloud-to-edge vision AI applications.

Figure 2: Advantech ICAM-520 industrial AI camera. Image credit: Mouser Electronics.

The ICAM-520 incorporates an NVIDIA Jetson Xavier NX system-on-module (SoM) with a 64-bit Camel Armv8.2 CPU. The 70 × 40 mm Jetson Xavier NX, touted as the world’s smallest supercomputer, is a full-featured multi-modal AI engine designed specifically for autonomous machine designs. Machine learning capabilities are built right into the ICAM-520, which comes with an HTML5 web-based utility for integration into cloud services supporting V4L2 and RTSP interfaces.

The 60FPS ICAM-520 includes a USB Type-C port for high-speed data transfers and a 10/100/1000 auto-negotiating Ethernet port. An integrated HDMI 2.0 port allows easy and direct connection to a local monitor or display. An RS-485 port is also available for peripheral control, command, or status. Integrated digital I/O allows customising a user interface.

The module provides embedded 8GB level 2 and 4MB level 3 internal cache memory, plus 16GB of eMMC storage.

Conclusions

The concept of machine vision has evolved far beyond a simple camera and display. Today, machine vision describes a comprehensive image-processing system with diverse capabilities tailored to specific requirements. High-performance processing and memory are essential components for any advanced application.

Luckily, not all machine vision designs are overly complex. With the aid of image modules and processing engines, machine vision technology is becoming more accessible to a broader group of designers. Advanced and faster memory technologies are meeting the challenge of manipulating, storing, and transmitting captured data.

Design practices like wider buses, pipelined data streams, parallel processing, and AI are providing invaluable assistance to engineers. By using camera modules with embedded AI, designers can concentrate on the application, not the video source. High-quality modular camera assemblies like the Advantech ICAM-520 ease the task of custom design and expedite the market launch of products with advanced capabilities.

Top image credit: iStock.com/warat42

The evolution of machine vision

Introduction of the digital darkroom

Integrating machine vision into society

Modern capabilities of machine vision

Evolving design challenges for next-gen machine vision systems

Modules make it easier

Conclusions

Mouser Electronics: Powering Innovation for Australian Engineers

Thermal Profiling: Optimising Soldering Operations

How to specify the perfect plastic enclosure for your electronics

Content from other channels on our network