In computer vision (CV), the overall image processing pipeline consists of many steps that can be roughly split into image pre-processing and recognition. Convolutional neural networks (CNNs) have become a popular choice for recognition, but the pre-processing pipeline consists of many CV algorithms. These algorithms require the manual tuning of many parameters to achieve best-quality recognition. One startup, Algolux, has developed a novel way to solve the problem of image tuning via applying machine learning. Tractica sat down with chief executive officer Allan Benchetrit to understand more about its technology and market traction. Below is the transcript from that interview.
Tractica: What industry problem are you trying to solve?
Benchetrit: [The] automotive industry is going through transformation driven by safety and autonomy features. Cameras are being used as the primary automotive sensor for these systems. Vehicles today have anywhere from one camera in lower-end vehicles to five for the more sophisticated cars with surround view and advanced CV capabilities, like the Ford F-150 or BMW 7 Series. We’ve heard from vehicle [original equipment manufacturers] (OEMs) and Tier Ones that this number is expected to grow to 8 to 12 cameras per vehicle as increasing levels of autonomy increase.
Tractica: Why is image quality important for these cameras?
Benchetrit: Image quality, or IQ, is critical both for driver assistance systems for driver viewing [and] for CV tasks, such as collision avoidance, driver monitoring, or lane keeping. For the CV tasks, having the image quality expertly tuned for the most accurate detection results is a fundamental requirement. The better the image quality, the better the results.
Tractica: What are the stages in an image signal processor (ISP)?
Benchetrit: ISPs can have approximately 6 to 14 processing blocks, depending on the application. Typical ISPs include processing stages, such as debayer, demosaic, defective pixel correction, color and tone correction, deblur, and denoise, to name a few. More advanced ISPs have A3 (auto white balance, auto exposure, and auto focus) and high-dynamic range (HDR) processing. Each stage can have tens or even hundreds of parameters.
Tractica: What parameters need tuning?
Benchetrit: As tuning today is done manually, it would take a team years of time to work through all the parameter combinations to find the optimal settings for the application determined by the particular expert looking at it, which is very subjective. The team usually picks a subset of the parameters to tune based on experience and [does] the best they can within the project schedule and budget.
Tractica: How do you apply machine learning for the image quality tuning process?
Benchetrit: Algolux approaches the image quality challenge uniquely by treating it as an optimization problem, rather than a visual tuning problem. Our CRISP-ML platform integrates to the ISP and CV hyperparameters and applies our machine learning solvers to intelligently find the right parameters against defined metrics, or [key performance indicators] (KPIs), for visual image quality or CV accuracy. This allows us to achieve results in hours or days, rather than the months it takes teams using the traditional approach. As this type of extreme multi-dimensional optimization problem is non-linear and non-convex, we had to develop very novel technology to ensure we converge to the best-case solution for the ISP or CV parameters based on the target KPIs. We’ve also been developing a solution for CV to significantly improve accuracy by leveraging neural networks in a very innovative way, which we plan to share in a future announcement.
A CV algorithm looks for different things than the human eye, so manual tuning and visual inspection can never achieve the best possible result. The ISP settings that deliver the most pleasing visual result (denoised, etc.) actually remove valuable pixel data from the image or video stream, giving the classifier or object detector less to work with (see Figure 2 below).
Note: The left-hand image classification results occur when image quality is tuned for human vision (visually pleasing but incorrect results) versus when tuned for CV (less pleasing but correct results).
Tractica: How was Algolux founded?
Benchetrit: Algolux is headquartered in Montreal and was founded in April 2014 after graduating from TandemLaunch, a tech-transfer incubator also based in Montreal. We received seed funding to pioneer a new algorithmic optimization approach to image processing. During the development and early customer pilot process, we recognized that customers had a growing need to improve their image quality tuning process. We tailored our technology and found it could be very well suited to optimizing the customers’ existing ISP and CV implementations, and CRISP-ML was born. More recently, we were selected out of over 1,000 companies to participate in the renowned Plug and Play accelerator’s latest Mobility startup batch. It’s great to get that recognition and validation of our leadership from their corporate partners, many of whom we’re already engaged with.
Tractica: What are some of the success stories?
Benchetrit: While we can’t share details regarding our current pilots, we are demonstrating a great example where we optimized a camera platform based on the Intel Movidius Myriad2 processer and OmniVision OV2740 sensor against Microsoft’s public Skype Standard and Premium KPIs for image quality. We were able to achieve visually impressive image quality results after only about 6 hours with CRISP-ML, achieving results that are orders of magnitude faster than comparable manual methods for this particular case.