A Non-Deep Learning Approach to Computer Vision

A World of Computer Vision Outside of Deep Learning

Mwanikii
Heartbeat

--

Photo by Museums Victoria on Unsplash

IBM defines computer vision as “a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs [1].”

Computer vision (CV) applications are often limited to deep learning methods. It may shock many people that deep learning is one of many approaches to CV, despite how ubiquitous it is in modern applications.

I stumbled upon a simple question, what was there before neural nets could efficiently be run on the average computer? Surely research in this field yielded results that could be used without requiring neural networks. If your curiosity brought you here, stick around to see what CV looks like without deep learning.

Scale-Invariant Feature Transform (SIFT)

This is an algorithm created by David Lowe in 1999. It is a potent force for many applications and has widespread use in fields such as robotics and object detection.

It’s a general algorithm that is known as a feature descriptor. It learns the features of different images similarly and does not have the limitation of working like a deep neural net, i.e., learning features specifically from the training set.

It has a relatively straightforward method of operation. After picking the set of images you desire to use, the algorithm will detect the keypoints of the images and store them in a database. Object detection will then occur via feature comparison between any new image’s features and the database’s features.

This is an oversimplification, as many steps occur between keypoint storage and feature comparison to detect things in images.
Another valuable feature descriptor algorithm is the Speeded Up Robust Features algorithm.

Speeded Up Robust Features (SURF)

This algorithm performs many of the same functions as the Scale-Invariant Feature Transform Algorithm. Its applications include 3D reconstruction and object recognition.

Discussing this algorithm without mentioning the previous one is difficult, as SIFT inspired SURF. Many of the significant steps taken to perform tasks are similar to those in the SIFT algorithm, albeit with some changes.

One example of this can be noted in the first step of the SURF algorithm. Instead of using cascaded filters to get the keypoints, it uses square-shaped filters. The incremental changes in all the steps lead to a boost in performance and even give the SURF algorithm a significantly higher performance than the SIFT algorithm.

Features from Accelerated Segment Test (FAST)

This is a corner detector algorithm and it does exactly that. It detects corners. Corners are an important feature to detect as it is possible to pinpoint and track the position of solid objects or regions.

It includes only two essential steps, unlike the previous methods that we highlighted. The first step looks at a particular point and uses a pixel circle. If it satisfies the given conditions of the algorithm, it will be classified as a corner. A “high-speed test” is performed to reject points that do not emerge as corners.

Despite the above simplification, we can rationalize that the importance of this algorithm is that we can tell a point that is a corner apart from one that isn’t in a given problem.

It is also valuable to note that it is faster than the solutions used in the SURF and SIFT algorithms, such as difference of Gaussians (DoF).

Advantages

One of the primary advantages of falling back on traditional computer vision methods is that you can save time and resources on a given task. In some scenarios, it is possible to completely avoid using deep learning methods such as deep neural networks and settle for a simpler algorithm.

Another advantage is that these algorithms are not limited to working independently. It is possible to improve the performance of these algorithms with machine learning algorithms such as Support Vector Machines. This is a good way of improving their performance and still not expending computing resources using deep learning.

One critical factor when using a model is whether it can easily be interpreted and tweaked accordingly. Traditional CV methods rely on parameters that can easily be adjusted whenever there is an issue. On the other hand, deep learning methods become more complicated when you consider the possible existence of millions of parameters for a given problem.

An additional advantage these traditional computer vision methods offer is easy deployability on microcontrollers and such devices on the edge. Increased access to computers has dramatically led to a decrease in the price of powerful machines. This doesn’t mean we should encourage computing expedience; it means that we now have a lot of variety in the tools we use and should be considerate about using them appropriately.

Finally, and probably most importantly, these traditional computer vision algorithms could be used to improve the performance of deep learning algorithms. Features extracted from them could be fed into neural networks for enhanced performance.

There is a strong case for using traditional computer vision algorithms over deep learning algorithms. In industries, academia, and even personal projects, there is always an opportunity to use a simpler solution that uses fewer resources and is fairly more efficient.

Deep learning is powerful and has made many strides in the right direction through painstaking development, but it is not a solution that will fit everywhere. The most appropriate way forward for a given problem is the most relevant onet, not the most ubiquitous.

Sources

[1] IBM. “What is Computer Vision?” IBM — United States. Accessed April 23, 2023. https://www.ibm.com/topics/computer-vision.

[2]O’Mahony, Niall, Sean Campbell, Anderson Carvalho, Suman Harapanahalli, Gustavo Velasco Hernandez, Lenka Krpalkova, Daniel Riordan, and Joseph Walsh. “Deep learning vs. traditional computer vision.” In Advances in Computer Vision: Proceedings of the 2019 Computer Vision Conference (CVC), Volume 1 1, pp. 128–144. Springer International Publishing, 2020.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Writer. Techie. History buff. If it changes the world I’m on its case. Open for gigs… freddynjagi@gmail.com! Published by the Writing Cooperative.