Comprehensive Guide: Top Computer Vision Resources All in One Blog

Save this blog for comprehensive resources for computer vision

7 min readJan 27, 2023

Working in computer vision and deep learning is fantastic because, after every few months, someone comes up with something crazy that completely changes your perspective on what is feasible.

After spending 2+ years in this field, I have found many interesting and helpful resources that will help you in your computer vision work. Also, they will show you how huge this domain is. I planned to add topics in a systematic way as we work on a computer vision project. So without any further due, let's start with,

Dataset generation

To train and evaluate computer vision models, we want some data. A dataset is a group of samples (in this case, photos or videos). Examples that fall within a specific topic or domain are typically included in datasets. Open datasets are those that anybody may access, download, and use for any purpose. We should include images which are having our targeted labels of classes. You can use the below resources for creating your data.

★Kaggle image datasets: Link
Users of Kaggle may discover and share data sets, study and develop models in a web-based data science environment, and collaborate with other data scientists and computer vision experts.
★ Datagen: Link

There are other sites where you can download basic images and then you can augment or process them. They are free to use and millions of images are present on the below site.

★ Unsplash
★ Pexel
★ Pixabay

I found these websites much more helpful to process with an easy to download images. I also mentioned 2 techniques for dataset creation in my blog so you can refer to them for further resources.

Annotation:

I purposefully placed annotation before augmentation because many annotation tools now have a facility for augmentation.

Annotation is the process where you want to mark a mask or bounding box around your target in an image in order to teach your model about the features of categories.

★ Roboflow: inbuilt facility for augmentation and annotation
★makesense.ai : Very helpful for annotations and collabration facility is also present
★Vgg annotator : Useful for faster masking
★ LabeIimg
★ V7
★ Labelbox
★ Scale AI
★ SuperAnnotate

Augmentation of images:

Image data augmentation is the process of generating new transformed versions of images from the given image dataset to increase its diversity.

★Roboflow: [Again !!!!] One of best augmentation tool for huge data
★ image_augmentor : It will help you for fast and different augmentations
★ My blog: I coded it with the help of library to augment images you can check this for customization.

Training of model:

Training is a teaching model to understand facts and draw predictions from them so that it can accurately carry out a task.

There are many models but I am providing links of papers with a repo for a few famous ones. I will provide papers for all models to understand what is its structure and its layerwise arrangements.

* Object detection

Object detection is a computer vision technique for locating instances of objects in images or videos.

An incredible explanation of working of YOLO by Dhaval Patel

★ YOLO v1: Paper and repo
★ YOLO v2: Paper and repo
★ YOLO v3: Paper and repo
★ YOLO v4: Paper and repo
★ YOLO v5: Paper and repo
★ YOLO v6: Paper and code
★ YOLO v7: Paper and repo
★ YOLO v8: [Paper was not realesed until this blog ]and repo
★ SSD: Paper and repo
★ Faster-RCNN: Paper and repo
★ Fast-RCNN: Paper and repo
★ Spatial Pyramid Pooling (SPP-net): Paper and repo

* Segmentation

It is the process of dividing an image into different regions based on the characteristics of pixels to identify objects or boundaries to simplify an image and more efficiently analyze it.

★ YOLOv8 Instance Segmentation
★ YOLOv7 Instance Segmentation
★ OneFormer
★ Mask RCNN
★ YOLOv5 Instance Segmentation
★ SegFormer

Libraries to know

Few libraries are helpful in all sub-activities in computer vision projects.

OpenCV ,SimpleCV,TensorFlow,Keras,MATLAB,PCL,DeepFace,NVIDIA CUDA-X,NVIDIA,Performance,Primitives,BoofCV,OpenVINO,PyTorch,Albumentations,Caffe,Detectron2,CUDA,YOLO

Interesting and simple projects that I found to improve your computer vision thinking:

image-net — computer vision challenge
1. How to read an image in Python using OpenCV — 2023
2. Sketchy — Sketch making Flask App — Interesting Project — 2023
3. How to detect shapes using cv2- with source code — easy project — 2023
4. Rotating and Scaling Images using cv2 — a fun Python application — 2023
5. How to use mouse clicks to draw circles in Python using OpenCV — easy project — 2023
6. How to perform Morphological Operations like Erosion, Dilation, and Gradient in Python using OpenCV — easiest explanation –2023
7. Object Detection using SSD — with source code — easiest way — fun project –2023
8. Face Recognition Based Attendance System with source code — Flask App — With GUI — 2023
9. Face Recognition — GitHub Link 1, GitHub Link 2, Video Tutorial
10. Easiest way to Train yolov7 on the custom dataset — 2023
11. Template Matching — Video Tutorial, Written Tutorial
12. Semantic and Instance Segmentation on Videos using PixelLib in Python — Video Tutorial, Code
13. Object Detection using Deep Learning — Video Tutorial, Written Tutorial
14. Drowsiness Detection using cv2 in Python — interesting project — 2023
15. Realtime Number Plate Detection using Yolov7 — Easiest Explanation — 2023

Simultaneous localization and mapping [SLAM] systems:

Simultaneous localization and mapping (SLAM) is the computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent’s location within it.

A video by Daniel DeTone explaining SLAM systems

★ DROID-SLAM : DROID-SLAM consists of recurrent iterative updates of camera pose and pixelwise depth through a Dense Bundle Adjustment layer.
★ DynaSLAM:DynaSLAM is a visual SLAM system that is robust in dynamic scenarios for monocular, stereo and RGB-D configurations. Having a static map of the scene allows inpainting the frame background that has been occluded by such dynamic objects.

*RGB (Monocular):

★ ORB-SLAM:ORB-SLAM is a versatile and accurate SLAM solution for Monocular, Stereo and RGB-D cameras
★ Kimera:An Open-Source Library for Real-Time Metric-Semantic Localization and Mapping.
★ PTAM: PTAM (Parallel Tracking and Mapping) is a camera tracking system for augmented reality
★ LSD-SLAM:LSD-SLAM is a novel, direct monocular SLAM technique. Instead of using keypoints, it directly operates on image intensities both for tracking and mapping.
★ SVO-SLAM:SVO uses a semi-drect paradigm to estimate the 6-DOF motion of a camera system from both pixel intensities

Neural Radiance Field (NeRF)

A neural radiance field (NeRF) is a fully-connected neural network that can generate novel views of complex 3D scenes, based on a partial set of 2D images.

★ Bmild
★ nerf-pytorch
★ nerf_pl
★ MobileNeRF

Interesting blogs related to computer vision:

★ Introduction to object detection by Analytics vidya: Part1, Part 2, and Part 3
★ Instance segmentation : To understand all about instance segmentation
★ Semantic segmentation:To understand all about instance segmentation
★Ultimate Guide to Object Detection Using Deep Learning: Step by step approch to understand deep learning
★ Image processing : To understand basics of image processing
★ All CNN architectures : Understanding of basic cnn architectures

Informative videos related to computer vision:

☆ MIT 6.S094: Computer Vision by Lex Fridman
☆ CNN Architectures by Michigan online
☆ Tensorflow Object Detection by Nicholas Renotte
☆ Detection and Segmentation by Stanford
☆ CNN by Andrej Karpathy (2016)
☆ CNN by Stanford University School of Engineering (2017)
☆ Introduction to Deep Learning and Self-Driving Cars by Lex Fridman [MIT 6.S094]
☆ Deep Learning State of the Art by Lex Fridman
☆ Stanford Machine Learning Course — Andrew Ng

Research Papers

These are a few research paper sources where you can get easily papers for any required model and method.

★ arXiv.org
★ ICLR
★ Awesome — Most Cited Deep Learning Papers

Other Resources

★ GitHub — A famous host of open-source software projects.
★ Quora — Seek help and ask any questions here if you have any difficulties!
★ nducthang/deep_learning_object_detection
★ nducthang/Active-learning-for-object-detection

I will be updating this blog frequently because there are many things that are not covered in this blog. You can follow me for new updates. Also, you can suggest topics to add to make it more useful for newbies as well as for all computer vision engineers. Let’s embrace AI!

If you have found this article insightful

Give article claps if you liked this article

If you found this article insightful, follow me on Linkedin and medium. you can also subscribe to get notified when I publish articles. Let’s create a community! Thanks for your support!

If you want to support me :

As Your following and clapping is the most important thing, but you can also support me by buying coffee. COFFEE.

Comprehensive Guide: Top Computer Vision Resources All in One Blog

Save this blog for comprehensive resources for computer vision

Dataset generation

Annotation:

Augmentation of images:

Training of model:

* Object detection

* Segmentation

Libraries to know

Interesting and simple projects that I found to improve your computer vision thinking:

Simultaneous localization and mapping [SLAM] systems:

*RGB (Monocular):

Neural Radiance Field (NeRF)

Interesting blogs related to computer vision:

Informative videos related to computer vision:

Research Papers

Other Resources

If you have found this article insightful

If you want to support me :

You can read my other blogs related to :

Working on a Computer Vision project? These code chunks will help you !!!

An introduction to a few “used to” methods in a computer vision project

YOLO v8! The real state-of-the-art?

My experience & experiment related to YOLO v8

10 AI Websites That Will Excite You to The Core!

interesting artificial intelligence-based websites and their working

Simultaneous Localization And Mapping [SLAM] systems

Introduction to outperforming DROID-SLAM system

Signing off,

Chinmay

Mlearning.ai Submission Suggestions

How to become a writer on Mlearning.ai

Written by Chinmay Bhalerao