Going through lectures from standford on Computer vision
1st video ( Introduction ) https://www.youtube.com/watch?v=yp9rwI_LZX8&t=3897s
Summary :-
- 90% of the data in internet is visual data ( pictures, videos etc ), impossible to utilize these data manually.
- Big Bang of evolution - 543m BC years back researcher predicts that something strange happened and animal started to diversify accross earth. Many have different theory but one zoologist from australia had a convincing theory that it was onset of the eyes. Logic is simple, once the eyes were developed animals started moving around in search for food.
- In 1900s it was discovered by scientists that the primary visual cortex ( part of brain that processes the visual data ) (neurons in primary visual cortex are organized in columns like - a hidden layer in ANN ) first starts to process or understand visual data by looking at edges, simple bars etc. Every column of the neuron likes to see a specific orientation of a stimuli ( visual data ). Begining of visual processing is not whole picture, it is simple structures.
Two insights
- Vision starts with simple structures
Vision is hierarchical.
Learning important features of an object is important, because it helps in recognizing the object in totally different angle
Focus on Neural Network - CNN for classification, localization, object detection, instance detection.
2nd Video ( classification in images ) https://www.youtube.com/watch?v=t2IwlUtbCFE
Summary :-
- Challenge in classification is to identify object in different scenarios like an object could be illuminated, could be in dark surrondings , could be partially hidden, could be in deformed shape , we as humans can still identify the object so the challenge in computer vision is to be able to identify the object in all these situations.
- Data driven approach as there are huge amount of images avialable for classification.
Showed NearestNeibhour implementation for classification, drawback - slow with scale
NN Options avialable to increase the speed with FANN implementation.
NN is fast training time, slow test time
CNN is slow training time, fast test time. - this is what we need.