Going through lectures from standford on Computer vision

1st video ( Introduction ) https://www.youtube.com/watch?v=yp9rwI_LZX8&t=3897s

Summary :-

  • 90% of the data in internet is visual data ( pictures, videos etc ), impossible to utilize these data manually.
  • Big Bang of evolution - 543m BC years back researcher predicts that something strange happened and animal started to diversify accross earth. Many have different theory but one zoologist from australia had a convincing theory that it was onset of the eyes. Logic is simple, once the eyes were developed animals started moving around in search for food.
  • In 1900s it was discovered by scientists that the primary visual cortex ( part of brain that processes the visual data ) (neurons in primary visual cortex are organized in columns like - a hidden layer in ANN ) first starts to process or understand visual data by looking at edges, simple bars etc. Every column of the neuron likes to see a specific orientation of a stimuli ( visual data ). Begining of visual processing is not whole picture, it is simple structures.

Two insights

  • Vision starts with simple structures
  • Vision is hierarchical.

  • Learning important features of an object is important, because it helps in recognizing the object in totally different angle

  • Focus on Neural Network - CNN for classification, localization, object detection, instance detection.


2nd Video ( classification in images ) https://www.youtube.com/watch?v=t2IwlUtbCFE

Summary :-

  • Challenge in classification is to identify object in different scenarios like an object could be illuminated, could be in dark surrondings , could be partially hidden, could be in deformed shape , we as humans can still identify the object so the challenge in computer vision is to be able to identify the object in all these situations.
  • Data driven approach as there are huge amount of images avialable for classification.
  • Showed NearestNeibhour implementation for classification, drawback - slow with scale

  • NN Options avialable to increase the speed with FANN implementation.

  • NN is fast training time, slow test time

  • CNN is slow training time, fast test time. - this is what we need.


https://medium.com/ilenze-com/object-detection-using-deep-learning-for-advanced-users-part-1-183bbbb08b19

results matching ""

    No results matching ""