Computer vision

Computer vision studies how computers can be programmed to understand digital videos or images. Computer vision is vital in areas such as object recognition and autonomous driving. In both NLP and computer vision, deep learning approaches have proved to be extremely powerful. 

Visual perception is one of the most complex processes. What comes so easily and naturally to us without our even thinking about it requires the constant processing of incredible amounts of data. We use our eyes in everything we do in life and vision is arguably the most important sense – and an extremely sophisticated one that took a long time to develop.  

Teaching a computer to see like humans do has long been on the minds of researchers. Even though there had been advancements and successes in that field, things changed when neural networks entered the scene. Thanks to their ability to model the complex relationships they led to a leap of performance in the fields of object recognition and computer vision. So suddenly the creation of AI system to automate extremely complex tasks that seemed impossible some 20 years ago such as autonomous driving, were put within the realm of possibilities sooner rather than later. 

For us, an image is the interplay of millions of visual inputs and objects whose interrelationships make up the whole, which eventually gives us meaning. For a computer, an image is just another array of 0s and 1s put together in a strange way. 

It should not come as a surprise, then, that if we were to train, say, a GLM to detect an object in an image, we would not get very far. There are far too many relationships and they work in ways that are too difficult to be discerned by a linear model. But deep neural networks, with their millions of neurons and billions of connections between them, are able to make sense of such a mess. Through their ability to create and discern abstract features in a dataset, they can discern edges, traits and ultimately objects in an image. This has enabled significant advancements in computer vision. Today, neural networks have been trained to perform incredible tasks, e.g. recognize all sorts of images in a picture or detect the sexual orientation of a person based on facial traits. 

Data Navigator Newsletter


Data Navigator Newsletter