Computers are unable to analyse unstructured data such as text, image, audio or geospatial data out of the box. To perform any sort of analysis, such as creating image based recommendation systems or text topic analysis, we need to convert them into a numerical format that a computer can understand. By using machine learning, we can convert the unstructured data into a numerical formats and feed it into specialised vector algorithms for a variety of use cases.
This conversion from unstructured data to numerical data is called vectorizing or encoding. Each item, that could be a single word, an image, or an audio file, is encoded into a list of numbers: this list is called a vector. The encoding process can be done by different kinds of algorithms, notably if it is done through a deep neural network, it is often called a deep learning vector embedding. Once encoded into a vector, the longer it is in length, the more information is represented about the data it encoded.