SPT’s Sandesh Singh: "Data Annotation is the future of ML and AI systems"

Published by Rhys Taylor-Brown on June 26th 2021, 7:09am

Sandesh Singh, a consultant at global boutique information solutions and services provider Systems Plus Transformations, discusses the importance of Data Annotation to Machine Learning [ML] and Artificial Intelligence [AI] systems in helping machines to learn.

Established in 2012 as an arm of the Systems Plus parent group, Systems Plus Transformations is a global boutique information solution and services provider. Based in Harrow, with offices in Mumbai and Pune in India, the organisation’s primary objective is to give its clients a competitive advantage in business intelligence, analytics, artificial intelligence and machine learning, and cloud services.

Within our day-to-day lives, AI and ML have taken up a far more integral role when it comes to technology. What is less known, however, is how AI and ML systems function and how they are able to identify and understand things.

Although these technologies are - as Systems Plus Transformations consultant Sandesh Singh puts it - an “intense science”, understanding the root of AI and ML processes does help us understand them far better.

Sandesh writes: “When you learn a new language, you repeat it everywhere and get familiar with your thoughts. When you want to identify something, you create an image of that in your mind multiple times to make it a part of your subconscious mind. So, when you see the object the next time, you know what it is and how it looks like.

“As you learn things in life through repetitions, machines do the same through AI and ML. It means, they learn through repetitions, by fetching the same data again and trying to make predictions based on past data.”

How data is fed into a machine to make this possible all goes through a process called ‘Data Annotation’, as Sandesh explains.

“Data Annotation involves adding metadata to a dataset. This metadata usually takes the form of tags, which can be added to any data, including text, images, and video. Adding comprehensive and consistent tags is a key part of developing a training dataset for machine learning.

“The Google definition of ‘Data Annotation’ is ‘the process, which trains the machine to recognise an object’. This is a repetitive process. Machines do not understand anything in the first instance. So, numbers of samples are fed into the machines for better understanding and/or identification of the objects. This data being fed to the machines can be in different formats such as text, image, video, and audio.”

Of course, this begs the question: how can data be captured and then fed into a machine to enable it to learn?

As Sandesh describes, data capturing and feeding can take numerous forms. In existence are several data formats including text, images, video and audio. In order to capture data, however, an element of human intervention is required.

Sandesh says: “For a better understanding, let us consider Data Annotation using images. First, for the most part, image annotation is used to identify the objects present in the image. So, several videos are captured using various high-end sensors or cameras. That video is converted into images, frame by frame, as per requirement. Then those images are stored and sent for the annotation process during the annotation process. The teams working on the process are responsible for identifying the objects present in the images properly. This process of annotating the images is also known as ‘Image Labelling’.

“The labelling is therefore done as per the project requirement. The more accurate the labelling, the better the machine learns. Once the images [data] are properly labelled, the data is fed into the special algorithm where the machine fetches the data and starts learning about the objects that you label.

“Now, again this is a repetitive process. Machines take a lot of time to learn. Therefore, the machine is fed with hundreds and thousands of samples from various angles and various labelling standards. So, this is how the process of Data Annotation works.”

The reason why these processes of Data Annotation are so important to AI and ML systems is simple: they enable machines to learn. As Sandesh highlights, in the very same way a human cannot read something in a particular language if they cannot understand that language, ML and AI cannot perform if the machine does not know how to identify and understand things. These processes are, therefore, necessary for the technology that is becoming increasingly vital in our daily lives to work.

Sandesh concludes: “Whether it is in humans or in machines, training is essential. This is what Data Annotation effectively is. There are various formats one can use to train machines, such as image, text, videos, and audio. In a way, we can call Data Annotation the future of AI and ML, for without data, machines cannot learn.”

Share this article


Leaders of Great Britain

About Leaders of Great Britain

Leaders of Great Britain hosts a series of engaging events featuring prominent figures from the worlds of politics, sports, business, and entertainment. Our goal is for every attendee to leave these gatherings with profound leadership insights that transcend boundaries. Learn More.


Related Features


Authored By

Rhys Taylor-Brown
Junior Editor
June 26th 2021, 7:09am

Follow Us

Follow @LeadersGBNI on Twitter for more live updates

Share this article


Popular Features

FEATURES | Published October 7th 2024, 4:04 pm

Margaret Ollivier: Ensuring Respect and Inclusivity at Expect Ltd

FEATURES | Published September 16th 2024, 11:11 am

Andrew Martin: Steering the Course of Planning and Development

FEATURES | Published July 26th 2024, 7:07 am

Paul Bowley: Transforming Lives with Abbeycare Group

© Copyright 2024, Leaders of Great Britain.