Brett Grentell

This Week in AI: Let us not forget the humble data annotator

Written by Kyle Wiggers 31/3/2024

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

This week in AI, I’d like to turn the spotlight on labeling and annotation startups — startups like Scale AI, which is reportedly in talks to raise new funds at a $13 billion valuation. Labeling and annotation platforms might not get the attention flashy new generative AI models like OpenAI’s Sora do. But they’re essential. Without them, modern AI models arguably wouldn’t exist.

The data on which many models train has to be labeled. Why? Labels, or tags, help the models understand and interpret data during the training process. For example, labels to train an image recognition model might take the form of markings around objects, “bounding boxes” or captions referring to each person, place or object depicted in an image.

The accuracy and quality of labels significantly impact the performance — and reliability — of the trained models. And annotation is a vast undertaking, requiring thousands to millions of labels for the larger and more sophisticated data sets in use.

So you’d think data annotators would be treated well, paid living wages and given the same benefits that the engineers building the models themselves enjoy. But often, the opposite is true — a product of the brutal working conditions that many annotation and labeling startups foster.

Companies with billions in the bank, like OpenAI, have relied on annotators in third-world countries paid only a few dollars per hour. Some of these annotators are exposed to highly disturbing content, like graphic imagery, yet aren’t given time off (as they’re usually contractors) or access to mental health resources.

An excellent piece in NY Mag peels back the curtains on Scale AI in particular, which recruits annotators in countries as far-flung as Nairobi and Kenya. Some of the tasks on Scale AI take labelers multiple eight-hour workdays — no breaks — and pay as little as $10. And these workers are beholden to the whims of the platform. Annotators sometimes go long stretches without receiving work, or they’re unceremoniously booted off Scale AI — as happened to contractors in Thailand, Vietnam, Poland and Pakistan recently.