Data-Centric, Human-Powered: Exploring the double-meaning of ‘humans-in-the-loop’
In the last issue of The Batch, DeepLearning.ai founder Andrew Ng opened up about his late grandfather and newly-born child. He briefly described his grandfather’s storied and admirable life, and how he was only days away from meeting his first great-grandson. Expressing a mix of loss and celebration, Ng emphasized how time is scarce and, therefore, precious.
Why did Ng, the foremost figure in the AI movement, begin a newsletter like this, and in the midst of DeepLearning.ai’s Data-Centric AI Competition? What do mourning, childbirth, and timing have to do with Machine Learning? In a field focused on teaching machines, how human are we allowed to be?
What’s inside this article:
The Data-Centric AI Competition
A collaboration between DeepLearning.AI and Landing AI, the Data-Centric AI Competition aims to improve the performance of machine learning models through new, data-centric approaches. Where most machine learning competitions focus on designing high-quality models, the competition challenges participants to improve the dataset instead. Through myriad enrichment methods, anyone has the chance to put their money where their ‘data-centric’ mouth is (which is something Innotescus co-founder Shashank Deshpande is doing quite well; just check the leaderboard…)
The data-centric movement is all about enhancing the data we use. Most AI research focuses on improving models, even though improving data quality has a greater impact on model performance. It is a very persuasive proposal. The work a lot of innovators are doing now is figuring out how. Which strategies succeed in improving data, and how do we best implement them?
Data enhancement depends on a tried and tested approach to machine learning, in which experts use what they know to shape and curate datasets, and ultimately enhance model performance. The human-in-the-loop (HITL) approach is, typically, a good way to keep your model progressing and moving forward. Humans intervene to alter the course of operations by finessing a model’s output, creating more and improved information for the model to use next time. It can be useful when data is scarce or being interpreted incorrectly and when models (or the humans who created them) need to be more efficient and accurate.
The human/AI dialectic
But HITL design is more than just dependent on humans. Rather it is strategic about where humans intervene. Automation does the work, and we craft it further. What results is a dialectical process between man and machine; an AI that is intrinsically human.
However, these machines are only as capable as their designers are humble. We are as much a liability to our models (and each other) as they are to us. “It seems that both good data science, and good citizenship, require epistemic humility.” Put another way, both ends of HITL design can be wrong, and may try to convince you to the contrary.
A non-technical definition
But how else can we look at HITL processes? One idea: focus instead on humanness-in-the-loop. We’ve already defined the “humanity” intrinsic to HITL AI, but where else can we find it? Rather, instead of something we have to discover, maybe humanness-in-the-loop is something we have to admit.
ML Leaders Remembering Out Loud
Eulogizing is one of the oldest ways to address the public. Through it, communities come together and deal with one of the hardest parts of being human. The eulogy appeals to values we share, like most epideictic speech. These kinds of ceremonial addresses are all about blame or acclaim, and here and now. (as Socrates says, “It’s not hard to praise Athenians when you’re in Athens.”)
Ng is clearly acclaiming his grandfather—but here (a newsletter) and now (the last week of the Data-Centric AI Competition.) The next headline in Ng’s newsletter reads:
The juxtaposition of life lost and the precarity of health in data science creates something that a research paper usually can’t; sympathy. By reflecting on the deep impact one life had on him, Ng walks his reader to the question, “So, who is this statistic?” It’s subtle (and maybe even unintentional) but it illustrates the inevitability of humanness-in-the-loop. Numbers become humans the same way humans become numbers.
In our moment of campy dancing robots and crypto volatility, prioritizing people may seem too-simple or too-vague an idea for such complex issues; and I don’t disagree. But I think that’s what we need to do; make our work data-centric by also making it human-centric. The data-centric movement sprang from the belief that our models can only do so much. Humanness, intrinsic to us and to the AI we create, is the thing that comes next. Even if AI is the new electricity, we still have to flip the switch.
We are a group of scientists, engineers, and entrepreneurs with a vision for better AI. With backgrounds primarily in Machine Learning and Computer Vision, the Innotescus team understands the importance of having full control over and insight into data used to train Machine Learning models.
For media inquiries, please contact: email@example.com