Data-centric AI is all about taking control of not just our “data,” but all of the processes that bring our data together.  Everything from class-balancing, ambiguity analysis, and even simple data clean-ups serve the same goal; recognizing that models can do so much if the camera is in focus. 

So the data-centric approach is much more than just polishing your finished product. Effective implementation can, and should, occur at every point in the MLOps cycle. A phase in the cycle that could most benefit from introducing these methods is the actual labeling of data; possibly the best way to do so is to focus on in-app usability.

New to MLOps? Read about it here or listen to how Innotescus gets it done in our previous webinar.

Analyzing usability, or, ease of use, is a quality-forward approach to the user-facing operations we design. Our products are more usable when they prioritize effectiveness, efficiency, engagement, error tolerance, and ease of learning. The better a platform can be learned and implemented, the higher quality the dataset it creates. Designing and redesigning with use in mind is a key data-centric approach.

Measuring usability makes us clearly define our goals and honestly whether we accomplish them or not.

Here are four (of many) tips to help you improve your work.

1 — Make sure your platform knows something—fully manual platforms are tedious, but fully automatic platforms prove unreliable. When optimizing performance, designers and developers know that there is always a semi-automatic “sweet spot” they need to find. This challenge has two clear sides; leave space for users and prevent potential errors.

But automation is more than just skipping a few steps. Effective UI makes sure that the platform has built-in knowledge a user can effortlessly combine with their own. But what does this look like?

Imagine a project that asks users to tag every instance “bear” and “animal” in a photo, but ignores that bears are, in fact, animals themselves?

Or instead, a user has domain-specific knowledge on what they’re annotating, but the user can do little more than draw a box and click “next.” 

In both cases, the task being asked of users cannot be fully completed in the given platform. These kinds of small issues, when repeated across a dataset, limit what your final model can actually do. Simply put, a great way to improve output is by implementing tools that better understand annotator input.

2 — Don’t Reinvent the Wheel (or Metaphor)—You finish creating a feature that lets users trim pixels off the edges of a selection, and it comes time to name it. when you suggest calling it  “the peeler tool,” your idea meets a surprising amount of pushback (clearly they don’t get the literary reference to your favorite poem…) But what’s the harm in picking a stand-out name?

Unfortunately, neither of your test users have encountered a “peeler tool” before. User 1 will read “peeler” and assume the tool does something with layers, like peeling a banana. User 2 will get closer, but will be surprised when clicking in the middle of a selection doesn’t remove the outermost row of pixels, (like a vegetable peeler does, right?) but instead puts a tiny hole right where they clicked. In both cases, time is lost to slightly atypical feature taxonomy.

When users enter your platform, they bring with them a myriad of previous associations. If your design uses familiar metaphors, users will quickly learn and memorize the corresponding tools. In the above case, both users have almost certainly encountered an “eraser tool” before, and would be better served by a design that uses previous knowledge.

3 —Detours, not stop signs—However thorough and heuristically-sound your UI is, a user is bound to fall between the cracks of your design. Annotators will encounter issues, so what can they do next?

And let’s think a little bigger than the “Help” menu…

Users are most productive when they know what they can do after an issue occurs. When they remember a mistake they made three examples prior, or they place a bounding box the platform doesn’t accept, there needs to be a way to be wrong while still moving forward. It may be as simple as including a “skip” button, or letting the user leave a quick comment on their work. Make sure that, when something goes wrong, your user is met with a detour, not a stop sign.

4 — Make mistakes useful—So we’ve established that humans make mistakes. But what do we do with that? In the data-centric AI movement, what do you do with hiccups in the process? The seemingly-too-simple approach is to make mistakes useful.

This could mean creating learning opportunities for annotators, or flagging an image that is consistently mislabelled. In general, instead of dumping all mistakes into the waste bin, create features that ask errors “why?”

For a good example, look at Innotescus’s bi-directional communication system. By tagging problems and leaving short messages, supervisors can point out annotator mistakes and ensure that users have a good understanding of what went wrong.

Conclusion

So to recap:

  • Make sure your platform knows something
  • Don’t Reinvent the Wheel (or Metaphor)
  • Detours, not stop signs
  • Make mistakes useful

These tips remind me that, despite our assumptions or our ignorance, users always have something to tell us. We can best listen by finding simple ways to understand user logic and amplify adaptability.