Face It, We’re in Awe: Meta’s AI Trains Its Gaze on Us with 300 Million Lessons

Who did ‘always watching’ better – Big Brother or Meta? Judging by the recent unveiling of their latest AI models suite Sapiens, Meta is taking a shot at the crown – and let’s hope it’s just for the betterment of AI (none of that creepy 1984 stuff).

First things first, how does Sapiens work? Think of it as a baby learning to identify shapes but rather with pixels. These models were fed a super-sized menu of 300 million human images before their debut, helping them become mini-Van Goghs for tasks like 2D pose estimation, body segmentation, depth estimation, and surface normal estimation. Don’t worry, ‘surface normal estimation’ does not refer to rating your average-Joe (though that would be some gossip-worthy AI). It means working out how surfaces point in 3D space, a vital skill for making photorealistic 3D models.

Meta claims that Sapiens models outperform the class valedictorian of the AI models. Specifically, the Sapiens-2B model has improved body segmentation accuracy by a whopping 17 percentage points. In layman’s terms, it’s like moving from a stick-figure drawing to a decent caricature.

One might think that size doesn’t matter, but in the realm of AI, it does (at least in Meta’s view). The largest model, Sapiens-2B, is a 2-billion-parameter beast, trained at a picture resolution of 1024 by 1024 pixels. This high-res training enables a more detailed analysis than its lower-res counterparts. But don’t let this jargon intimidate you! It’s like going from an old TV to a fancy 4K – absolutely no comparison, and we would know if we could afford 4K TVs (wink wink!).

However, like a toddler learning to navigate messy mealtimes, Sapiens struggles with complex poses, crowded scenes, and significant occlusions like a pro Hide n Seek player. But hey, just as humans do, it learns from its…PHD (Pretty Huge Datasets).

These new models are also philanthropic (in a cyber manner, of course). The Sapiens models are opening up their secret sauce to the research community on GitHub. Meta sees them as potential helping hands for those who want to label a humongous heap of real-world data. So possibly, in the future, we could see even more advanced image analysis systems that can understand human-centric images better with some help from the goodwill of Sapiens.

So, the next time you post a picture, remember you’re potentially training some AI prodigy out there. Only this time, its name is Sapiens.

Let’s keep a watchful eye on the future so that Big Brother doesn’t do it for us first. With so much expectation on AI, do you think it’s heading in the right direction? Share your thoughts, let’s get this discussion rolling!