Zach Winn reporting in MIT News:
MIT alumnus-founded Netra is using artificial intelligence to improve video analysis at scale. The company’s system can identify activities, objects, emotions, locations, and more to organize and provide context to videos in new ways.
Netra’s solution analyzes video content to identify meaningful constructs in service of more accurate organization. This improves searchability and the pairing of video content with relevant ads. How does this work?
Netra can quickly analyze videos and organize the content based on what’s going on in different clips, including scenes where people are doing similar things, expressing similar emotions, using similar products, and more. Netra’s analysis generates metadata for different scenes, but [Netra CTO Shashi Kant] says Netra’s system provides much more than keyword tagging.
“What we work with are embeddings,” Kant explains, referring to how his system classifies content. “If there’s a scene of someone hitting a home run, there’s a certain signature to that, and we generate an embedding for that. An embedding is a sequence of numbers, or a ‘vector,’ that captures the essence of a piece of content. Tags are just human readable representations of that. So, we’ll train a model that detects all the home runs, but underneath the cover there’s a neural network, and it’s creating an embedding of that video, and that differentiates the scene in other ways from an out or a walk.”
This notion of ‘vectors’ is intriguing — and it sounds like an approach that might be applicable beyond videos. I imagine analyzing the evolution of such vectors over time is essential to deriving relevant contextual information from timeline-based media like video and audio. But I expect such meaningful relationships could also be derived from text.
Systems that do this type of analysis could supplement (or eventually replace) the more granular aspects of IA work. Given the pace of progress in ML modeling, “big” IA (especially high-level conceptual modeling) represents the future of the discipline.
|_[Improving the way videos are organized||MIT News||Massachusetts Institute of Technology](https://news.mit.edu/2021/netra-video-ai-0520)_|