By T Shobhana, VP Global Head Marketing and Communications, Prime Focus Technologies –
Advances in artificial intelligence (AI) are seeping into our lives every day, affecting the way we live and impacting the choices we make. Right from driverless cars to voice-powered smart assistants, to curated social media feeds, there are many examples and applications of AI in daily use across the world today.
AI can now be used in conjunction with machine learning (ML) to speed up data analysis with reasonably high accuracy, thereby providing faster and better information to decision makers and opening up new possibilities for revenue generation. For the M&E industry, AI is starting to make deep inroads in the sphere of media asset management by automating tasks that are otherwise time consuming or repetitive in nature.
Already, there are a number of readymade AI tools available off-the-shelf that are designed to recognize different aspects of video content – like objects, spoken dialog, on-screen text, celebrity faces and emotions. These tools support automated metadata tagging, caption generation and similar functions.
The problem with (artificial) intelligence
Many companies have eagerly invested in general-purpose AI tools, hoping to reduce costs and save time. And while they have witnessed certain business benefits, they have also experienced several challenges while working with these tools. The underlying reason is that most AI tools are trained to solve generic use cases across multiple industries, which limits their effectiveness. Also, while standalone AI engines can recognize specific elements within individual video frames, they lack the ability to combine these findings in a contextual manner to identify complex sequences (like a particular character in a car chase).
This greatly impacts the accuracy of search results delivered, and users are often left wading through a large number of irrelevant clips before they can arrive at the content they were looking for.
Here’s where wisdom, as opposed to mere intelligence, can add tremendous value. There now exist native media recognition AI platforms that build machine wisdom from intelligence to transform business operations. Leveraging machine wisdom, such platforms synthesize and focus on the most important annotations to deliver search results that are far more meaningful and contextual. They leverage collective intelligence of the industry’s most sophisticated AI solutions and homegrown engines to identify complex elements such as compound objects, sounds, commentary, actions and best suited (contextual) thumbnails.
AI tools powered by wisdom go way beyond just recognizing a limited set of keywords and are designed to meet the specific needs of content owners which off-the-shelf models cannot solve. With a rich feature set based on Natural Language Processing (NLP), Fuzzy Logic and Thesaurus, these platforms can be used to search digital media and easily extract relevant content for multiple use cases across the content lifecycle – right from creation and post-production to distribution and marketing.
Intelligent search and recommendations, promo creation, segment analysis, automatic ad placement and video interactivity are just a few avenues where Wisdom can add tremendous business value for M&E companies.
The path to wisdom
A deep dive into the nature of audio-visual content can help understand why wisdom is a giant leap ahead of intelligence. Much of the communicative power of media assets lies in the combined effects of the audio and the video. Thus, capturing the essence of a scene, image or line often depends on capturing not only the dialog, but also the imagery, the action, and the context, then putting them together to understand the true meaning of the scene. In other words, going beyond the facts of the description, which we can call intelligence, to the true meaning, which we can call wisdom.
Standalone AI engines may be able to recognize specific elements within individual video frames, but no matter how good each individual recognition tool is, if they lack the ability to combine these findings in a contextual manner, then they cannot provide wisdom to identify complex sequences (like a particular character in a car chase) or even the true meaning of scene (like the line “As you wish” in The Princess Bride). By yielding logical, contextual search results, wisdom-driven platforms can facilitate faster decision making, improve scalability and open up new monetization opportunities.
Recently, advances in AI have been taking place at a furious pace. As user experience personalization becomes more important across the board, companies are using AI to create personalized services for billions of customers. The stage is set for leveraging wisdom to greatly speed up metadata tagging, media localization, review of existing content libraries and many other tasks that help create new revenue opportunities, while reducing costs.
Of course, machine wisdom does have limitations, and human quality control remains paramount. For example, while AI voice recognition capabilities have improved over time, there are still challenges when working on content with considerable cross-talk, background noise, heavy accents and a lot of high context content (like sarcasm or humor). But with extensive research and development yielding new developments every day, wisdom (on the back of AI and ML) clearly has promise to help M&E companies enhance operational efficiencies and achieve faster time-to-market.