M&E Journal: The Evolution – and Applications – of the Video-Based Search Engine

By Eyal Arad, CEO, Co-Founder, Videocites

The text-based search engine is the standard search tool for video discovery and management according to titles, tags, or any other available metadata. The limitation of these engines lies in their dependency on the attached metadata. In many cases where metadata is missing, inaccurate, differs in languages or is misleading (e.g. altered titles), the search leads to either lacking or too many irrelevant results.

Audio-based search engines (“Shazam- like”) were the next important evolutionary step. They make use of audio information in order to identify songs and audio tracks. In some cases, audio-based search engines are also employed to search videos according to their soundtracks (e.g. for IP protection). However, it appears that audio information doesn’t have the required properties to handle many real-life scenarios in the visual content domain.

Background noise (e.g. crowds in sports events), simple audio manipulation, different dubbing, various videos that contain only the desired audio, or missing audio, lead to very poor video search performance.

The next generation came in the form of image-based search engines, which made it possible to employ the visual properties of an image (using a unique rich and large image fingerprint) to track similar images in different sizes and shapes, regardless of any metadata. These engines significantly contributed to the photo rights management industry.

Two contradicting factors constrain the building of a fast, accurate video-based search engine:

* The video fingerprint has to be very rich with all the unique visual properties of the frames.

* It has to be ultra-lightweight in order to allow for a reasonable and searchable index that will comprise a phenomenal amount of videos and frames.

Before the age of machine-learning, these contradicting constraints simply did not allow for a robust solution to be found.

Even though some attempts were made for using image fingerprints to build a video-based search, they were not able to cope with videos on such a large scale, nor build immunity to manipulations; therefore, they could only be efficient for very limited video databases and basic search scenarios.

The video-based search engine

With the appearance of machine learning technology, these constraints could fully co-exist, positioning the video-based search engine as an infrastructure for various new video management applications and tools that until then could not have been possible.

The key features of this engine include:

* The ability to index the visual information in every frame of every video in a multitude of databases and platforms, making it searchable on a frame-level basis.

* Immediate search retrieval.

* Search accuracy that provides all the matching videos, and only them.

* Search scalability, i.e. the ability to run many parallel searches at the same time, on huge video databases and at a high frequency.

What is the video-based search engine good for?

Video-based search is relevant for many industries and domains. The points below describe interesting use cases for multiple business units in the M&E industry.

* Video tracking within huge databases. When searching for a certain video and its copies, there is a huge advantage for using a video-based rather than text-based engine. The former, as opposed to the latter, is not dependent on any metadata, language or audio, and therefore filters out many irrelevant results, saving precious human reviewing time. In addition, video-based search provides a complete solution by detecting also manipulated and partial copies.

Some of these relevant use cases are:

* IP protection and video monetization where finding all illegal copies online as soon as they appear is the most valuable criterion.

* Leak detection of videos to social platforms prior to their official release date.

* Detection of live streaming and rebroadcasts.

* Internal catalog search that for e.g. finds all different versions of a movie within a large, old and untagged inventory (and tagging them all at once).

* A video validation tool that verifies whether a video already exists (fully or partially), for e.g. within the inventory of a specific distributer.

* Storage cost reduction by video deduping and clustering.

* Fast talent acquisition by immediately tracking the first uploader of, for e.g. a viral video in the social platforms.

Advanced analytics: This new technology also provides another layer of information and analytics around videos and their copies. Many patterns and insights in the examples above can only be revealed through the binoculars of the video-based search. Analytics that hitherto were based on official video metrics alone, like the viewership of a promo or an ad, can now be boosted by engagement metrics of unofficial copies that usually surpass the official version by far.

Insights can be related to:

* Real traction of content in time and across platforms.

* Brand awareness, audience preferences and discovery of potentially relevant influencers.

* Lost revenue, monetization opportunities, competition and high-volume infringers.

To summarize, as video content proliferates and plays a major role in every aspect of our lives, better and more efficient ways of content retrieval must be developed and adopted. Boosted by state-of-the-art machine-learning, the video-based search engine is the new contribution to the M&E industry toolbox that simplifies and makes video management much more controlled, transparent and efficient.


Click here to translate this article
Click here to download the complete .PDF version of this article
Click here to download the entire Spring/Summer 2018 M&E Journal