M&E Connections

M&E Journal: Q&A: Putting Sony Pictures Entertainment’s Best Footage Forward With AI

By Jason Lambert, Executive Director, Content Licensing and Metadata Services Sony Pictures Entertainment and Daniel Mayer, CMO, Expert System

With a deep library of clips from blockbuster franchises and acclaimed cinematographic masterpieces, Sony Pictures Entertainment runs a thriving licensing business. But for customers to find clips, they need a rich, taxonomy-driven metadata schema capturing the essence of each clip.

That’s why SPE launched its taxonomy and metadata enrichment initiative, scaling the project with AI. Expert System CMO Daniel Mayer discussed the initiative with Jason Lambert, Executive Director of Content Licensing at SPE.

Mayer: Jason, will tell us a bit about the stock footage business at Sony Pictures?

Lambert: Our third-party stock footage business was launched about 15 years ago. There had been an archive of stock footage assets prior to that, but it was mainly serving the needs of internal production teams. The studio collected stock footage assets, and when an internal team needed, say, footage involving police cars, the stock footage librarian would go into the vault and search through print reels, paperwork or card catalog files to find relevant material.

When we set about to launch our third-party business, we digitized and re-cataloged the entire archive, which made searching and reviewing those assets much easier. As a result, the demand for stock footage grew exponentially. Today, we’re serving a wide audience that includes ad agencies, documentary producers and the other studios. We have approximately 180,000 stock footage clips at present, and the archive is growing by 5,000 to 10,000 clips every year.

Mayer: Why did you get started on your taxonomy project?

Lambert: The stock footage business has become more competitive over the past fifteen years. Because the cost of producing high-quality digital content has gone down dramatically, there’s much more content available in the marketplace. I still believe that having the best content is the first success factor, but a close second is simply finding the shot the customer needs. You’ve got to find the shot to license the shot. For this reason, an effective search experience has become critical to the business, and the foundation of an effective search experience is good metadata, with taxonomy as its organizing principle.

The background for this effort was the launch of our stock footage web site — a site designed to marry intuitive, taxonomy-based searching for our customers with the white-glove research services we already provide.

Mayer: What impact do you anticipate with this taxonomy?

Lambert: Our customers don’t necessarily search using the same words that our indexers pull from our controlled vocabularies when cataloging. To use a simplified example, a customer might look for footage involving a “dog” whereas we might have indexed a series of shots more specifically with the term “dachshund.” There are all sorts of variants of this situation of course, but the basic principle is that you need to make sure it’s possible to search by category rather than by instance, by part rather than by whole, by related term (and so on), so users can search however they want.

And that’s when you realize that what you need is actually a taxonomy: a schema that codifies the relationships between searchable objects in your assets.

From the user’s standpoint, the benefit of a taxonomy is that it will become easier to find what they’re looking for. The taxonomy helps funnel users back to the object they are searching for even if they don’t use the same term as the indexer. It also enables us to build guided search experiences where users can follow the taxonomy to drill down into the content rather than poking around in the dark. To go back to the previous example, the term “dog” can point towards different dog breeds (like “dachshund”), cascade further towards breed variants (“smooth hair,” “longhaired,” “wirehaired”), or incorporate alternative terms (like “canine” or “pooch”). From the business standpoint, we think increased discoverability will lead to increased licensing.

Mayer: But did you already have this taxonomy, or did you have to build it?

Lambert: We had to build it, and it took some time and effort. We started by reviewing our existing vocabularies, as well as historical logs of customer requests. That gave us a starting point, after which we went through the process of grouping all of our existing terms under subcategories. We drew inspiration from existing third-party thesauri, including the Library of Congress’s Thesaurus for Graphic Materials, which bears some overlap to the taxonomy we felt was needed for our business.

Today we’ve gravitated to a taxonomy of around 11,000 terms, which we feel is ready for prime-time. This being said, as we move forward, we expect to continue enriching and pruning the taxonomy based on inputs from our subject matter experts, whether they are the catalogers themselves or elsewhere in operations or sales.

Mayer: How does AI help with the process?

Lambert: Well, taking a step back from the taxonomy itself, the tricky part is that if we had to apply the taxonomy to our asset metadata manually, it wouldn’t be possible given the volumes we’re dealing with. The answer to this problem is AI.

Natural Language Processing, the variety of AI that we are using on this project, does two things: It hosts our taxonomy and it processes our content. When it recognizes the concepts of our taxonomy in our assets, it applies the correct metadata automatically. This makes the entire process painless, and when we update our taxonomy — which is bound to happen — we can update our metadata through the same process.

Mayer: Given your experience so far, how do you view the role AI can play in media and entertainment?

Lambert: I think AI has transformative potential for the business. Today we’re using it as the means to give our stock footage customers a great search experience and to grow our licensing business, but I see it applying to other types of assets in our licensing inventory — film clips, photos, posters, etc. We’ve already taken steps to index the scenes in our top 500 feature films with a combination of human tagging and AI services (facial recognition, phonetic indexing).

As AI services continue to improve, our ability to scale that tagging effort across the entire SPE catalog will continue to improve as well.

Then there’s the range of analytics that we could apply to movies using AI services: What’s the theme, what are the story points? What’s the character’s behavior in any particular scene? Can we then connect the dots between those data points and our audience’s reactions? That kind of approach opens a breadth of insights on the creative side as well.


Click here to translate this article
Click here to download the complete .PDF version of this article
Click here to download the entire Spring/Summer 2018 M&E Journal