Enriching the BBC Archive: IBM Watson Media Video Enrichment and Natural Language Understanding

Using IBM Watson Media Video Enrichment and advanced Natural Language Understanding to add value to the BBC archive so that content can be harvested for re-use. (Paper presented at the FIAT/IFTA 2018 World Conference in Venice - "The Archive’s Renaissance: Navigating the Future, Channelling the Past")

jose_velazquez_on_ai-transforming_bbc_media_archives__the_blue_room_project_.2018._audio_64k.mp4

[Jose Velazquez shares insights about the project]

Abstract

The BBC, managing a vast archive, utilised IBM Watson Media Video Enrichment to enhance metadata and enable intuitive content retrieval. This project leverages AI, particularly advanced Natural Language Understanding (NLU), to analyse and interpret content, allowing users to ask questions in plain English and retrieve relevant information, transforming how the archive is accessed and utilised.

Introduction

In the face of exponentially growing media collections, effective tools for search, discovery, and accessibility are essential. The BBC's innovation hub, "The Blue Room," implemented IBM Watson Media Video Enrichment, focusing on NLU, to unlock the potential of its video archive. This initiative aimed to move beyond traditional metadata tagging and enable dynamic, human-like interaction with archived content.

Objectives

Enhance the searchability and accessibility of the BBC's video archive through advanced NLU.
Enable users to retrieve relevant information by asking questions in plain English.
Automate the enrichment of metadata to improve content retrieval.
Demonstrate the application of AI in managing and leveraging large media archives.
Improve content discovery for researchers and journalists.

Methodology

The project employed IBM Watson Media Video Enrichment, which includes:

Content Ingest: Assets were ingested into the Watson Video Enrichment platform.
Automated Metadata Extraction: Watson analysed video and audio content to detect scenes, keywords, objects, and emotions.
Natural Language Understanding (NLU): NLU was used to extract concepts, entities, and themes from the content, allowing for the interpretation of plain English queries. This allowed for the system to understand the context of the requested information, and then retrieve the most relevant sections of the archive.
Tone and Sentiment Analysis: Watson identified dominant emotions within the video.
Speech to Text: Audio was transcribed into text for analysis.
Visual Recognition: Objects, faces, and scenes were automatically recognised and tagged.
API Integration: The Watson Video Enrichment API was integrated to create searchable JSON metadata objects.

Results

Enhanced metadata for archived video content, enabling more precise and intuitive searches.
Ability to retrieve relevant content by asking questions in plain English, greatly improving user experience.
Automated creation of playlists based on complex, natural language queries.
Increased efficiency in content discovery for researchers and journalists.
Demonstrated the potential of AI, especially NLU, for dynamic archive management.

Watson.-.BBC.Archive.demo.Newsnight.sample.mp4

[Demo walkthrough. Newsnight Sample Video (c) BBC Archive]

Discussion

The implementation of Watson Video Enrichment, with a focus on NLU, addressed the challenge of managing a massive archive by automating metadata enrichment and enabling intuitive queries. This significantly reduced the time and cost associated with content retrieval. By allowing users to interact with the archive in a natural, conversational manner, the system transformed static archives into dynamic knowledge resources.

Future Work

Exploration of further applications of NLU in archive management.
Integration of Watson Video Enrichment with other BBC systems.
Continued improvement of natural language processing capabilities.
Expanding the use of the system to other archived materials.

Contribution

This project demonstrates the transformative potential of NLU in media archive management. It showcases how AI can enable intuitive, human-like interaction with archived content, moving beyond traditional search methods. By presenting the BBC's use case, it provides a valuable example for institutions seeking to enhance their archive accessibility and usability.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Audio_Transcript		Audio_Transcript
LICENSE		LICENSE
PAPER.md		PAPER.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Enriching the BBC Archive: IBM Watson Media Video Enrichment and Natural Language Understanding

Abstract

Introduction

Objectives

Methodology

Results

Discussion

Future Work

Contribution

References

About

Uh oh!

Releases 1

Packages

License

josev2046/Watson-Media-Enrichment-at-the-BBC-Archive

Folders and files

Latest commit

History

Repository files navigation

Enriching the BBC Archive: IBM Watson Media Video Enrichment and Natural Language Understanding

Abstract

Introduction

Objectives

Methodology

Results

Discussion

Future Work

Contribution

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Packages