Metadata – Rhizome

Let’s democratise content discovery, starting with a different metadata model

The so-called streaming wars came and went, and did not leave us (the users) with better tools for content discovery. Over the last few years, the movie and series content offering grew enormously and got very fragmented at the same time. New streaming services tried to hone in on a piece of the market by differentiating themselves with unique and exclusive content. Even though they strive to be different through this content offering it’s clear that most streaming services aren’t unique in the way they offer their content to the user. Options for navigating content libraries are very homogenous. Every service offers a combination of landing page banners and rows, navigational categories, search and AI powered personalisation. Next to all the (talked about) advances in AI there seems to be no innovation on other methods for content discovery.

For me, the personalisation algorithm never provided any value. It did not help me find new and unexpected content. I’m sure many people will recognize this in some way, since research has shown that users are getting more frustrated trying to find something to watch. The business model of most streaming services focuses on unique content to acquire users and after that personalise the sh*t out of them to prevent them from leaving. I doubt that increased search frustration is beneficial in preventing churn. So maybe we should look beyond personalisation for an answer. Providing users with more freedom to navigate content as an additional feature next to personalisation, can diminish the search frustration by increasing findability of content. It can provide value for the user and as a consequence also for the streaming service by preventing churn.

Increasing freedom means providing informed access to content. On the one hand by adding more options for filtering and searching the content library and on the other by explaining the logic behind the options (informed as opposed to a black box).

To be able to achieve this, it is crucial to establish a qualitative metadata model. Metadata expert John Horodyski addresses this necessity in his book Metadata Matters. It ‘matters because it is both identification and discovery; it’s about access.’ Creating more freedom for the user to find the right content on a streaming service is all about navigational access and at the moment that qualitative access is lacking. Horodyski argues that a shift in thinking is needed. ‘Content is no longer “queen” … there are many in the realm, not the least of which is the user and the user experience. If you have great content and your users cannot find it, the value of the content is diminished or lost altogether.’

Of course, streaming services provide some metadata for navigating, but I want to argue that the value of that metadata is so low that it does the opposite of creating freedom. It doesn’t help in finding content. It diminishes the value of the content and traps us within a limited view of how movies can be categorised. I will explain my idea to remedy this by narrowing in on a specific category that is ubiquitous in every day movie discussions and included in the navigational menus of all streaming services in one way or another: genres.

Genre is a French word with origins in Latin meaning ‘kind’ or ‘sort’. In other words, a movie genre represents a specific kind of movie. It suggests that there’s a generally agreed-upon definition of the shared elements that shape a genre. There isn’t, and as such the current application of genres is inconsistent at best. Giving access to inconsistent data will not benefit identification or discovery, since one of the worst things that can happen to your data model is bad quality data. Anything built on that data or any conclusions drawn from that data will suffer from lack of quality because ‘garbage in is garbage out’.

This data value problem can only be solved by approaching movie classification from a different perspective. We should stop trying to see genre as something universal that can be used to tag movies. Instead we should look at the ‘hidden’ data that describes movies. A bottom up strategy that doesn’t work towards a specific output variable. This way, it’s possible to deconstruct movies to create a structured data model with clearly defined components and relations between these components, an ontology of movie metadata.

It isn’t possible to define this in one go. It will be a continuous process of development, whereby defining the model and using the model to create data will run parallel to each other. If the first logic is consistently applied to a sufficient number of movies this data can facilitate new insights to further refine the model and so on.

The above doesn’t mean that genres no longer have value. If used consistently they can help in positioning movies. As such they still have a cultural value. The new model can be used as a baseline for capturing that cultural value and also creating transparency about the underlying logic. That means that at some point the data generated by applying the model must be analysed and from that analysis specific clusters of data may become apparent. These clusters can be positioned as genres; a specific kind of movie based on metadata. A simplified example is a cluster of movies that take place in the north-west of the Americas during the 1849 – 1890 period, which can be subjectively defined as westerns.

The meaning attributed to the clusters isn’t fixed. One of the reasons that genres are hard to define is that categorising movies can be impacted by societal changes and that genres evolve over time. That subjectivity still applies. The meaning attributed to the data in the model can be adapted over time to keep up with the zeitgeist. The underlying data however stays the same, although the model can be enriched with additional components.

A model that doesn’t attempt to capture an objective truth. Even if the universal truth in art can be found, it’s not necessary to know it to build a data model. It doesn’t matter if grass is classified as green or as red, as long as the classification is applied consistently. In other words, this article is the first leg of a journey to formulate a consistent subjective method. The next step is to describe how to deconstruct movies to generate consistent metadata.

CategoryMetadata

WHAT IS THE DATA VALUE OF A MOVIE GENRE?

Let’s democratise content discovery, starting with a different metadata model