Internationale Kurzfilmtage Oberhausen

You Tube Smash Up
by
Parag K. Mital

YouTube Smash Up:
Exploring Copyright Through Generative Art

 

 

YouTube Smash Up is a generative virus that attempts to reproduce the #1 video on YouTube each week using other Top 10 content. It ran 5 weeks in 2012 and faced numerous copyright violations so was shut down. The algorithm for this process essentially learns to generate audiovisual collages, where the contents of the collage come from fragments of the #2 - 10 videos. A number of ideas were explored as part of the process: the representations of a generative algorithm, whether it would be possible to encode popular cultural patterns with an A.I., and building a better understanding of how copyright applies to generative digital media.

 

The process of learning from the #2 - 10 videos works worked by attempting to discover cultural fragments. These fragments are blurbs of sound, prototypes of an object, and exist at a level of representation not as advanced as objects though not as simple as pixels. Further, the process is entirely automated. The result: Miley Cyrus's lips collaged against the background of a troupe of dancing animals or Psy's forehead dancing without the remaining pieces of Psy.

 

Using YouTube's interface, the video is also textually tagged using text from the Top 10 YouTube videos. Labelling the video essentially injects the video into the populous alongside innocuous tribute videos by masking the true experimental abstract film nature of it: "Miley Cyrus - Wrecking Ball (YouTube Smash Up)". The smash ups are however mostly viewed negatively by the YouTube audience, with comments such as, "now im [sic] blind" (https://www.youtube.com/watch? v=Cw5Fo03a6lc), "Will someone kill me in my sleep because I watched this video?", and another commenter's reply to the previous comment, "me 2 [sic]" (https://www.youtube.com/watch?v=YGnX2MNKEFw).

 

Despite the negative reception received online, the work has been invited for exhibition internationally numerous times. However, situating the work within a gallery or screening changes it significantly, i.e. it is no longer mistakenly viewed by unsuspecting YouTubers likely searching for Miley Cyrus and not experimental abstract film.

 

The process lasted 5 weeks starting in late September of 2012 and the idea was to continue until one of the synthesized videos ended up in the top 10, thus resynthesizing itself. However, after 5 weeks of producing smash ups, all videos were the subject of numerous copyright infringement notices. Further, I was locked out of my YouTube account, and instead I was redirected to a page entitled "YouTube Copyright School" (http://www.youtube.com/copyright_school) with an animation of "Happy Tree Friends" explaining what copyright is and how to avoid risking copyright infringement.

 

Accompanying the original animation was also a quiz testing my knowledge of copyright which I had to correctly complete in order to use my account. After disputing all copyright claims, most infringement notices were withdrawn after 1 month, and the videos were placed back online. However, to this day, some videos still receive more infringement notices, as the publishers of certain videos may change or additional content holders may hold a stake in a video.

 

What was most surprising perhaps was that the infringing content was never the #2 - 10 videos, but instead the #1 video. That is surprising given that content from the #2 - 10 videos is literally copied into the resulting collage, and no content from the #1 video appears in the supposedly infringing smash up. If anyone should claim infringement violations, it should really be the content rights holders of the #2 - 10 videos.

 

It is not as if the content were at the level of simple pixels, which are the basis functions of every possible video and would be a meaningless exercise for a generative collage. But they are also not quite composed of objects, which may more likely qualify as copyright infringement or perhaps even slander. Instead, the content taps into a level in between the two and reveals an interesting level of representation that doesn't necessarily carry along with it the original copyright. It therefore seems that rather than the fragments of the content being the source of infringement, it is only the semblance of the content. The gist, in other words, is that it is infringing on the #1 video's copyright. To the actors enforcing the copyright, it is as if the original video were copied and filtered with a Photoshop mosaic filter.

 

The infringement notices were likely sent as part of an automated process offered by YouTube called ContentID. This system attempts to automatically discover copyrighted content amongst its billions of videos. It is fairly likely that the algorithms behind detecting the similarity of the Smash Up with the #1 video are quite similar to the ones used to create the video. They are both a sort of robot perception after all. They are defined by computational networks that define shape, texture, color, and other parameters within their intrinsic representations. Once enough content passes some threshold of similarity, likely defined in relation to all the content it has access to, it is likely that ContentID then informs the original content holders if it finds anything. It's also likely that the content rights holders never viewed the smash ups and went ahead and sent notices demanding that I cease and desist. Luckily, they were assuaged when I convinced them the work was in the name of art and nothing more (e.g.: http://pkmital.com/home/2012/11/09/an-open-letter-to-sony-atv-and-umpg/). Or perhaps more likely they finally viewed the smash ups and realized no one in their right mind would consider this a copy of their video.

 

Of course all of this is mere speculation, since the algorithms are proprietary and not described anywhere.

 

Since 2012, the work has continued to live on YouTube. But the cease and desist notices have mostly stopped. Instead, the content rights holders of the #1 video are now profiting off of my video: they are able to place ads beside my video allowing viewers to purchase the music being resynthesized, despite any objections that I might have.

 

More than a fun story, this artwork introduces some interesting questions in a society with ever more increasing amounts of A.I. generated content. Media is continually shifting from framed and packaged assets, to an onslaught of grams, snaps, tweets, posts, and a myriad of other potential blurbs of content. A recent BBC article suggests that networks of as many as 500,000 Twitter accounts are run by algorithms (http://www.bbc.com/news/technology-38724082). Add that to the 270 million fake Facebook users, and it's clear that media is shifting to an increasingly generative landscape (http://www.telegraph.co.uk/technology/2017/11/02/facebook-admits-270m-users-fake-duplicate-accounts/).

 

In a society that continues to define content procedurally and infinitely, it seems then there may eventually be a need to better understand content's sources rather than simply its outcomes. Artificial Intelligence (A.I.) and Machine Learning in particular, which requires massive troves of data in order to be successful, are built on the data of others. In fact, the algorithms that power speech recognition in Google's Voice or Amazon's Alexa, or the image recognition software that turns your face into a cat, or the self-driving car tech that drives our cars, all of these technologies are no doubt built on data that is almost certainly copyrighted. Yet we may never know or understand what that data is, how it was licensed, or even in what form that data now exists as some "internal representation" of an A.I.

 

A.I. may already be responsible for the composition of a #1 YouTube Music Video. Though rather than a generative algorithm, the process of producing a #1 video is a lot more opaque than a simple algorithm. Instead it is filtered through people making choices often more and more as a result of data. For instance, platforms such as Netflix or Amazon integrate rich amounts of data science into their production pipeline in order to derive what content to produce or even how a plot may evolve.

 

Of course, a Netflix production is a lot more involved than a YouTube Smash Up. But it raises the question: what happens when content creation is as simple as a running a generative algorithm, like YouTube Smash Up? Aren't we already falling for the 100's of millions of fake Facebook accounts, sharing their stories as if they are real people's thoughts? If other media types reach this level of realism, and they almost certainly eventually will, what role will copyright play? Will copyright then consider that the #1 video is infringing on the copyrights of the #2 - 10 videos? More likely, copyright's role seems to be shifting from owning exclusive rights to content, to allowing one to take advantage of the growing amount of content, procedural or not.

 

Should an A.I. be responsible for describing the data with which it forms its actions? Consider what happens in an environment which has transformed the way we ingest news: Facebook. Facebook's filter bubbles can create a feedback loop of self-reinforcing behavior. It is a toxic environment responsible for swaying political elections, as the news driving your feed can be bought and sold to the highest bidder. If we understood the data used in building decisions in networks that dictate political election outcomes, we might also be a better informed public. We might understand what forms the basis for the content we see, and better understand if it is based in something that already exists, or is pure fiction. We might even find that we can bring some meaning back to society. But by the same token, we might also fall victim to copyright. And that model currently depends on an advertising model which will no doubt place a scheme for monetization atop our content.

 

I've considered restarting the YouTube Smash Up project numerous times. But each time, I am faced with the reality that I will no doubt face additional copyright infringement claims and potentially face legal repercussions. More disturbing though is that the legal action will most likely be taken by the wrong content right holders. And that is because most copyrights are not some holy grail of truth with clear lines. They are completely subjective. And even more absurd is the fact that fair use is an argument which you can only invoke once you have been sued. That means that your use of content is not necessarily fair use, until you have argued your case in a courtroom. And you can't argue your case in a courtroom unless you have been sued.

 

There is a lot wrong with copyright. And it is only going to get harder unless the ways in which we understand data also change. One potential set of technologies that may offer some hope for generative artworks and a path for copyrighted content is decentralized data stores built on blockchain protocols such as Dat (https://github.com/datproject/docs/tree/master/papers) and IPFS.io (https://ipfs.io). These models re-envision the internet as one giant torrent network. Rather than have infrastructure dictate what content you see, instead, it is a blockchain built from your peers. A giant peer-to-peer network where data can never disappear, and as a result of the chain of events created, can never lose its history of events which allowed it come to your own computer.

 

Copyright has already gone through many evolutions, from Disney's protection of Mickey Mouse, to recent extensions that ensure copyrighted content can persist even after the death of the person it aims to protect. Copyright has even evolved to protect a culture of mash up, the least likely of actors to benefit from copyright. For instance, the Electronic Frontier Foundation's Creative Commons copyrights explicitly permit derivative works and encourage sharing as part of its variations. But when content is transformed, re-appropriated, mashed up, and re-sampled to the point of unrecognizability, how can we, let alone algorithms, understand what is copyrighted?

 

This is an important question to consider as algorithms become capable of ingesting more data and transforming it in ways we may never be able to understand. Algorithms will continue to gain greater expressivity and be used in creating a wider possible range of generative works of art, perhaps even capable of #1 video hits on YouTube. But these algorithms are almost certainly going to be built on the data of others. Will we enforce that an algorithm's internal representations must not have been built on copyrighted data, or be capable of expressing copyrighted data? Or will we discover new ways of accessing and describing data which maintains authorship and with it, some semblance of a historical record?