I was discussing on a private mailing list whether or not this number is
for real. Someone complained it wasn't. Here's my response:

————-

Your utter certainty regarding the flawed methodology of their results
is somewhat ironic given that you've done no apparent research into how
they came to those results, nor any suggestion of what a more accurate
number might be.

A real critique would go to the source of their data and show why it's
bad. I'll help you with that. I found this year's report by simply
updating the year in the link from last year:

http://www.ifpi.org/content/section_resources/dmr2010.html

The actual quote from there is:

> Estimates on the impact of internet piracy vary but are consistently huge in scale. IFPI, collating separate studies in 16 countries over a four-year period, estimated unauthorised file-sharing at over 40 billion files in 2008. This means that globally around 95 per cent of music tracks are downloaded without payment to the artist or the music company that produced them.

When read in the actual context, they seem to admit it's a hard thing,
and acknowledge they're estimating. Furthermore, contrary to your claim
that it was pulled out of thin air, they kindly cite a variety of other
sources. I haven't dug into them all, but one is:

http://www.ipoque.com/resources/internet-studies/internet-study-2008_2009

The paper is 14 pages long with lots of interesting data, as well as a
completely description of its methodology. A weakness of their data is
they didn't cover North America or Western Europe as a whole (maybe due
to data privacy laws?). But focusing on Germany alone (which seems
middle of the road in terms of its data compared to other regions), it
says 53% of traffic was P2P downloads (separately from VoIP), to 26% web.

So right off the bat, P2P downloads account for double the traffic of
*all of HTTP* across their not insignificant sample set.

It doesn't seem unreasonable to assume that, say, 90% of that P2P
traffic was pirate. (It's what it was designed to do, after all.) So
let's say 47% of all traffic was pirate content downloads.

As for legitimate content downloads, we could probably back into that by
estimating bandwidth consumed by iTunes using public sales numbers and
average file sizes. But let's say iTunes accounts for 10% of *all HTTP*
(which seems astronomically high). That would mean 2.6% of the internet
is legitimate content downloads (separate from streaming).

47 + 2.6 = 49.6% of the internet devoted to content downloads
47/49.6 = 94.75% content downloads pirated.

Huh, I didn't even plan that out.

Anyway, there are obviously problems with all that data, how it's
collected, how it's analyzed, etc. But to categorically assert that the
data and results are flawed and everyone involved in the process is
either actively lying or allowing themselves to be mislead — without
providing any evidence or analysis to the contrary… let's just say
your own methodology could use a bit more scientific rigor.

-david

Really interesting NY Times chart here showing the relative value of all units sold across different media (CD, vinyl, digital, etc) since 1978.

Dollars by media

Does anyone know a similar breakdown by number of units sold (independent of value)?  It’d be interesting to see a breakdown of units (or even by tracks) to compare the total volume of legitimate songs being purchased over time.  I wonder if it’s constant, or gradually increasing, or quickly increasing with the advent of the internet?

For example, if we could see that the number of songs both being produced and consumed is steadily increasing — while simultaneously revenue is decreasing — it seems hard to argue that revenue is correlated with increased song production/consumption.

(At best it could be argued that revenue leads to higher quality songs, but that’s a tricky one given that “quality” is entirely subjective.)

And if revenue bears no correlation with song production/consumption, then policies that protect and maximize revenue (eg, Copyright) become rather difficult to defend given the stated purpose of music Copyright to — essentially — increase song production/consumption.

-david

Having the increasingly tired conversation about whether the music industry stands a chance (it doesn’t), and wrote up this response for why the problems that do exist in pirate tools are pretty insignificant compared to the alternatives:

“All those problems [incomplete catalogue, fake files, bad tagging, variable sound quality, porn ads, ISP throttling, the risk of being sued, guilt, etc.] are real, no doubt.  But let’s not forget: despite those problems, pirates still own 95% of the music download market.  Don’t you think if pirates really cared to deal with any of those problems (better than they already do), they could?

Indeed, don’t you think those problems are pretty trivial compared to the vastly more substantial problems (that you fail to list) of dealing with the cartels?  Namely: crippling fees for companies, unreasonably high fees for users, and the occasional licensing turf war that wipes out all users outside of the United States?

Thanks to technology: music is now bits, and bits are now infinite.  These are revolutionary facts.  Why is it so hard to accept that this revolution is like any other: those in power suffer while a new power emerges.  This is such obvious stuff, I don’t know how we’re still debating it so many years *after* the revolution ended.

I mean, for a moment imagine the pirates are the good guys in this battle.  Imagine you’re one of them, and you’ve already secured 95% of your terrain (eg, the global music download market), and are only dealing with the occasional isolated incident from rogue terrorist outfits (eg, RIAA vs Jamie Thomas-Rasset) or settling regional disputes (three strikes laws, lawsuits against TPB).  Sure the pirates could improve their interfaces.  And surely they are improving.  But where’s the rush?  Wouldn’t you think the war had been won long ago?

If anything, I bet they’re more interested preparing an offensive push into new terrain: the global music streaming market.  And if stupid things like Pandora needing to shut down its international userbase — creating a global demand for something that there is no legitimate way to buy — then they’ll have no harder time winning and holding that terrain than they have music downloads.

-david”

Somebody came up with a brilliant idea: let’s make a new audio file format! After all, people have tons of complaints about MP3s, right? Like… oh, wait, actually there are very few complaints. Undaunted, and with a $2.5M war chest ($2.5M to create a file format!?), MXP4’s advanced technology is poised to “revolutionize the music experience”… uh, what? That full quote:

So what makes MXP4 so advanced? The file format, beta-released in September, contains multiple tracks, allows users to mix the music, and incorporates video. On the mixing side, different track elements can be suppressed and recombined, allowing remixes, karaoke versions, or others creative combinations. “This are clear signs that the music industry is beginning to see the potential for MXP4 to revolutionize the music experience for consumers by allowing them to play with the music, whilst opening up new promotional and revenue possibilities for artists and labels alike,” Serviant commented.

Riiight… I think this’ll fail. Not (just) because it brings insignificant value. But for a reason that sounds incredibly trivial but is actually really significant: MXP4 has too many letters.

All successful file formats have TLAs. It’s just how things work. Yes, technically you can have a four-letter file extension. Just nobody ever does it, so it looks really weird.

Even “jpeg” eventually dropped the “e” to become “jpg” — and “jpeg” sounded fine when you said it out loud. MXP4 sounds incredibly awkward spelled out, and doesn’t sound like anything when pronounced like a word. That means every website, tool, story, and mention of this abysmal product will be tainted with an awkward, unpronounceable tinge.

I think had they called it MP5, or even just MPX, they’d be in a far better position. But MXP4 is this weird bastard name — it’s not the clear successor to MP3 that MP4 connotes, nor is it even in the MP family (it’s in some new MXP family). But rather than being the first of a new family (MXP1), it’s spontaneously the fourth generation — in an obvious ploy to sound better than MP3.

It’s a name only a high-paid marketing team could come up with.

David Barrett
Follow me at http://twitter.com/quinthar

(This is in response to an email from “Anton” suggesting that we’re at the dawn of a new type of “dynamic recording.”)

I actually really agree with you, if not on the specifics, but on the potential for a genuinely new type of music originating on the internet that is structurally unlike anything before — and that is intrinsically incompatible with and stifled by copyright.

For example, remember that even the concept of “notes” was once an innovation.  Prior to that, music was a collection of sounds at various frequencies, without an awareness that certain frequencies just sound “better” (nor a mathematical understanding of why that’s the case).  When “notes” were invented/discovered — along with the technology to produce them reliably — music itself fundamentally changed.

Similarly, some might have thought notes were the end of the line, but then came along chords.  Again, it was a real discovery that could only be enabled through technology: you simply can’t do chords until you have the technical ability to generate multiple “notes” reliably and simultaneously.

(And if you haven’t yet invented notes, then chords are simply impossible.)

Then the pianoforte comes along — again, a technical innovation — that opens up an entirely new type of music that simply couldn’t be done prior.  I’m sure we could come up with a thousand examples (including the use of distortion as an instrument, which gave rise to heavy metal) of how technology not merely extended music, but genuinely changed it.

I think computers and the internet present another innovation in that sense.  Prior to the digital age, it simply wasn’t possible to — for example — mash up hundreds of videos or thousands of songs to make a new song.  But that’s now possible, and its core “building block” isn’t frequencies, notes, chords, or even instruments.  Its building block is whole songs/videos.  It’s an entirely new building block that couldn’t technically be considered before.  It’s an entirely new type of music — sampling taken to the extreme — enabled through an entirely new technology.

And next?  As Anton suggests, prior to the internet, making globally interactive music — whatever that might mean — simply wasn’t an option.  We can’t even imagine what the consequence of that will be, nor what new type of music that might enable after.

But what I *can* imagine is all that might be fundamentally incompatible with today’s notion of copyright.  Indeed, we might look back on the attempt to copyright individual songs as silly as trying to copyright individual notes or chords.

Indeed, maybe the reason all music seems to sound the same today is because we’re discovering there are certain classes of songs that actually *are* the same, and sound better, for reasons we don’t quite understand now but someday will.  This might be the same process early musicians grappled with when first discovering the core notes and chords that we now view as so fundamental to music.

Maybe far from witnessing the death of music as the industry would have us believe, we’re seeing the birth of a whole new generation?

After all, those prior building blocks were perceived as innovative, radical, or even threatening back in their day.  Why should our day be any different?

David Barrett
Follow me at http://twitter.com/quinthar

As I wrote about previously, Expensify is doing (what I believe to be) some pretty innovative Twitter marketing.  However, from the very start we realized there’s a delicate line between marketing and spam, so we set out some early rules to ensure we’re on the right side of the line:

1) Keep it personal.  Only send messages from real people, to real people.  Leave the faceless boxes on Google and maintain the social foundation of Twitter.

2) Keep it timely.  A huge benefit of Twitter is you can go straight the people who are experiencing the problem at that exact moment.  Leave the huge backlog of past posters alone and stay focused on the present.

3) Keep it relevant.  The temptation is overwhelming to just blast this out to everybody.  But resist that temptation and focus on the people who are actually calling out for your thing.

That said, we were afraid then that others would cross the line, and it appears that’s happening with increasing frequency.  Alas.

Unfortunately, I’m not sure what Twitter could do to thwart it.  Perhaps the easiest way would be to just add a “Spam” button to the Twitter interface and then kick off users who get too many relative to their post volume.  In Expensify’s case, we get 4x more compliments than complaints (the above rules appear to work!), so I think we’d do just fine under such a scheme.

But it’s still too early to predict how the Twittersphere will react.  What do you think?

David Barrett
Follow me at http://twitter.com/quinthar

Piracy raw data update

March 21, 2009

Here’s a big data dump of stats (followed by analysis), for those who care about this sort of thing, from a March 2009 ars technica article:

– 17M people stopped buying CDs in 2008

– 8M people started buying digital music in 2008
– There are now 36M digital music customers
– 1.5B songs were sold “digitally” (ie, online) in 2008  
– 33% “of all music tracks” purchased in the US were digital

– Pandora use doubled in 2008, to “18 percent of Internet users”
– “Social network music streaming” rose from 15 to 19 percent usage

A January 2009 ars technica article rounds out these stats with:

– “unit purchases” increased by 10.5% in 2008
– 428M albums (LPs + CDs + online) were sold in 2008, down 14%
– 65.6M online albums sold in 2008, up 32% over 2007
– 1.5B songs sold online in 2008, up 27% over 2007
– 1.88M vinyl sales in 2008, up 89% over 2007

So all that looks pretty rosy for the music industry, in absolute terms.  But how did it do relative to piracy?  According to this slightly more pessimistic January 2009 IFPI report:

– Digital music sales grew 25% in 2008 to $3.7B worldwide
– Digital music sales account for 20% of recorded music sales, up 15% over 2007

– 40B songs were “illegally file-shared” in 2008
– 72% of UK music consumers wold stop pirating if told to do so by their ISP
– 74% of French consumers agree internet disconnection is preferable to fines

A linked “key facts” PDF has a boatload of additional statistics, including:

– 16% of European internet users “regularly swapped infringing music” in 2008
– 13.7M films were distributed via P2P in France in May 2008, compared to 12.2M cinema tickets
– “free music” was given as the primary reason for piacy
– P2P file sharing accounts for up to 80% of traffic on ISP networks

So pirated downloads still utterly dominates legit downloads, to the tune of 26:1.  If anything, it seems like piracy is accelerating, even faster than legal download services.

What about legit streaming?  In July 2008 I estimated that MySpace users legally streamed about 110M songs per day.  Turns out I was off by a lot: they streamed 1B downloads after “only a few days”, and this September 2008 TechCrunch article tosses out 20B streams initiated *per day*.  That’s an amazing number.

But it’s also an incredibly vague number, as stream initiation isn’t nearly as interesting as stream completion.  For example, the average user spends under 10 minutes on the site per visit, meaning there’s barely time for two full-length songs.  I’m having a surprisingly hard time finding recent data, but this 2007 article shows MySpace had like 29M daily visitors, so even doubling that for 60M daily visitors today suggests at most time for 120M full-length songs per day — roughly 43B per year — and this ignores the large subset of international users (who can’t get newly-released music).

Similarly, YouTube had 5B views in July 2008, and 6B views in December 2008, so let’s just assume something like 66B total videos in 2008.  As for what fraction of those equate to “songs” I have no idea; I’d say this is more about “intent” than anything (ie, people who play the video in the background like a radio, rather than watching it like a music video), and I have no data at all on that.  But I wager it’s not the common case, so let’s say 25% of YouTube videos are actually just played as songs — and even that seems high.  (Also, this assumes all YouTube music is licensed, when in fact the opposite is probably more often true.  Details, details…)

Adding to MySpace’s 43B and YouTube’s 16.5B would be all of Pandora’s streams, which should be considerable given the claim that 18% of all Internet users use it, but I can’t find any data on it.  One reason for that is probably because Pandora actually has nowhere near that userbase: this Dec 19, 2008 TechCrunch article reports they only just hit 20M users, while in that same month the internet was estimated to comprise 248M North-American users (1.4B global).  This puts Pandora’s penetration at a much more conservative 8% of North-American users (assuming 100% are North American), or 1% global.  Still significant, but 20M *total* users is nowhere near MySpace’s 100M *active* users.

So for the sake of argument, let’s say there are about 60B legit streams, against 40B pirated downloads — meaning piracy utterly dominates in the download market, whereas legit streaming utterly dominates in the streaming market.  Indeed, there is essentially no such thing as a meaningful “legitimate” download market, or a meaningful “pirate” streaming market.

As for which accounts for more total “listens” and thus ultimately controls more users’ ears, that’s an open question: on the one hand, streamed songs are only heard at most once, whereas downloaded songs can be listened to multiple times.  But streamed songs are probably more likely to be heard at all, with a lot of pirated songs probably just going into vast personal libraries having never been played.

Who’s winning?  Who knows, and as piracy goes dark, it’s harder and harder to tell.  Personally, I’d still put my money on piracy having a strong lead on users’ ears, both right now and for the forseeable future.  If the average pirated song is listened to just 1.5 times (which seems reasonable), than piracy is still winning.

So in conclusion, it seems to me that the battle for downloads is utterly and irretrievably lost to piracy, but the battle for pirate streaming is only just beginning.

As it stands, streaming is overwhelmingly in favor of legitimate content owners.  But I really wonder how long that will last. 

After all, the list of streaming P2P applications is long and always growing (now over encrypted onionskin darknets).  Basically, P2P streaming is a hard problem, but it’s also largely a solved problem.  So if there’s no technical reason why pirates don’t stream, maybe they don’t simply because they don’t want to? 

The most obvious reason why this might be true is because people turn to piracy primarily to avoid paying.  (Please excuse the alliteration.)  So long as MySpace and YouTube continue give it out for free, there’s little incentive to build a pirate streaming site.  But the real test will come if something in that calculation changes, by one or more of the major parties.

For example, let’s say MySpace decides they don’t like paying to stream content from central servers, and then paying again for licensing fees.  Maybe they find their ad revenue sagging and decide to integrate a streaming P2P plugin (I’m betting on Littleshoot for now) to offer the same exact experience as today — but by tapping into the pirate networks.  So no bandwidth costs, no licensing fees.

Alternatively, let’s say the powers that be do something incredibly stupid like pulling their music from MySpace, or jacking up the price such that MySpace is forced to charge for it.  At this point there’s an opening for someone like The Pirate Bay to offer a first-class pirate station, and then it’s game on.

Either party would use an argument like “we don’t host any data, we just enable user sharing.  Any illegal behavior they do is their business and we don’t encourage it (we merely profit from it).” 

And unlike the small P2P outfits who have tried this in the past, the next wave of defendants will have substantial legal resources and astonishing revenue incentive.  And unlike the tiny, outgunned P2P outfits of yore, MySpace’s or The Pirate Bay’s victory won’t be quite so Pyrrhic.

Anyway, just wanted to do a quick review of the available data and update my predictions.  Can anyone provide more recent or accurate data to correct the above analysis, or see holes in the logic?  I’m as eager as anyone to get a firm grasp on reality; let me know if you think my grip is slipping.

Fun times, I can’t wait to see where this goes.  Thankfully, it’s going there really fast, so there’s little time to wait.

David Barrett
Follow me at http://twitter.com/quinthar