Material

PDF.js, Sausages, and Numeracy

How fast is PDF.js? purports to tell you how fast Mozilla's PDF renderer, PDF.js, is - and therefore presumably also why Opera will be choosing it. What it actually tells me is that someone thinks it's important to get an "Opera and Mozilla, open web together!" puff piece on a Mozilla development blog.

Let's say you're a software engineer from San Francisco visiting Berlin. You're hungry, and stop to grab lunch. Berlin's street specialty is currywurst, which they're offering with fries and a drink for €7. Is that a good deal? How can you tell?

Opera and Mozilla say, look at the other items on the menu. If you see a €10 kebab - well, that's more expensive, so €7 must be a good deal. Maybe do a quick conversion in your head - that's about $10, which compares to ever-gentrifying SoMa prices, and you were told to expect things being a bit more expensive in Europe anyway.

How cheap is this restaurant? Well, 80% of its menu items cost less than what's in my wallet, which is cheaper than the other 20%.

Of course you don't do that. You check the prices at the place next door, and see that things should cost at least 30% less. You'd be a sucker to buy lunch at the first place!

You can't benchmark something against itself. The post describes largely unspecified PDFs (guessing a target of "about 4x as bad as this one PDF is OK" as if the measurement were linearizable to begin with), on an unspecified machine, with no points of comparison with the half dozen other PDF readers doing the same thing. Then shows off a graph that doesn't tell me anything beyond the complexity of PDFs approximate a Zipfian distribution, which, no shit.1

How fast is PDF.js? Well, 80% of the time, it's not terrible, which is faster than PDF.js the other 20% of the time.

This answer is great if you're trying to befuddle Opera users into not worrying about their new PDF reader, developers into not worrying about platform consolidation and ever-taller stacks of leaky abstractions, investors into remembering Opera exists at all.

But how much of my phone's battery (née milliliters of dead dinosaur and cm³ of greenhouse gasses) does it take to render it? How many weeks/months/years of median salary does it take to afford a computing system that lets me read it "fast enough"? How far can we lower those numbers? These are the kind of questions you need to answer if you want to show you care about high-quality software for sustainable and broadly-available computing.2 To answer them, you need to do actual benchmarks. In the absolute, and relative to other software, but never against yourself - after all, you're cheating.


  1. I'm not saying PDF.js is too slow. It feels slower than other readers to me, but all that means is that someone should run a benchmark. A real benchmark. 

  2. Or in the rhetoric of Mozilla, which obfuscates the actual concerns so they feel good publishing worthless articles, "the open web."