Just as Reddit is celebrating Opera reaching 100/100, with the misleading headline Opera the first browser to pass the Acid3 test (hey, submitter: it wouldn't hurt to read the Opera blog post before submitting it to Reddit), the Apple guys track me down and point out that there's yet another bug in the test. With heycam's help, we have now fixed the test. Again. This presumably means Opera is now at 99/100... the race continues!
I have to say, by the way, that the relevant parts of the SVG spec are truly worthless. Where are the UA conformance criteria? You'd think a spec that was so verbose and detailed would actually tell you stuff, instead of just rambling on without actually saying what the requirements were...
The Acid3 test has a rendering subtest that checks the positioning of text in particular conditions (absolutely-positioned generated content with embedded fonts). To get precise results, I used a single glyph from the Ahem font, which has well-known metrics. My plan was to have the glyph set up so that a perfect white 20×20 pixel square glyph from the font would be overlaid exactly on a 20×20 pixel square red background, thus hiding everything when things lined up. I positioned this test in the upper right hand corner, snug against the black border of the test.
The problem with this test is that on some platforms, specifically Mac platforms with LCD antialiasing, the font rendering system actually renders the glyph using sub-pixel effects, which ends up overlapping the border and makes the test not look the same as the reference rendering.
This would affect any browser, but only on Mac. Unfortunately, the WebKit team "fixed" this problem by simply hard-coding the Ahem font and making it not antialias.
Now, I argue that this is a bug in the antialiasing, but sadly there's no real spec for the antialiasing and so other people argue that it shouldn't be in the test in the first place, whether I'm right or wrong. What we all agree on is that the font-specific hack is lame. (It's especially bad with this font because Ahem is supposed to be a testing font and we specifically don't want it going down different codepaths!)
So Hyatt and I came to a deal. I would move the test down and to the left one pixel, so it doesn't affect the border anymore, he would accept to remove the hack, and would fix one additional bug (a background-position rounding bug).
The test will probably have a few more minor changes as people track down the last few remaining problems, in particular in the SVG subtests and on the performance part of test 26. (Test 26 is supposed to track the incremental speed of computer hardware, but it hasn't been really calibrated well yet, I just estimated what the numbers should be.)
It's great to see WebKit and Opera work so hard on interoperability issues such as those brought up by Acid3. The Microsoft and Mozilla teams are currently in the "crunch time" of their respective browsers' releases, so it's expected that they wouldn't be working on this at this time — at the end of a release cycle, stability, performance, and user experience are usually much more critical. Hopefully once IE8 and Firefox3 are released they will be able to turn once more to the world of standards and we'll see big improvements of the Web platform again.
The Acid3 test says "To pass the test, a browser must use its default settings, the animation has to be smooth, the score has to end on 100/100, and the final page has to look exactly, pixel for pixel, like this reference rendering". (Emphasis mine.)
There has been some question as to what "the animation has to be smooth" means.
The idea is to make sure that browsers focus on performance as well as standards. Performance isn't a standards-compliance issue, but it is something that affects all Web authors and users. If a browser passes all 100/100 subtests and gets the rendering pixel-for-pixel correct (including the favicon!), then it has passed the standards-compliance parts of the Acid3 test. The rest is just a competition for who can be the fastest.
To determine the "score" for performance in a browser that gets 100/100, click on the "A" of "Acid3" on the test after having run the test twice (so that the test uses the browser's cache). An alert should pop up, giving a total time elapsed, and reporting any tests that took longer than 33ms. Test 26 is the only one that should take any significant amount of time, as it contains a tight loop doing some common DOM and JS operations. The test has "passed", for the purposes of the "smoothness" criteria, if all the tests took less than 33ms (it'll give you a message saying "No JS errors and no timing issues." if this happens). Then the only issue is the total time — is it faster than all the other browsers?
An important question is "using what hardware?". Performance tests vary depending on the hardware, so some "reference platform" has to be picked to make a decision. Since "computer" browsers are the first priority with Acid3, as opposed to browsers for phones or other small devices, and since we want the hardware to be able to run the three major platforms of today, I have decided that the "reference hardware" is whatever the top-of-the-line Apple laptop is at the time the test is run.
As hardware improves, performance improves too, so to take this into account test 26 is set up to take longer and longer over time. Today I calibrated the test so that the performance it expects is plausible and will remain so for the next few years, based on results that browsers get on the past few years of Mac laptops.
In my last post I discussed what "the animation has to be smooth" means for Web browsers on computers. Of course, embedded devices also have Web browsers, and Acid3 is as important for those as it is for a regular computer.
On embedded devices with smaller displays, the visual rendering may end up extending outside of the screen, but it should still look pixel-for-pixel identical to the reference rendering.
On devices with low power CPUs, the animation may end up being jerky, especially around test 26. This is because test 26 is significantly more CPU-intensive than the other tests.
On devices without a cache, the animation may be very jerky throughout. This is because without a cache, the test is affected by network conditions.
However, ideally, even embedded devices will have a cache and a CPU powerful enough to handle intense JavaScript. After all, they won't be able to handle big Web applications if they don't have the power to handle even a page like Acid3!
2008-04-22 02:46 UTC
Media queries and performance in Acid3 (and an error on my part)
David Baron of Mozilla discovered some errors in the Acid3 test. It turns out that the Media Queries draft changed between 2002 and 2007, and I was testing things as they stood in the 2002 version of the specification! This has now been fixed. He also found a logic error that I'd made, which I have also fixed.
I'm really glad to see the Acid3 test being reviewed so carefully. With something so complex it's always possible that there are more errors, though, so if you find any more please let me know!
On a related note, the lasttwo posts I made discussed the performance aspect of Acid3, and how to determine if a browser has passed the "smoothness" criteria. While I covered how to test browsers that run on regular laptop computers, there are of course a lot of browsers out there that run on computers that aren't, and never will be, as powerful as high-end laptops. So for the record: if the test is run on a slow computer or device, it may run slowly or not smoothly and this does not imply non-conformance.