Draft whitepaper about improvements to functions in Excel 2010

I don’t often write posts that simply say “hey, did you see this post over here?”. These echoes in the blogosphere don’t really add much value, and are sometimes symptomatic of people being measured by how many blog posts they write to meet some arbitrary marketing activity metric, rather than adding quality.

(Aside: the same applies to a series of posts about 6 related features of some software or comparing 5 alternative products which would have made much more sense written as a single cohesive article, but failed to tick the box for 10 blog posts per month. You get what you measure, or WYMIWYG)

But today, I though this was important enough to just say – have you seen this post on the Excel Team blog about improvements to functions in Excel 2010? Now that one’s a week or so old now, but today there was an even more important post with a link to a draft whitepaper with more information.

I have linked to the post , not the pdf file directly as it is only a draft, and hopefully the post will get an edit or at least a comment when a newer or final version is available. This document is for those who like to understand the details, and for any sceptics who might say “well, they said it was accurate last time, how do we know it’s any better now?” – it does sound a bit like washing powder ads who always tell us that this time round it will get things even whiter and brighter and cleaner than ever (just like they said before).

There are a couple of typos in the draft (the floor.precise function for example has an obvious chunk of copy and paste from the ceiling.precise function for example), and there are some things not made very clear (for example in most cases it only describes the new behaviour, not the old for comparison or for explanation of the difference and why the new way is more accurate).

It lists the MOD function in the section on functions whose accuracy has been improved. Did you know that MOD gave inaccurate results on older versions of Excel? I didn’t. In fact, it doesn’t – it gives a completely accurate result or fails with a #NUM error if the divisor goes into the number more than 2^27 times as described here. I think this is distinct from being inaccurate in the way the statistical functions have often been criticised.

Similarly the RAND function is listed but its problems not described in any detail – I can only assume that the function is flawed in that it does not give a perfectly even distribution of results and is therefore “weighted” to some extent. Since I only ever use it to produce dummy data for examples used in my Excel training courses, it does not really matter to me if it not truly statistically pure, but I am sure to others it is vitally important what algorithm is used to generate the results (it’s now the Mersenne Twister, for those who care, but this fact is not from the whitepaper, it’s in a comment to the original post made by Jessica Liu).

Anyway, the bugs and inaccuracies that are discussed in the whitepaper are all now fixed (but it does not say this in the whitepaper, merely leaves it implied), and already works in the Technical Preview. Some of the other changes came too late for TP but should be in the public Beta when that gets released.

I expect the changes to naming conventions will also help people who use the statistical functions a lot (I’m not one of those) or have to make sense of others’ work. The convention of .precise added to a function name seems to mean “according to a precise definition” rather than “inherently more accurate”. I would have though .strict might be less ambiguous and similar to the use of the term in other fields (eg web design using XHTML versus XHTML strict).

Has anybody had real issues with these inaccuracies in the past? (and had you even noticed?)

Are you using other software tools to avoid the problem in Excel, and will these changes allow or encourage you to switch back?