Numbers, precision and PostgreSQL

It is interesting how numbers can create problems while developing systems.

While we can’t write one third (1/3) in decimal without resorting to infinite representation of the number through periodic repetitions (I don’t know if the term “periodic number” is correct in English) there is the same issue with other base systems, for example, it is impossible to write one tenth (0.1 or 1/10) in binary without periodic numbers.

A second issue is not the numeric conversion itself, but the manner in which this number will be stored at the database server. There are different standards that try to solve the conversion issue (BCD representation, for example) but even then, people (mainly developers and DBAs) need to be aware of them and what are the implications on their usage.

Other problems happen when making comparisons. Due to the innacuracy of the float type and the hardware where the software is run, assuming that you can get x = 0.05 to work all the time is something problematic (specially when 0.05 becomes 0.0500000001…). The recommended approach, then, is using intervals, so instead of x = 0.05 we would write the equivalent to 0.4999 < x < 0.5001 (or any other variant that takes into account the precision you can get on your numeric system and platform(s)).

Do you want a whole new beast to fight with? Name it rounding. Yes. It might appear to be a simple operation, but then ask for someone on the Maths field to round some numbers and ask the same for someone in the Physics field. You will get different results. And different explanation for the processes. (And if you start including error coefficients, error propagation, etc. then this beast will really become a nightmare…)

Back to my small tiny world (with not so many beasts to fight…), the reason that led me to write about this was some discussion on numbers, their representation on screen and their storage on database systems. It is amazing how a lot of developers lack some formality on their understanding of what they do. It is like they are only code replicators or writers. There is, of course, a vast number of them that understand what they do, but I feel that the majority don’t know what they are doing. Specially with more high level languages.

On the database side, PostgreSQL’s numeric types is a manual page that is largely ignored (forgotten?) by its users. It is nice that it also brings up attention and some awareness about the numeric issues and their representation /storage on systems.

I have submitted a comment on that page asking for more clarification on how to calculate NUMERIC and DECIMAL storage requirements… The current information is just:

The actual storage requirement is two bytes for each group of four
decimal digits, plus five to eight bytes overhead.

Which is helpful, but could be improved. Maybe some examples could make it a lot clearer.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s