Which number takes more space in an Oracle row?

So, which number takes more bytes inside an Oracle row?

A: 123

B:  1000000000000000000000000000000000000

And the correct answer is … (drumroll) … A! The “big” number 1000000000000000000000000000000000000 actually takes less space than the “small” 123!

Let’s verify this:

SQL> select vsize(123) A, vsize(1000000000000000000000000000000000000) B from dual;

         A          B
---------- ----------
         3          2

WTF? Why does such a small number 123 take more space than  1000000000000000000000000000000000000 ?

Well, the answer lies in how Oracle stores numbers. Oracle NUMBER datatype doesn’t store numbers in their platform-native integer format. Oracle uses it’s own format which stores numbers in scientific notation, in exponent-mantissa form. More details about this here.

You can use the DUMP sql function to see the actual binary value of the number data stored:

select dump(123) from dual;

Typ=2 Len=3: 194,2,24

SQL> select dump(1000000000000000000000000000000000000) from dual;

Typ=2 Len=2: Typ=2 Len=2: 211,2

So, although the number 1000000000000000000000000000000000000 is bigger than 123, when stored in base-10 exponent form, it really carries much less information in it than 123 (1 x 10^36 vs 123 x 10^0). Oracle doesn’t need many bits for keeping the precision of this large value as it happens to be a power of 10.

See what happens when I store a number only slightly bigger or smaller than the original large number, now the stored number requires much more storage for keeping the required precision:

SQL> select dump(1000000000000000000000000000000000000+1) from dual;

Typ=2 Len=20: 211,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2

SQL> select dump(1000000000000000000000000000000000000-1) from dual;

Typ=2 Len=19: 210,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100
This entry was posted in Oracle and tagged . Bookmark the permalink.

10 Responses to Which number takes more space in an Oracle row?

  1. Sokrates says:

    And the correct answer is … (drumroll) … B

    A ?

  2. Tanel Poder says:

    @Sokrates Doh, of course :) As the question was about which one took “more” not “less”. Fixed. Thanks!

  3. It’s a fun thought that because of this, one can reduce the storage requirements for something like a data warehouse fact table by allocating to the most common dimensional values synthetic keys with the smallest storage requirements — 0, 1, 100, 1000, 100000 etc..

    It would be a shame if the value that occurs 95% of the time was represented by 4 bytes instead of 2.

    I don’t have a database handy at the moment, but am i right in remembering that 0 requires one byte and 1 requires 2? I seem to remember using 0 and 1 as keys for a two-value dimension instead of 1 and 2 for that reason … or maybe I was just considering it and then thought “over engineering!”.

  4. Tanel Poder says:

    @David Aldridge
    Interesting point, David :)

    Although I think I’d be lazy enough (and optimize my own time and reduce complexity) instead of trying to save 0.1% of the total table size :)

    Yep, Value 0 is stored as 0x80, 1 is stored in 2 bytes (0xC1, 0x2)..

    Of course every column which is actually stored in the row structure also has a length byte, so a NUMBER 0 would be stored as 0x1 (length) and 0x80 (value).

    There used to be a Data Server Internals class by Oracle (DSI – updated until 9i only) which explained these low level structures, including datatype storage formats in good detail…

  5. null says:

    Nice post! Thanks Tanel.

  6. maclean says:

    interesting post. how about gooooogle ….?

  7. joel garry says:

    Always great fun:

    TTST> select dump(.999999999999999999999999999999999999999999999999999999999999) from dual;

    Typ=2 Len=2: 193,2

    TTST> select dump(1) from dual;

    Typ=2 Len=2: 193,2

    TTST> select dump(1/3*3) from dual;

    Typ=2 Len=21: 192,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100

    TTST> select dump(3*1/3) from dual;

    Typ=2 Len=2: 193,2

    TTST> select dump(3*(1/3)) from dual;

    Typ=2 Len=21: 192,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100,100

    Good ol’ precedence combined with division precision repeals commutative law, makes up information.

  8. Ahmed says:

    Hi Tanel,

    We’re getting an ORA-03113 error when one try to apply a to_char function on a FLOAT column:

    SQL> select to_char(col1),col1 from aaaa;
    ORA-03113: end-of-file on communication channel
    Process ID: 0
    Session ID: 416 Serial number: 2314

    The table contains 4 rows and the issue is due to the fourth row.
    I don’t know yet how the data is inserted into this table.
    I’m waiting for our developpers to provide me with this information.

    SQL> select col1,dump(col1) dump_col1 from aaaa;

    ———- ——————————
    0.0002 Typ=2 Len=2: 191,3
    0.002 Typ=2 Len=2: 191,21
    0.0002 Typ=2 Len=2: 191,3
    0.0002 Typ=2 Len=3: 191,3,1

    The visible value of the fourth row seems to be the same as the 1st and the 3rd row but the dump column shows that the length is not the same.
    How can I find out what is the actual value stored inside the database for the 4th row? How interpret the 3rd part of the dump column: “191,3,1” ?

    It would be nice if you could help me on this.

  9. Ahmed says:

    I forgot to indicate the database version:

    SQL> select * from v$version;

    Oracle Database 10g Enterprise Edition Release – 64bi
    PL/SQL Release – Production
    CORE Production
    TNS for Linux: Version – Production
    NLSRTL Version – Production

Leave a Reply

Your email address will not be published. Required fields are marked *