Re: Office 2007 is fine
>>"First, this is wrong, as I'm pretty sure the numbers after "row" and "col" are variable (one entry for each combination)"
True, but modern compression techniques handle that. If you have a sequence like the following:
<row 1><col 1>{empty}</col 1></row 1>
<row 2><col 1>{empty}</col 1></row 2>
<row 3><col 1>{empty}</col 1></row 3>
...
<row 999999><col 1>{empty}</col 1></row 999999>
Then compression will pull out the like parts and just preserve the sequence of the row numbers and how they fit into it. But it will actually go further. If it recognizes a simple sequence (e.g. incrementing by 1 each time), then it will codify that sequence instead.
The people who write compression algorithms are very, very smart. Both you and I could probably write something that does what I just described. So why expect someone who does it professionally not to? It takes a modern processor almost no time at all to expand a compression technique such as I just described. Compared to image compression, it's child's play.
The OP was very wrong to suggest that this was "bloated" because they'd completely forgotten that docx is a container format that is compressed as standard.
>>"Second, compression is not an excuse for something that could be solved by a less crappy format (Keeping the XML and adding a simple rule like "Saving : Empty cells are not be saved. Loading : If a cell is not defined in the file then it's considered empty" would do the trick)"
Have you actually tried this? I just created an workbook in Excel 2013. I put data in rows 1,2,3 and 5 and saved it. I then unpacked the file using 7zip and had a look.
Within the <sheetData> element are <row> elements each with an "r" attribute which is clearly the row number. I have enties for 1,2,3 and 5 but no row element for 4. So it seems it actually does do what you suggest. Probably there is something that the OP omitted to mention such as special formatting or references or similar. Or come to think of it, they're talking about Office 2007 which is eight years old and uses the very crap version of .docx that was rushed through ISO for marketing reasons. At any rate, modern versions of .docx omit the elements where possible - I've just checked.
But that "where possible" is important. If you add custom rules as you suggest, then you can quickly reach the point that it is no longer valid XML and then you create interoperability problems for third parties. And one of the big deals with .docx unlike their old proprietary formats, is that it is a standard that is open and can be used by third parties. Having your formats be valid XML is a MAJOR boost to that. You can't just decide you don't want to represent some of the XML elements because you feel like it. And as pointed out, they have minimal effect on file size due to modern compression techniques.