* Posts by penthes

1 post • joined 5 Jul 2008

ISO certifies Adobe's PDF

Thumb Down

OK for reading, no good for analysing

PDF has one major problem - no document structural markup, ie anything indicating what's a heading, where a paragraph starts and ends etc.

Fine for viewing on screen and printing, but if you want a machine to read it, analyse it, extract information for search & discovery, then it's awful. Even worse if you want to extract tables or other structured information.

HTML's actually a much better format for this.

Unfortunately a lot of documents are getting published (and archived) in PDF only.