An Alternative Perspective -- Scientific Accountability
I've been following Dan Goodin's articles on PDF vulnerabilities
quite closely for some time now and I would like to offer an
alternative perspective. Specifically, instead of bashing PDF
for its security weaknesses, I believe the format should be
appreciated for the many scientific possibilities it offers, especially
in light of the ongoing climategate scandal.
Before I elaborate on my viewpoint, I should note that I do take
PDF vulnerabilities seriously. For instance, the recent security advisory:
http://www.adobe.com/support/security/bulletins/apsb10-09.html
lists me as having identified CVE-2010-0197. And Quirk2003,
http://www.amrita-cfd.org/doc/amr2003, is an example of a PDF
that includes a built-in security FAQ. It does so, because the document
includes /Launch actions, which currently have Stevens in a froth.
However, for added security, its /Launch actions are only active
when the document is viewed using a custom PDF pre-processor.
Therefore, while I would not bill myself as a security expert,
I like to think I have a grasp of the main issues.
My real interest lies with the concept of self-substantiating,
journal articles for injecting rigour into the practical aspects
of computational science. Imagine electronic documents that
preserve the look-and-feel of a traditional scholarly publication,
while containing embedded examples that allow the interested reader
to sample the reported work, first hand, right down to its smallest detail.
Well, Quirk20003 shows that such documents can be prototyped using PDF.
Why bother? Some of you may have read a recent story in The Guardian,
by Darell Ince: ``If you're going to do good science,
release the computer code too'' see:
http://www.guardian.co.uk/technology/2010/feb/05/science-climate-emails-code-release
It is an open secret in scientific computing that programming standards
are extremely poor and desperately need improving.
As luck would have it, I posted the last comment on Ince's article
and so it has its own url,
http://www.guardian.co.uk/technology/2010/feb/05/science-climate-emails-code-release?showallcomments=true#end-of-comments
But the downside to posting the last comment is that I got no feedback.
Undeterred, I contacted Ince directly. Then following an exchange of
e-mails and a phone call, he pointed me in the direction of
``The Fourth Paradigm'' -- scientific discovery
through data intensive processing, see http://www.fourthparadigm.org .
And it was while annotating this Microsoft-sponsored
book that I stumbled upon CVE-2010-0197 .
Now as an undergraduate, I lived in Fitzwilliam St, Cambridge,
directly opposite to where Darwin once lived, and so I'm fully aware
of my own limitations as a scientist. However, given my document
dabblings, it pains me to see the advocacy of a new scientific paradigm,
distributed as a PDF, in which the authors cannot provide the
critical reader with worked examples to show their ``computational thinking.''
The situation is analogous to a mathematician claiming to have a wonderful proof
but only being prepared to discuss the proof in vague generalities.
It jars, because as my annotated version shows:
http://www.amrita-ebook.org/draft/jjq-on-4th-Paradigm.pdf
PDF allows for a much richer dialogue between technical author and
technical reader.
Here I need to make it very clear that my document dabblings are just that,
dabblings, and anyone who downloads jjq-on-4th-paradigm.pdf will soon
see the limitations of my work. The scientific question, however,
is not whether I'm right or wrong. Nor has it anything to do with
format wars, PDF vs XML vs A.N.OTHER. It has to do with accountability,
scholarship, and maintaining standards of critical thinking.
This week the US Library of Congress announced a project to
archive Twitter which, while I have grave misgivings about the target
material, shows that society takes its archiving duties seriously.
For me, a much more exciting initiative is the Federal Research
Public Access Act (FRPAA):
http://www.taxpayeraccess.org/issues/frpaa/index.shtml
which today, after a number of false starts, was introduced
in the US House of Representatives, not six miles from where
I'm composing this message.
The challenge I would like to leave readers of Dan Goodin's articles
is that the next time you are tempted to bash PDF for having unnecessary
and dangerous features: stop, and imagine a world where taxpayers,
educators, and students could download and try out ``computational classics.''
These are entites that rival literary ones in terms of their cultural significance and
would help inspire generation after generation to the intellectual joys
of computer-based science. Then imagine what document features would be needed
to support said computational classics.
This year, The Royal Society (the world's oldest scientific organization still
in existence) is celebrating its 350th anniversary.
So do your imagining in the year 2360 when the society celebrates
its 700th anniversary. By then, the inherent weaknesses of current
journal papers, for reporting computational work, not withstanding
their strengths, will be apparent to even a kindergarten student.
It is also to be hoped that by then society will have a better
handle on how to deal with computer security.
Yes security is important, but it is also important for document formats
to evolve so as to support rigorous computer-based science.
James J. Quirk
16th April 2010
Alexandria, VA