Re: Compulsory casts.
What compiler doesn't know the types of arguments?
A C compiler :-)
Well it knows what they are, but it won't save you when you decide you know better.
There has been much sniggering into sleeves after wags found they could upset iOS 6 iPhones and iPads, and Macs running OS X 10.8, by sending a simple rogue text message or email. A bug is triggered when the CoreText component in vulnerable Apple operating systems tries to render on screen a particular sequence of Unicode …
Ime, it's typical of modern programming idiom in some areas, where people are sloppy about typing and sloppy about algorithm design, or just plain cut and paste bad code from elsewhere. K&R and other programming texts actually encourage this by using signed types (char, int etc) as the default, but signed types should only be used where actually needed, ie, for arithmetic ops, with the default unsigned everywhere else. It's the same story with the original C standard library, which tended to use (signed) int or char for everything. There, the function typing is a real mess, but may have a historical context excuse..
Back in the days of assembler, it was quite common to find bugs eg: compare or increment / decrement op, followed by a signed branch when processing unsigned values. The bug was perhaps discovered much later when the value flipped the overflow or sign bit. If you do a lot of embedded work, failure to use of the correct type is something you ignore at your peril and standards like Misra specifically warn against this sort of thing.
Programming != Software engineering :-)...
Chris
"Programming != Software engineering"
True, but when I was a Cobol database programmer we knew enough to guard against this stupidity when zipping in and out of separately collected subroutines via the Linkage Section.
Tsk! Baby out with the bathwater again, youth of today, general lack of wherewithal, three world wars, shrapnel in the head, wouldn't've happened in my day, flogging too good for them etc etc.
Talking of VMS, and VT100 terminals, reminds me of much fun in the old days. I remember a certain PAD that would drop the connection if sent something like ZZZZZ (as well as certain control sequences). People used to put them everywhere - including nicknames and in finger IDs on the unix boxes. If you hacked appropriated someone's account, you might set it as their prompt too.
Oh, it's good to see how far we have come.
VT132s were fun, they had a "readback screen" escape sequence. If you crafted the right string in mail you could write something like
DELETE<CR>EXIT<CR> SYS$SYSTEM:SHUTDOWN<CR> to the system admin's terminal, and instruct the terminal to playback the string. By the time he had rebooted the system all the evidence had gone...
Right, killer sequences... "ZZZZZ as well as certain control sequences", you mention... Well, there was this Hayes patent on inserting a "two second pause" between parts of a command to make their modems not hang up on the (otherwise valid) +++ATH0\n sequence. Of course, most manufacturers did license this, and several manufacturers found ways not to... But I do remember seeing a modem that could be pushed to hang up this way. Very fun for us BBS (ab)users! :)
Given all that "Premium product, because it just works" BS I'd expect them to a) Issue a bug fix b)Scan all their code to find similar code sequences c)Update those as well.
IOW not just the bug, but the pattern of the bug.
Let's see if we revisit this gag in the future? Different string, different function, same epic fail.
Should not be possible, and yet......
Excellent article and very detailed low level stuff.
It's actually a frustratingly easy mistake to make with Apple's APIs — those CFIndexes pop up in quite a few places — and Xcode ships with the implicit signedness conversions warning disabled. It's one of the things I always enable when I'm starting a new project. Just enabling that would probably help them catch stuff like this.
That said, if it's a latent problem in initial table setup then the true diagnosis is probably that whomever guarantees the signed value would always be positive needs fixing, so they'd probably just have thrown the explicit cast in and forgotten about it.
Could have been the unicode bug in all versions of NT4/IIS up to service pack 6a.
I remember a friend telling me how awesome the BackOffice server he'd set up for his boss was. I told him to get that shit off the Internet now. He refused.
An hour later I came back to him with a printed directory listing of the server's hard drive and pointed out that I could execute a "format c: /y" just as easily.
Within an hour, the server had been firewalled.
quote: "An hour later I came back to him with a printed directory listing of the server's hard drive and pointed out that I could execute a "format c: /y" just as easily."
And if you did that today, you'd be arrested for unauthorised intereference with a computer system, an offense under The Computer Misuse Act 1990. In fact since you're talking about NT4, it would have been a criminal act at the time... hopefully you didn't get collared for the 12 to 24 months custodial sentence you could have been liable to.
Interesting thought that; using a flaw to point out the flaw is a criminal offense, however a failure to act (failing to secure the system after being notified of the flaw) is deemed a civil offense...
This post has been deleted by its author
This is just such a fundamentally elementary bug that the fact it ended up in OSX and the iPhone product lines is just (to me at least) inconceivable! Truly incompetent! When dealing with buffer sizes/lengths, one NEVER uses signed variables, for just this reason... A true FAIL moment for the Apple software team!
FWIW, I have been writing software for large-scale systems for 30+ years. I am a senior systems engineer for a tier-one hardware/software manufacturer. And I was writing software to support Unicode back in the late 1980's when it was still in the development stages.
If I understood the article, it sounds like it's meant to be calculating something related to the string width, and getting that wrong because by passing from one place to another it interprets a glyph count of -1 as a glyph count of UINT64_MAX.
So if you're asking why floating point arithmetic is used, it's because fonts are designed with floating point arithmetic and rendered with floating point arithmetic. The OS X graphics system uses the same drawing primitives as PDF and Postscript.
I'm sure the bug could possibly be trigged by any uneven number of characters. Not had full chance to play, but my assumption is they're using the direction of the text to mess up the counter; two forwards, three back. Personally I'd compute LTR direction then char direction (i.e. for the backspace char).
No, because the function would appear to get the direction right for a single character or any group of characters with the same direction.
You'd need an initial set of characters going one direction, and then a wider reversing set of characters going the other direction.
If we're talking proportional fonts and actual pixel width rather than character width, it could be done with one narrow LTR character followed by one wider RTL character.
Otherwise, you'd need one more RTL character than LTR (or vice versa if you started with the RTL character).
If this were the case, the correct answer would be either
a) the absolute value of the result, or
b) the sum of the absolute values of the character widths in each direction
depending on whether the desired result is overlapping characters or each set of characters rendered side-by-side.
This little exploint rang a bell, so I searched Bruce Schneier's website. And, sure enough, on July 15, 2000, he observed ``Unicode is just too complex to ever be secure. '' See https://www.schneier.com/crypto-gram-0007.html#9. Doesn't exactly warm the cockles of the paranoid's heart.