May 16, 2004

US intelligence exposed as student decodes Iraq memo

Armed with little more than an electronic dictionary and text-analysis software, Claire Whelan, a graduate student in computer science at Dublin City University in Ireland, has managed to decrypt words that had been blotted out from declassified documents to protect intelligence sources.


[IMAGE]It took less then a week to decipher the blotted out words.

She and one of her PhD supervisors, David Naccache, a cryptographer with Gemplus, which manufactures banking and security cards, tackled two high-profile documents. One was a memo to US President George Bush that had been declassified in April for an inquiry into the 11 September 2001 terrorist attacks. The other was a US Department of Defense memo about who helped Iraq to 'militarize' civilian Hughes helicopters.

It all started when Naccache saw the Bush memo on television over Easter. "I was bored, and I was looking for challenges for Claire to solve. She's a wild problem solver, so I thought that with this one I'd get peace for a week," Naccache says. Whelan produced a solution in slightly less than that.

Demasking blotted out words was easy, Naccache told Nature. "Optical recognition easily identified the font type - in this case Arial - and its size," he says. "Knowing this, you can estimate the size of the word behind the blot. Then you just take every word in the dictionary and calculate whether or not, in that font, it is the right size to fit in the space, plus or minus 3 pixels.

A computerized dictionary search yielded 1,530 candidates for a blotted out word in this sentence of the Bush memo: "An Egyptian Islamic Jihad (EIJ) operative told an XXXXXXXX service at the same time that Bin Ladin was planning to exploit the operative's access to the US to mount a terrorist strike." A grammatical analyser yielded just 346 of these that would make sense in English.

A cursory human scan of the 346 removed unlikely contenders such as acetose, leaving just seven possibilities: Ugandan, Ukrainian, Egyptian, uninvited, incursive, indebted and unofficial. Egyptian seems most likely, says Naccache. A similar analysis of the defence department's memo identified South Korea as the most likely anonymous supplier of helicopter knowledge to Iraq.

Intelligence experts say the technique is cause for concern, and that they may think about changing procedures. One expert adds that rumour-mongering on probable fits might engender as much confusion and damage as just releasing the full, unadulterated text.

Naccache accepts the criticism that although the technique works reasonably well on single words, the number of candidates for more than two or three consecutively blotted out words would severely limit it. Many declassified documents contain whole paragraphs blotted out. "That's impossible to tackle," he says, adding that, "the most important conclusion of this work is that censoring text by blotting out words and re-scanning is not a secure practice".

Naccache and Whelan presented their results at Eurocrypt 2004, a meeting of security researchers held in Interlaken, Switzerland, in early May. They did not present at the formal sessions, but at a Tuesday evening informal 'rump session', where participants discuss work in progress. "We came away with the prize for the best rump-session talk - a huge cow-bell," says Naccache.

(c) Nature News Service / Macmillan Magazines Ltd 2004

subscription required for this link

Posted by iang at May 16, 2004 11:17 AM | TrackBack

ROTFL or something like that.

Aspecially the "Intelligence experts say the technique is cause for concern..." aren't they aware that these techniques probably were already in use during WW2? Yes, even the daunting task of collecting the hugh number of alternatives for each word. But as labor was relativily cheap and they already had an inkling of distributed processing, it is too obvious.

I like French food, but I like this even better ;-)

Posted by: Twan at May 17, 2004 08:41 PM

[for references] you only have to look at the descriptions on how the large scale cryptographical services at these time worked (David Kahn and others). And don't forget that not all intercepted traffic was as sophisticated as the PURPLE and Enigma stuff.

Deciphering "blotted out" texts was probably only used to make sense out of censored letters found on fugative civilians and captive or killed soldiers. And is likely only to yield information on the movement of troops.

And this, of course, was a less glamorous part of intelligence work, than the intelligence which could be gathered from high-end crypto of the Engima and PURPLE codes. And showing off the technological prowess of cracking the latter codes added to the fame, but also obscured the "simpler" techniques in the process.

Posted by: Twan at May 18, 2004 03:59 AM

You make a good point - that the sex appeal of some parts of the crypto business overshadows some drier parts, and this distortion always been with us. C.f., the emerging QC hype cycle.

Posted by: Iang at May 18, 2004 04:00 AM