Armed with little more than an electronic dictionary and text-analysis software, Claire Whelan, a graduate student in computer science at Dublin City University in Ireland, has managed to decrypt words that had been blotted out from declassified documents to protect intelligence sources.
13 May 2004 DECLAN BUTLER
[IMAGE]It took less then a week to decipher the blotted out words.
She and one of her PhD supervisors, David Naccache, a cryptographer with Gemplus, which manufactures banking and security cards, tackled two high-profile documents. One was a memo to US President George Bush that had been declassified in April for an inquiry into the 11 September 2001 terrorist attacks. The other was a US Department of Defense memo about who helped Iraq to 'militarize' civilian Hughes helicopters.
It all started when Naccache saw the Bush memo on television over Easter. "I was bored, and I was looking for challenges for Claire to solve. She's a wild problem solver, so I thought that with this one I'd get peace for a week," Naccache says. Whelan produced a solution in slightly less than that.
Demasking blotted out words was easy, Naccache told Nature. "Optical recognition easily identified the font type - in this case Arial - and its size," he says. "Knowing this, you can estimate the size of the word behind the blot. Then you just take every word in the dictionary and calculate whether or not, in that font, it is the right size to fit in the space, plus or minus 3 pixels.
A computerized dictionary search yielded 1,530 candidates for a blotted out word in this sentence of the Bush memo: "An Egyptian Islamic Jihad (EIJ) operative told an XXXXXXXX service at the same time that Bin Ladin was planning to exploit the operative's access to the US to mount a terrorist strike." A grammatical analyser yielded just 346 of these that would make sense in English.
A cursory human scan of the 346 removed unlikely contenders such as acetose, leaving just seven possibilities: Ugandan, Ukrainian, Egyptian, uninvited, incursive, indebted and unofficial. Egyptian seems most likely, says Naccache. A similar analysis of the defence department's memo identified South Korea as the most likely anonymous supplier of helicopter knowledge to Iraq.
Intelligence experts say the technique is cause for concern, and that they may think about changing procedures. One expert adds that rumour-mongering on probable fits might engender as much confusion and damage as just releasing the full, unadulterated text.
Naccache accepts the criticism that although the technique works reasonably well on single words, the number of candidates for more than two or three consecutively blotted out words would severely limit it. Many declassified documents contain whole paragraphs blotted out. "That's impossible to tackle," he says, adding that, "the most important conclusion of this work is that censoring text by blotting out words and re-scanning is not a secure practice".
Naccache and Whelan presented their results at Eurocrypt 2004, a meeting of security researchers held in Interlaken, Switzerland, in early May. They did not present at the formal sessions, but at a Tuesday evening informal 'rump session', where participants discuss work in progress. "We came away with the prize for the best rump-session talk - a huge cow-bell," says Naccache.
(c) Nature News Service / Macmillan Magazines Ltd 2004Posted by iang at May 16, 2004 11:17 AM | TrackBack