Bruce Schneier | |||||||||||
Schneier on SecurityA blog covering security and security technology. « Demands from Law Enforcement for Google Data | Main | The Security of SSL » October 26, 2011Cracking the Copiale CipherI don't follow historical cryptography, so all of this comes as a surprise to me. But something called the Copiale Cipher from the 18th Century has been cracked. EDITED TO ADD (11/14): Here's the academic website. Posted on October 26, 2011 at 6:02 AM • 20 Comments To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter. Danny Moules • October 26, 2011 6:30 AM It's interesting how many different skillsets were required to break it. Makes me wonder what a modern 'cryptoanalyst' is. Deciphering it doesn't look too difficult to me (a native german). The hardest (or luckiest) part is finding such a document in the first place. AlanS • October 26, 2011 8:44 AM "It has been more than six decades since Warren Weaver, a pioneer in automated language translation, suggested applying code-breaking techniques to the challenge of interpreting a foreign language." He also did this for biology. As a program officer at the Rockfeller Foundation he promoted the application of ideas taken from mathematics, physics, linguistics, information theory and cryptography to biology to form a field we now call molecular biology, a term he coined. The scientific approach that leads to the 'cracking the genetic code' and beyond comes from Weaver. Of course, it's a metaphor; not a real code. Ron • October 26, 2011 8:50 AM I couldn't really figure out why this is a big deal. It's basically a classic substitution cipher with a couple neat twists (fake letters and colons). Beyond that, the techniques were no different than doing the cryptoquote in the Sunday paper. This is still incredibly cool, don't get me wrong - but I don't understand why they're making a big deal about applying a 'new technique'. Bill Ricker • October 26, 2011 9:15 AM The academic website http://stp.lingfil.uu.se/~bea/copiale/ interesting features: * normal alphabetic characters as nulls (homophones for word-break). The logotypes for words make this a Nomenclator mixed code/cipher. This matches what Kahn would have us expect for practical use in the era from European diplomatic practices in the Enlightenment. Many such were broken by hand in the 20thC, what is novel here is the computer helped classify the graphic symbols - and the content subject, which is not diplomacy. Preparing a KWIC index facilitates recovering the logotype codewords, to view ther word contacts. This is classic trench-code / nomenclator breaking. Note two *lip* and *bigX* take suffixes of 'rey', which would be in modern english 'ry'. *toe* almost always precedes *nee*. My best guess so far *bigl* Big_L_Shape and it reads fairly well. Encoders errors exist. In one case, *lip* stands for itself, and in few cases 'mason' is not logotyped, ditto 'society'. *o* might represent a germanic compound for a specific society name. Were I the encoder, I would have logotyped 'brother', 'apprentice', 'fellow', 'candidate' , 'conduct', 'ceremony', 'cross', 'holy/sacred', 'Scottish' and 'St Andrews' as well. Those words gave context to the logotypes. David • October 26, 2011 1:40 PM Voynich? Nah, how about something more "interesting" and practical like the Beale Ciphers ( http://en.wikipedia.org/wiki/Beale_ciphers ) - describing burried treasure here in Virginia? I remember going to a talk given at one of the local ACM subcommittee meetings in the 80's given by Dr. Carl Hammer (of UNIVAC) who was trying to break the ciphers and digging up the countryside to find the treasure as a hobby ( http://www.angelfire.com/pro/bealeciphers/... ), but I guess he never found anything (that he'd admit to anyway). One theory was that the two unsolved documents were actually bogus, but as I recall Dr. Hammer's analysis had concluded that their cipher text was not random so might actually contain encrypted text. Enjoy! bill ricker • October 26, 2011 4:54 PM Regarding the Beale ciphers, a productive member of the retro-crytanalytic-computing community J. Gillogly has statistical evidence that determines at least one of the Beale documents (B1) is an elaborate hoax. Jim Gillogly, "THE BEALE CIPHER: A DISSENTING OPINION", October 1980, Cryptologia, bill ricker • October 26, 2011 5:42 PM Re Voynich ms., the 'Copiale' technique of automatically classifying glyphs / graphics into equivalence classes to prepare digital text from images might actually be applicable. The question remains if the Voynich ms. is a known natural language in a novel script, or a lost language (unlikely?), or a novelty language, or nonsense. Textual Statistics and Cipher breaking (with computer support) is only really useful if it is indeed a natural language in peculiar 'disguise', as with the Copiale ms. If it is instead utter nonsense, a sufficient statistical attack (after a Copiale ms. style digital text preparation) might provide evidence for Voynich ms. being non-structured and badly-randomized by an intelligence trying to be random. If it is a 'new' language (either a novelty language like Klingon or Elvish, or a lost natural language like Etruscan), it's more like code-book breaking, with rather more thought required.
What tools would one use ? A combined KWICI and trial annotation script is what I hacked together in minutes for my Nomenclator discussion up-thread. Nothing fancy, quick little Perl scripts or AWK scripts if one is used to text munging. A full assault on language-as-codebook would require more. Adjacency matrix and affix discovery would also be useful 'preparation' to automate. Perhaps Machine Learning techniques can extract the Grammar automatically, albeit unlabelled. One expects (hopes?) the fantastical diagrams / imagery interspersed with the text would provide known plaintext in the form of probable words, "cribs", for entry into the lexicon. If however the images are unrelated doodles, mere adornment, then the utter lack of context -- no diplomacy or war that can be assumed to be topic of most encoded cablegrams that aren't proforma monthly reports -- deprives one of most usual entries into a code-book. Sadly, it could be both nonsense and a novelty language, if the hoaxer developed a grammar (which might be recoverable by adjacency matrices) but not a semantics, leaving the grammar as a trap into which the would be cryptanalyst may pour his own meaning. Peter E Retep • October 26, 2011 9:20 PM The problem with interlocking secret societies is: (with apologies to the Inklings of XX) Bill Ricker • October 26, 2011 9:58 PM correction to self - On reading the full 'Copiale' academic paper, I see the 'science writer' summary I saw over-stated the amount of automation in the Copiale ms. prep work. Same technique would work for Voynich, but no magic. BF Skinner • October 27, 2011 7:49 AM Does centurys old information still have the same freshness, the same savor and flavor say of centurys old wine? Dirk Praet • October 27, 2011 10:19 AM @ BF Skinner Depends on the wine. I would not recommend coming anywhere near a 200-year old Beaujolais Nouveau. Bill Ricker • October 27, 2011 10:07 PM @nessss "what excactly this book is telling? what is about?" It's a ceremonial manual for some branch of Freemasonry, from when it was more secretive. Paper watermarks they say date it to mid 17xx's, the 18xx end-paper date is presumed just a later acquisition date. It does not specify which Rite or branch it is from, but whatever it is recognizes Scotch rite as brotherly rite, so is not that one. Might be York, might be antecedent of Shriner, or, given Germanic locale and ophthalmic imagery, it might be the germ of truth behind the legend of the Bavarian Illuminati. Or a dead end branch with no modern descendants. The academic paper notes that a second copy was found in a northern European library. It does not indicate how exact was the match. The academic website has German and English translations. Up-thread here is a proposed key for the remaining logotypes or keywords in the Nomenclator (=a mixed small code and simple cipher with homophones and nulls, used by Courtiers in pre-modern periods, precursor to Trench codes of WW1). (While the code for one logotype is *lip* , given the ophthalmic theme of the initiation ordeal, I would be sure it is an *eye* that codes for their more illuminated branch of Freemasonry, possibly Illuminated Freemasonry?) TRX • October 29, 2011 8:12 AM Interesting. After perusing the paper, the cipher appears to be very similar in concept to the one used by the Zodiac Killer a few decades ago. Sam Peds • October 29, 2011 12:06 PM After reading the manuscript translation into English, several points seem to stand-out. (2) The obsessive nature of those seeking Secret Ritual Pornography, and its Omerta, is re-enforced by the statement that if the initiate "breaks the rules against disclosure" he, or she, will be subject to sexual humiliiation by the society is reminiscent of the Story of O. (3) The real business of the Secret Society is revealed in the gang sign and street communications rules, taught to rocker novices, and also taught and practiced differently to full patch members, which conform to practice of current street gangs, and the pre-cautioned use of controlled Q & A, like that of Con Men. Further they describe a store devoid of any marking, a pre-0cursor to the Big Store a century later. (4) Then there is the only true test of a novice, in their ability to assess and spot other native cons, as opposed to bringing in either an agent of a ruler, or a cop, or another gang member, or someone who simply blabs. Then both are shut out forever, as incommunicados. Conclusion: These sound like German mowhawks and street cons. Peter E Retep • October 29, 2011 3:16 PM Ah, one of the seven grails of cryptography: Bill Ricker • November 16, 2012 3:55 PM Wired's Danger Room has an update a year later, with report on other related documents as well as backstory. (I'm rather happy it confirms several of my speculations from a year ago, above.)
Post a comment
Powered by Movable Type. Photo at top by Geoffrey Stone.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT. |
|
Comments