Uncle Sam, Cipher Wizard (1917 article)
"Uncle Sam, Cipher Wizard", Literary Digest 55: 46-51, Nov. 3, 1917
UNCLE SAM, CIPHER WIZARD
THE detective work accomplished by the United States Government since its entry into the war has been worthy of a Sherlock Holmes, and yet few persons, reading only the results of this remarkably developed system, have realized that a Government heretofore finding it unnecessary to match wits with foreign spy bureaus has suddenly taken a high rank in this unpleasant but absolutely essential branch of war-making — as it has in all others. The public read of the intercepted dispatches from the Argentine to Germany by way of Sweden, and of the Bernstorff messages, but without a realization of the problem that a cipher dispatch presents to one who has not the key. And probably the average reader is unaware that, in both the Army and Navy, experts have been trained to decipher code messages, with the result that both the making and the reading of such dispatches have been reduced to an almost mathematical science. The Philadelphia Press, in outlining the instruction given in this important work at the Army Service schools, says:
What is taught the military will furnish an idea of the task of the code experts in the State Department, and of the basis of the science that has unmasked the German plans with respect to vessels to be spurlos versenkt and of legislators to be influenced through the power of German gold.
"It may as well be stated," says Capt. Parker Hitt — that is, he was a captain of infantry when he said it — "that no practicable military cipher is mathematically indecipherable if intercepted; the most; that can be expected is to delay for a longer or a shorter time the deciphering of the message by the interceptor."
The young officer is warned that one doesn't have to rely in these times upon capturing messengers as they speed by horse from post to post. All radio messages may be picked up by every operator within the zone, and the interesting information is given that if one can run a fine wire within 100 feet of a buzzer-line or within thirty feet of a telegraph-line, whatever tidings may be going over those mediums may be copied by induction.
In order that the student may not lose heart, it is pointed out in the beginning that many European Powers use ciphers that vary from extreme simplicity to "a complexity which is more apparent than real." And as to amateurs, who make up ciphers for some special purpose, it's dollars to doughnuts that their messages will be read just as easily as tho they had printed them in box-car letters.
At every headquarters of an army the Intelligence Department of the General Staff stands ready to play checkers with any formidable-looking document that comes along in cipher, and there is mighty little matter in code that stands a ghost of a chance of getting by.
The scientific dissection of ciphers starts with the examination of the general system of language communication, which, with everybody excepting friend Chinaman, is an alphabet composed of letters that appear in conventional order.
It was early found by the keen-eyed gentlemen who analyzed ciphers that if one took ten thousand words of any language and counted the letters in them the number of times that any one letter would recur would be found practically identical with their recurrence in any other ten thousand words. From this discovery the experts made frequency tables, which show just how many times one may expect to find a letter e or any other letter in a given number of words or letters. These tables were made for 10,000 letters and for 200 letters so that one might get an idea how often to expect to find given letters in both long and short messages or documents.
Thus we find the following result:
|— Letters —|
|A . . . . . .||778||16|
|B . . . . . .||141||3|
|C . . . . . .||296||6|
|D . . . . . .||402||8|
|E . . . . . .||1277||6|
|F . . . . . .||197||4|
|G . . . . . .||174||3|
|H . . . . . .||595||12|
|I . . . . . .||667||13|
|I . . . . . .||51||1|
|K . . . . . .||74||2|
|L . . . . . .||372||7|
|M . . . . . .||288||6|
|N . . . . . .||686||14|
|O . . . . . .||807||16|
|P . . . . . .||223||4|
|Q . . . . . .||8||..|
|R . . . . . .||651||13|
|S . . . . . .||622||12|
|T . . . . . .||855||17|
|U . . . . . .||308||6|
|V . . . . . .||112||2|
|W . . . . . .||176||3|
|X . . . . . .||27||..|
|Y . . . . . .||196||4|
|Z . . . . . .||17||..|
It is found that in any text the vowels A E I O U represent 38.37 per cent.; the consonants L N R S T represent 31.86 per cent., and the consonants J K Q X Z stand for only 1.77 per cent. One doesn't want to shy away from these figures as being dry and dull, because they form part of a story as interesting as any detective narrative that was ever penned by a Conan Doyle.
For the usual purposes of figuring a cipher the first group is given the value of 40 per cent., the second 30 per cent., and the last 2 per cent. And then one is introduced to the order of frequency in which letters appear in ordinary text. It is:
E T O A N I R S H D L U C M P F Y W G B V K J X Z Q.
Tables are then made for kinds of matter that is not ordinary, taken from various kinds of telegraphic and other documents, which will alter only slightly the percentage values of the letters as shown in a table from ordinary English.
Having gone along thus far, the expert figures how many times he can expect to find two letters occurring together. These are called digraphs, and one learns that AH will show up once in a thousand letters, while HA will be found twenty-six times. These double-letter combinations form a separate table all of their own, and the common ones are set aside, as TH, ER, ON, OR, etc., so they can be readily guessed or mathematically figured against any text.
Tables of frequency are figured out for the various languages — particularly German — and the ciphers are divided into two chief classes, substitution and transposition. The writer in The Press says:
Now you will remember those percentages of vowels and consonants. Here is where they come in. When a message is picked up the army expert counts the times that the vowels recur, and if they do not check with the 40 per cent. for the common vowels, with the consonant figures tallying within 5 per cent. of the key, he knows that he is up against a substitution cipher. The transposition kind will check to u gnat's heel.
When the expert knows exactly what he is up against he is ready to apply the figures and patiently unravel the story. It may take him hours, and maybe days, but sooner or later he will get it to a certainty.
If he has picked up a transposition fellow he proceeds to examine it geometrically, placing the letters so that they form all sorts of squares and rectangles that come under the heads of simple horizontals, simple verticals, alternate horizontals, alternate verticals, simple diagonals, alternate diagonals, spirals reading clockwise and spirals reading counter-clockwise. Once one gets the arrangement of the letters the reading is simple.
For instance, ILVGIOIAEITSRNMANHMNG comes along the wire. It doesn't figure for a substitution cipher and you try the transposition plan. There are twenty-one letters in it, and the number at once suggests seven columns of three letters each. Try it on your piano:
I L V G I O I
A E I T S R N
M A N H M N G
And reading down each column in succession you get "I am leaving this morning." After passing over several simple ciphers as not "classy" enough to engage the reader's attention, the writer takes up one of a much more complicated nature which, however, did not get by Uncle Sam's code wizards. Follow the deciphering of this example by Captain Hitt:
He began with an advertisement which appeared in a London newspaper, which read as follows:
"M. B. Will deposit £27 14s. 5d. to-morrow."
The next day this advertisement in cipher appeared:
"M. B. CT OSB UHGI TP IPEWF H CEWIL
NSTTLE FJNVX XTYLS FWKKHI BJLSI SQ
VOI BKSM XMKUL SK NVPONPN GSW OL
IEAG NPSI HYJISFZ CYY NPUXQG TPRJA
VXMXI AP EHVPPR TH WPPNEL. UVZUA
MMYVSF KNTS ZSZ UAJPQ DLMMJXL JR RA
PORTELOGJ CSULTWNI XMKUHW XGLN
ELCPOWY OL. ULJTL BVJ TLBWTPZ XLD K
ZISZNK OSY DL RYJUAJSSGK. TLFNS UVN
VV FQGCYL FJHVSI YJL NEXV PO WTOL
PYYYHSH GQBOH AGZTIQ EYFAX YPMP SQA
CI XEYVXNPPAII UV TLFTWMC FU WBWX-
GUHIWU. AIIWG HSI YJVTI BJV XMQN
SFX DQB LRTY TZ QTXLNISVZ. GIFT AII
UQSJGJ OHZ XFOWFV BKI CTWY DSWTL-
TTTPKFRHG IVX QCAFV TP DIIS JBF ESF JSC
MCCF HNGK ESBP DJPQ NLU CTW ROSB CSM."
Now just off-hand, the average man would shy away from this combination as a bit of news that he really did not care to read. But to the cipher fiend it was a thing of joy, and it illustrates one of the many cases that they are called upon to read, and the methods by which they work.
As a starting-point the cipher-man assumed that the text was in English because he got it out of an English newspaper, but he did not stop there. He checked it from a negative View-point by finding the letter w in it, which does not occur in the Latin languages, and by finding that the last fifteen words of the message had from two to four letters each, which would have been impossible in German.
Then he proceeds to analyze. The message has 108 groups that are presumably words, and there are 473 letters in it. This makes an average of 4.4 letters to the group, whereas one versed in the art; normally expects about five. There are ninety vowels of the AEIOU group and seventy-eight letters JKQXZ. Harking back to that first statement of percentages, it is certain that this is a substitution cipher because the percentage does not check with the transposition averages.
The canny man with the sharp pencil then looks for recurring groups and similar groups in his message and he finds that they are:
AIIWG AII BKSM BKAI CT CTWY
CTW DLMMJXL DL ESF ESBP FJNVX
FJHVSI NPSI NPUXQG OSB OSY
ROSB OL OL PORTELOGJ PO SQ SQA
TP TP TLBWTPZ TLFNS TLFTWMC
UVZUA UVD UV SMKUL XMKUHVV
Passing along by the elimination route he refers to his frequency tables to see how often the same letters occur, and he finds that they are all out of proportion, and he can proceed to hunt the key for several alphabets.
He factors the recurring groups like a small boy doing a sum in arithmetic when he wants to find out how many numbers multiplied by each other will produce a larger one. The number of letters between recurring groups and words is counted and dissected in this wise:
|AII . . . . . . . . . . . AII||45, which equals 3×3×5|
|BK . . . . . . . . . . . BK||345, which equals 23×3×5|
|CT . . . . . . . . . . . . CT||403, no factors|
|CTW . . . . . . . . CTW||60, which equals 2×2×2×5|
|DL . . . . . . . . . . . . DL||75, which equals 3×5×5|
|ES . . . . . . . . ES||14, which equals 2×7|
|FJ . . . . . . . . . . . FJ||187, no factors|
|NP . . . . . . . . . . . . NP||14, which equals 2×7|
|OL . . . . . . . . . . . . OL||120, which equals 2X2×2×3×5|
|OS . . . . . . . . . . . . . OS||220, which equals 11×2×2×5|
|OSB . . . . . . . . . . OSB||465, which equals 31×3×5|
|PO . . . . . . . . . . . . PO||105, which equals 7×3×5|
|SQ . . . . . . . . . . . . . SQ||250, which equals 2×5×5×5|
|TLF . . . . . . . . . TLF||80, which equals 2×2×2×2×5|
|TP . . . . . . . . . . . . TP||405, which equals 3×3×3×3×5|
|UV . . . . . . . . . . . . UV||115, which equals 23×5|
|XMKU . . . . XMKU||120, which equals 2×2×2×3×5|
|UV . . . . . . . . . . . . UV||73, no factors|
|YJ . . . . . . . . . . . . . YJ||85, which equals 17×5|
Now the man who is doing the studying takes a squint at this result and he sees that the dominant factor all through the case is the figure 5, so he is reasonably sure that five alphabets were used, and that the key-word had, therefore, five letters, so he writes the message in lines of five letters each and makes a frequency table for each one of the five columns he has formed, and he gets the following result:
|Col. 1.||Col. 2.||Col. 3.||Col. 4.||Col. 5.|
|A 2||A 9||A 1||A 1||A 2|
|B—||B 3||B 3||B—||B 7|
|C 7||C 1||C 3||C 4||C—|
|D 2||D 2||D 1||D—||D 3|
|E 4||E—||E 2||E 7||E—|
|F 3||F—||F 9||F 3||F 5|
|G 9||G—||G 3||G 2||G 2|
|H 3||H 5||H 3||H 3||H 2|
|I 2||I 2||I 7||I 17||I 2|
|J 5||J 1||J 6||J—||J 9|
|K 6||K 5||K—||K 1||K 1|
|L—||L 19||L 2||L 5||L 1|
|M—||M—||M 7||M 4||M 3|
|N 7||N 3||N 4||N—||N 5|
|O 5||O—||O 9||O 1||O—|
|P 7||P 7||P 8||P 4||P—|
|Q 5||Q—||Q—||Q 2||Q 6|
|R—||R 1||R 1||R 6||R 1|
|S—||S 8||S 6||S 12||S 7|
|T 7||T 3||T 5||T 1||T 14|
|U 7||U 3||U 6||U—||U 1|
|V 5||V—||V 2||V 5||V—|
|W 3||W 4||W—||W 5||W 7|
|X 2||X—||X 4||X 8||X 6|
|Y 4||Y 5||Y—||Y 3||Y 7|
|Z—||Z 5||Z 3||Z—||Z 3|
Now having erected these five enigmatical columns, Captain Hitt juggles them until he uncovers the hidden message, thus:
"In the table for column 1 the letter G occurs 9 times," he says with an air of a man having found something that is perfectly plain. "Let us consider it tentatively as E.
"Then, if the cipher alphabet runs regularly and in the direction of the regular alphabet, C (7 times) is equal to A, and the cipher alphabet bears a close resemblance to the regular frequency table. Note that TUV (equal to RST) occurring respectively 7, 7, and 5 times and the non-occurrence of B, L, M, R, S, Z (equal to Z, J, K, P, Q, and X respectively).
"In the next table L occurs 19 times, and taking it for E with the alphabet running the same way, A is equal to H. The first word of our message, CT, thus becomes AM when deciphered with these two alphabets, and the first two letters of the key are CH.
"Similarly in the third table we may take either F or O for E, but a casual examination shows that the former is correct and A is equal to B.
"In the fourth table I is clearly E and A is equal to E.
"The fifth table shows that T is equal to 14 and J is equal to 9. If we take J as equal to E then T is equal to O, and in view of the many Es already accounted for in the other columns this may be all right. It checks as correct if we apply the last three alphabets to the second word of our message, OSB, which deciphers NOW. Using these alphabets to decipher the whole message we find it to read:
"'M. B. Am now safe on board a barge moored below Tower Bridge, where no one will think of looking for me. Have good friends but little money owing to action of police. Trust, little girl, you still believe in my innocence altho things Seem against me. There are reasons why I should not be questioned. Shall try to embark before the mast in some outward bound vessel. Crews will not be scrutinized as sharply as passengers. There are those who will let you know my movements. Fear the police may tamper with your correspondence, but later on, when hue and cry have died down, will let you know all.' "
It all seems simple to the man who follows the idea closely, but Captain Hitt proceeds to make further revelations of the art. He adds:
"The key to this message is CHBEF, which is not intelligible as a word, but if put into figures, indicating that the 2d, 7th, 1st, 4th, and 5th letter beyond the corresponding letter of the message has been used as. a key it becomes 27145, and we connect it with the personal which appeared in the same paper the day before reading:
"'M. B. Will deposit £27 14s. 5d. to-morrow.'"
This is only one of the many methods for getting under the hide of a coded message that our bright men of the Army and their cousins of the State and Navy departments have worked out through years of study and application.