Home

Methods

  • Stegoscripts and statistical model
  • Wordspy algorithm
  • Model optimization
  • Over-represented motif discovery
  • Word clustering

Insights

Results

 

 

 

 

 

Experimental Results on English Stegoscripts


As the first test, we applied it to deciphering a stegoscript (about 156K letters) with the first ten chapters (about 112K letters) of novel Moby Dick embedded within. The stegoscript was created by Siggia Lab. The original text and the embedded text can be downloaded from their website:

http://www.physics.rockefeller.edu/~siggia/projects/mobydick/Chap1-10.txt
http://www.physics.rockefeller.edu/~siggia/projects/mobydick/Randomized-chap1-10.txt

Our goal is to recover the orginal text as much and accurate as possible. To evaluate prediction performance, we consider a word to be correctly predicted if it matchs at least half of its origin, and measure the performance by the true positive prediction rate (TPR), which is the percentage of correctly predicted words among all predictions, and false positive rate (FPR), which is the percentage of false words reported among all predicted words. We ran WordSpy on the stegoscript of Moby Dick with different Z-score thresholds. The results can be found in the following links.

Summarized results

The complete output of experiments.