Monday, 24 June 2013
Saturday, 22 June 2013
A couple of years ago, I came across a new technique for analysing documents, developed by Marcello Montemurro and Damian Zanette. It identifies the most significant words in a document by the entropy of their distribution in the text. I tried it out on subtitles at the BBC, and got promissing early results.
Now Dr Montemurro has applied the technique to the infamous Voynich Manuscript, and discovered that it appears to contain a meaningful language, rather than gibberish. No news yet as to what any of it might mean, but hopefully my own efforts to uncover the syntax with a Hidden Markov Model might eventually bear fruit. I'm convinced it's a conlang.