Monday 24 June 2013

A Couple of my Fantastical Devices

with the recent news about the Voynich Manuscript, as mentioned in my last post, I thought it opportune to share a couple of pieces of code I'd written. First off, as I mentioned earlier, a couple of years ago I wrote a Python implementation of Montemurro and Zanette's algorithm for calculating the entropy of words in documents. If you're interested in using the technique yourself, you may want to have a look. Secondly, my own attempts to uncover the syntax use a Python library for Hidden Markov Models that I created. It probably still has a few bugs in it, but it's attracted a bit of interest online, and I'm hoping to develop it further. So, if you're at all interested in AI, computational linguistics, or analytics, please have a look at these. Feedback is welcome, as is anybody who wishes to contribute further to these projects.

Saturday 22 June 2013

Exciting Voynich Manuscript News

A couple of years ago, I came across a new technique for analysing documents, developed by Marcello Montemurro and Damian Zanette.  It identifies the most significant words in a document by the entropy of their distribution in the text. I tried it out on subtitles at the BBC, and got promissing early results.

Now Dr Montemurro has applied the technique to the infamous Voynich Manuscript, and discovered that it appears to contain a meaningful language, rather than gibberish. No news yet as to what any of it might mean, but hopefully my own efforts to uncover the syntax with a Hidden Markov Model might eventually bear fruit. I'm convinced it's a conlang.

Monday 17 June 2013

Wherein I Announce my Candidacy

I've spent most of the last two weeks job hunting. There's plenty going in my line of work (artificial intelligence and big data), but I do wonder if it would be a good idea to make a new start and try something completely different. As it, happens, a post's just become available that would be right up my street. Doctor Who There seem to be a lot of reasons why I'd be a good fit for the part. For one thing, I've only ever met one person who loves the show more than me. Second, I've got the right sort of English Eccentric vibe to me, and could pull off what, in honour of the fourth Doctor, I call Tomfoolery like nobody's business. My looks fit as well - nearly good looking but a little odd, with angular features and deep-set, bright blue eyes that give me an intense air. At 39, I'm not too old for the part. I have two children, so I can relate to the younger viewers. With an academic background in physics and experience of artificial intelligence, I can do the best technobabble in the known universe. I can not only reverse the polarity of the neutron flow, I could tell you how to polarize your neutron flow to start with. I even have the title "Doctor". My old college's motto was eadem mutata resurgo, "I rise again, the same but changed", which is very appropriate for The Doctor. Indeed, I can only think of one drawback to me playing The Doctor, and that's that my last acting experience was as Loony Bergonzi in a college production of Bugsy Malone 20 years ago. If you're reading, Mr Moffat, get in touch...

Wednesday 3 April 2013

My Philosophy of Conlanging

A few weeks ago, on Twitter, +Wm Annis of the +Conlangery Podcast pondered whether he should add ideophones to an existing conlang or create a new one. I said "Let your conlang decide." This remark deserves a bit of explanation.

I think that a good conlang should feel like it has a mind of its own. Yes, Khangaþyagon is my creation, and I can technically do what I like with it, but whatever I do has to feel like it naturally belongs to the language. So, to start with, I don't use word generation software. I'm a committed handcrafter of vocabulary, because I have to feel I've got the right match of sound to meaning. Sometimes the word comes first, sometimes the meaning, but whichever way round it is, it has to mean what Khangaþyagon wants it to mean.

A good example is "oplen". When I first thought of it, I thought it was a verb. I'd worked out what the correct form and sense of the present participal were (Khangaþyagon verbal nouns have quirks), but I didn't have a suitable meaning (it was meant to be something to do with travel). So I slept on it, and in the morning I had the answer. "oplen" wasn't a verb at all. It's  a noun, and it means "glade".

If you want to work this way (and it won't be to everybody's tastes) it's important to internalise your language's phonaesthetics. When I started work on Khangaþyagon, my wife and I were doing an evening class on History of Art. Under the influence of Wassily Kandinsky, I decided that all wizards should be synaesthetes. This set me an interesting challenge, as I'm not one myself. You should have seen me, when I was making up names for herbs and spices, going round my kitchen, sniffing at jars, trying to fit sounds to scents. I'm particularly proud of those words.

Another word I'm proud of is "dapt-" which means "be the weather". I had been thinking that Khangaþyagon would express the weather with phrases like "The sun shines", "the rain falls", "the wind blows" etc. However, during Lexember, I came up with this verb, which fits in much better with the character of Khangaþyagon. Given that Khangaþyagon is a magical language, just think of the possibilities of the first and second persons.

Khangaþyagon can reject words too. Early on, I coined egorigik and namassateus, but Khangaþyagon didn't want them. If your conlang does, it can have them.

It's not just about the words. A language's personality should pervade every aspect of its grammar. You remember I said that verbal nouns had quirks? I started Khangaþyagon by taking a runic inscription from an Anglo-Saxon ring, ærkriuflt kriariþon glæstæpontol, and parsing it as "Let the bleeding be healed by conjuration." This gave me two forms, on and ont, for what I loosely call the present participal. However, I later created words for which these forms acted more like agent nouns. Which form and which sense go with a particular verb are lexically determined, and they don't correlate. This started out as a mistake, but I liked it so I kept it.

The segunak "ut" means "at" or "exact location". "omb" means around. So why does the combination "utomb" mean "made of"? Your guess is as good as mine.

Does the verb "to be" even have a passive in any language? It's Khangaþyagon's way of expressing "there is". It seems to work.

You might say that all this talk of letting your language think for itself is all a flight of fancy. After all, aren't I the one making all these decisions in the end? But creating a language is a flight of fancy to start with, one that you have to be fully involved in to undertake successfully. This is a flight of fancy that I've been on for over 10 years. Khangaþyagon is part of me now.

Sunday 17 February 2013

Literally Vikings

One thing that quite a lot of people get annoyed about is using "literally" to mean "figuratively". Fortunately, these Vikings know how to use the word correctly.

Watch "Horrible Histories - Literally: The Viking Song" on YouTube

(Of course, they're not literally Vikings. They're really actors playing Vikings in Horrible Histories, but you knew that, didn't you?)

I'm off to York for a few days, and it just happens that the Viking Festival is on, so I might post some more Viking-related stuff.

Wednesday 13 February 2013

Whosoever shall draw…

My DVD player broke down, and I had to take it apart to retrieve a disc that was stuck in it.

It turned out to be The Sword In The Stone

Saturday 9 February 2013

Custom Sorting For Conlangs again

I've just revised that Python code I posted a while back for sorting lists of strings in customized alphabetical orders. I realized that it would be more efficient to implement it as a key function (which is evaluated once per item) rather than a cmp funtion (which is evaluated for each pair of items). Fortunately Python compares lists in a similar way to strings, thus making it possible.
 class CustomSorter(object):
    def __init__(self,alphabet):
        self.alphabet=alphabet

    def __call__(self,word):
        head,tail=self.separate(word1)
        key=[self.alphabet.index(head1)]
        if len(tail):
            key.extend(self(tail))
        return key

    def separate(self,word):
        candidates=self.Candidates(word)
        while candidates==[]:
            word=word[1:]
            candidates=self.Candidates(word)
        candidates.sort(key=len)
        head=candidates.pop()
        tail=word[len(head):]
        return head,tail

        def Candidates(self,word):
            return [letter for letter in self.alphabet if word.startswith(letter)]