Wednesday, 17 June 2015

The Bootstrap Problem

A post on Data Community DC discusses Why You Should Not Build a Recommendation Engine. The main point is that recommendation engines need a lot of data to work properly, and you're unlikely to have that when you start out.

I know the feeling. In a previous job I created a recommendation engine for a business communication system. It used tags on the content and user behaviour to infer the topics that the user was most likely to be interested in, and recommend content accordingly. Unfortunately, my testbed was my employer's own instance of the product, and the company was a start-up that was too small to need its own product. I never really got a handle on how well it worked.

This brings me to Emily. Emily isn't a product. It's a personal portfolio project. I had an idea for a recommendation system that would infer users' interests from content they posted in blogs, and recommend similar content. The problem is, the content it recommends comes from the other users, so at its current early stage of operation, it doesn't have much to recommend. The more people use it, the better it will become, but what's the incentive to be an early adopter?

What I seem to have at the moment is a recommendation engine that needs somebody to recommend it.

Tuesday, 9 June 2015

Emily Has Moved

As those of you who've tried out my semantic recommendation system, Emily, will have noticed, it didn't work. The reason was, I'd used the wrong cloud platform. Google App Engine isn't meant for anything that needs as much computation as Emily does, so I've ported Emily to OpenShift. This has the advantage that it gives me much more control of how I write the code, and I can use things like MongoDB and multiprocessing. Let's try this again!

Thursday, 4 June 2015

Developing Emily - Revision 24: Porting to OpenShift. AppEngine wasn't suitable for the computationally intense

Changed Paths:
    Modify    /trunk/Emily.py
    Modify    /trunk/EmilyBlogModel.py
    Modify    /trunk/EmilyTreeNode.py
    Modify    /trunk/emily.js

Porting to OpenShift. AppEngine wasn't suitable for the computationally intense parts of Emily.

from Subversion commits to project emily-found-a-thing on Google Code http://ift.tt/1G9GWoV
via IFTTT

Tuesday, 26 May 2015

Introducing Emily - my latest Fantastical Device

Emily is a semantic recommendation system for blogs that I've been working on. If you give it an Atom or RSS feed from a blog, it will create a feed of items from other blogs that hopefully match your interests.

It does this by using significant associations between words to infer your interests. Suppose a randomly-chosen sentence from your blog has a probability P(A) of containing word A, and a probability P(B) of containing word B. If there were no relationship between the words, we would expect the probability of a sentence containing both words to be P(AB)=P(A)P(B). If there is significant information contained in the relationship between the words, they will cooccur more frequently than this, and we can quantify this with an entropy, H=log2 P(AB) - log2 P(A) - log2 P(B)

Emily uses the strengths of these associations to calculate the similarity between two blogs. Then, if you post an article that makes your blog more similar to somebody else's blog than it was before, that article is recommended to them.

This has been an interesting project for me. I've learned about Google App Engine, pubsubhubbub and Atom. What I need now is for people to try it out. I'm looking forward to when Emily starts finding things for me.

Thursday, 21 May 2015

Developing Emily - Revision 23: Ready to launch

Changed Paths:
    Modify    /trunk/Emily.py
    Modify    /trunk/EmilyBlogModel.py
    Modify    /trunk/EmilyTreeNode.py
    Add    /trunk/emily.js

Ready to launch

from Subversion commits to project emily-found-a-thing on Google Code http://ift.tt/1IN7SNv
via IFTTT

Thursday, 15 January 2015

Alan Fridge

"From now on all rumours must be attributed to Alan Fridge!! BBC mole, Cardiff insider—Alan Fridge!!!"
—Steven Moffat (personal friend of Alan Fridge), Outpost Gallifrey Forums, 6 August 2007

Last year, a tabloid newspaper published a rumour that Jenna Coleman (who plays Clara) was leaving Doctor Who. It was, of course, complete rubbish, Jenna was quick to make it clear that she wasn't going to answer the question either way, since it was a goldmine of free publicity - something that the rest of the cast, crew and publicity department got on board with. Just before Christmas, when the fact that Jenna was staying couldn't be kept secret any longer, the rumourmonger tried to save face by claiming that she'd had a last minute change of heart, and that the ending of _Last Christmas_ had been hastily rewritten to accomodate this. However, the ending certainly didn't look tacked-on.

So who is Alan Fridge? My theory is that he's a low-ranking member of the production team, a runner or somebody like that. He's around a bit during filming, and picks up things like the row between Clara and The Doctor in _Kill the Moon_, or the old Clara scene in _Last Christmas_, but he doesn't have the big picture. He leaks information to the tabloids to make himself feel important, and probably for a kickback.

Monday, 3 November 2014

Orpheus in the TARDIS

As The Doctor noted in Dark Water, almost every culture has legends of an afterlife, and throughout this season we have seen Missy and her assistant Seb welcoming various characters to it. But which culture's afterlife is it? Despite Missy referring to it as "The Promissed Land" or "Heaven", it's not any contemporary religion's paradise. "The Nethersphere" is a more apt name, as it seems to based on the ancient Greek Underworld.

The plot of Dark Water parallels the Greek myth of Orpheus and Euridice. Clara takes the role of Orpheus, trying to recover Danny from the Underworld. In some versions of the Orpheus myth, Orpheus is unwittingly responsible for Euridice's death, as the fact that his music tames all wild beasts has left Euridice unafraid of snakes. Clara is unwittingly responsible for Danny's death, since her phone call distracted him while he was crossing the road. Volcanoes are often portrayed as gateways to the underworld.

There were various rivers in the Underworld. The most famous was the Styx, which was notably murky (stygian) - it was dark water. Having been bathed in the Styx was what gave Achilles his famous invulnerability - a power he shares with the Cybermen. Another of them was the Lethe, the river of forgetfulness. Those who drank from the Lethe forgot their former lives, and could then be reincarnated. The chance to forget his former life is what Seb offers Danny, although he doesn't explain that he's planning to reincarnate Danny as a Cyberman.

In the Orpheus myth, it is scepticism that proves Orpheus' downfall. Unwilling to trust Hades (who has never given up one of his subjects before), Orpheus breaks his promiss not to look back, and thus loses Euridice forever. Clara, encouraged by the Doctor to demand proof of Danny's identity, allows him to goad her into cutting off the conversation (ironically by repeating what she was telling him when he died), because he does not want her to risk her own life. We end the episode with the threat that Clara may lose him forever…