The Fantastical Devices of Pete the Mad Scientist: algorithm

Showing posts with label algorithm. Show all posts

Tuesday, 13 December 2016

The Common Ground Algorithm - A Possible Remedy for Filter Bubbles

People have a tendency towards Confirmation Bias, whereby they seek out things that confirm their existing opinions and avoid things that challenge them. On social networks and recommendation systems, this can lead to the development of a filter bubble, whereby their sources of information come to be structured around what they already believe. This, of course, acts as an obstacle to healthy discussion between people of differing opinions, and causes their positions to become ever more deeply entrenched and polarised. Instead of seeing those with whom they differ as being decent people who have something of value to offer them, and who may be persuadable on some of their differences, people start seeing their opponents as the enemy. To prevent this, people need something that will put them in touch with people with whom they have generally opposing viewpoints. Of course, we can't just confront people with contrary opinions - this will risk provoking hostile reactions. What we need is to show people what they have in common with those whose opinions are different, so that they can build trust and begin to interact in a healthy way. As an attempt to do this, I present The Common Ground Algorithm. This uses a combination of topic modelling and sentiment analysis to characterise a user's opinions. It then finds people whose opinions are generally opposed to theirs, and identifies the topics on which they share common ground, recommending posts where they agree on something with people they disagree with in general. I've coded up a reference implementation in Python, and am releasing it under the MIT Licence to encourage its use and further development.

Wednesday, 17 June 2015

The Bootstrap Problem

A post on Data Community DC discusses Why You Should Not Build a Recommendation Engine. The main point is that recommendation engines need a lot of data to work properly, and you're unlikely to have that when you start out.

I know the feeling. In a previous job I created a recommendation engine for a business communication system. It used tags on the content and user behaviour to infer the topics that the user was most likely to be interested in, and recommend content accordingly. Unfortunately, my testbed was my employer's own instance of the product, and the company was a start-up that was too small to need its own product. I never really got a handle on how well it worked.

This brings me to Emily. Emily isn't a product. It's a personal portfolio project. I had an idea for a recommendation system that would infer users' interests from content they posted in blogs, and recommend similar content. The problem is, the content it recommends comes from the other users, so at its current early stage of operation, it doesn't have much to recommend. The more people use it, the better it will become, but what's the incentive to be an early adopter?

What I seem to have at the moment is a recommendation engine that needs somebody to recommend it.

Tuesday, 9 June 2015

Emily Has Moved

As those of you who've tried out my semantic recommendation system, Emily, will have noticed, it didn't work. The reason was, I'd used the wrong cloud platform. Google App Engine isn't meant for anything that needs as much computation as Emily does, so I've ported Emily to OpenShift. This has the advantage that it gives me much more control of how I write the code, and I can use things like MongoDB and multiprocessing. Let's try this again!

Wednesday, 5 March 2014

One of my Fantastical Devices is on PyPI

I've mentioned in previous posts that I've been working on a Python library for Hidden Markov Models. I've been encouraged to put this up on the Python Package Index, so, after a little while getting the hang of registering and uploading a project here it is. It's alpha, or course, so there are probably plenty of bugs to be found in it, but if you want to play with something I've made, all you have to do is type

sudo pip install Markov

, and try it out. If you feel you can help me improve it, contact me and I can add you to the Google Code project.

Monday, 24 June 2013

A Couple of my Fantastical Devices

with the recent news about the Voynich Manuscript, as mentioned in my last post, I thought it opportune to share a couple of pieces of code I'd written. First off, as I mentioned earlier, a couple of years ago I wrote a Python implementation of Montemurro and Zanette's algorithm for calculating the entropy of words in documents. If you're interested in using the technique yourself, you may want to have a look. Secondly, my own attempts to uncover the syntax use a Python library for Hidden Markov Models that I created. It probably still has a few bugs in it, but it's attracted a bit of interest online, and I'm hoping to develop it further. So, if you're at all interested in AI, computational linguistics, or analytics, please have a look at these. Feedback is welcome, as is anybody who wishes to contribute further to these projects.

Saturday, 9 February 2013

Custom Sorting For Conlangs again

I've just revised that Python code I posted a while back for sorting lists of strings in customized alphabetical orders. I realized that it would be more efficient to implement it as a key function (which is evaluated once per item) rather than a cmp funtion (which is evaluated for each pair of items). Fortunately Python compares lists in a similar way to strings, thus making it possible.

 class CustomSorter(object):
    def __init__(self,alphabet):
        self.alphabet=alphabet

    def __call__(self,word):
        head,tail=self.separate(word1)
        key=[self.alphabet.index(head1)]
        if len(tail):
            key.extend(self(tail))
        return key

    def separate(self,word):
        candidates=self.Candidates(word)
        while candidates==[]:
            word=word[1:]
            candidates=self.Candidates(word)
        candidates.sort(key=len)
        head=candidates.pop()
        tail=word[len(head):]
        return head,tail

        def Candidates(self,word):
            return [letter for letter in self.alphabet if word.startswith(letter)]