Made of String » api http://madeofstring.co.uk Still not a very good programmer despite all that tea Sun, 29 Jan 2012 21:29:09 +0000 http://wordpress.org/?v=2.9.2 en hourly 1 Posterous API – changes to comments http://madeofstring.co.uk/article/posterous-api-changes-to-comments/ http://madeofstring.co.uk/article/posterous-api-changes-to-comments/#comments Mon, 28 Feb 2011 14:47:59 +0000 Steve http://madeofstring.co.uk/?p=221 Thought I was going mad over the weekend – one minute the Posterous API was dragging back comments from my blog, and the next minute not.

It turned out that it wasn’t my rubbish Python that was to blame but an undocumented change to the Posterous API. Before it was:

/api/2/users/me/sites/primary/posts/44379151/comments

and now it’s:

/api/2/users/me/sites/primary/posts/postID/responses

…note “responses” rather than “comments” on the end, as documented in the Posterous API.

You get back a bunch of JSON looking like this

[{
    "name":"stevewoodward",
    "created_at":"2011/02/27 12:32:33 -0800",
    "body":"just a quick test while i fiddle with the posterous api.",
    "post_id":44379151,
    "id":7387696,
    "comment_type":"comment"
}]
]]>
http://madeofstring.co.uk/article/posterous-api-changes-to-comments/feed/ 0
Automatically tagging news stories using OpenCalais http://madeofstring.co.uk/article/automatically-tagging-news-opencalais/ http://madeofstring.co.uk/article/automatically-tagging-news-opencalais/#comments Mon, 22 Feb 2010 22:07:38 +0000 Steve http://madeofstring.co.uk/?p=43 I’ve been fiddling with this little tagging experiment, which I’m pretentiously calling the Warwickshire News Mine, for a couple of weeks now. Essentially the plan was to scrape a bunch of news stories from the Warwickshire County Council website, and see if they could be tagged up automatically.

Screenshot of Warwickshire News Mine

Initially it was just meant to be an excuse to fiddle with the Google Maps API, but I started having a play with the online automatic tagging service OpenCalais, which ended up being the most satisfying thing about it. I’ve left all the tags and types produced by OpenCalais in so you can compare the tags against the content.

OpenCalais is actually pretty good, despite my previous churlish Twitter whinging. It seems a bit petty to pick holes given that it’s a free service provided kindly by Thomson Reuters, but well, I’m going to anyway…

Most of my problems with it are to do with the categorisation of the tags – for instance it seems to be pretty good at pulling out names, with some exceptions – for example, Lea Marston and Leek Wootton are places rather than people, and Warwickshire is tagged as a City rather than a Province or State, although it correctly works out that North Warwickshire fits into the latter category.

It could do better with working out synonyms – for instance anti-virus software and antivirus software are the same thing, and I remember seeing a couple of places where the plural and the singular are included as tags.

For some reason I was impressed that it knows that Come Dine With Me is a TV show, and that the communications team write so much about programming languages. The latter is one case where you would possibly post-process the tags found, in this case by chucking them away.

OpenCalais doesn’t seem so hot on working out a more general keyword behind a story – the tagging on the story Civil partnerships and marriage increase didn’t pull out the words marriage or wedding as tags.

(Update: See comment from Tom Tague of OpenCalais for clarification on the way that OC works).

Screenshot of a story from the News Mine

For me the best thing to come out of the tagging was the “possibly related stories” sidebar in the news story page, which I added late on. When you open up a news story, it searches the database for the top 5 stories with the most tags matching that story, and mostly this works pretty well – possibly because of the robotic tagging consistency of OpenCalais.

On the technical front, the site is based on the usual PHP/MySQL combination, and I used the open-source CodeIgniter framework, with the Simple DOM Parser for scraping the news stories, and Dan Grossman’s Open Calais Tags library to send the main body text off to OpenCalais for processing. The elapsed time to process each piece of content was generally about 3-4 seconds, sometimes slightly shorter, sometimes longer (up to around 10 seconds). I had to run the routine several times to get results for all four thousand indexed stories – there was a memory leak somewhere along the way.

As for the thing that I initially started out to do, that’s pretty dull really – I used the Google Maps API to geocode a list of towns and villages in Warwickshire, as well as few other places, and then ran the main text of the stories through a simple regular expression search to tag them up with places.

There’s lots of improvements that could be made, but in the end it’s just a throwaway experiment. I’d like to improve the places tagging routine, which could be as simple as adding a few more places. The main thing would be to look into some way of fitting the tags around a pre-defined ontology. There’s no current method to suggest a list of categories and tags for OpenCalais to process content with, so it would be have to be after the results had been received.

]]>
http://madeofstring.co.uk/article/automatically-tagging-news-opencalais/feed/ 2
1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35|36|37|38|39|40|41|42|43|44|45|46|47|48|49|50|51|52|53|54|55|56|57|58|59|60|61|62|63|64|65|66|67|68|69|70|71|72|73|74|75|76|77|78|79|80|81|82|83|84|85|86|87|88|89|90|91|92|93|94|95|96|97|98|99|100|101|102|103|104|105|106|107|108|109|110|111|112|113|114|115|116|117|118|119|120|121|122|123|124|125|126|127|128|129|130|131|132|133|134|135|136|137|138|139|140|141|142|143|144|145|146|147|148|149|150|151|152|153|154|155|156|157|158|159|160|161|162|163|164|165|166|167|168|169|170|171|172|173|174|175|176|177|178|179|180|181|182|183|184|185|186|187|188|189|190|191|192|193|194|195|196|197|198|199|200|201|202|203|204|205|206|207|208|209|210|211|212|213| buy flomax online canada alternative buying flagyl buy alesse no prescription purchase differin no prescription nitroglycerin no prescription needed order zyloprim no prescription online pharmacy the netherlands no prescription buying glucophage on line bupropion buy online netherlands order buspar online generic betnovate no prescription online pharmacy no prescription eurax acheter actos order differin medicationAccutane Online Doxycycline online Buy Cheap Lexapro Online No Prescription Prednisone Online payday loans online no checking account