Programming Collective Intelligence

December 30, 2007 by ajmg

It’s easy to get so involved in your day to day work that you don’t find the time to read around the wider areas of your profession (or at least I find this to be the case). Because of this, one of my colleagues suggested that we have a “geek book club” where we read articles and books that are related to software development, and through this I’ve encountered books such as Object Thinking and Pragmatic Programmer that I otherwise wouldn’t have heard of. For holiday reading over Christmas one of my colleagues suggested that we read Programming Collective Intelligence.

Programming Collective Intelligence
Programming Collective Intelligence by Toby Segaran

This is a book about machine learning and AI in relation to developing Web 2.0 applications so there are chapters about search engines, spam filtering and making recommendations a la Amazon. These chapters I haven’t read but, as I’d implemented a genetic algorithm at university, what I immediately did was to skip to chapter 11 entitled Evolving Intelligence which is about Genetic Programming.

Genetic Programming is a term I’d not heard of before but it is, apparently, an offshoot of Genetic Algorithms. The difference, as I understand it, is that Genetic Algorithms start with an initial population of data structures which represent the answers to a problem. These data structures are amended using the evolutionary concepts of crossover and mutation and a fitness function which chooses the fittest structures (answers) to go on to the next generation. However, as the author explains, Genetic Programming evolves the algorithm itself, not just the parameters or results of an algorithm. In Segaran’s example the algorithm is modelled as a parse tree, which is the way in which programs are often first broken down by a compiler or an interpreter. This tree representation of the algorithm is then subject to crossover and mutation to evolve “better” programs as defined by the fitness function.

This kind of programming, the author tells us, has been used in fields such as optics, gaming, evolving scientific inventions such as antennas for NASA, designing a concert hall shape that gives the best acoustics etc. Though this is only one chapter in a book it goes further than the basics, for example, it touches on how you can provide the algorithm with memory and the algorithmic population with shared memory to help it learn longer term strategies, and points you in the direction of implementing this. I was most impressed and wished that I’d had this book to hand when first learning about the subject. I’ve only read chapter 11 and a bit of chapter 5 but these have already given me a good overview of the subject of genetic algorithms/programming, refreshed my memory on stuff I’ve already learned, taught me new things as well as helped me brush up on the python language. If these chapters are anything to go by then the entire book is well worth reading.

The Singing Ringing Tree

December 30, 2007 by ajmg

I was delighted to receive, as a Christmas present from a friend, a copy of The Singing Ringing Tree which I remember watching as a very little girl – maybe around 5 or even younger.

The Singing Ringing Tree

The blurb on the DVD describes it as a fairy tale that “haunted a generation” and it must have made an impression on me because although I was only very little when I watched it I still have vivid memories of it. Watching it as an adult I can see why. For a start it’s in technicolour with glorious and strange scenery which reminds me of watching the Wizard of Oz, though of course the settings are on a smaller scale. The soundtrack is very evocative, especially in the scary bits. The acting is good for that sort of thing – the sort of theatrical, unsubtle acting which works well on stage and the costumes are good, for example, when the prince gets turned into a bear they were clever enough not to have a bear mask but to have hairs glued onto his face so you could still see the pathos in his features (as my friend put it he doesn’t look like Bungle out of Rainbow which would have been really rubbish).

I chose to watch it with the English voice over, just as I watched it as a little kid. This is a narrator telling the story Jackanory style in line with the visuals. I don’t know in how many countries this was shown, but I can see how it could have wide appeal as the visuals and acting are compelling enough, and with a narrator telling the story in their own language it would work for children anywhere. There is one thing I misremember … instead of hearing the actors’ voices in the background (it was a German production and the actors speak in German) I remember a constant burbling in the background of what sounded like another narration in Czech or something? … this of course added to the overall weirdness … perhaps this is a false memory but the friend who gave it to me says that he remembers the same.

The wicked dwarf is very scary indeed and I am surprised that it didn’t scare me as a little kid. When he’s thwarting the prince and princess you can hear his cackling laughter as his face pops up out of a cloud, or out of the side of a tree, or out of the ground like an evil teletubby.

Philosophy – what’s the use?

December 17, 2007 by ajmg

There are a couple of people, including myself, in our team at work who have philosophy degrees. I went to university later on in life as what is euphemistically called a “mature student” so I had to go through the annoyance of putting up with such comments from friends and family as “Philosophy – you must be mad what the hell use is that?” Apart from my Mum, who always supports me in anything I choose to do, who said that she was sure studying philosophy would be “very nice”.

I was, therefore, glad to receive from a colleague of mine, whose son is doing maths and philosophy, this link to a paper which espouses the benefits of studying for a philosophy degree and the transferable skills it helps to develop which should be of benefit to any employer.

It’s a formidable list of benefits and I don’t pretend to having half of these transferable skills but I’ve never regretted my decision to study philosophy. It raises the important question of what is the use of any degree. Maths and English can certainly be applied to a lot of practical and useful careers, but what about Music? Should we really allow youngsters to study music as ultimately “what the hell use is that”? But if you are of the opinion that every degree should be vocational and “useful” then you may end up like the man who learned to service the betamax video recorder (as a philosophy student at my university pointed out).

And what is education for? Is it to produce worker drones or is it to produce people who can think for themselves?

In Britain I think there is often a prejudice against intellectualism which, as a friend of mine pointed out, often manifests in the often overheard phrase “Oh these people, they may have degrees but they have no common sense” and of course Socrates was murdered by the state for his wisdom!

What, after all, is the use of you? or of myself? Is Gont Mountain useful, or the Open Sea? … (Le Guin, A Wizard of Earthsea)

First Impressions of Stockholm

December 14, 2007 by ajmg

First impression was of course the weather, which was damp and cold, and the darkess, in December it gets dark at 3.00 pm. I’ve heard that there’s a high suicide rate in Sweden because they don’t get enough light, however, a bit of googling shows this to be a myth.

Another impression is that of the public transport system which seemed very complicated, until we got the hang of it and realised that it was very simple as long as you’re good at directions and can hold a map in your head. Stockholm central metro station is very big, very sprawled out and with lots of exits and escalators which makes it very confusing. One thing I did notice was the pleasing lack of advertising, instead it had painted walls with a few murals which gave an impression of a public space belonging to the people of the city, rather than corporations.

Another impression was of the people, who did not seem very warm and friendly (according to my two Greek companions, who are rather expressive) however, they were polite and helpful and spoke very good English (their education system must be excellent). We stayed at Kista which is north west of Stockholm and is a bit like Milton Keynes, but we also explored Djurgarden, Norrmalm, the centre bit by Stockholm Central Station, and of course Gamla Stan the old medieval town. In all these places we saw no evidence of anti-social behaviour and we felt very safe. As it was December there were many Christmas (or Yuletide) markets, as well as a fairground and in all of these places open fires were burning. Imagine an open bonfire in the German market in Victoria Square or the Bullring in Birmingham – you’d get people being silly and chucking it around and setting fire to things.

It was also nice to see children looking dignified and sensibly dressed as children instead of dressed up like slutty teenagers – apparently advertising aimed and children under 12 is banned and it certainly shows in what they are wearing.

Clothes and shoes seems no more expensive than your average prices here, however, eating out and going to bars is very expensive (a single Baileys and ice cost £6).

Probably due to the weather, and the light, and being built on an archipelago and with a lot of water around, Stockholm gave the impression of being a wee bit mournfull, though not in a depressing or dreary way, but rather in a detached, calm way.

Bong Sau and Tan Sau notes

December 5, 2007 by ajmg

The wrist is on the centre line and when turning from bong sau to tan sau (as in the 3rd section of siu nim tao) the wrist should stay in the same place. Tan Sau stays roughly a fist and thumb distance away from the body with the elbow in line with the hip. Both bong sau and tan sau can be higher or lower depending on the height of your partner (tan sau should point roughly at your partner’s throat/chin as you can strike more efficiently from that position).

Bong sau is not necesarily a deflective technique (is it a deflective technique at all?) but is used to gain information. Once you’ve turned out of the way of the strike bong sau is put into place to tell where your opponent’s arm/elbow is and then you can countermove from there. From when you are in the basic stance bong sau moves through the centreline and then “corkscrews” up into position at the last moment to avoid clashing.

Chi Sao

December 3, 2007 by ajmg

Tonight during Wing Chun training we concentrated almost solely on Chi Sao which my Sifu describes as “play fighting” in that it is not an all out brawl, there are certain rules and niceties and the attitude should be that you are there to learn. Sifu’s useful analogy was that of tiger cubs playing – they need to learn to hunt but they don’t want to kill each other so they go through all the motions but with their claws in; so there’s an element of trust involved as well as assertiveness (and controlled aggression maybe?)

One of the more difficult things I find about Wing Chun is that you’ve got to get over your British reserve and get physically close to people. I find this difficult, especially with a training partner I’m only mildly acquainted with. During Chi Sao I feel stupid because I don’t know what move to make next. I feel silly and embarrassed when someone gets through. I feel guilty and embarrassed when I get through someone’s defences and usually end up apologising for it. Also I feel that the men I train with are just indulging me and fear that they think they would get more out of training with another male. I also suffer from what I call “body dyslexia”, for want of a better phrase, which means that I may be able to do the moves when practising on my own but that it all goes to pot and I forget my left from my right when paired up with a training partner.

I think it true to say that during Chi Sao I’m just a bundle of conflicting impulses with a cloud of free floating anxiety on top.

I wonder if this kind of contact is perhaps easier for men because as children they’ve usually played some sort of contact sport at school (it may have changed for girls now but netball was about the only contact sport I did at school).

Controlled aggression may play a part in it but one of the guys I consider to be very good at Chi Sau seems to have absolutely no ego – if you get through (not that I often do!) he doesn’t take offence but always wants to know why and asks you to repeat what you did more slowly so that he can learn from it.

There is definitely a lot of cognitive processing going on during Chi Sao and I’m not quite sure what kind of body and brain combined intelligence is involved, but it appears to me that there’s great deal of skill involved in predicting your partners moves and coming up with a skilful countermoves.

Wing Chun (trip to Hong Kong)

September 30, 2007 by ajmg

The first two weeks of August I went to Hong Kong with my Wing Chun kung fu class, a mixture of training and holiday. Our Sifu, Shaun Rawcliffe, is brilliant and takes real care of his students. He took around 40 of us and we all had a great time and the trip was extremely well organised. As someone pointed out it was incredible that for such a large party of people no one got lost, there were no accidents and no-one fell out with one another.

The purpose of the trip was to be trained by the Wing Chun grandmaster Ip Chun who is now in his 80s and still going strong. Our Si Gung, Ip Chun, did a seminar for us where one of the things he pointed out was the importance of relaxation. If you are relaxed and not worrying about getting hit then there is less chance of you freezing, getting hurt and forgetting to put what you have learned into practice. Also if you’re too tense you will not move with the speed that some of the techniques require.

The Wing Chun system is said to be one of the most efficient and effective systems of Chinese Kung Fu, and even though it’s a fighting system designed to be used at close range, it does not rely on bulldozing your way into an opponent using a lot of physical strength. One of the Chinese students that I train with said that he and his son took up Wing Chun because it was more suited for smaller framed people. Indeed, this is what made me take up the system as it was, allegedly, developed by a woman therefore I reasoned that it would be suitable for smaller, less muscular people.

(The only martial art that I’ve ever practiced is Wing Chun but earlier on in the year I went with Nad to see the World Tae Kwon Do championships and was surprised at how many women were competing. I’ve heard that martial arts such as kick boxing and tae kwon do attract more female students quite possibly because these systems have a higher profile and more publicity. The majority of students in the Wing Chun classes I’ve attended are male.)

The techniques Wing Chun uses to deal with blows tend to be deflective, rather than blocking, techniques. Tan sau and bong sau are techniques used to deflect blows to the upper body and one of the questions I asked Sifu was why I kept on getting bruising on my arms when practicing these. He told me that if I got bruises that I needed to be looking at my technique and that I was clashing with my partner, as in using a block, rather than deflecting the force of the blow away from me. So I needed to look at my technique to find out where the clash was occurring.

The fact that the system is touted as being one that does not rely so much on physical strength made me think that it was one of the more easier martial arts to learn, however, I now think that it is one of the more difficult. (Though I may be wrong about this as I’ve only ever studied Wing Chun and therefore have nothing to compare it with.) A lot of what makes Wing Chun effective is the bodily structures that it uses to deal with incoming strikes to the body and getting these structures correct I have found to be no trivial matter. A subtle movement one way or the other can make the difference between the technique working and getting it completely wrong or it not being so effective. I think it is also easier and more natural to rely on physical strength, especially if you are strong anyway, and I’ve noted that the people who seem to do well at Wing Chun take an almost a scientific approach, move very tidily and try hard to get their stance and structures correct. Thankfully through practice with training partners I am finding that the structures do work for me when I get them right.

I’ve been practicing Wing Chun for about a year now and am still a bit miffed that I’m not like Bruce Lee :) But of course this is completely unrealistic. What I realise now is that Wing Chun will take me years of training and even then there’ll still be something to learn. It’s a lifetime’s work but perhaps, like yoga, it can be something that you get better at as you get older.

As for Hong Kong I found it to be a marvellous place and a very beautiful city. Considering it’s so built up and with so many skyscrapers their architects must be wonderful to have created a
skyline that is a thing of beauty. It is also a very safe city and I had no problem walking back to the hotel on my own in the early hours of the morning. The only thing you have to watch out for are pick pockets. It’s also an easy city to get around with a very efficient metro and train system. If you go Island hopping by ferry (I went to both Lantau and Lamma island as well as to Macao) you can see beautiful, sub-tropical countryside and lovely beaches.

My flickr stream gives an idea of what you can see if you visit Hong Kong.

hprof for diagnosing memory leaks

June 29, 2007 by ajmg

The last few days I’ve been trying to diagnose a possible memory leak in one of our java web services and have been using hprof the built in java profiler that comes with J2SE. hprof is easy to use just enter

java -Xrunhprof:help

to see a list of options for usage. I wanted the heap profiling option so used

-Xrunhprof:heap=sites,depth=10

in the command to launch the web server.

heap=sites will break down memory usage according to the amount of memory allocated to particular objects and will also generate stack traces showing the methods which allocated this memory. The depth option sets the depth of the stack trace; I’ve set it to 10 but the default is 4.

hprof generates an output file, java.hprof.txt, on program exit which starts with the stack traces and finishes with a breakdown of memory usage. Here is a snippet of the memory usage part of the file showing the objects using the most amount of memory:

SITES BEGIN (ordered by live bytes) Tue Jun 26 16:04:27 2007
  	percent          live          alloc'ed  stack class
rank   self  accum     bytes objs     bytes  objs trace name
   1 30.27% 30.27% 123015224 1260358 123015224 1260358 331325 char[]
   2 11.11% 41.38%  45130056 667796  45130056 667796 331138 char[]
   3  7.97% 49.35%  32396784    1  32396784     1 331303 java.lang.String[]
   4  7.97% 57.32%  32396784    1  32396784     1 331302 int[]
   5  7.44% 64.77%  30248592 1260358  30248592 1260358 331324 java.lang.String
   6  5.98% 70.74%  24297624    3  24297624     3 331443 byte[]
   7  5.26% 76.00%  21369472 667796  21369472 667796 331140 org.apache.lucene.index.TermInfo
   8  3.94% 79.95%  16027104 667796  16027104 667796 331137 java.lang.String
   9  2.63% 82.58%  10684736 667796  10684736 667796 331139 org.apache.lucene.index.Term
  10  1.31% 83.89%   5342384    1   5342384     1 331134 long[]

There seems to be a fair amount of information regarding hprof on the web, but I haven’t managed to find a definitive explanation of exactly what all these columns mean, however, I have a fair idea by now so here goes:

Sites - are particular stack traces.
rank - ranking is in order of amount of memory taken up by particular objects in a stack trace
self - this is the percentage of space allocated to particular objects
accum - not sure of this one, but guessing it could be the percentage of memory ever accumulated by these objects before garbage collection
live bytes – number of live bytes taken up by currently live objects
live objs – number of currently live objects
alloc’ed bytesI think this is the number of bytes allocated so far for particular objects
alloc’ed objs – likewise I think the number of objects of this type so far allocated
stack trace – stack trace number
class name – class of object

As can be seen most memory is used by the char[] (live bytes = 123015224/objs = 1260358). Live bytes/objs will usually be less than alloc'ed bytes/objs due to garbage collection taking place, but as the web server was only running for a short time before this profile was taken it is likely that garbage collection had not happened by the time I stopped the server.

Where live bytes/objs = alloc'ed bytes/objs this could possibly signify a memory leak. One of the sources I looked at stated that low level objects such as char[] tend to float to the top and advised looking further down the ranking for heads of leaking data structures. (See here).

Here is stack trace 331325 for our highest ranking char[] objects:

TRACE 331325:

java.lang.String.<init>(<Unknown Source>:Unknown line) org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:104)
org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:155)
org.apache.lucene.search.FieldCacheImpl$6.createValue(FieldCacheImpl.java:282)
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:260)
org.apache.lucene.search.FieldCacheImpl$7.createValue(FieldCacheImpl.java:371)
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
org.apache.lucene.search.FieldCacheImpl.getAuto(FieldCacheImpl.java:334)
org.apache.lucene.search.FieldSortedHitQueue.comparatorAuto(FieldSortedHitQueue.java:338)

which were allocated by the section of the code that performs a lucene search with a request to sort the result set.

For the moment it looks as though our “memory leak” issue is due to the sorting functionality in lucene using up a lot of resources rather than a memory leak as such. We are sorting strings which the lucene docs state are the most expensive types to sort in terms of resources because each unique term is cached for each document. This could be a problem for us and we may have to rethink how we do our sorting, if so that will be another topic.

hprof links I found useful:

http://java.sun.com/developer/technicalArticles/Programming/HPROF.html

http://www.skywayradio.com/tech/WAS51/appserver_hangs.php
http://www.skywayradio.com/tech/WAS51/HProf.php
http://publib.boulder.ibm.com/infocenter/javasdk/v1r4m2/index.jsp?
topic=/com.ibm.java.doc.diagnostics.142/html/id1590.html

http://www.javalobby.org/java/forums/t19612.html
http://www.javaworld.com/javaworld/jw-12-2001/jw-1207-hprof.html

Notes on Two days at XTech 2007

May 21, 2007 by ajmg

I was lucky enough to be one of the developers at Talis to attend XTech 2007 (well the first 2 days of it) and what a great conference it was, as well as being in the centre of Paris – my 21st floor Novotel room displayed a wonderful view of the city overlooking the River Seine (perfect!).

I find it sometimes easy to develop tunnel vision, in that you’ve got a job to do, something to implement, and a deadline and you’re focussed on that instead of the wider issues around what you are doing and the technologies that you are using. So it was good to go to XTech and see what others are doing around the semantic web area.

Ubiquitous Computing

One of the themes on the first day was ubiquitous computing with discussion around location aware devices, i.e., mobile devices with sensors that can interact with their physical environment. One of the talks on this theme was from Claus Dahl of Imity who has developed a client bluetooth application that can take a bluetooth scan of your immediate environment to “see” what other bluetooth objects are around. The Imity client can keep, tag and share a history of these objects with other people who have the Imity client installed. One of the interesting points Dahl made was that 3 months worth of personal location history was hard to fake as physical data is “stickier” than online data; so this history is identity. The Imity client is open source and they are planning an api for it.

Location aware devices and software such as the Imity client bring about new opportunities for social networking because as you share your tags you can find people you’ve met or people who have attended the same events as you over time. Check out imity.com for the opportunities in social networking that this involves.

Though not mentioned in the Imity talk these kinds of social networking opportunities also highlight privacy issues as if your phone is bluetooth enabled then it can be discovered by anyone with an Imity client. This leads me on to one of the keynote speeches by Adam Greenfield “Everywhere: expectation, emergence, reality”. His theme was that networking technology is no longer a PC on our desk with an ethernet cable but is “in the woodwork” everywhere around us. He quoted Mark Weiser “The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it”. Greenfield takes the view that now all the components are in place to enable this infiltration of devices into our everyday lives and that this imperceptible and pervasive aspect of computer technology has worrying implications for privacy. The usual counterargument to this is that if you’re honest then you shouldn’t be afraid as you have nothing to hide. Greenfield argued, however, that all societies survive due to a veneer of hypocrisy which is the oil that keeps the social wheels turning (a good definition of politeness perhaps). It’s worrying enough that the average Londoner is apparently caught on camera up to 300 times a day (see the liberty site), however, the ubiquity of these devices provide even more opportunities for surveillance and “reality mining”, i.e., harvesting facts about our behaviour. There are also issues around who controls this information. Some of the recommendations he makes as to how society should deal with these systems are that the systems should “default to harmlessness” and not necessarily embarrass, humiliate or shame users, (for example by being imprecise about users location etc.) Systems should also be deniable, i.e., offer users the ability to opt out at any time. Greenfield is an excellent speaker and I was impressed enough with his talk to order his book.

Open Data

Another theme was Open Data. The talk that I attended by Alf Eaton discussed how Semantic Web technologies can facilitate the sharing of scientific data and experimental results. For example, searching across all scientific literature cannot currently be done as all we have are pdf formats. What would be useful would be to have machine readable documents and semantic browsing, for example, a link which says show me all papers about this gene. Semantic browsing around this data is only possible if things have identifiers, i.e., each paper should have a DOI (Digital Object Identifier) assigned to it. Tools need to be made available for people to collaborate and put their scientific data and experiments for others to see. There was also discussion around how scientists doing experiments need to use structured data and the ontologies being created to enable this. An interesting issue brought up in the question and answer session regarded historical data, i.e., how do you generate a unique ID for, for example, the population Britain between 1957 and the present and also what sort of things should be uniquely identified.

This links with a great keynote by Gavin Starks, who has a background in astrophysics. His talk was about climate change, which he started off by showing us some scary statistics regarding C02 level concentration and world temperature as well as showing us satellite images of how much the polar ice caps have shrunk in the last decade. There is, apparently, a proven correlation between how high C02 levels are and the temperature of the planet. I haven’t yet seen “An Inconvenient Truth” but Starks recommends it as presenting the scientific data very well. Starks then went on to present what he and his colleagues were doing about climate change which was the launch of AMEE, a semantic web platform and “generic algorithmic engine” surrounded by an api and consisting of data supplied by DEFRA, the Royal Society, Global Cool, 0c Climate group (he said that even Rupert Murdoch has apparently bought into this). This platform enables things like toolkits for schools, tradespeople can use it to give energy profiles to their businesses, campaigners can use it to collaborate with each other etc.

I feel depressed about climate change and I’ve always thought that to do something done about it would be nigh impossible as that would mean world governments having to take the lead and collaborate together to tackle the problem (fat chance!). However, due to semantic web technology new social networks are emerging so perhaps we will see a more grass roots movement of concerned citizens who are willing to collaborate for the survival of our species and other species. I was going to say the survival of our planet, but as Starks with his atrophysics background, pointed out, the planet doesn’t care, the planet will survive no matter what we do it. It’s the living beings on our planet, including ourselves that we need to be concerned about. By the way AMEE stands for Avoiding Mass Extinction Engine.

Tutorials

XPath 2.0, XQuery 1.0 and XSLT 2.0 Explained was a detailed and useful tutorial by Priscilla Walmsley which I can’t do justice to as I only attended for a morning. What I did pick up was that there are about 110 built-in functions in XPath 2.0 and also that you can write your own functions. Priscilla took us through the XPath/XQuery data model ( the document node is now the top level node, this was formerly the root node the term root no longer being used). The new version is now strongly typed (which apparently has caused some heated argument from people who think it should be more scripting like). Also new comparison operators exist in XPath 2.0 which it would be better to use as these are optimized for performance, e.g., eq, ne, lt, le, gt, ge rather than the old =, != etc. See her new book here.

Browser Technologies

Though I’ve done front end web development with old fashioned ASP and with ASP.net I’m not all that familiar with the W3C specifications for html and css (my excuse is that I haven’t had time to read and delve into them in detail as I just needed to get a job done quickly) so the talk by Molly Holchszschlag was aimed at developers like me. One of the more frustrating aspects of web development involves browsers rendering the same thing differently or that things will work in one browser and not in another, so this was an interesting and useful talk on browser interoperability and why browsers work the way they do. One of the problems is fractured specifications (linked to this there was some heated debate on why two specifications, xhtml and html 5, are needed), and ambiguities in the specifications. I hope I’m not wrong but I got the impression that, for example, a particular browser may have been developed to comply with one specification and a different browser developed to comply with a different version of the same specification or that different browsers historically have been developed to comply with different specifications. Therefore, no wonder different browsers work, well differently! As well as some interesting historical analysis of the way browser technology has developed she also made some suggestions as to the way forward to improve browser interoperability such as:

Evolving tools and getting community feedback.
Work to common standards and clarify any ambiguities in the W3C specifications.
Have transparent and open development cycles, rather than closed, competitive and secretive development efforts, and work from common use cases.

Time to Leave

The last talk I attended before I left was RSS Remixing by Ian Davis who did a demo of the apis surrounding our Talis platform. He showed how different RSS results retrieved can be used to augment each other so, for example, you can do a search of bibliographic data and augment this with book jacket images and reviews, and data from wikipedia.

I thoroughly enjoyed my 2 days at XTech and met some really nice people as well hearing some useful stuff.

Lucene Sorting Again

May 14, 2007 by ajmg

Fixing various issues in this story I’m implementing which uses Lucene sorting functionality. Ran into the issue that if you specify a field to be sorted by, and that field is not indexed, then IndexSearcher throws a java runtime exception. A bit annoying as a typed exception could be caught and more easily dealt with. So what we’ve done is retrieved the indexed field names via

IndexSearcher.getIndexReader().getFieldNames(
IndexReader.FieldOption.INDEXED)

which returns a Collection, then checked that the fields specified in the sort criteria are in this Collection.