visualising vocabulary

Fiona just pointed me to Visuwords a lovely visualisation of word association using WordNet data. The image below is of the word ‘human’ and you can see two clusters one corresponding to the noun human and one to the adjective meaning humane/caring
Visuword visualisation of the word 'human'
[[full size image]]

Visuwords is Flash front-end and PHP backend. It appears to use some variant of spring and ball visualisation. You can download the source … so could use it as the basis of visualisation for other kinds of data such as web sites.

multiple representations – many chairs in the mind

I have just started reading Andy Clark’s “Being There”1 (maybe more on that later), but early on he reflects on the MIT COG project, which is a human-like robot torso with decentralised computation – coherent action emerging through interactions not central control.

This reminded me of results of brain scans (sadly, I can’t recall the source), which showed that the areas in the brain where you store concepts like ‘chair’ are different from those where you store the sound of the word – and also I’m sure the spelling of it also.

This makes sense of the “tip of the tongue” phenomenon, you know that there is a word for something, but can’t find the exact word. Even more remarkable is that of you know words in different languages you can know this separately for each language.

So, musing on this, there seem to be very good reasons why, even within our own mind, we hold multiple representations for the “same” thing, such as chair, which are connected, but loosely coupled.

Continue reading

  1. Andy Clark. Being There. MIT Press. 1997. ISBN 0-262-53156-9. book@MIT[back]

I was just sent a link to an article in The Psychologist “Sleep on a Problem… It works like a dream” by Josephine Ross

The article gathers loads of anecdotal evidence of creativity in dreams … including, inevitably, those benzine rings!

Personally … while I’m sure that some things happen unconsciously and during sleep, my guess is that 90% of these creativity stories have simpler reasons through selective memory or semi-random inspiration.

Continue reading

tagging … I am not alone … or am I?

I’ve noticed that I reuse very few tags … and thought I was just a poor tag-user. However, I read the other day a reference to a paper at the CSCW confernce last year; it reported that the average number of re-uses of a tag was just 1.311 . I thought this meant that most tags are never reused … I am not alone 🙂

Having downloaded and read the paper it turns out that this is the average number of users who use a tag – that is most tags are used by only one person, in fact individuals reuse their own tags a lot more … so I am no-good tagger after all 🙁

Incidentally, I use ultimate tag warrior plugin for wordpress and it seems OK. Only drawback is that if you want tags displayed with your post, they really get inserted into the post itself. This is not a problem for tags at the end, but would mess up an RSS feed if you like your tags above the post. I guess this is because wordpress does not have a handle for plugins to add things to display loops, so the only way to ensure the tags are displayed are to make them part of the post.

Also Nad sent me a link to a neat tag visualisation by Moritz Stefaner.

  1. Sen, S., Lam, S. K., Rashid, A., Cosley, D., Frankowski, D., Osterhouse, J., Harper, F. M., and Riedl, J. 2006. tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work {Banff, Alberta, Canada, November 04 – 08, 2006}. CSCW ’06. ACM Press, New York, NY, 181-190. DOI= http://doi.acm.org/10.1145/1180875.1180904[back]

digital culture

I was at futuresonic last Friday doing a panel keynote at the Social Technologies Summit. I talked about various things connected to imagination: bad ideas, regret modelling and firefly/fairylights technology. On the same panel was a guy from Satchi and Satchi who created television adds for T-mobile and a lady from Goldsmiths who described a project for Intel where they studied a London bus route. The chair Eric introduced the session with a little about blogging and other web-based technologies and in general we were immersed in the ways in which digital culture pervades the day to day world.
In my way home on the train I sat opposite a father and son who were playing hangman. The boy was about 6 or 7 and the father had to help him and sometimes correct him. Every so often I noticed the words they chose, but just before I got off the train there was obviously the father’s hardest challenge yet. I gradually noticed the hightened excitement in the voices … it was a word with ‘X’ and ‘Y’ in it.

As I stood to get up, the boy eventually got the last letters and completed the word …

F O X Y B I N G O . C O M

omnignorance, the future of the web

The dream of the web seems a form of omniscience, unlmited and universal knowledge available at the vlick ofof a button or least at the click of a Google ‘Seach’ button. However, last night I was on a site that empitomised the problems of the web.

I was pointed to a blog entry about a presentation that the blog said was “Simply the best presentation I’VE EVER SEEN!“. I was intrigued and and followed this to the Identity2.0 site and in particular a page about a keynote at OSCON 2005.

The entry about the talk mentioned ‘Identity2.0’ and ‘digital identity’ so guessed this was something to do with single logins (like MS passport) or open authentication, which has long been an open issue (with many ‘solutions’ but so far little success). However, was this person and this site talking about one of these such as plain open authentication, or something deeper.

Well it is fine for the page about the talk not to say clearly, it is written within the context of the Indetity2.0 site, so I looked for an ‘about’ link or something like that … nothing I stripped the url back to plain identity2.0.com and of course simply got to a stabdard blog front page (I guess rather like this one), with the latest news, but nothing to gove over context or background.

In fact by chasing yet more links to other sites, by half guessing from various abstracts, blog entries, etc. I managed to pick out half a story about what this was about … but why so hard?

This reminds me of the problem I recorded a month or so back when trying to find last week’s (as opposed to yesterday’s) news. Really easy to find the latest item or even the hotest item, but really bad at getting to the background that gives context and turns buzz words into meaning.

At the risk of sounding like a codger at a cafe table, the same is true of much software documentation. If you have seen the software grow and develop over the years it makes sense, but to students trying to make sense of Java packages, AJAX, Mac/Windows APIs, it is like fumbling in the dark. Good signpoosting at every street corner, but no roadmaps.

In all these cases there are real questions we want to ask, this is not like meandering around Flickr or YouTube, travelling just for the journey. However definitive statements (I’ll not say answers) give way to half-overheard conversations in a coffee shop.

It is often said that experts know more and more about less and less untul eventually they know everything about nothing. It seems we are turning into a generation who know less and less about more and more until we know nothing about everthing – omingnorance rules.

Practicing security

Last week I was stuck a night in Frankfurt due to the high winds. Soon after I settled into the 17th floor of the Holiday Inn, gusts screaming round the building, I got a call on my mobile. I thought it would be my taxi driver confirming the time for the re-booked flight the next day, but instead there was an unfamiliar voice:

“This is HM Revenue and Customs, we have a message for you, but first can we confirm your name and address”

Actually name and address to phone number is not that secret, but still I asked:

“How do I know you are who you say you are?”

“If you prefer you can ring back on …. it is to your advantage”

I checked that it would be OK to ring the next day on my return and rang off … I didn’t say (life is too short), that being given a number to ring back does not increase my confidence unless I can verify it.

In fact the next day I checked on the HMR&C site and the number was their helpline. However, this call had many hall marks of a fraudulent call: how could I, or a less technically aware citizen know this was a good call? In this case the information requested was relatively innocuous (but of course could easily continue, “and date of birth … bank details …”) and the phone number given was an 0845 number which costs the recipient money … either genuine or high-value fraud. Of course, if I was fradulently ringing up people pretending to be HMR&C, at any sign of trouble giving the genuine helpline numer would be just what I would do to allay suspicion!

It is not just HMRC that give calls like this; banks and credit card companies are forever ringing up and asking you to confirm your identity … and that usually does include giving some form of security code. But they have rung you up, so have more confidence in who you are then you do in them, yet never offer any means to confirm their identitiy.

Email too: I have received various mails from banks which look very like phishing emails. In one case I received an email where the domain of the sender was different from the domain of the reply email and different again from the domain of the URL link. It goes to say that none of these were the same as the standard domain of the bank. In this case the only reason I knew it was not phishing was that it offered information and did not request anything secure.

By sending emails and making phone calls that are virtually indistinguishable from fraudulent ones, the banks (and even HMR&C) are training us to be victims of fraud.

Literally we are encouraged to practice being insecure.

the power of sequential thinking

A short while ago I was mentioning to another computing academic at a meeting the curious fact that the computational power of the complete internet is now roughly similar to that of a single human brain [[see article here]]. While this little factoid is deliberatly provocative, I did not expect the strength of the response.

“that’s impossible” he said.

“why” I asked, “I’m not saying they are similar, just that there is the same computational potential”

“Computers are sequential” he said, “brains are associative”.

Further attempts to reason, likening it to other forms of simulation or emulation, simply met with the same flat response, a complete unwillingness to entertain the concept.

Partly this is to do with the feeling that this somehow diminishes us as people, what for me was a form of play with numbers, for him was perhaps an assault on his integrity as a human. I guess as a Christian I’m used to the idea that the importance of a person is not that we are clever or anything else, but that we are loved and chosen. So, I guess, for me this is less of an insult to my idea of being who I am.

This aside it is interesting that the reason given was about the mode of computation: “computers are sequential” vs. the massively parallel associativity of the human brain.

Of course if the computational substrate is all the PCs connected to the Intenet then this is hardly purely sequential and in fact one of the reasons that you could not ‘run’ a brain simulation on the Internet is that communication is too slow. Distributed computation over 100s of millions of PCs on the internet could not synchronise in the way that long-range synapses do within our brains.

Amongst other things it is suggested that our sense of consciousness is connected with the single track of synchronised activity enabled by the tight interconnections and rapid feedback loops within our brains1. In contrast, individual computers connected to the onternet compute far faster than they can communicate, there could be not single thread of attention switching at the rate that our minds can.

If the internet were to think it would be schizophrenic.

Sequence is also imprtant in other ways. As the man said, our brians are associative. When considering spreading activation mechanisms for intelligent internet interfaces, one of the problems is that associative stuff gets ‘mixed up’. If London has a high level of activation, why is that? In a designed computational framework it is possible to consider mutiple ‘flavours’ of activations spreading through a network of concepts, but our brains do not do this, so how do they mange to separate things.

Now to some extent they don’t – we get an overall feel for things, not seeing the world as little pieces. However, it is also important to be able to more or less accurately ascribe feelings and associations to things. Consider one of those FBI training ranges were bank terrorists and hostages pop out from behind windows or doors. Your aim is to shoot the terrorists and save the hostages. But, if you see a robber holding a hostage how do you manage to separate the ‘bad and kill’ feelings and properly ascribe them only to the terrorist and not the hostage.

The answer may well be due exactly to the switching of attention. Even with both terrorist and hostage are next to each other, as mental attention shifts momentarily to one and then the other, the mental associations also shift. Rodney Cotterill in Enchanted Looms describes two levels of attention switich2. One near conscious and taking around 500ms and one connected with more low-level visual attention (sometimes called a visual searchlight) at 20-50ms. It is probably the slower timescales that allow fuller webs of association to build and decay, but maybe there are other intermediate timescales of attention switching as well.

If this is right then the rapid sequential shifts of attention could be essential for maintaining the individual identity of percepts and concepts.

If we look at concepts on their own, another story of sequence unfolds.

There is a bit of a joke among neuroscientists about grandmother cells. This is the idea that there is a single neuron that in someway encodes or represents your grandmother3

Looking at this purely from a computing science perspective, even if there were not neurological reasons for looking for more distrubuted representations, there are computational ones. If concepts were stored in small local assemblies of neurons (not single ones to allow some redundancy and robsutness) and even a reasonably large part of our brains were dedicated to concept memory, then there just seems too few ‘concept-slots’.

If we used 100 neurons per concept and 10% of the brain for concept memory, we would only have space for around 10 million concepts. A quick scan through the dictionary suggests I have a reconition vocabuary of arounf 35,000 words, so that means I’d have less than 300 other concepts per dictinary word one. Taking into account memories of various kinds, it justs seems a little small. If we take into account the interconnections then we have plenty of potential long-term storage capacity (1/2 petabyte or so), but not if we try to use indiviudal groups of neorons to represent things. Gradmother cells are simpy an inefficient use of neurons!

Now there is also plenty of neurological evidence for more distributed storage. Walter Freeman describes how he and his team lovingly chopped the tops off rabbits’ skulls, embeded electrodes into their olfactory bulbs and then gently nursed them back to health4. The rabbits were then presented with different smells and each smell produced a distinctive pattern of neuron firings, but these patterns exteded across the bulb, not localised to a few neurons.

If neurons had ‘continuous’ levels of activation it would be possible to represent things like “1/2 think it is a dog 1/2 think it is a fox”, simply as an overlay of the activation of each. However, if this were the case, and one could have in mind any blend of concepts, then an assembly of N neurons would still only be able to encode up to N concepts as the concepts patterns would form a set of basis vectors for the N-dimensional vector space of possible activation levels (a bit of standard linear algebra).

In fact, neurons tend to behave non-linearly and in many areas there are patterns of inhibition as well as mutual excitement and disinhibition, leading to winner-takes-all effects. If this is true of the places where we represent concepts for short term memory, conscious attention, etc., then this means instead of representations that ‘add up’, we have each pattern potentially completely different, similar to the way binary numers are encoded in computer memory: 1010 is not a combination of 1000 and 0010 but completely different.

In principle this kind of representation allows 2^N (two to the power of N) rather than N different concepts using the same N neurons … In reality, almost certainly representations are less ‘precise’ allowing some levels of similarity in representations etc., so the real story will be more complex, but the basic principle holds that combinations of thresholding and winner-takes-all allow more distinct concepts than would be possible if combinations of concepts can occur more freely.

However, notice again that higher capacity to deal with more concepts is potentially bought at the cost of being able to think of less things ‘at once’ – and the side effect is that we have to serialise.

Returning back to the “computers are sequential, brains are associative” argument, whilst not denying the incredible parallel associativity of human memory, actually there seems as much to wonder about in the mechanisms that the brain ‘uses’ for sequentiality and the gains it gets because of this.

  1. see Gerald Edelman, Wider then the Sky, Yale University Press, 2004, ISBN 0-300-10229-1[back]
  2. Rodney Cotterill, Enchanted Looms: Conscious Networks in Brains and Computers, Cambridge University Press, 1998, ISBN 0-521-62435-5. See p. 244 for 500ms switching and pp. 261 and 265 for 20-50ms spotlight/searchlight of attention[back]
  3. Although the grandmother cell this is generally derided as oversimplisitic, there is evidence that there is more neuron specialisation then previously thought [[see Mind Hacks: evidence for ‘Grandmother Cells’]]. Also it is easier to encode relationships if there are single patches than configuratiin sof neurons, so perhaps we have both mechanisms at work.[back]
  4. Walter J. Freeman, How Brains Make Up Their Minds, Phoenix, 1999, ISBN 0-75381-068-9. See p. 95 onwards for rabbit olfactory bulb experiments.[back]