Names, URIs and why the web discards 50 years of computing experience

Posted on August 8, 2010 by alan

Names and naming have always been a big issue both in computer science and philosophy, and a topic I have posted on before (see “names – a file by any other name“).

In computer science, and in particular programming languages, a whole vocabulary has arisen to talk about names: scope, binding, referential transparency. As in philosophy, it is typically the association between a name and its ‘meaning’ that is of interest. Names and words, whether in programming languages or day-to-day language, are, what philosophers call, ‘intentional‘: they refer to something else. In computer science the ‘something else’ is typically some data or code or a placeholder/variable containing data or code, and the key question of semantics or ‘meaning’ is about how to identify which variable, function or piece of data a name refers to in a particular context at a particular time.

The emphasis in computing has tended to be about:

(a) Making sure names have unambiguous meaning when looking locally inside code. Concerns such as referential transparency, avoiding dynamic binding and the deprecation of global variables are about this.

(b) Putting boundaries on where names can be seen/understood, both as a means to ensure (a) and also as part of encapsulation of semantics in object-based languages and abstract data types.

However, there has always been a tension between clarity of intention (in both the normal and philosophical sense) and abstraction/reuse. If names are totally unambiguous then it becomes impossible to say general things. Without a level of controlled ambiguity in language a legal statement such as “if a driver exceeds the speed limit they will be fined” would need to be stated separately for every citizen. Similarly in computing when we write:

function f(x) { return (x+1)*(x-1); }

The meaning of x is different when we use it in ‘f(2)’ or ‘f(3)’ and must be so to allow ‘f’ to be used generically. Crucially there is no internal ambiguity, the two ‘x’s refer to the same thing in a particular invocation of ‘f’, but the precise meaning of ‘x’ for each invocation is achieved by external binding (the argument list ‘(2)’).

Come the web and URLs and URIs.

Fiona@lovefibre was recently making a test copy of a website built using WordPress. In a pure html website, this is easy (so long as you have used relative or site-relative links within the site), you just copy the files and put them in the new location and they work 🙂 Occasionally a more dynamic site does need to know its global name (URL), for example if you want to send a link in an email, but this can usually be achieved using configuration file. For example, there is a development version of Snip!t at cardiff.snip!t.org (rather then www.snipit.org), and there is just one configuration file that needs to be changed between this test site and the live one.

Similarly in a pristine WordPress install there is just such a configuration file and one or two database entries. However, as soon as it has been used to create a site, the database content becomes filled with URLs. Some are in clear locations, but many are embedded within HTML fields or serialised plugin options. Copying and moving the database requires a series of SQL updates with string replacements matching the old site name and replacing it with the new — both tedious and needing extreme care not to corrupt the database in the process.

Is this just a case of WordPress being poorly engineered?

In fact I feel more a problem endemic in the web and driven largely by the URL.

Recently I was experimenting with Firefox extensions. Being a good 21st century programmer I simply found an existing extension that was roughly similar to what I was after and started to alter it. First of course I changed its name and then found I needed to make changes through pretty much every file in the extension as the knowledge of the extension name seemed to permeate to the lowest level of the code. To be fair XUL has mechanisms to achieve a level of encapsulation introducing local URIs through the ‘chrome:’ naming scheme and having been through the process once. I maybe understand a bit better how to design extensions to make them less reliant on the external name, and also which names need to be changed and which are more like the ‘x’ in the ‘f(x)’ example. However, despite this, the experience was so different to the levels of encapsulation I have learnt to take for granted in traditional programming.

Much of the trouble resides with the URL. Going back to the two issues of naming, the URL focuses strongly on (a) making the name unambiguous by having a single universal namespace; URLs are a bit like saying “let’s not just refer to ‘Alan’, but ‘the person with UK National Insurance Number XXXX’ so we know precisely who we are talking about”. Of course this focus on uniqueness of naming has a consequential impact on generality and abstraction. There are many visitors on Tiree over the summer and maybe one day I meet one at the shop and then a few days later pass the same person out walking; I don’t need to know the persons NI number or URL in order to say it was the same person.

Back to Snip!t, over the summer I spent some time working on the XML-based extension mechanism. As soon as these became even slightly complex I found URLs sneaking in, just like the WordPress database 🙁 The use of namespaces in the XML file can reduce this by at least limiting full URLs to the XML header, but, still, embedded in every XML file are un-abstracted references … and my pride in keeping the test site and live site near identical was severely dented¹.

In the years when the web was coming into being the Hypertext community had been reflecting on more than 30 years of practical experience, embodied particularly in the Dexter Model². The Dexter model and some systems, such as Wendy Hall’s Microcosm³, incorporated external linkage; that is, the body of content had marked hot spots, but the association of these hot spots to other resources was in a separate external layer.

Sadly HTML opted for internal links in anchor and image tags in order to make html files self-contained, a pattern replicated across web technologies such as XML and RDF. At a practical level this is (i) why it is hard to have a single anchor link to multiple things, as was common in early Hypertext systems such as Intermedia, and (ii), as Fiona found, a real pain for maintenance!

I actually resolved this by a nasty ‘hack’ of having internal functions alias the full site name when encountered and treating them as if they refer to the test site — very cludgy![back]
Halasz, F. and Schwartz, M. 1994. The Dexter hypertext reference model. Commun. ACM 37, 2 (Feb. 1994), 30-39. DOI= http://doi.acm.org/10.1145/175235.175237[back]
Hall, W., Davis, H., and Hutchings, G. 1996 Rethinking Hypermedia: the Microcosm Approach. Kluwer Academic Publishers.[back]

grammer aint wot it used two be

Posted on July 10, 2009 by alan

Fiona @ lovefibre and I have often discussed the worrying decline of language used in many comments and postings on the web. Sometimes people are using compressed txtng language or even leetspeak, both of these are reasonable alternative codes to ‘proper’ English, and potentially part of the natural growth of the language. However, it is often clear that the cause is ignorance not choice. One of the reasons may be that many more people are getting a voice on the Internet; it is not just the journalists, academics and professional classes. If so, this could be a positive social sign indicating that a public voice is no longer restricted to university graduates, who, of course, know their grammar perfectly …

Earlier today I was using Google to look up the author of a book I was reading and one of the top links was a listing on ratemyprofessors.com. For interest I clicked through and saw:

“He sucks.. hes mean and way to demanding if u wanan work your ass off for a C+ take his class¹”

Hmm I wonder what this student’s course assignment looked like?

Continue reading →

In case you think I’m a complete pedant, personally, I am happy with both the slang ‘sucks’ and ‘ass’ (instead of ‘arse’!), and the compressed speech ‘u’. These could be well-considered choices in language. The mistyped ‘wanna’ is also just a slip. It is the slightly more proper “hes mean and way to demanding” that seems to show general lack of understanding. Happily, the other comments, were not as bad as this one, but I did find the student who wanted a “descent grade” amusing 🙂 [back]

European working time directive 2012 – the end of the UK university?

Posted on February 13, 2009 by alan

Fiona @ lovefibre just forwarded me a link to a petition about retained firefighters, who evidently may be at risk as the right to opt out of European working time directive is rescinded. Checking through to the Hansard record, it seems this is really a precautionary debate as the crunch is not until 2012.

However, I was wondering how that was going to impact UK academia if, in 2012, the 48 hour maximum cuts in.

It may make no difference if academics are not required to work more than 48 hours, just decide to do so voluntarily. However, this presumably has all sorts of insurance ramifications – if we do a reference or paper outside the ‘official hours’ would we be covered by the University’s professional indemnity. I guess also, in considering promotions and appointments, we would have to ‘downgrade’ someone’s publications etc. to only include those that were done during paid working hours otherwise we would effectively be making the extra hours a requirement (as we currently do).

The university system has become totally dependent on these extra hours. In a survey in the early 1990s the average hours worked were over 55 per week, and in the 15 years since then this has gone up substantially. I would guess now the average is well over 60, with many academics getting close to double the 48 hour maximum. I recall one colleague, who had recently had a baby, mentioning how he had cut back on work; now he stops work at 5pm … and doesn’t start again until 7:30pm, his ‘cut back’ week was still way in excess of 60 hours even with a young baby¹. Worryingly this has spread beyond the academics and departmental administrators are often at their desks at 7 or 8 o’clock in the evening, taking piles of work home and answering email through the weekend. While I admire and appreciate their devotion, one has to wonder at the impact on their personal lives.

So, at a human level, enforcing limited working hours would be no bad thing; certainly many companies force this, forbidding work out of office hours. However, practically speaking, if the working time directive does become compulsory in 2012, I cannot imagine how the University system could continue to function.

And … if you are planning to do a 3 year course, start now; who knows what things will be like after 3 years!

Yea, and I know I can’t talk, as an inveterate workaholic I ‘cut back’ from a high of averaging 95 hours a few years ago and now try to keep around 80 max. I was however very fortunate in that I was doing a PhD and then personal fellowships when our children were small, so was able to spend time with them and only later got mired in the academic quicksands.[back]

more tibbies

Posted on August 20, 2007 by alan

As I just wrote about a Tibetan spaniel I thought I ought to add a picture of our Tibetan spaniel, Tansy, in case she gets jealous.

Tansy in purple

drawing by Fiona at LoveFibre

Alan Dix

Tag Archives: lovefibre

Names, URIs and why the web discards 50 years of computing experience

grammer aint wot it used two be

Continue reading →

European working time directive 2012 – the end of the UK university?

more tibbies