details matter: infinite scrolling and feature interaction

Many sites now dynamically add content to a page as you scroll down; this includes both Facebook and Twitter feeds, which add content as you get near the bottom.  In many ways this is a good thing, if users have to click to get to another page, they often never bother1.  However there can be unfortunate side effects … sometimes making sites un-navigable on certain devices.  There are particular problems on MacOS, due to the removal of scrollbar arrows, a usability disaster anyway, but confounded by feature interactions with other effects.

A recent example was when I visited the SimoleonSense blog in order to find an article corresponding to an image about human sensory illusions.  The image had been shared in Facebook, and I found, when I tried to search for it, also widely pinned in Pinterst, but the Facebook shares only linked back to the image url and Pinterst to the overall site (why some artists hate Pintrest).  However, I wanted to find the actual post on the site that mentioned the image.

Happily, the image url, http://www.simoleonsense.com/wp-content/uploads/2009/02/hacking-your-brain1.jpg, made it clear that it was a WordPress blog and the image had been uploaded in February 2009, so I edited the url to http://www.simoleonsense.com/2009/02/ and started to browse.  The site is a basically a weekly digest and so the page returned was already long.  I must have missed it on my first scan down, so I hit the bottom of the page, it dynamically added more content, and I continued to scroll.  Before long the scrollbar handle looked very small, and the page very big and every time I tried to scroll up and down the page appeared to go crazy, randomly scrolling anywhere, but not where I wanted.

It took me a while to realise that the problem was that the scrollbar had been ‘enhanced’ by the website (using the WordPress infinite scroll plugin), which not only added infinite scrolling, but also ‘smart scrolling’, where a click on the scrollbar makes an animated jump to that location on the scrollbar.  Now many early scrollbars worked in this way, and the ‘smart scroll’ options is inspired by the fact that Apple rediscovered this in iOS for touch screen interaction.  The method gives rapid interaction, especially if the scrollbar is augmented by ‘tips’ on the scrollbar (see the jQuery smartscroll demo page).

Unfortunately, this is different from the Mac normal behaviour when you click above or below the handle on a scrollbar, which effectively does screen up/down.  So, I was trying to navigate up/down the web page a screen at a time to find the relevant post, and not caring where I clicked above the scroll handle, hence the apparently random movements.

This was compounded by two things.  The first is a slight bug in the scrolling extension which means that sometimes it doesn’t notice your mouse release and starts scrolling the page as you move your mouse around.  This is a bug I’ve seen in scrolling systems for many years, not taking into account all the combinations of mouse down/up, enter/leave region etc., and is present even in Google maps.

The second compounding factor is that since MacOS got rid of the scrollbar arrows (why? Why? WHY?!!), this is now the only way to reliably do small up/down movements if you don’t have a scroll wheel mouse or similar.

Now, in fact, my Air has a trackpad and I think Apple assumes you will use this for scrolling, but I have single-finger ‘Tap to click’ turned off to prevent accidental selections, and (I assume due to a persistent bug) this turns off the two finger scrolling gesture as well (even though it is shown as on in the preferences), so no scrolling from the touchpad.

Since near the beginning of my career I have been fascinated by these fine design decisions and have written previously about scrollbars, buttons, etc.  They are often overlooked as they form part of the backdrop to more significant applications and information.  However, the very fact that they are the persistent backdrop of interaction makes their fluid usability crucial, like the many mundane services, buses, rubbish collection, etc., that make cities work, but are often unseen and unnoticed until they fail.

Also note that this failure was not due to any single feature or bug, but the way these work together what the telephony industry originally named ‘feature interaction‘, but common across all technological systems  There is no easy fix, apart from (i) thinking of all possible scenarios (reach for your formal methods in HCI!) and (ii) testing across different devices.  And certainly (Apple please listen!) if it ain’t broke, don’t fix it.

Happily, I did manage to find the post in the end (I forget how, maybe random clicking) and it is “5 Ways To Hack Your Brain“.  The individual post page has no dynamic additions, so is only two screens big on my display (phew), but still scrolled all over the place as I tried to select the page title to paste above!

  1. To my mind, early web guidance, was always wrong about this as it usually suggested making pages fit a screen to improve download speed, whereas my feeling, when using a slow connection, was it was usually better to wait a little longer for one big screen (you were going to have to wait anyway!) and then be able to scroll up and down quickly.[back]

fixing hung iCal

iCal hung on a sync with Google calendars and kept hanging everytime I restarted it, even after restarting the whole machine.

I found some advice on this in a few posts.

One “Fix an iCal ‘application not responding’ occasional hang” was more about occasional long pauses and suggested selecting”Reset Sync History” in  “iSync » Preferences”.  Another  “Fix an iCal hang due to system date reset” suggested resetting the ‘lastHearBeatDate‘ in Library/Preferences/com.apple.iCal.plist. Neither worked, but prompted by the latter I used TimeMachine (yawn yawn how do they make it sooooo sloooow), to restore copies of all the iCal plist files in Library/Preferences/, but again to no avail.

So several good suggestions, but none worked.

Happily I saw a comment lower down on “Fix an iCal hang due to system date reset” which suggested moving the complete ~/Library/Calendars folder out to the desktop and then recopying the calendar files in one by one after restarting iCal. I didn’t do this as such, but instead in ~/Library/Calendars there are a number of ‘Calendar Cache‘ files and also a folder labelled Calendar Sync Changes. I removed these, restarted and … it works 🙂

Hardly easy for the end user though :-/

Time Machine – when it goes wrong and how to fix it

Unfortunately only fixing Mac OS X backup, not the Tardis 🙁 … but, nonetheless, critical.

What bit of software do you really need to be reliable?  If anything else goes really wrong you have the backup — but if the backup fails you really are lost.

And Mac OS X Time Machine, while it does have a very pretty  interface, is inclined to get stuck sometimes.

This is my own story of how it goes wrong … and how to put it right.

… and throughout I’ve dropped in a few lessons for anyone implementing critical system software — maybe the odd Apple engineer is reading

how to tell when things are wrong

Occasionally Time Machine seems to be stuck, but isn’t really.  When you first do a backup, or when you haven’t backed up to a particular disk for ages (perhaps if you have been away on a trip), it can spend several hours ‘preparing’.  You can tell it is ‘preparing’ because when you open the Time Machine preferences there is the little barbers pole saying ‘preparing’ 😉

This is when it is running over the disk working out what it needs to backup, and always seems to be the lengthiest operation, actually backing up the disk is often quite fast, and yet, for some reason there is no indication of how far through the ‘preparing’ process it has got.

Lesson 1: make sure you include progress indicators for anything that can take a while, not just the obvious ‘slow’ things.

So, when you see ‘preparing’, just be patient!

However, at least half-a-dozen times over the last year, my Time Machine has got completely stuck.  I have seen this happen in three ways:

(i)  it is still saying ‘preparing’ after leaving it overnight!

(ii)  it starts to transfer to disk, but then gets stuck part way:

(iii)  if you look in the Time Machine preferences it says the backup has failed

This last time in fact the first sign was (iii), but it doesn’t actually tell you (if you don’t look) until it has failed for ten days, by which time I was travelling.  In the days before Time Machine I always did a manual backup before travelling as I knew that was when things were most likely to go wrong, but now-a-days I have got used to relying on it and forget to check it is working OK … so if you are paranoid about your data, do peek occasionally at Time Machine to check it is still working!

When I got home and told Time Machine to backup to the Time Capsule here rather than my office disk (why can’t it remember that I have two backup disks??).  Then (after being very very patient while to was ‘preparing’ for four hours), I saw it got stuck in step (ii) at 1.4 GB or 4.2 GB.  Of course progress indicators are never very good for very slow operations, when transferring several GB of data there may be several minutes before the bar even moves a pixel … but I was very very patient and it definitely did not move!

Lesson 2: for very long processes supplement the progress indicator with some other indicator to show things are still working, in this case perhaps amount transferred in last minute

At this point I did the normal things, turn Time Machine On/Off,  restart machine a couple of times, etc., but when it persists then you know something is deeply wrong.

so why does it go wrong?

In fact Fiona@lovefibre has found Time Machine flawless for her desktop machine backing up to exactly the same Time Capsule.  I am guessing the problem I have is because I use a laptop so possible reasons:

  • it may go to sleep occasionally, breaking connection to the Time Capsule
  • maybe the WiFi aerial on a laptop is not as good as the desktop

However, if every laptop failed as often surely Apple would have fixed it by now.  So guessing there is an additional factor:

  • my disk has 196 Gb of data, much of it in smaller document files (word docs, code files, etc.), not just a few giant movies.

The software will be designed to withstand a certain amount of external failure, especially when connecting to disks over WiFi as the Time Capsule is designed to do.  However, I imagine that there are places in the code where there are race conditions, or critical portions where external failure really makes a difference.  If the external connections are reliable and the backup is quite fast the likelihood of hitting one of the nasty spots in the code is low.  However, if you have a lot of data to check and then transfer and the external failures more frequent, then the likelihood of hitting one increases and things start to go wrong.

I see similar problems with other software, Dreamweaver in particular, which has got better, but still can crash if the Internet connection is poor (see also “Why software need never hang“).  What happens is that during testing, the test machines often have minimal data, little software (maybe just the operating system and what is being tested), and operate in perfect situations.  In such circumstances these hidden flaws never become apparent.

Lesson 3: make sure your test machine is fully loaded with data and applications, and operates in an unreliable environment, so that testing is realistic

However, this is not like Word crashing and losing your most recent edits to one document.  When Time Machine fails it seems to occasionally leave something corrupt in the backup disk so that subsequent attempts to backup also fail.  There is no excuse for this, the techniques for dealing with potential disk-writing failures are well established in both databases and low-level disk management.  For example, one can save a timestamp file at the end of successful operations so that, when  returning to the data, if the timestamp file is not there the software knows something went wrong last time.

Maybe Time Machine is trying to be too clever, picking up where it left off when, for example, connection to the disk is broken.  If so it clearly needs some additional mechanism to notice “I’ve tried this several times and it keeps going wrong, maybe I need to back off to the last successful state”.  Perhaps not something to worry about in less critical software, but not difficult to get right when it is really needed … as in backups!

Lesson 4: build critical software defensively in layers so that errors in one part do not affect the whole; and if saving to disk ensure there is some sort of atomic transaction

The aim during testing should be what I call “fail-fast programming” trying to make sure that failures happen during testing not real use!

One thing I found particularly disturbing about my most recent Time Machine hang is that when I looked at the system console it had regular spats of “unknown SIGSEGV” several times a minute … in the kernel!  If you don’t know UNIX internals the ‘kernel’ is the heart of the operating system of the Mac, where all the lowest level work is done and where if something goes wrong everything fails.  SIGSEGV means that some bit of software is trying to access a memory location that doesn’t exist.  In fact while this is caught it is not so bad, the greater worry is that if it is trying to access non-existent memory, then it may corrupt other memory … and the kernel has access to everything – not good.

Please, please Apple if you cannot get Time Machine to work properly, do not let it affect the kernel!

how to put it right

One might hope that even if Time Machine cannot notice itself there is something wrong at least there would be an option to say “restart yourself”.  One might hope, but there is not.  However, you can do it yourself by digging a little into the backup disk itself.

First problem is to stop the Time Machine backup if it has hung.

In the Time Machine control panel, you can simply slide the OFF-ON button to OFF.  The status should change to ‘stopping’ and after a while stop.  Then you can restart the machine and try to fix things.

This is the ideal thing to do, but I find that when Time Machine is really hung this rarely works.  I do turn it to OFF, but either it never changes to ‘stopping’ and stays ‘preparing’, or it changes to ‘stopping’, but never does.  If this happens the system restart typically doesn’t restart the system as Time Machine won’t stop running.  Then, always with much trepidation, I reach for the on/off button on the Mac itself :-/

After doing a hard on/off like this, I usually do anther restart from the Apple menu … not sure if this is necessary, but just to be on the safe side!

Occasionally I skip to the next step before the hard restart.

Then you can start to fix the problem properly.

Find the backup disk.  If it is not obvious in the Finder use the ‘Go’ menu and select “Computer”; it shows all the locally connected disks (or it may simply appear in the left hand favourites pane in each Finder window).

If you skipped the restart stage (or of you just peek now to see what it is like when it hasn’t gone wrong), you will see something like “Backup of Alan Dix’s MacBook Pro” (obviously for you it will not be “Alan Dix’s MacBook Pro”!).  This is the Time Machine backup.  However, if you have restarted the machine with Time Machine off you will have to find the actual disk that you chose as your backup disk and on it look for a file called something like “Alan Dix’s MacBook  Pro_0039fc56f8a2.sparsebundle”.  This is some form of compressed disk image.  In the older versions of Time Machine there was simply a folder with all the backups in it — I felt much more secure.  Now this is a single opaque file and I worry that if one day it gets corrupted :-/

Having found the ‘sparsebundle’ double click it and it will display a little pop-up window that says ‘checking volumes’.  I keep meaning to see if this ever stops, but I am not patient enough and press the button that says to skip this state and then (after a while) it mounts the disk image and the disk “Backup of Alan Dix’s MacBook Pro” appears.

Double click “Backup of Alan Dix’s MacBook Pro” and look inside and then inside the folder “Backups_backupd” and you find loads of dated folders, which are the actual backups of your system that you can browse if you prefer instead of using the Time Machine interface.  In addition there may be one file ending “.inProgress”, which is some sort of internal file created while it is in the middle of doing the backup.

Delete the “.inProgress” file.

In addition, I usually delete the last of the dated folders (sort by “Date Modified” to get the last one).  However, if you don’t want to lose the last backup you can try just deleting the “inProgress” file and only delete the last dated backup if Time Machine still gets stuck.

Important: only delete the latest of the dated backup folders (e.g. “2010-06-09-225547” in the screen shot above), NOT the entire “Alan Dix’s Macbook Pro” folder.  If you do that you lose all your backups!

I recall doing this all with extreme trepidation the first time, but had got to the point when I couldn’t do backups or access them anyway so had nothing to lose.  Actually it seems pretty OK getting in here and doing this sort of thing, the nice thing about Time Machine is that it uses ordinary folder structures that you can peek around in and see are there all secure.  I am much happier with this than the kind of backup where you only know if it is working the day you try to restore something!  At least half the times I have used such backups over the years I’ve found the backup is in some way corrupt or incomplete. So actually one up for Time Machine 🙂

Now reboot again (for luck).  Turn Time Machine back on in the control panel and wait … a long time … it will start ‘preparing’ as if for the first backup … and several hours later hopefully all will be well.

But do remember to set the power save options not to go to sleep in the middle!

In fact the above has always worked for me except for this last time when, for some reason (maybe I missed something on the way?), it hung again and I had to go through the whole process again.  This time I waited until yesterday evening before turning Time Machine back on so that I could leave it to do the long 4 hour ‘preparing’ stage without me doing anything else.

And then:

Joy!

making life easier – quick filing in visible folders

It is one of those things that has bugged me for years … and if it was right I would probably not even notice it was there – such is the nature of good design, but …  when I am saving a file from an application and I already have a folder window open, why is it not easier to select the open folder as the destination.

A scenario: I have just been writing a reference for a student and have a folder for the references open on my desktop. I select “Save As …” from the Word menu and get a file selection dialogue, but I have to navigate through my hard disk to find the folder even though I can see it right in front of me (and I have over 11000 folders, so it does get annoying).

The solution to this is easy, some sort of virtual folder at the top level of the file tree labelled “Open Folders …” that contains a list of the curently open folder windows in the finder.  Indeed for years I instinctively clicked on the ‘Desktop’ folder expecting this to contain the open windows, but of course this just refers to the various aliases and files permamently on the desktop background, not the open windows I can see in front of me.

In fact as Mac OSX is built on top of UNIX there is an easy very UNIX-ish fix (or maybe hack), the Finder could simply maintain an actual folder (probably on the desktop) called “Finder Folders” and add aliases to folders as you navigate.  Although less in the spirit of Windows, this would certainly be possible there too and of course any of the LINUX based systems.  … so OS developers out there “fix it”, it is easy.

So why is it that this is a persistent and annoying problem and has an easy fix, and yet is still there in every system I have used after 30 years of windowing systems?

First, it is annoying and persistent, but does not stop you getting things done, it is about efficiency but not a ‘bug’ … and system designers love to say, “but it can do X”, and then send flying fingers over the keyboard to show you just how.  So it gets overshadowed by bigger issues and never appears in bug lists – and even though it has annoyed me for years, no, I have never sent a bug report to Apple either.

Second it is only a problem when you have sufficient files.  This means it is unlikely to be encountered during normal user testing.  There are a class of problems like this and ‘expert slips’1, that require very long term use before they become apparent.  Rigorous user testing is not sufficient to produse usable systems. To be fair many people have a relatively small number of files and folders (often just one enormous “My Documents” folder!), but at a time when PCs ship with hundreds of giga-bytes of disk it does seem slighty odd that so much software fails either in terms of user interface (as in this case) or in terms of functionality (Spotlight is seriously challenged by my disk) when you actually use the space!

Finally, and I think the real reason, is in the implementation architecture.  For all sorts of good software engineering reasons, the functional separation between applications is very strong.  Typically the only way they ‘talk’ is through cut-and-paste or drag-and-drop, with occasional scripting for real experts. In most windowing environments the ‘application’ that lets you navigate files (Finder on the Mac, File Explorer in Windows) is just another application like all the rest.  From a system point of view, the file selection dialogue is part of the lower level toolkit and has no link to the particular application called ‘Finder’.  However, to me as a user, the Finder is special; it appears to me (and I am sure most) as ‘the computer’ and certainly part of the ‘desktop’.  Implementation architecture has a major interface effect.

But even if the Finder is ‘just another application’, the same holds for all applications.  As a user I see them all and if I have selected a font in one application why is it not easier to select the same font in another?  In the semantic web world there is an increasing move towards open data / linked data / web of data2, all about moving data out of application silos.  However, this usually refers to persistent data more like the file system of the PC … which actually is shared, at least physically, between applications; what is also needed is that some of the ephemeral state of interaction is also shared on a moment-to-moment basis.

Maybe this will emerge anyway with increasing numbers of micro-applications such as widgets … although if anything they often sit in silos as much as larger applications, just smaller silos.  In fact, I think the opposite is true, micro-applications and desktop mash-ups require us to understand better and develop just these ways to allow applications to ‘open up’, so that they can see what the user sees.

  1. see “Causing Trouble with Buttons” for how Steve Brewster and I once forced infrequent expert slips to happen often enough to be user testable[back]
  2. For example the Web of Data Practitioners Days I blogged about a couple of months back and the core vision of Talis Platform that I’m on the advisory board of.[back]