intelligent transparent database access
Query-by-Browsing (QbB) is an intelligent database interface, that was originally envisaged more than 30 years ago, to highlight the potential danger of social, gender and ethnic bias in black-box machine learning systems, and to demonstrate how these dangers could be ameliorated through transparency and effective user interaction.
The basic idea is the user starts to select records that there are or are not interested in (marked (1) in the annotated screenshot below) and then, when asked (2) the system suggests a potential database query that agrees with the user choices so far. Crucially both the inferred query (3) and the records selected (4) are presented back to the user to make it easy to assess whether the inferred query is indeed what is wanted.
In the web demo you can experiment with one of the small example databases,
Under the bonnet, the positive and negative examples selected by the user are fed into a machine learning system, that generates a decision tree, which is then translated into an SQL query, as this is the most common database query notation.
History and variants
There have been several variants over the years. The current web demo uses a variant of ID3, a machine learning algorithm that infers a decision tree, so is intrinsically more ‘explainable’ than many types of ML. This was the form initially proposed in the 1992 paper, which introduced QbB, however one variant also used a Query-by-Example style tableau and rules learnt using a genetic algorithm. A semantic web variant was also produced in the 2010s, but never published. Query-by-Browsing was also used as a case study in the introductory AI textbook that Janet Finlay and I published in 1996, and also features in the new edition, with is currently in press.
The first web version was created around New Year 2005, initially using a CGI backend written in C. This was later translated into PHP. The current web demo retains the look and feel of this 2000s web version, but as a single page web app with the machine learning running in Javascript. In addition, it allows the upload of the user’s own CSV file.
Bias and discrimination
As noted, the first environment of Query-by-Browisng was in a 1991 talk and later 1992 book chapter Human issues in the use of pattern recognition techniques, which was probably the first publication to highlight the potential dangers of gender, ethnic and social bias in black-box machine learning algorithms. It was written at the hight of excitement in AI around 1990, and just before the great ‘AI Winter’. At the time I thought that discriminatory uses of AI would arise within the next five years. In fact it took more like 25 years before these became a problem, so in many way the paper was far too early to be a useful warning.
The core idea of the dual list and query (extensional and intensional) representation in QbB was to help the user see if the machine learning was using a potential biased rule, that is what would now be considered explainable AI. The screen shot below shows an example of this using the current web demo.