Chapter 12 – Computer vision – Artificial Intelligence

Download chapter slides

12.1 Overview

12.2 Introduction

12.2.1 Why Computer Vision Is Difficult
12.2.2 Phases of Computer Vision

12.3 Digitisation and Signal Processing

12.3.1 Digitising Images

12.3.2 Thresholding

12.3.3 Digital Filters

12.3.3.1 Linear Filters
12.3.3.2 Smoothing
12.3.3.3 Gaussian Filters
12.3.3.4 Practical Considerations

12.4 Edge Detection

12.4.1 Identifying Edge Pixels

12.4.1.1 Gradient Operators
12.4.1.2 Robert's Operator
12.4.1.3 Sobel's Operator
12.4.1.4 Laplacian Operator
12.4.1.5 Successive Refinement and Marr's Primal Sketch

12.4.2 Edge Following

12.5 Region Detection

12.5.1 Region Growing
12.5.2 The Problem of Texture
12.5.3 Representing Regions -- Quadtrees
12.5.4 Computational Problems

12.6 Reconstructing Objects

12.6.1 Inferring Three-Dimensional Features

12.6.1.1 Problems with Labelling

12.6.2 Using Properties of Regions

12.7 Identifying Objects

12.7.1 Using Bitmaps
12.7.2 Using Summary Statistics
12.7.3 Using Outlines
12.7.4 Using Paths

12.8 Facial and Body Recognition

12.9 Neural Networks for Images

12.9.1 Convolutional Neural Networks
12.9.2 Autoencoders

12.10 Generative Adversarial Networks

12.10.1 Generated Data
12.10.2 Diffusion Models
12.10.3 Bottom-up and Top-down Processing

12.11 Multiple Images

12.11.1 Stereo Vision
12.11.2 Moving Pictures

12.12 Summary

Glossary items referenced in this chapter

accuracy, active vision, ambiguous image, aspect ratio, auto-associative memory, autoencoder, backpropagation, Bayes Theorem, binary image, bitmap image, Boltzmann machine, bottom-up reasoning, camera!pan, camera!zoom, clustering, connectionist model, constraint satisfaction, constraints, contour following, convolutional neural network, convolutions, correlation, crowdsourcing, data structure, database, deep fakes, deep neural network, differential (calculus), diffusion models, digital filtering, digital signal processing, digitisation, edge detection, edge following, emotion recognition, facial recognition, false positive, frame of video, game playing, Gaussian filter, generative adversarial network, geographic information system, geometric constraints, gesture recognition, Google, gradient descent, gradient operators, graphics processing unit, grey-scale image, ground truth, handwriting recognition, heuristic evaluation function, Hopfield networks, human perception, hybrid architecture, image thresholding, image understanding, labelling, Laplacian operator, Laplacian-of-Gaussian filter, line labelling, linear filter, machine learning, Marr's primal sketch, moving images, multiple images, neural network, neural-network architecture, Normal distribution, normalisation, object identification, object identification!bitmaps, object identification!outlines, object identification!paths, object identification!summary statistics, object recognition, OCR (optical character recognition), octree, optical flow, parallax, parallel processing, pattern matching, pen-based systems, position independent, pre-processing, privacy, quadtree, reasoning with uncertainty, receptive field, region detection, region growing, restricted Boltzmann machine, ridge, Robert's operator, robotics, segmentation, sensation, sensor fusion, sharpening filters, signal processing, similarity measure, Skynet, smoothing, Sobel's operator, standard deviation, stereo vision, successive refinement, template matching, texture, three-dimensional objects, threshold, time series, unsupervised learning, voxel, Waltz's algorithm, wavelet transform, zero-sum game

Prolog examples (from 1st ed.)

eximages.p	image processing utilities images from examples in book
image.p	image processing utilities simple representation of a pixel image
gimage.p	image processing utilities the 'gimage' representation of a pixel image
filter.p	digital filters gradient filters
threshold.p	thresholding

Contents

Glossary items referenced in this chapter

Prolog examples (from 1st ed.)