Contents
- 12.1 Overview
- 12.2 Introduction
- 12.2.1 Why Computer Vision Is Difficult
- 12.2.2 Phases of Computer Vision
- 12.3 Digitisation and Signal Processing
- 12.3.1 Digitising Images
- 12.3.2 Thresholding
- 12.3.3 Digital Filters
- 12.3.3.1 Linear Filters
- 12.3.3.2 Smoothing
- 12.3.3.3 Gaussian Filters
- 12.3.3.4 Practical Considerations
- 12.4 Edge Detection
- 12.4.1 Identifying Edge Pixels
- 12.4.1.1 Gradient Operators
- 12.4.1.2 Robert's Operator
- 12.4.1.3 Sobel's Operator
- 12.4.1.4 Laplacian Operator
- 12.4.1.5 Successive Refinement and Marr's Primal Sketch
- 12.4.2 Edge Following
- 12.5 Region Detection
- 12.5.1 Region Growing
- 12.5.2 The Problem of Texture
- 12.5.3 Representing Regions -- Quadtrees
- 12.5.4 Computational Problems
- 12.6 Reconstructing Objects
- 12.6.1 Inferring Three-Dimensional Features
- 12.6.1.1 Problems with Labelling
- 12.6.2 Using Properties of Regions
- 12.7 Identifying Objects
- 12.7.1 Using Bitmaps
- 12.7.2 Using Summary Statistics
- 12.7.3 Using Outlines
- 12.7.4 Using Paths
- 12.8 Facial and Body Recognition
- 12.9 Neural Networks for Images
- 12.9.1 Convolutional Neural Networks
- 12.9.2 Autoencoders
- 12.10 Generative Adversarial Networks
- 12.10.1 Generated Data
- 12.10.2 Diffusion Models
- 12.10.3 Bottom-up and Top-down Processing
- 12.11 Multiple Images
- 12.11.1 Stereo Vision
- 12.11.2 Moving Pictures
- 12.12 Summary
Glossary items referenced in this chapter
accuracy, active vision, ambiguous image, aspect ratio, auto-associative memory, autoencoder, backpropagation, Bayes Theorem, binary image, bitmap image, Boltzmann machine, bottom-up reasoning, camera!pan, camera!zoom, clustering, connectionist model, constraint satisfaction, constraints, contour following, convolutional neural network, convolutions, correlation, crowdsourcing, data structure, database, deep fakes, deep neural network, differential (calculus), diffusion models, digital filtering, digital signal processing, digitisation, edge detection, edge following, emotion recognition, facial recognition, false positive, frame of video, game playing, Gaussian filter, generative adversarial network, geographic information system, geometric constraints, gesture recognition, GIS, Google, GPU, gradient descent, gradient operators, grey-scale image, ground truth, handwriting recognition, heuristic evaluation function, Hopfield networks, human perception, hybrid architecture, image thresholding, image understanding, labelling, Laplacian operator, Laplacian-of-Gaussian filter, line labelling, linear filter, machine learning, Marr's primal sketch, moving images, multiple images, neural network, neural-network architecture, Normal distribution, normalisation, object identification, object identification!bitmaps, object identification!outlines, object identification!paths, object identification!summary statistics, object recognition, OCR (optical character recognition), octree, optical flow, parallax, parallel processing, pattern matching, pen-based systems, position independent, pre-processing, privacy, quadtree, reasoning with uncertainty, receptive field, region detection, region growing, Restricted Boltzmann Machine, ridge, Robert's operator, robotics, segmentation, sensation, sensor fusion, sharpening filters, signal processing, similarity measure, Skynet, smoothing, Sobel's operator, standard deviation, stereo vision, successive refinement, template matching, texture, three-dimensional objects, threshold, time series, unsupervised learning, voxel, Waltz's algorithm, wavelet transform, zero-sum game
Prolog examples (from 1st ed.)
eximages.p | image processing utilities images from examples in book |
image.p | image processing utilities |
gimage.p | image processing utilities |
filter.p | digital filters gradient filters |
threshold.p | thresholding |