Chapter 8 – Going Large: deep learning and big data

Contents

8.1  Overview
8.2  Deep Learning
8.2.1  Why Are Many Layers so Difficult?
8.2.2  Architecture of the Layers
8.3  Growing the Data
8.3.1  Modifying Real Data
8.3.2  Virtual Worlds
8.3.3  Self-Learning
8.4  Data Reduction
8.4.1  Dimension Reduction
8.4.1.1  Vector Space Techniques
8.4.1.2  Non-numeric Features
8.4.2  Reduce Total Number of Data Items
8.4.2.1  Sampling
8.4.2.2  Aggregation
8.4.3  Segmentation
8.4.3.1  Class Segmentation
8.4.3.2  Result Recombination
8.4.3.3  Weakly Communicating Partial Analysis
8.5  Processing Big Data
8.5.1  Why It Is Hard -- Distributed Storage and Computation
8.5.2  Principles behind MapReduce
8.5.3  MapReduce for the Cloud
8.5.4  If It Can Go Wrong -- Resilience for Big Processing
8.6  Data and Algorithms at Scale
8.6.1  Big Graphs
8.6.2  Time Series and Event Streams
8.6.2.1  Multi-scale with Mega-windows
8.6.2.2  Untangling Streams
8.6.2.3  Real-time Processing
8.7  Summary

Glossary items referenced in this chapter

accuracy, adversarial learning, AlphaGo, Apache Hadoop, autonomous car, backpropagation, bias, big data, boosting, CERN, cloud computation, clustering, combinatorial explosion, computer chess, convolutional neural network, correlation matrix, data reduction, decision tree, Deep Blue, deep neural network, degrees of freedom (data), dimension reduction, domain-specific knowledge, ECG , eigenvector, emotion, ensemble methods, event, event stream, fault tolerant, fully connected, game playing, generalisation, generative adversarial network, genes, genetic algorithm, Go, Google, ground truth, higher-order function, IBM, image recognition, instabilities, Kasparov, Garry, kernel, least squares, Lee Sedol, linear regression, Lisp, local data access, locality, long-tail distribution, machine learning, map, MapReduce, multi-dimensional scaling, n-gram, natural selection, neural network, neural-network architecture, non-linear transformations, optimal, overfitting, PageRank, parallel processing, perceptron, pinch-point layer, poorly constrained, pre-processing, principal components analysis, Python, quartile, radial basis functions, random forest, random segmentation, RDF, recommender systems, reduce, Restricted Boltzmann Machine, robotics, robust to failure, search space, segmentation, segmentation rule, self learning, semantic web, sharding, social media, sparse matrix, standard deviation, statistical techniques, supervised learning, support vector machine, synthetic data, time series, training phase, transpose, underdetermined, unsupervised learning, wavelet transform, windowing