Making Models Generalize | 16 | Deep Learning and Scientific Computing

ABSTRACT

Data augmentation means taking the data one have and modifying them, so as to force the algorithm to abstract over some things. This chapter introduces two popular variants of data augmentation: “classic” and “mixup”. Classic data augmentation, whatever it may be doing to the entities involved- move them, distort them, blur them- it leaves them intact. Mixup, on the other hand, takes two entities and “mixes them together”. Mixup is an appealing technique that makes a lot of intuitive sense. If different connections between neurons could be dropped out at unforeseeable times, the network as a whole had better not get too dependent on cooperation between individual units. But it is just this kind of inter-individual cooperation that results in strong memorization of examples presented during training. Dropout introduces randomness. Looking for analogies in machine learning overall, it has something in common with ensemble modeling.