LET THE MACHINE LEARN EVERYTHING! The network is designed as a traditional deep ConvNet, with successions of convolutional and pooling layers, finishing with 2 fully connected layers. The succession of convolutions makes it possible to learn very diverse sets of features, by learning convolutions of convolutions of convolutions (as if we looked at, for example, the blurred embossed image of the edges of an image, but with the actual operation being learned by the system). We learn to read like that! We arrange images in our brain, we learn the characters and associate them with sounds. We practice convolutions of convolutions until we get 100% accuracy. Babies go by images, and then we rhyme with them or say words and little phrases with the images. We keep doing that over time until they train their brain to evaluate images of entire sentences; we still go on for the ones who train themselves to fast reading. Learning for Text Understanding from Scratch