From what I know about convolutional neural networks, you must feed the same training examples each epoch, but shuffled (so the network won’t remember some particular order while training).
However, in this article, they’re feeding the network 64000 random samples each epoch (so only some of the training examples were “seen” before):
Each training instance was a uniformly sampled set of 3 images, 2 of
which are of the same class (x and x+), and the third (x−) of a
different class. Each training epoch consisted of 640000 such
instances (randomly chosen each epoch), and a fixed set of 64000
instances used for test.
So, do I have to use the same training examples each epoch, and why?
Experimental results are poor when I use random samples – the accuracy varies a lot. But I want to know why.
Most of the time you might want to use as much data as you can. However, in the paper you cite they train a triplet loss, which uses triples of images, and there could be billions of such triples.
You might wonder, why introduce the idea of epoch in the first place if we’re likely to obtain different training sets each time. The answer is technical: we’d like to evaluate the network on the validation data once in a while, also you might want to do learning rate decay based on the number of completed epochs.
Answered By – Artem Sobolev
Answer Checked By – David Goodson (AngularFixing Volunteer)