Self-training (ST), or pseudo-labeling has sparked important curiosity within the automated speech recognition (ASR) neighborhood just lately due to its success in harnessing unlabeled information. In contrast to prior semi-supervised studying approaches that relied on iteratively regenerating pseudo-labels (PLs) from a educated mannequin and utilizing them to coach a brand new mannequin, current state-of-the-art strategies carry out ‘steady coaching’ the place PLs are generated utilizing a really current model of the mannequin being educated. However, these approaches nonetheless depend on bootstrapping the ST utilizing an preliminary supervised studying section the place the mannequin is educated on labeled information alone. We consider this has the potential for over-fitting to the labeled dataset in low useful resource settings and that ST from the beginning of coaching ought to scale back over-fitting. On this paper we present how we will do that by dynamically controlling the evolution of PLs through the coaching course of in ASR. To the very best of our information, that is the primary research that exhibits the feasibility of producing PLs from the very begin of the coaching. We’re capable of obtain this utilizing two methods that keep away from instabilities which result in degenerate fashions that don’t generalize. Firstly, we management the evolution of PLs by means of a curriculum that makes use of the web modifications in PLs to regulate the membership of the cache of PLs and enhance generalization. Secondly, we discover that by sampling transcriptions from the predictive distribution, slightly than solely utilizing the very best transcription, we will stabilize coaching additional. With these methods, our ST fashions match prior works with out an exterior language mannequin.