We develop an approach to reduce correspondence ambiguity in training data where data items are associated with sets of plausible labels. Our domain is images annotated with keywords where it is not known which part of the image a keyword refers to. In contrast to earlier approaches that build predictive models or classifiers despite the ambiguity, we argue that that it is better to first address the correspondence ambiguity, and then build more complex models from the improved training data. This addresses difficulties of fitting complex models in the face of ambiguity while exploiting all the constraints available from the training data. We contribute a simple and flexible formulation of the problem, and show results validated by a recently developed comprehensive evaluation data set and corresponding evaluation methodology.