(A) Four nonspeech auditory categories are defined across six exemplars (differentiated by the higher frequency component shown as different colors on the same axes, with a common lower-frequency component shown as a dashed grey line). Categories A and B are characterized by a unidimensional acoustic attribute (offset rises or falls) whereas Categories C and D cannot be defined by a single acoustic attribute and instead are multidimensional, with distributional structure in higher-order perception space (see 15). In the Systematic Multimodal Association Reaction Time (SMART) task each auditory category uniquely predicts the upcoming location of a visual target. Participants respond with a keypress to indicate the visual target location. (B) Each of three experiments involves three behavioral testing sessions (Day 1, Day 2, Day 10). The blocks labeled ‘train’ involve a consistent mapping from auditory category to visual target location (and visuomotor response), as shown in (A). Blocks 7, 10, and 13 destroy this relationship through randomization of sounds to locations to examine the impact on visuomotor response (as a response time cost). Examination of performance on Day 2 and Day 10 informs offline gains (response time facilitation), consolidation of incidental category learning, and its retention. A final overt labeling task on Day 10 measures generalization of incidental category learning to novel category exemplars (not plotted in Panel A) in an overt labeling task. (C) Exp 1 examines visuomotor task demands without auditory exemplars preceding the visual target to characterize putative visuomotor learning, consolidation and retention. Exp 2 examines incidental auditory category when, on each trial, a single category exemplar is repeated five times and predicts the upcoming location of the visual target; exemplar variability is experienced across, not within, trials. Exp 3 examines incidental learning when within-category variability is more tightly coupled to visuomotor task demands; five unique exemplars are sampled from a category on each trial and, as in Exp 2, the category identity predicts the location of the upcoming visual target.