Stimuli, paradigm and regions of interest.

(A) Top images represent the four categories used in the main experiment: body, car, house and cat. The stimulus set consisted of 10 exemplars from each category (here: cats), with exemplars differing in pose and 3D-orientation. (B) The experimental design comprised 16 task conditions (12 paired, 4 isolated). The 4×4 matrix on the left illustrates the 12 paired conditions, with the to-be-attended category (outlined in orange for illustration purposes, not present in the experiment) on the y-axis and the to-be-ignored category on the x-axis. The right column illustrates the four isolated conditions. (C) Experimental paradigm. A paired block is depicted with superimposed body and house stimuli. In this example block, house stimuli were cued as target, and the participant responded on the repetition of the exact same house in two consecutive trials, as marked here by the arrow. (D) Regions of interest for an example participant; the primary visual cortex V1, the object-selective regions LO and pFs, the body-selective region EBA, and the scene-selective region PPA.

Response vectors related to x and y stimuli in isolated and paired conditions.

and denote the response vectors related to isolated x and isolated y conditions. and illustrate the response in paired conditions, when attention is directed to stimulus x and y, respectively. Each paired response was projected on the two isolated responses and . a1 and b1 represent the weight of isolated x response in the pair response, respectively for the and responses. a2 and b2 represent the weight of the isolated y response in the and paired response, respectively.

Attentional modulation by the response gain model and tuning sharpening model.

We illustrate the models here for the example of a neuron with high selectivity for cat stimuli. Solid curves denote the response to unattended stimuli and dashed curves denote the response to attended stimuli. (A) According to the response gain model, the response of the neuron to attended stimuli is scaled by a constant attention factor. Therefore, the response of the cat-selective neuron to an attended stimulus is enhanced to the same degree for all stimuli. (B) According to the tuning sharpening model, the response modulation by attention depends on the neuron’s tuning for the attended stimulus. Therefore, for optimal and near-optimal stimuli such as cat and body stimuli the response is highly increased, while for non-optimal stimuli such as houses, the response is suppressed.

Average voxel response in EBA for each pair of stimulus categories.

The x-axis labels represent the 4 conditions related to each category pair, Mat, MatL, MLat, Lat, with M and L denoting the presence of the more preferred and the less preferred category and the superscript at denoting the attended category. For instance, Mat refers to the condition in which the more preferred stimulus was presented in isolation (and automatically attended), and MLat refers to the paired condition in which the less preferred stimulus was attended to. Red arrows in each panel illustrate the observed attentional modulation (AM) caused by the shift of attention from the more preferred to the less preferred stimulus. Green arrows in panels B and C illustrate the difference in the response to isolated stimuli. Error bars represent standard errors of the mean. N = 17 human participants.

Attentional modulation versus category distance in each ROI.

(A-E) The value of attentional modulation versus category distance. MatL and MLat denote the two paired conditions with attention directed to the more preferred (M) or less preferred (L) stimulus, respectively. Mat and Lat represent the isolated conditions, respectively, with the more preferred or the less preferred stimulus presented in isolation. Each circle represents the values related to one category pair. Note that the data illustrated here are averaged across subjects only for illustration purposes. R2 was calculated for individual participants and statistical significance using t-tests across participants as illustrated in panel F. (F) R2 for the correlation between attentional modulation and category distance in each ROI. Asterisks indicate that the correlation coefficients are significantly positive. Error bars represent standard errors of the mean. N = 17 human participants.

Weight shift versus category distance in each ROI.

(A-E) Attentional weight shift versus category distance. Each circle represents the values related to one category pair. Note that the data illustrated here are averaged across subjects only for illustration purposes. R2 was calculated for individual participants and statistical significance using t-tests across participants as illustrated in panel F. (F) R2 for the correlation between attentional weight shift and category distance in each ROI. Asterisks indicate that the correlations are significantly positive. Error bars represent standard errors of the mean. N = 17 human participants.

Attentional modulation as a function of category distance, as predicted by the two attentional mechanisms.

MatL and MLat denote the two paired conditions with attention directed to the more preferred (M) or the less preferred (L) stimulus, respectively. Mat and Lat represent the isolated conditions, respectively with the M or the L stimulus presented in isolation. Top panels represent predictions in a region with strong preference for a specific category, and bottom panels illustrate predictions in an object-selective region. Each circle represents a pair of categories. (A) Predicted attentional modulation based on the gain model in a region with strong preference for a specific category. Predicted attentional modulation based on the gain model in an object-selective region. (C) Predicted attentional modulation based on the tuning model in a region with strong preference for a specific category. (D) Predicted attentional modulation based on the tuning model in an object-selective region.

Predicted weight shift as a function of category distance.

Weight shift for each pair is calculated using Equation 5. Category distance represents the difference in multi-voxel representation between responses to the two isolated stimuli, calculated by Equation 3. Top panels are related to predictions in a region with strong preference for a specific category and bottom panels illustrate predictions in an object-selective region. (A) Weight shift predicted by the gain model in a region with strong preference for a specific category. (B) Weight shift predicted by the gain model in an object-selective region. (C) Weight shift predicted by the tuning model in a region with strong preference for a specific category. (D) Weight shift predicted by the tuning model in an object-selective region.