Introduction

The “Sniffin’ Sticks” test is a widely used tool for assessment of olfactory performance consisting of three subtests: olfactory threshold, odor discrimination and odor identification. It has been introduced over 20 years ago by Kobal et al. [1]. Since the first publication, test–retest reliability and validity have been established [2, 3] and the test has been successfully adapted across cultures, e.g., [4,5,6]. Both extended [7, 8] and abridged versions, with satisfying psychometric properties [9,10,11], have been proposed, along with modifications of the set of odors utilized [12,13,14].

The “Sniffin’ Sticks” battery is used in daily clinical practice as well as scientific research. Individual scores can be related to standard values for (a) normosmia (normal olfactory function), (b) hyposmia (impaired olfactory function) or (c) functional anosmia (residual or absent olfactory function). Additionally, there is the category of supersmellers, i.e., subjects with an extraordinary sense of smell. Although norms for the Sniffin’ Sticks test have already been published [15, 16], an update appeared advisable, based upon a large-scale sample comprising detailed age groups, as older subjects were underrepresented in previous studies. Furthermore, updated norms are necessary to monitor potential changes in olfactory performance caused by macro-scale environmental and social factors, e.g., pollution or dietary habits.

Here we present updated normative data for clinical and scientific quantitative assessment of olfactory performance in female and male subjects, with a threefold number of participants compared to previous studies. This large sample allowed us to bin individual results into age groups of 10 years each, for a more accurate reference of subjects’ olfactory performance to their coevals. In addition, the narrow age categories resulted in more homogeneous groups and facilitated—although cross-sectional in nature—a detailed insight into the dynamics of olfactory performance during the course of life.

Materials and methods

Data were obtained from 9139 subjects [4928 females aged 5–96 years (M = 31.8, SD = 18.9) and 4211 males aged 5–91 years (M = 30.7, SD = 17.7)]. Among them, 3432 (37.5%) had been included in a previous study to establish normative data [15]. According to the inclusion criteria for the respective studies, all subjects were healthy and none reported histories for any olfactory disturbances.

Odors were delivered using felt-tip pens (“Sniffin’ Sticks”) of approximately 14 cm length and an inner diameter of 1.3 cm. These pens carry a tampon soaked with 4 ml of liquid odorant. For odor presentation, the cap was removed from the pen for approximately 3 s, the pen’s tip brought in front of the subject’s nose and carefully moved from left to right nostril and backwards [3].

The threshold was obtained in a three alternative forced choice paradigm (3 AFC) where subjects were repeatedly presented with triplets of pens and had to discriminate one pen containing an odorous solution from two blanks filled with the solvent. Phenylethanol (dissolved in propylene glycol) or n-butanol (dissolved in water) were used, with both odorants having been found equivalent in olfactory sensitivity testing: scores obtained with both are correlated [17]. The highest concentration was a 4% odor solution. Sixteen concentrations were created by stepwise diluting previous ones by 1:2. Starting with the lowest odor concentration, a staircase paradigm was used where two subsequent correct identifications of the odorous pen or one incorrect answer marked a so-called turning point, and resulted in a decrease or increase, respectively, of concentration in the next triplet. Triplets were presented at 20 s intervals. The threshold score was the mean of the last four turning points in the staircase, with the final score ranging between 1 and 16 points.

The discrimination task used the same 3 AFC logic. Two pens of any triplet contained the same odorant, while the third pen smelled differently. Subjects were asked to indicate the single pen with a different smell. Within-triplet intervals were approximately 3 s. As the odors used in this subtest were more intense, between-triplets intervals were 20–30 s. The score was the sum of correctly identified odors. Hence, the scores in this task ranged from 0 to 16 points. Importantly, subjects were blindfolded for the threshold and discrimination tasks to avoid visual identification of target pens.

Odor identification comprised common and familiar odorants (recognized by at least 75% of the population). Subjects were presented with single pens and asked to identify and label the smell, using four alternative descriptors for each pen. Between-pen intervals were approximately 20–30 s. The total score was the sum of correctly identified pens, thus subjects could score between 0 and 16 points.

The final “TDI score” was the sum of scores for Threshold, Discrimination and Identification subtests, with a range between 1 and 48 points.

Statistical analyses

Data were analyzed by means of SPSS v. 25 software (SPSS Inc., Chicago, Ill., USA). Subjects were divided into nine age groups: (A) 5–10 years (n = 889); (B) 11–20 years (n = 1750); (C) 21–30 years (n = 2995); (D) 31–40 years (n = 1102); (E) 41–50 years (n = 847); (F) 51–60 years (n = 737); (G) 61–70 years (n = 464); (H) 71–80 years (n = 212); and (I) over 81 years (n = 143). Descriptive statistics were computed to establish norms based on the extended sample (Table 1). We examined the effects of sex (female vs male) and age (groups A–I) on TDI scores by means of analysis of variance (ANOVA). Further, we modelled effects of sex and age on separate subtest scores obtained for threshold, discrimination and identification scores, controlling for within-subject variance using repeated measures analysis of variance (rm-ANOVA). Pairwise comparisons were Bonferroni-corrected for multiple comparisons between the nine age groups. To provide guidance for assessing individual olfactory abilities in relation to specific age groups, we calculated the tenth percentile of TDI score for each age group.

Table 1 Normative values for the Sniffin’ Sticks test

Results

Effects of sex and age on overall TDI score

We found a main effect of age on the overall TDI score F(8, 3337) = 128.8, p < 0.001, η2 = 0.24. Pairwise comparisons indicated that the most pronounced increase in overall olfactory abilities occurred between group A (5–10 years) and group B (11–20 years) and the most pronounced decrease at the age of 61–70 years (Figs. 1, 2). There was also a significant yet small main effect of sex F(1, 3337) = 26.9, p < 0.001, η2 = 0.008, suggesting that on average females (M = 31.7 ± 0.18) outperformed males (M = 30.4 ± 0.19). The two factors of interest (sex and age) did not interact with each other (p = 0.12).

Fig. 1
figure 1

Mean TDI scores obtained from female and male subjects across the nine age groups. Error bars represent SEM. The bottom table shows differences between mean scores of two groups (group in a column − group in a row) and the level of post-hoc test significance: ***p < 0.001; **p < 0.01; *p < 0.05

Fig. 2
figure 2

TDI scores obtained from female and male subjects with polynomial trendlines for both sexes

Effects of sex and age on olfactory threshold, odor discrimination and odor identification

We observed a significant interaction between age group and subtest [F(16, 6646) = 7.8, p < 0.001, η2 = 0.02], but not between sex and subtest (p = 0.23). The decrease was present in each test (F = 128.3, p < 0.001, η2 = 0.24); however, it was most pronounced in the threshold task as compared to discrimination and identification. Pairwise comparisons are displayed in Fig. 3.

Fig. 3
figure 3

Changes of odor threshold, discrimination, and identification across the nine age groups. All data were related to the respective average results obtained in reference to the age group with highest scores (group C for threshold and discrimination; group D for identification subtest). Tables in the right panel present differences between mean scores of two groups (group in a column − group in a row) and the level of post-hoc test significance: ***p < 0.001; **p < 0.01; *p < 0.05

With pooled genders, the tenth percentile of TDI score for group A (5–10 years) was 19.4 points; group B (11–20 years) 28.5 points; group C (21–30 years) 30.75 points; group D (31–40 years) 30.5 points; group E (41–50 years) 28.15 points; group F (51–60 years) 27.25 points; group G (61–70 years) 24.88 points; group H (71–80 years) 19.2 points; group I (over 81 years) 13 points. These data provide, on one hand, guidelines for assessing individual olfactory abilities in relation to specific age groups. On the other hand, the final diagnosis of anosmia versus normosmia depends on the reference group of young adults with a cutoff value of 30.75 points.

The term “functional anosmia” refers to individuals without any or with negligible—as experienced in everyday life—sense of smell. To differentiate between “functional anosmia” and hyposmia, we established a TDI score or 16 points, which is equivalent with both identification and discrimination scores of 8, the maximum 90% of patients with anosmia would achieve, as reported earlier [16].

“Supersmellers” are subjects who reach at least the 90th percentile of the group aged 21–30 years, i.e., 41.5 or more points. Table 2 presents the proportion of subjects with functional anosmia (scoring ≤ 16 points), hyposmia (scoring between 16.25 and 30.5 points), normosmia (scoring between 30.75 and 41.25 points), and supersmellers (scoring 41.5 points or above), across the nine age groups.

Table 2 Percentage of participants (all of whom identified themselves as having a normal sense of smell) with normosmia, hyposmia, and functional anosmia, separately for the nine age groups

Discussion

The current study provides updated norms for the “Sniffin’ Sticks” olfactory test based on a large sample. The present data obtained from 9139 subjects corroborate previous normative findings—which is noteworthy given that approximately two-thirds of the data are newly added to the database, as compared to the previous version from 2007 [15]. We observed similar values of the tenth percentile in all age groups, although the exact comparison cannot be made due to the more narrow age categories in the present study, e.g., previously, age group A was 5–15 years, whereas in the current study we present data for age groups A (5–10 years) and B (11–20 years).

With the current investigation we found the hyposmia cutoff point of 30.75 points in the reference group aged 21–30 years. Hitherto, the hyposmia cutoff score was 30.5 points [15] and in the proposed normative dataset, the same exact value of 30.5 points is the tenth percentile value of age group D (31–40 years). This 0.25 point difference between age groups C and D is likely to result from the division of the previously investigated group aged 16–35 years into two decade-wide age groups C (21–30 years) and D (31–40 years). The diagnosis of hyposmia remains to some extent an arbitrary decision, as the cutoff point of 30.75 has been established with respect to group C aged 21–30 years, representing the overall best smelling subsample. By a shift of perspective, individual scores may also be regarded in relation to the corresponding sex and age groups. We would like to give an example of how to interpret a patient’s score: A female subject aged 55 years obtained a threshold score of 4.5 points, an identification of 14 points and a discrimination score of 13 points, resulting in a TDI score of 31.5 points. According to the tenth percentile of her age group, her outcome would be “hyposmia” for the threshold test and “normosmia” for identification and discrimination. As, by definition, the more general, overall TDI sumscore overrides separate subtest results, her final diagnosis would be “normosmia”.

Importantly, changes of the tenth percentile TDI scores observed in the youngest and oldest age groups provoke the question about a deepened and updated analysis of the changes in olfactory performance [18]. Current data indicated the most pronounced loss in olfactory threshold, whereas olfactory discrimination and identification are, for one, tested with suprathreshold concentrations of odors and are, in addition, largely determined by individual experience and conscious cognitive processes which decrease at a slower pace over time. The pronounced decrease of odor thresholds with age supports the idea that it represents damage to the periphery of the olfactory system to a stronger degree than diminished odor identification and discrimination which are more strongly related to higher cognitive processes (for discussion see: [19,20,21]).

The relatively high percentage of children under 10 years with hyposmia is considered to be due to test difficulty rather than low olfactory function. Therefore, age-appropriate olfactory tests are necessary and have indeed been developed [9, 11, 22].

Our extended data further corroborated earlier reports on decreased olfactory abilities in age groups over 55 years [18, 23,24,25,26,27]. The apparent decrease in olfactory performance in seniors older than 60 raises the question about dynamics of olfactory loss with age.

“Functional anosmia”—a residual ability to perceive odors with limited usefulness in daily life—was found in a total of 0.45% of the subjects, and it was mostly prevalent in the oldest age groups, with the most visible decrease of function from age 70 years upwards. These subjects either have no olfactory function left at all or exhibit a modest ability to perceive, discriminate or identify odors insufficient for enjoyable experience of foods and drinks or the ability to detect environmental hazards such as gas, fire or food gone bad. However, age itself should not be considered a cause of olfactory loss but rather an accompanying factor of neurodegenerative diseases, drug side-effects, etc. [28, 29].

The incidence of 0.45% of participants with functional anosmia is low compared to epidemiological studies (e.g., [30]). One reason may be that the TDI score is used as the basis to establish the diagnosis of “functional anosmia”. However, using an odor identification score below eight points to determine the fraction of this population returns the number of 3.4%. It has to be considered that all subjects entering the study maintained to have a fully functional olfactory system. Yet, we found a meaningful proportion of subjects scoring in the range of hyposmia or functional anosmia, who seem either not to be aware of their olfactory dysfunction or not to be bothered by it. Finally, it must be kept in mind that the majority of the currently described population is young and healthy. Therefore no conclusions regarding epidemiology of olfactory loss in the general population can be made based on this work.

We observed sex-related differences with women outperforming men. Available empirical reports on this issue are inconclusive, with some studies pointing to a female advantage in olfactory tasks over males [15, 31, 32] but others failing to confirm this difference [33]; for review see: [34]. In our large sample, we observed the main effects of sex indicating that females obtained significantly higher scores than males—however, the difference in mean TDI scores calculated for both groups was rather small (1.3 points). In such a large sample size, even very small absolute differences become significant. In any case, the current study confirmed that sex-related differences are present but may be small; in other words, if sex-related differences are observed at all, it is typically women outperforming men.

We present updated norms for “Sniffin’ Sticks” based on a large sample of 9139 subjects. With this extended sample we found hyposmia to be defined at less than 30.75 points of TDI score in the group aged 21–30 years. Observed effects of sex and age corroborate previous norms by showing a significant decrease of olfactory abilities with age with a most pronounced increase between age 5–20 years and a most pronounced decrease at the age of 60–71 years.