Language is a powerful vehicle for the expression of emotion, and can influences the emotional states of others. Even single words in isolation can evoke strong affective reactions. Generally speaking, affective responses are determined by at least two affective variables: valence and arousal (Russell, 2003). Valence describes the extent to which an emotion is pleasant or unpleasant, whereas arousal refers to the degree of physiological activation elicited by a stimulus, varying from calm to excited (Lang, Bradley, & Cuthbert, 1997). A growing body of research has indicated that valence and arousal modulate the speed of visual word processing in various cognitive tasks (e.g., Citron, 2012; Herbert, Junghofer, & Kissler, 2008; Hinojosa, Méndez-Bértolo, & Pozo, 2010; Kissler, Assadollahi, & Herbert, 2006). Additionally, a number of studies have suggested that lexico-semantic variables, such as concreteness, imageability, context availability, familiarity, frequency, and length, also influence emotional word processing (Citron, 2012; Kuchinke, Võ, Hofmann, & Jacobs, 2007; Paivio, 1986; Schwanenflugel, Harnishfeger, & Stowe, 1988). Although studies of affective variables (i.e., valence and arousal) and semantic variables (e.g., concreteness) have traditionally tended to run in different directions, in recent years interest has been growing in studying the relationship between the two. Specifically, recent studies have shown that, as an important semantic variable, word concreteness plays an important role in the processing of emotional words (Barber, Otten, Kousta, & Vigliocco, 2013; Kanske & Kotz, 2007; Kousta, Vigliocco, Vinson, Andrews, & Del Campo, 2011; Tse & Altarriba, 2009; Yao & Wang, 2013, 2014). However, little is known about the exact relationships between affective variables (i.e., valence and arousal) and word concreteness, after controlling for semantic variables such as familiarity, imageability, and context availability.

Recently, Vigliocco and co-authors (Kousta et al., 2011; Vigliocco, Meteyard, Andrews, & Kousta, 2009) proposed a new hypothesis regarding how concreteness influences the processing of emotional words. According to this hypothesis, concrete and abstract words are composed of both experiential information (sensory, motor, and affective) and linguistic information. The differences between concrete and abstract words arise because of a statistical preponderance of sensorimotor information underlying concrete words, versus a preponderance of affective information underlying abstract words (Kousta et al., 2011; Vigliocco et al., 2009). In short, the difference in affective information between concrete and abstract words may be the reason why concreteness affects the processing of emotional words.

Accumulating evidence has indicated that concreteness is a crucial variable in the processing of emotional words and is associated with affective variables (i.e., valence and arousal; Hinojosa et al., 2016; Kaltwasser, Ries, Sommer, Knight, & Willems, 2013; Kanske & Kotz, 2007; Kousta et al., 2011; Palazova, Sommer, & Schacht, 2013; Sheikh & Titone, 2013; Tse & Altarriba, 2009; Vigliocco et al., 2014; Yao & Wang, 2013). Altarriba, Bauer, and Benvenuto (1999) were the first to note the relationship between the two, proposing that abstract words more likely refer directly to emotional states. On the basis of this finding, Altarriba and Bauer (2004) found that emotion words were recalled better than both concrete and abstract words in a free-recall task, and that concrete words, abstract words, and words denoting emotional states consistently received different concreteness, imageability, and context availability ratings. A study by Barsalou and Wiemer-Hastings (2005) also showed that abstract words have more emotional features than do concrete words.

In addition, evidence from behavioral (Kousta et al. 2011) and fMRI (Vigliocco et al. 2014) studies has indicated that affective variables also play a critical role in the processing of concrete and abstract words. Generally, concrete words are usually responded to faster than abstract words (Paivio, 1986, 1991; Schwanenflugel et al., 1988). Such a characteristic defines the so-called “concreteness effect” (Paivio, 1991). However, Kousta et al. (2011) employed more sophisticated regression methods for analyzing lexical-decision response times (RTs) for a large number of words (n = 2,330) from the English Lexicon Project (ELP; Balota et al. 2007), and found that when a large number of lexico-semantic variables (including familiarity, imageability, and context availability) were controlled, or partialed out in a regression analyses, abstract words elicited faster RTs than concrete words in lexical-decision tasks. This result is at odds with a long tradition of literature that has shown a processing advantage for concrete over abstract words (Binder, Westbury, McKiernan, Possing, & Medler, 2005; Levy-Drori & Henik, 2006; Paivio, 1986, 1991; Schwanenflugel et al., 1988). Kousta et al. (2011) proposed that the processing advantage for abstract words was due to differences in emotional valence (i.e., whether the words were positive, negative, or neutral) between concrete and abstract words. In their Experiment 3, including 480 words spanning the entire range of concreteness and valence ratings, the effects of concreteness (i.e., faster responses for abstract than for concrete words) disappeared when valence was included as a predictor. Thus, the reversed concreteness effect indicates that differences in terms of linguistic information do not exhaust the differences between concrete and abstract words, given that these are still processed differently after controlling for variables such as imageability, context availability, and so on. In other words, the traditional concreteness effect is mediated by words’ affective associations. In previous work, greater affective associationsFootnote 1 have been shown to facilitate word processing in lexical-decision tasks (Kanske & Kotz, 2007; Kousta, Vinson, & Vigliocco, 2009).

To make the link between abstract words and affective associations explicit, Vigliocco et al. (2014) provided evidence that abstract words tend to have more affective associations than do concrete words. They carried out regression analyses for 1,446 English words, spanning the concreteness and valence/arousal continua, and suggested that valence and arousal ratings significantly predicted concreteness ratings, even after imageability and context availability had been taken into account: More valenced and arousing words tended to be more abstract, whereas neutral words tended to be more concrete. With regard to abstract words, emotionality ratings predicted modulation of the blood oxygenation level-dependent signal in the rostral anterior cingulate cortex, an area associated with emotional processing. This result further indicated that the difference in the affective associations of concrete and abstract words may be the reason why the traditional concreteness effect on RTs was reversed when considering affective as well as other lexical variables.

So far, a few studies have collected psycholinguistic measures along with affective norms (Citron, Weekes, & Ferstl, 2014; Schmidtke, Schröder, Jacobs, & Conrad, 2014; Võ et al., 2009). For example, Võ et al. revised the list of affective German words, the Berlin Affective Word List Reloaded (BAWL-R), which was the first list to contain not only valence, arousal, and imageability, but also a large set of psycholinguistic factors that have been known to influence word perception. On the basis of the Sussex Affective Word List, Citron et al. reported positive linear correlations of imageability with affective variables (i.e., valence and arousal). Specifically, positive words and arousing words were more imageable than negative and than nonarousing words, respectively. However, among previous studies in this field, very few have explored the relationship between affective variables and concreteness ratings. Altarriba and Bauer (2004) showed that emotional words were more imageable but less concrete than abstract words, and also less imageable and less concrete than the concrete words themselves. Montefinese, Ambrosini, Fairfield, and Mammarella (2014) collected a total of 1,121 Italian words taken from the Affective Norms for English Words (ANEW; Bradley & Lang, 1999), and subsequently correlated the affective variables and psycholinguistic ratings (concreteness, imageability, etc.). They found nonlinear effects of concreteness on the arousal ratings: Words that were very abstract or very concrete make people feel calm, whereas those in the middle of the concreteness range increased excitement. Guasch, Ferré, and Fraga (2015) collected subjective ratings for 1,400 Spanish words for valence, arousal, concreteness, and other semantic variables, and found that concreteness showed significant negative correlations with both emotional load (positive and negative) and arousal, indicating that the more concrete a word is, the less emotionally loaded it is, and the lower its level of arousal. Contrary to these findings, Hinojosa et al. (2016) asked 660 native Spanish speakers to rate 875 Spanish words for valence, arousal, and concreteness, as well as collecting several objective psycholinguistic variables, and found a negative correlation between valence and concreteness, suggesting that words rated as more positive are also rated as more abstract. In contrast, a positive correlation between arousal and concreteness was observed, indicating that more arousing words are also rated as being more concrete.

Taken together, despite recent efforts to explore the relationship between the affective variables (i.e., valence and arousal) and concreteness, the results have remained inconclusive. The first aim of the present study was therefore to provide subjective ratings for a large set of Chinese words for variables related to concreteness (i.e., concreteness, imageability, context availability, and familiarity), as well as for the affective variables (i.e., valence and arousal), and to reveal whether affective variables are correlated with semantic ones (i.e., concreteness) or, rather, form a distinct cluster. We collected ratings of concreteness, imageability, and context availability because of the significant correlations between concreteness and imageability (Paivio, 1986) and between concreteness and context availability (Schwanenflugel et al., 1988). Furthermore, Altarriba et al. (1999) found that concrete words were assigned higher ratings of imageability and context availability than abstract words. The dual-coding theory (Paivio, 1986) and the context availability theory (Schwanenflugel et al., 1988) explained the experimental concreteness effects in terms of differences in either imageability or context availability, and these three variables are often confounded (Guasch et al., 2015).Thus, in order to explore the relationship between concreteness and the affective variables, it is clearly important to consider the influences of imageability and context availability on concreteness ratings. For familiarity, Levy-Drori and Henik (2006) found that, among the set of words that differed in context availability, there was a positive correlation between concreteness and familiarity, whereas among the set of words matched in terms of context availability, that correlation was negative. Therefore, it is necessary to control for these semantic variables related to concreteness, which might have acted as confounding factors to influence the relationship between concreteness and the affective variables.

Research concerning the effects of affective variables on word processing has used words from normed lists. For instance, the ANEW database, which provides ratings for 1,034 words in the dimensions of valence, arousal, and dominance, is the most widely used corpus in English (Bradley & Lang, 1999). Concreteness, imageability, and familiarity ratings (on a scale from 100 to 700) were provided by the Medical Research Council Psycholinguistic Database (MRC; Coltheart, 1981). In Chinese, most studies of emotional words have been based on the Chinese Affective Word System (CAWS). The CAWS was published by Wang, Zhou, and Luo (2008), and contains 1,500 two-character words, which were rated in terms of valence, arousal, and dominance by 124 participants using a 9-point scale. However, extant databases in Chinese do not include ratings of concreteness, context availability, or familiarity. Thus, the second aim of the study was to generate a corpus of Chinese words suitable for experiments investigating the effects of emotion as well as semantic features on single-word processing.

In the present study, we provide subjective ratings for 1,100 Chinese words on six dimensions—that is, valence, arousal, concreteness, familiarity, imageability, and context availability. The main value of the ratings is that they will make it possible to put a high value on the relationship between affective and semantic variables, and will help provide a general database of Chinese words suitable for experiments investigating the effects of emotion as well as semantic variables on word processing.

Method

Participants

Ratings for the six variables (valence, arousal, concreteness, familiarity, imageability, and context availability) were obtained from a sample of 960 university students (480 women and 480 men) from the departments of economics, biology, engineering, philology, and medicine. All of the participants were native Chinese speakers. The ages of the participants ranged from 18 to 21 years old (M = 18.52, SD = 1.49). The ratings were obtained from a public elective psychology course, and the participants received course credits for taking part in the present study.

Materials

The word set contained 1,100 Chinese words. Specifically, 399 two-character words were selected from CAWS (Wang et al., 2008) and 708 two-character words from the Modern Chinese Dictionary of Commonly Used Words. The selection of words was guided by the idea that in addition to neutral words, we needed as many words as possible with marked values for the two affective variables and two concreteness levels. The set contained nouns, adjectives, and verbs.

All 1,107 words were randomly divided into ten lists of 110 words apiece, with seven words used in the instructions that were not included in the rating procedures. To collect ratings for six of the variables, we constructed two versions of a questionnaire with the same words. One version aimed to collect the ratings of valence, arousal, concreteness, and familiarity for each word, and the other to collect the ratings of imageability and context availability, resulting in a total of 20 different list versions. Accordingly, the total of 960 participants were divided randomly and evenly by gender into 20 groups: 480 participants were given one version of the questionnaire, in which they were asked to rate the words in terms of their valence, arousal, concreteness, and familiarity. The other version of the questionnaire was completed by the other 480 participants, who gave ratings for imageability and context availability for the same words. In short, each version of the questionnaire contained 110 words and was completed by 48 participants. To reduce sequence effects, the order of the words was counterbalanced across lists.

Procedures

A paper-and-pencil test was used. Word ratings were collected in a quiet classroom over 20 different collective sessions. Each group of participants (48 in total) rated words simultaneously. The participants were told that they would be presented with a list of words and that their task was to rate each word for each dimension given (valence, arousal, concreteness, and familiarity, or instead imageability and context availability). All consent information and instructions for the tasks were provided in written Chinese. The instructions emphasized that there were no right or wrong answers and asked participants not to spend a lot of time thinking about their ratings, because their first impressions were of greatest interest. Moreover, the participants were permitted to stop rating at any time during the study and to restart after a short break, as long as they continued and handed in the list within an hour, or two hours at most.

The ratings for valence, arousal, concreteness, familiarity, imageability, and context availability were collected on the Likert scale ranging from 1 to 9, with 1 indicating very negative, very calm, highly abstract, unfamiliar, difficult to image, and difficult to think of a context, and 9 indicating very positive, very exciting, highly concrete, familiar, easy to image, and easy to think of a context, respectively.

The instructions for valence and arousal were translated into Chinese on the basis of the original English description: “Valence is the extent to which the word makes you feel negative (sad, scared) or positive (happy, contented), whereas arousal is the extent to which the word makes you feel calm (relaxed, bored) or excited (stimulated, agitated).” For each word, the participants were asked to choose one response among nine levels of valence (from extremely negative to extremely positive) and one response among nine levels of arousal (from extremely calming to extremely exciting; Bradley & Lang, 1999).

The instructions for concreteness, imageability, context availability, and familiarity were the Chinese translations of those used in previous studies among English-speaking populations. Concreteness was rated according to how concrete/tangible a concept was in the real world (Paivio, 1991; Della Rosa, Catricalà, Vigliocco, & Cappa, 2010). For imageability ratings, the participants were asked to rate how easy it was for each word to elicit a visual image of the concept that the word indicated (Della Rosa et al., 2010; Kousta et al., 2011). For context availability ratings, participants were asked to rate how easy it was to come up with a particular context or circumstance in which the word might appear (Schwanenflugel et al., 1988). For familiarity, we asked participants to rate their level of familiarity when they read each word (Della Rosa et al., 2010).

To ensure that the participants understood the instructions, we provided the following examples, using seven words that did not appear in the list. For instance, participants in the valence rating condition were asked to judge the extent to which the words made them feel negative/unpleasant or positive/pleasant. An example instruction is as follows:

If you think that “婚礼 (wedding)” has a very positive meaning, please choose 9. If you think that “尸体 (corpse)” has a very negative connotation, please choose 1. If you think that “苍蝇 (house fly)” refers to something that is fairly unpleasant, please choose 2. If you think that “事实 (fact)” refers to something that is neutral, please choose 5.

Similarly, participants in the arousal condition were asked to judge the extent to which the words made them feel active/aroused or passive/calm. The instructions for the other variables and the Chinese translations of the instructions are presented in the supplemental materials (in the file Instructions.pdf).

Results and discussion

Outlier analysis

Each participant’s response was coded and saved as a Microsoft Excel file. Before extracting the final ratings, we examined each participant’s ratings to ensure that each had adequately understood the instructions and completed the ratings for each dimension after having given his or her consent to participate. We used two different criteria to exclude participants. The first was to exclude participants who used the same response (e.g., 6) for more than 85 % of the total responses for each list (i.e., cases in which a participant assigned the same value to almost all words). Twenty-one participants (13 in one list version, eight in the other list version, 2.4 % of the total) were excluded because of responses that formed a pattern with almost no variation. The second took into account participants’ scores that were ±2.5 standard deviations away from their group’s average for each item. Of the total 309,320 registered responses, 7,600 outliers produced by the participants were removed from the analysis, which represented 97.5 % of the total. Finally, for the remaining data, the mean and SD for each word on each of the six dimensions were calculated in SPSS.

Descriptive statistics

The word list resulting from the rating procedure can be accessed in the supplemental materials. It contains the 1,100 words in alphabetical order, as well as their Chinese translations, generated by https://translate.google.com/. This database provides the mean ratings and standard deviations for each word on valence, arousal, concreteness, familiarity, imageability, and context availability, as well as the response time and accuracy rate for each. Table 1 shows the means, standard deviations, minimums, and maximums for the different variables.

Table 1 Descriptive statistics for the independent variables

Reliability of the measures

To assess the interrater reliability of the ratings of the six variables that were included in the database, we calculated the split-half correlations, corrected with the Spearman–Brown formula. For each version of the questionnaire, participants were randomly divided into two subgroups of equal size. Since there were ten different questionnaires for valence, arousal, concreteness, and familiarity, and ten different questionnaires for imageability and context availability, we report here the data from their mean split-half correlation coefficients by variables. Overall, the interrater reliabilities were high for all of the variables that we examined.

Regarding the two affective variables, the mean correlation values were r = .93 for valence (ranging from r = .76 to .98) and r = .82 for arousal (ranging from r = .73 to .89). This finding showed that valence had a higher interrater reliability than arousal, which is a common pattern in affective databases: There is greater consensus regarding valence than regarding arousal (e.g., Monnier & Syssau, 2014; Montefinese et al., 2014; Moors et al., 2013; Redondo, Fraga, Padrón, & Comesaña, 2007). High correlations were also observed for other variables, with mean values of r = .94 for concreteness (ranging from r = .87 to .98), r = .87 for familiarity (ranging from r = .76 to .93), r = .97 for imageability (ranging from r = .92 to .99), and r = .86 for context availability (ranging from r = .72 to .92).

To further assess the reliability of our ratings of the affective variables (no prior ratings for semantic variables were available in Chinese), we correlated them with the ratings from CAWS (Wang et al., 2008). For the total samples mentioned previously, there were 399 words in common with CAWS. For valence and arousal, the correlation coefficients were .81 (p < .001) and .75 (p < .001), respectively.

Evaluation of the normed variables with lexical-decision times

To assess the capacity of our normed variables to predict word recognition performance, we first recruited 36 right-handed volunteers (18 men and 18 women, mean age = 19.3 years, SD = 2.31; all native Chinese speakers with normal or corrected-to-normal vision) from Xidian University and asked them to finish a lexical-decision task. Then we computed a linear regression analysis considering the RT values of the 1,100 words in our database. The dependent variable was the RTs in the lexical-decision task, whereas the predictors were the six variables rated in the present database.

A total of 2,200 stimuli were presented to the participants. Half of them were the 1,100 Chinese words from our database, and the other half were legal pseudowords (these pseudowords were based on the 1,100 original words and were generated by altering one random character within different real words). All stimuli were randomly divided into ten different blocks of 220 stimuli each (with equal numbers of words and pseudowords). Participants were tested individually and went through the five blocks of stimuli in two different sessions on different days. The order of word presentation in each block and of the blocks in each session was randomized for each participant.

Each trial began with a fixation cross presented in the middle of the screen for 400 ms, followed by presentation of the string for 2,000 ms or until a response was given. The intertrial interval was 800–1,000 ms. Participants were instructed to respond as quickly and accurately as possible and to press one of two keys: “Z” if the stimulus was a real word, or “M” if it was a pseudoword. These two keys were counterbalanced across participants. Prior to the experiment trials, seven practice items were first presented. The task was presented by E-Prime 2.0 (Psychology Software Tools Inc., Sharpsburg, PA).

In the analysis of RTs, we excluded all responses either faster than 200 ms or slower than 2,000 ms, as well as wrong responses (1.04 % of the data). Table 2 shows the correlation analysis between the six variables and the RTs for each word. To explore the capacity of our normed variables to predict word recognition performance, a linear regression analysis was conducted. We tested multicollinearity among the variables and removed imageability (VIF = 6.70) in this regression analysis due to the high correlations with concreteness (.78) and context availability (.88). The R 2 of the model was .37, F(5, 1094) = 41.842, p < .001. The linear regressions showed that familiarity (β = –.12; t = –3.79, p < .05) and context availability (β = –.09; t = –2.83, p < .05) had significant facilitative effects on RTs, whereas concreteness (β = .41; t = 9.55, p < .001) had a significant inhibiting effect. The affective variables [i.e., valence (β = .05; t = 0.94), valence squared (β = –.03; t = –0.72), and arousal (β = –.04; t = –0.77)] were not significant predictors of RTs (all ps > .05). Taken together, these results indicate that the RTs in the lexical-decision task were predicted by familiarity, concreteness, and context availability, but not by valence and arousal, which is in line with prior research (Guasch et al., 2015; Levy-Drori & Henik, 2006). A relevant study here is that of Guasch et al., who explored the effects of both psycholinguistic and affective variables on word recognition, and found that the better predictors of RTs in a lexical-decision task were familiarity and word length, plus modest contributions from concreteness and context availability. The affective variables (i.e., valence and arousal) were not predictive at all. As far as we know, very few studies have directly investigated the capacity of affective variables to predict lexical decision and have paid attention to the question that whether the RTs in lexical-decision tasks could be predicted by valence and arousal. This is because the relevant studies (including our study) aimed to provide subjective ratings for a large set of words for both semantic and affective variables, which have been proved to influence RTs in the lexical-decision task. Although this fact weakens the capacity of affective variables to predict word recognition performance, it does not undermine the validity of our norms, because there is a substantial consistency between our normative data and the findings in other similar databases (Guasch et al., 2015; Levy-Drori & Henik, 2006).

Table 2 Correlation coefficients between response times (RTs) and the six variables (r values)

Exploration of the relationships among word variables

Relationship between affective variables: Valence and arousal

First of all, we explored the relationship between valence and arousal, since previous studies focused on developing affective databases in different languages have commonly reported that these two variables are related (e.g., Bradley & Lang, 1999; Ferré, Guasch, Moldovan, & Sánchez-Casas, 2012; Kanske & Kotz, 2010; Montefinese et al., 2014; Redondo et al., 2007; Schmidtke et al., 2014; Võ et al., 2009). To this end, a quadratic regression with the mean valence and its square as independent variables and the mean arousal as a dependent variable was conducted: The effects of all other semantic variables were partialed out by entering them as predictors in a first step; valence and valence squared were then entered in a second step. All the semantic variables predicted 3.9 % of the variance in the first model [R 2 = .04; F(4, 1095) = 10.98, p < .001]. Valence and its square were additional significant predictors in the second model, accounting for an additional 35.2 % of the variance [R 2 = .39; F(2, 1093) = 315.03, p < .001]. The quadratic model outperformed the simpler linear model, which, although significant, explained only 5.6 % of the variance [R 2 = .06; F(1, 1094) = 25.09, p < .001]. This finding indicated that very positive and very negative words were rated as being the most arousing stimuli, whereas items with low positive and negative ratings were perceived as being the least arousing. That is in line with the findings of prior studies, which have shown that the relationship between valence and arousal is described by a U-shaped distribution in different languages (Bradley & Lang, 1999; Citron et al., 2014; Ferré et al., 2012; Montefinese et al., 2014; Redondo et al., 2007; Schmidtke et al., 2014; Võ et al., 2009). Figure 1 shows the locations of the 1,100 words ratings in the two-dimensional affective space.

Fig. 1
figure 1

Distribution of the mean ratings for the 1,100 words in the valence and arousal variables

The association between valence and arousal was further examined by classifying each of the words in the database as being positive, negative, or neutral. Words were distributed according to the same criteria used in prior studies (Ferré et al., 2012; Hinojosa et al., 2016; Monnier & Syssau, 2014). We decided to consider words with valence values ranging from 1 to 4 as negative (M = 2.50, SD = 0.67), words with values ranging from 4 to 6 as neutral (M = 4.93, SD = 0.54), and words with values ranging from 6 to 9 as positive (M = 6.79, SD = 0.51). According to these criteria, we identified 361 negative (32.82 % of the whole database), 424 neutral (38.45 %), and 315 positive (28.64 %) words.

After separating the words into the three groups as mentioned, we computed the pairwise correlation between valence and arousal within each group (see Fig. 1). In the negative range, the correlation was moderate and negative, r = –.50, p < .001, whereas in the positive domain it was positive, but lower in magnitude than that for the negative words, r = .26, p < .001. This result indicated the relationship between valence and arousal seems to be asymmetrical, in that the correlation between valence and arousal showed a steeper slope for negative than for positive words. This result agreed with previous reports (Bradley & Lang, 1999; Citron et al., 2014; Ferré et al., 2012; Guasch et al., 2015; Montefinese et al., 2014; Redondo et al., 2007; Schmidtke et al., 2014; Võ et al., 2009), which suggested that both the most negative and most positive words have higher ratings in the arousal dimension, but the increase in emotional arousal in relation to an increasing degree of negative valence seems to be stronger than that related to an increasing degree of positive valence. As has been suggested, positive stimuli are associated with feelings of safety, so they are not necessarily high in arousal, whereas negative stimuli may reflect a dangerous event that requires a quick response (Citron et al., 2014; Lang et al., 1997).

Relationship between semantic variables: Concreteness, imageability, context availability, and familiarity

We stated in the introduction that imageability, context availability, and familiarity relate to concreteness and may have acted as confounding factors in determining the relationship between concreteness and the affective variables. Thus, we computed the Pearson correlations among these variables in order to explore whether the usual pattern of relations was also observed in the present database (see Table 3).

Table 3 Correlations among the semantic variables (r values) and their significance levels

All correlations and their significance levels are reported in Table 3. Concreteness showed a high positive correlation with imageability (r = .78), indicating that as concreteness increased, it also increased the ease of forming a mental image depicting the meaning of the word. That is, concrete words can be easily imagined (e.g., 苹果, “apple”), and abstract words are difficult to capture with a mental picture (e.g., 信念, “belief”). This result is in line with other studies (Guasch et al., 2015; Paivio, 1986, 1991). Concreteness and context availability showed a high and positive correlation (r = .71), indicating that concrete words are more easily associated with a context than are abstract words. The result is in agreement with those observed in previous studies (Altarriba et al., 1999; Guasch et al., 2015; Schwanenflugel et al., 1988).

Next, imageability and context availability also showed a high and positive correlation (r = .88), suggesting that the easier it is to imagine the content of a word, the easier it is to access an appropriate context for its use. As was mentioned by Guasch et al. (2015), imageability and context availability are two deeply related variables from a theoretical point of view. Words that are higher in imageability can easily extract perceptual cues from contextual information, whereas for words that are lower in imageability, contextual information derived from perceptual cues is limited. In short, our results are in agreement with previous findings in English (Altarriba et al., 1999; Paivio, 1986; Schwanenflugel et al., 1988) and in Spanish (Guasch et al., 2015), and confirm the adequacy of our database for the study of processing differences between concrete and abstract words in Chinese.

Finally, we explored the relationships between familiarity and the concreteness, imageability, and context availability measures. Familiarity showed moderate positive correlations with concreteness (r = .54) and context availability (r = .46), suggesting that highly familiar words are more concrete and easily associated with a context than are unfamiliar words. In addition, the correlation between familiarity and imageability (r = .34) was somewhat lower, similar to the values of .40 obtained by Citron et al. (2014) and of .31 obtained by Guasch et al. (2015), suggesting that highly imageable words tend to be more familiar than low-imageable ones. Thus, it is appropriate to control for familiarity in studies on the relationship between concreteness and the affective variables.

Relationship between affective and semantic variables

Because one of the aims of the present work was to provide researchers with affective and concreteness variables that could be manipulated or controlled in experiments, we examined whether the affective variables were significant predictors of concreteness ratings after other semantic variables were taken into account.

We first investigated the quadratic effect of valence on concreteness by performing a hierarchical regression analysis with the mean valence and its square as independent variables, and the mean concreteness as a dependent variable. For the effect of valence on concreteness, all of the remaining semantic variables as well as arousal predicted 70.5 % of the variance in the first step [F(4, 1095) = 654.99, p < .001]. Valence and its square were additional significant predictors in the second model (see Fig. 2), accounting for an additional 2.2 % of the variance [R 2 = .727; F(2, 1093) = 44.61, p < .001].

Fig. 2
figure 2

Relationship between valence and concreteness ratings

Then we carried out a linear regression analysis with arousal as the independent variable and concreteness as the dependent one. For the effect of arousal on concreteness, when the effects of all remaining semantic variables and both valence and valence squared were removed, the simpler linear model of arousal predicted 1.5 % of the variance [F(1, 1098) = 8.26, p = .002].

Our results showed a quadratic relation between valence and concreteness, and a linear relation between arousal and concreteness, which confirmed recent findings in English (Vigliocco et al., 2014) suggesting that highly valenced and highly arousing words tend to be more abstract, whereas neutral nonarousing words tend to be more concrete. Moreover, a similar result also was obtained using the Spanish affective norms of Guasch et al. (2015). They computed Pearson correlations between concreteness and emotional load (negative and positive)/arousal, and found that concreteness showed negative correlations with both emotional load and arousal, which suggested that the more emotionally loaded and the more arousing a word is, the less concrete it is.

The studies in different languages above have commonly reported a notable relation between the affective variables and concreteness, and suggest that abstract words have more affective associations than concrete words do. On the one hand, these results support Vigliocco et al.’s (2014; Kousta et al. 2011; Vigliocco et al., 2009) hypothesis, based on embodied theories of cognition, that words denoting concrete objects and actions develop from our experience with the external world, whereas words denoting abstract objects and actions (which refer to internal experience, not limited to emotions) develop from our internal affective experience. Importantly, although the emotional connotations of words have been shown to vary from one culture to another as well as between languages (Montefinese et al., 2014), the semantic representations of words seem similar in the English, Spanish, and Chinese languages. In short, our findings regarding the relation between the affective variables and concreteness extend the range of application of Vigliocco and colleagues’ hypothesis, confirming that abstract words always tend to have more affective associations than do concrete words in the Chinese language.

On the other hand, knowing exactly the relationship between the affective variables and concreteness would be informative regarding the question of the contributions of concreteness to the effects of valence and arousal on emotional word processing. Although research projects aimed at studying the effects of concreteness and the affective dimensions on word processing have traditionally run in different directions, in recent years interest has been growing in studying the relationship between the two. Because of the potential interaction between concreteness and the affective variables, recent studies have suggested that concreteness influences the effect of emotional meaning on word processing (Kanske & Kotz, 2007; Palazova et al., 2013; Sheikh & Titone, 2013; Tse & Altarriba, 2009; Wang & Yao, 2012; Yao & Wang, 2013, 2014). Our findings provide an instrument that can facilitate experimental research into the effects on word processing and memory of both types of variables simultaneously, as well as their potential overlap. In addition, our findings also show a notable contribution to the hypothesis that some of the effects related to concreteness are explained by the emotionality of words. In particular, Kousta et al. (2011) considered that the reversed concreteness effect (i.e., abstract words being more quickly and accurately processed than concrete words) reported recently in several studies (Barber et al., 2013; Kousta et al., 2011) could be explained by the fact that abstract words have more affective associations than concrete ones.

Considering that some evidence suggests that imageability has a strong correlation with concreteness (Paivio, 1991; Guasch et al., 2015; Schwanenflugel et al., 1988), and that emotion words activate different levels of concreteness and imageability in the processing of emotional words (Altarriba & Bauer, 2004), we also computed the quadratic regression between valence and imageability: For the effect of valence on imageability, the second model outperformed the first-step model [respectively, F(2, 1093) = 25.99 and F(4, 1095) = 1,552.12; both ps < .001, R 2 change = .007]. However, it seems to be worse than the simpler linear model, which accounted for 2.2 % of the variance [F(1, 1098) = 24.6, p < .001]. For the effect of arousal on imageability, the simpler linear model failed to reach a significant level [F(1, 1098) = 0.52, p = .47]. These results showed obvious differences between the affective variables and concreteness, as well as between those variables and imageability. Concreteness ratings were significantly predicted by valence and its square, as well as by arousal, whereas a linear function was best suited to represent the relationship between valence and imageability, and imageability ratings were not predicted by arousal. Our findings confirm the view that concreteness and imageability should not be used interchangeably, since the distributions of concreteness and imageability ratings are different (Kousta et al. 2011; Montefinese et al., 2014). This is a reminder to researchers to control and manipulate their experimental material with regard not only to concreteness, but also to imageability, especially in the studies on emotional effects.

Because valence is a bipolar dimension (i.e., one extreme is positive, the other negative), whereas all other variables range from absence to full presence of a certain property, we then computed the quadratic relationships between valence and each other linear variable. For the effect of valence on familiarity, valence and its square predicted an additional 15.9 % of the variance in the second model [F(2, 1093) = 203.74, p < .0001]. For imageability, valence and its square predicted an additional 0.7 % of the variance in the second model [F(2, 1093) = 25.99, p < .0001]. These results show that valenced words are more familiar than neutral words, and that the more valenced a word is, the less imageable it is, indicating the importance of controlling for familiarity and imageability in further research on the effects of emotion, since they might have acted as confounding factors in previous studies (e.g., Altarriba & Bauer, 2004; Levy-Drori & Henik, 2006). Next, we computed the effect of valence on context availability and found no significant difference between the first-step model [R 2 = .805; F(5, 1094) = 905.30, p < .0001] and the second-step model [R 2 = .806; F(2, 1092) = 1.57, p = .21]. This result is in line with the Spanish affective norms, in which the correlation with emotional load and context availability also was not significant (Guasch et al., 2015).

Finally, we calculated the partial correlations between valence (or arousal) and each of the remaining semantic variables after other semantic variables, as well as arousal and valence squared, were taken into account. As with valence, only one partial correlation was significant—that with the familiarity ratings (r = .06, p = .03). This result confirms again the view that participants feel more familiar with positive words, corroborating the findings of Citron et al. (2014). A possible explanation for this is that participants are reluctant to admit that they are very familiar with negative words (Citron et al., 2014). A similar response bias was found in a self-referential task (Lewis, Critchley, Rotshtein, & Dolan, 2007) in which participants were asked to indicate whether each word could be used to describe themselves (i.e., “yes/no”), and where they showed a tendency to respond “yes” more often for positive words.

Regarding the arousal dimension, the partial correlations were significant for both context availability (r = .10, p = .001) and imageability (r = –.06, p = .05). These results shown that words with a high level of arousal have more context availability and are harder to imagine. For the correlation of arousal with imageability, although the above correlations are low, they would be in line with those observed in Guasch et al.’s (2015) study. In contrast, Citron et al. (2014) found that high-arousing words more easily evoke a mental image. Montefinese et al. (2014) investigated the nonlinear effects of concreteness on arousal ratings and suggested that words that are very hard or easy to imagine make people feel calm, whereas those in the middle of the imageability range increase excitement. To the best of our knowledge, fewer studies have explored the relationships between arousal and imageability rating. In some cases, the discrepancy in the prior ratings studies may have arisen from high variability of the arousal ratings. Arousal might be associated with intense experiences in life, and the same word may represent different threats and require immediate action for different subsamples of the population. For example, the word “flower” elicits more excited feelings in people who may be in love, but it may elicit less exciting feelings in people (e.g., florists) who may rely on flowers for their livelihood. The word “spider” can also elicit more intense displeasure for some people than for others because of differences in their sensorimotor experiences. In fact, the relationship between arousal and concreteness is still uncertain. Although some studies have shown a positive correlation between the two (Hinojosa et al., 2016; Vigliocco et al., 2014), another study reported a significant negative correlation between them (Guasch et al., 2015). A potential reason for these discrepancies may have been a difference in the ranges of the variables. For instance, imageability and concreteness ratings were obtained from the MRC Psycholinguistic Database in Vigliocco et al.’s (2014) study. That range was from 100 to 700, whereas the scale ranged from 1 (not at all) to 7 (very high) in Guasch et al.’s study. In short, these findings regarding the relation of arousal to the semantic variables (including imageability and concreteness) should be taken with caution, and further research will be needed to confirm and clarify this relation.

Conclusions, limitations, and future directions

In summary, the present database provides subjective ratings for 1,100 Chinese words for both affective variables (i.e., valence and arousal) and various semantic variables (concreteness, imageability, context availability, and familiarity), and particularly focuses on the relationship between the affective variables (i.e., valence and arousal) and concreteness after controlling other semantic variables. Descriptive statistics for all variables are supplied in a PDF file as supplementary materials to this article. The correlation analysis carried out confirmed the reliability and consistency of the present data. The hierarchical regression carried out suggests that the affective variable ratings can predict concreteness ratings, which supports the idea that abstract words might have more affective associations than do concrete words (Kousta et al., 2011; Vigliocco et al., 2009) and confirms the findings of recent behavioral and event-related potential studies (Barber et al., 2013; Kanske & Kotz, 2007; Kousta, Vigliocco, Vinson, Andrews, & Del Campo, 2011; Tse & Altarriba, 2009; Yao & Wang, 2013, 2014).

However, we did not include a measure of the age of acquisition (AoA) and the mode of acquisition (MoA) of the words, which may be limitations of this study. In fact, studies have shown that AoA and MoA can be important contributions to lexical processing and are related to the affective properties of words (Citron et al., 2014; Della Rosa et al., 2010; Moors et al., 2013). Therefore, a future study could expand the present database to include AoA and MoA values. In addition, we used a scale from unfamiliar to familiar to measure familiarity, which may have led to the meaning of familiarity being interpreted in different ways by participants (Montefinese et al. 2014). Thus, the familiarity index should be based on “subjective measures” of how often participants both use or are exposed to a given word (e.g., very often, very rarely) in future studies.

To conclude, the present study will be a valuable source of information for emotion research that makes use of Chinese words. This database enables researchers to use highly controlled Chinese verbal stimuli for the study of emotion and will allow them to investigate the relation between cognition and emotion more reliably.