For years, researchers have been examining the “persona effect” assuming a general positive affective impact of animated agents on the learner which is able to boost both motivation and, as a consequence, performance (Lester et al., 1997). This effect however has not yet been convincingly proven (Heidig & Clarebout, 2011). Only those studies that focus on specific features of agents yield some promising results. Baylor (2011) states in her literature review that motivation for learning can be enhanced by a “genuine” social interaction between agent and learner. A genuine interaction according to the author is characterized by three factors: the appearance of the agents, advanced communication features like gestures and emotional expression, and the dialogue itself including motivational messages. In our study we focus on the appearance of the agents, but as our agents also provide feedback and therefore interact with the user, a few important findings related to those two characteristics will also be addressed.
In terms of dialogue design, elaborative feedback ( Lin, Atkinson, Christopherson, Joseph, & Harrison, 2013) and polite conversation ( Wang et al., 2008) have been shown to have a positive influence on performance. Furthermore, compliments for correct answers can encourage intrinsic motivation by positively influencing feelings of competence, self-control, self-efficacy, and curiosity ( Johnson et al., 2004). Advanced communication features of social agents, such as nonverbal cues, seem to be crucial in maintaining learning motivation in virtual learning environments ( Allmendinger, 2010), probably because they inform the observer about states, involvement, responsiveness, and understanding ( Bavelas et al., 1986 and Fridlund et al., 1987). Emotional expressions displayed by virtual characters should therefore be recognizable which is sufficiently guaranteed when posture and facial expression convey the same message simultaneously ( Visschedijk, Lazonder, van der Hulst, Vink, & Leemkuil, 2012). Deictic gestures have no effect on retention ( Craig, Gholson, & Driscoll, 2002), but have been shown to guide attention ( Atkinson, 2002), especially when the agent is static ( Baylor & Ryu, 2003b).
The third factor, the appearance of pedagogical agents has been widely ignored in research on pedagogical agents ( Gulz & Haake, 2006). The authors argue that the appearance of the agent influences the student’s interpretation of the role of the agent and more importantly, the perceived similarity between the student and the agent that in turn influences the pleasantness of the agent and the quality of the interaction with it. Appearance has also been shown to increase student’s motivation. For example, in a study with undergraduates Baylor and Kim (2009) demonstrated that a visible and physically present agent leads to better motivational outcomes than a voice or simply a text box. Furthermore, it is important to design agents realistically, because e.g. cartoon figures have been shown to diminish the positive motivating effects in comparison to realistic figures ( Baylor & Kim, 2004). However, it is also important not to fall into what is known as “The uncanny valley”. The uncanny valley hypothesis states that when the humanness of an agent is increased, the valence reported by the students also increases, until a point beyond which the reaction is reversed. However, when the agent is designed to resemble a human until it is hardly distinguishable from a real person, the valence turns positive again ( Mori, MacDorman, & Kageki, 2012). In addition, the similarity of an agent to the learner positively influences the learner’s motivation ( Bailenson, Blascovich, & Guadagno, 2008 in a study with undergraduates). For example, computer-based female agents yielded better motivational outcomes for undergraduate women if they matched the students with respect to race and gender ( Rosenberg-Kima, Plant, Doerr, & Baylor, 2010). Another study, conducted with undergraduates by Rosenberg-Kima, Baylor, Plant, and Doerr (2008) revealed that a female agent rated as young, attractive and “cool” succeeded in enhancing young female students’ self-efficacy, which is believed to be a driving force behind motivation ( Bandura, 1997). All these findings are theoretically supported by Bandura’s social cognitive learning theory which states that people often learn behavior and norms by imitating people whom they perceive as similar (or superior: higher in rank or status) to them and who are therefore rather accepted as social role models ( Bandura, 1986). This finding is supported by another study of Gulz, Haake, and Tärning (2007) which demonstrated that participants prefer same-gender agents when they are asked to choose their preferred agent as presenter for a multimedia slideshow.
A study by Baylor and Kim (2005) demonstrated that manipulating the role of an agent by changing its appearance and text in a dialogue enhanced the learner‘s (in this case undergraduate literacy students and pre-service teachers) motivation and increased their performance depending on the role the agent is given. Interestingly, in this study motivation and performance were influenced independently. An expert agent was shown to enhance performance whereas a motivator agent enhanced motivation (self-efficacy).
Motivation was also boosted by agents that were used to tutor students learning mathematics in a sample of 7th to 10th graders in a recent article by Arroyo, Burleson, Tai, Muldner, and Woolf (2013). In their 4 studies, virtual learning companions elicited more interest (“less boredom”) for the subject. Importantly, they found clear gender influence on how students benefit from the tutoring.
The primary goal of the current study was to investigate the influence of virtual agents in a web-based tutoring program in a sample of undergraduate psychology students attending a statistics’ class. We expected to confirm the “similarity hypothesis” of Rosenberg-Kima et al. (2008): This hypothesis predicts that a young female agent has a more positive effect on the motivational outcome than an older male agent because of the similarity (young and female) or superiority (attractive) to the sample investigated (mostly female students). In order to test this hypothesis, we chose a young and female (therefore similar) agent and contrasted it against an older, male agent, expecting the young female agent to boost motivation and performance. Assuming both of the agents we used could act as motivators, we added a third group with no agent as control group and expected to replicate the finding of Arroyo et al. (2013) that any agent positively influences interest of the students in a mathematical subject.
3. Results
3.1. Baseline measurements
Thirty-nine students in the Edgar group, 31 of the Minnie group and 38 of the group without agent attended the exam. No differences regarding demographics, exam preparation tools (e.g. additional literature) and motivation between the groups were detected (all p’s > .151). Only the distribution of gender differed between groups: In the group without agent, 14 (42%) were male, in the Edgar group 3 (9%) were male, and in the Minnie group 4 (15%) were male. Thus, there were more than three times as many male participants in the group without agent as in the other two groups. Due to this difference, we also investigated gender differences on our dependent variables. We found highly significant differences for interest, t(92) = 3.73, p = .001, and pleasure in learning, t(92) = 4.55, p < .001, indicating that male participants were generally more interested in the subject at hand (statistics) and enjoyed studying it. Notably, male participants did not perform any different than female participants (p = .249 on the final exam grade). Since the data was collected over two successive terms, we also checked if there were systematic differences between the terms: We found a difference between the terms in performance – the students performed better in the winter than in the summer term (MW = 2.89 (SDW = 1.11), MS = 3.61 (SDS = 0.80), p < .001).
The API questionnaire revealed that the two agent-groups did not perceive their agents differently on the subscale “facilitating learning” (p = .322). However, with regard to credibility, Edgar was perceived as more credible and by trend as more human-like and engaging than Minnie (see Table 1).
Table 1.
API-ratings for both agents.
Subscale M SD p d n
Edgar Minnie Edgar Minnie Ed Mi
Facilitating learning (10 items) 23.68 22.74 8.21 7.58 .322 0.12 37 27
Credible (5 items) 15.97 13.63 4.55 3.71 .016∗ 0.56 38 27
Human-like (5 items) 15.03 13.04 4.76 4.74 .052† 0.42 37 27
Engaging (5 items) 14.49 12.86 4.98 4.29 .085† 0.35 37 28
Note. Mean (M) and standard deviation (SD) on the four subscales of the Agent Persona Instrument (API) of the agents Edgar (Ed) and Minnie (Mi): Significant comparisons are indicated ∗p < .05, †p < .10 and effect sizes (Cohen’s d) are displayed. All items were answered on a five point Likert scale (1 = strongly agree, 5 = strongly disagree). The values of each item were added for the final scores of the scales.
Table options
3.2. Case train use
In total, 3215 cases were used by the participants of this study. On average, a student finished 30 cases by the end of term. No between group differences in the number of cases which were finished could be detected: MNO = 29.87 (SD = 6.63), MM = 30.56 (SD = 7.72), ME = 28.65 (SD = 6.19), F(2, 105) = 0.67, p = .515.
3.3. Performance
As we found a difference between the terms in performance, we calculated the analysis with z-standardized grades. The planned contrast for the agents versus the non-agent condition did not reveal any effect (p = .982). The Minnie and Edgar groups differed by trend: t(68) = 1.682, p = .097, d = 0.41, indicating that the students who practiced with Edgar performed slightly better than those, who practiced with Minnie. Means and standard errors are displayed in Table 2. Since we found a difference between the two semesters in performance before the exam, results are also displayed separately for each term in Table 3.
Table 2.
Comparisons.
Variable M (SE) p
Edgar Minnie No agent A vs. NA Ed vs. Mi
Performance −0.18 (0.15) 0.21 (0.18) 0.02 (0.17) .982 .097†
Interest 3.10 (0.13) 3.52 (0.13) 3.53 (0.17) .223 .053†
Enjoyment 5.65 (0.38) 5.52 (0.38) 5.58 (0.40) .990 .841
Note. Ed = Edgar (n = 39), Mi = Minnie (n = 31), NA = no agent (n = 38), A = Agent. Mean (M) and standard errors (SE) for the dependent variables of all three conditions. The examination grade (performance measure) has been z-standardized. Significant comparisons are indicated ∗p < .05, †p < .10.
Table options
Table 3.
Results separated for the terms.
Group Performance Interest
su wi su wi su wi
M (SD) M (SD) N (fe%)
Minnie 3.65 (0.80) 3.34 (1.20) 3.59 (0.62) 3.43 (0.83) 17 (88) 14 (78)
Edgar 3.53 (0.85) 2.61 (1.00) 3.06 (0.75) 3.14 (0.83) 17 (94) 22 (89)
No agent 3.63 (0.79) 2.88 (1.23) 3.42 (1.18) 3.43 (0.85) 24 (58) 14 (56)
Note. Mean (M) and standard deviation (SD) for performance (exam grade [1–6]) and interest separated for summer (su) and winter (wi) term. Number of females (fe) is displayed in %.
Table options
3.4. Motivation
For the variable interest, no effect in favor of the application of any agent compared to no agent was found (p > .282). A significant difference was found when we tested the second hypothesis predicting that the Minnie group would show more interest in the subject in comparison to Edgar, t(105) = 1.96, p = .027, d = 0.56 (see Table 2). None of the tests were significant for enjoyment (all p > .841).
3.5. Explorative analysis
3.5.1. Correlations
Correlations between performance and the two subscales of motivation (interest and enjoyment) were rSP = .22 (p < .001) for interest and rSP = .42 (p < .001) for enjoyment.
3.5.2. Performance
Post-hoc, we also contrasted Edgar against the other two groups in terms of performance. The analysis revealed a significant effect in favor of Edgar, t(105) = 2.07, p = .040, d = 0.42.
3.5.3. Motivation
Due to the gender differences in the dependent variable “interest” between the groups, we also displayed the means of only the female participants (Fig. 4, right panel). When we exploratively calculated the test only with those female participants and contrasted Minnie against Edgar and the no-agent group, she outperformed Edgar and the no-agent group by trend (p = .074, d = 0.53).
Full-size image (13 K)
Fig. 4.
Means and standard error bars for interest in the subject (5 = very high interest, 1 = very low interest) of all students (left) and only females (right). Between group differences for interest in the subject for all participants and only for females.