For which courses do students make the most progress?

Editor's note: This work was completed as part of Duolingo's summer internship program.

Everybody learns at their own pace and Duolingo learners are no exception. Some Duolingo users study a few lessons per day on their way to work, right before bed or whenever they have a free moment. On the other hand, there are users who complete the course in a few short days, like this motivated Duolingo user who completed the Esperanto course only six days after it was launched! But is there a pattern to the amount of progress Duolingo users make based on the course they’re taking? If so, for which courses do we see users make the most progress? Can this tell us anything about our users? And what other factors contribute to the amount of progress made?

About the data

We collected data from new users who created a Duolingo account in the month of February of 2016. At that time, Duolingo had launched a total of 44 courses, ranging from an Irish course for English speakers to a German course for Portuguese speakers (there are now 62 and counting). For each course, we looked at the progress users had made in the next 90 days. All Duolingo courses are composed of rows of 1-3 skills each (see Figure 1). We measured progress based on the number of rows in the course that the users had completed. In the analysis we refer to the row the user reached by the end of the 90-day period as their “max row”.

Figure 1. A section of rows from the Spanish course for English speakers.
A section of rows from the Spanish course for English speakers.

Does course length affect how much progress users make?

Because each Duolingo course is customized for the language it teaches, the length of each course varies. The number of rows ranges from 29 (English for Spanish speakers) to 60 (Norwegian for English speakers).

Does the number of rows impact how many users finish that course in 90 days or fewer? It is reasonable to think that given the same amount of time, fewer users would finish a longer course over a shorter one. Indeed, Figure 2 shows that the longest courses (German for English speakers and Norwegian for English speakers) do have a very low proportion of users finish the course within the first few months. However, there is large variability, and the number of rows is only moderately correlated with the proportion of users who finish the course. For example, both the English course for Greek speakers and the Spanish course for English speakers have 31 rows, but the former has the highest proportion of users who finished the course, whereas the latter has the lowest.

Figure 2. Proportion of users who finished the course versus the number of rows in that course.
(Dot size reflects the number of users in the course.)
 Proportion of users who finished the courses versus the number of rows in that course.

So what does this mean exactly? One hypothesis is that certain Duolingo users have outside factors motivating them to learn or enabling them to progress through the courses more quickly. Let’s dig deeper.

Which courses have users with the most previous knowledge?

Intuitively, one of the biggest factors affecting course progress might be the amount of prior knowledge in the language being learned. Can our data set shed light on this intuition? Conveniently, when you first begin a course on Duolingo, users choose to either take a placement test or start from the very beginning.

Figure 3. The user’s option to take the placement test or start from the beginning of the course.
The user’s option to take the placement test or start from the beginning of the course.

The results show that, on average, users who completed the placement test finished more than one and a half times as many rows in the course as users who didn’t!

So which users completed the placement test? Figure 4 compares the proportion of users who completed the placement test for a certain course with the proportion of users who finished that course. Now, taking the placement test does not mean that you placed out of any rows, but — as shown in Figure 4 — a higher proportion of users who take the placement test is strongly correlated with a higher proportion of users who finish the course within 90 days.

Figure 4. Proportion of users who finished the course versus the proportion of users who completed
the placement test. (Dot size reflects the number of users in the course.)
Proportion of users who finished the course versus the proportion of users who completed the placement test.

English skills are valued all around the world so it is not surprising that the courses with the highest proportion to complete the placement test are all learning English. Some of these courses have more than half their users complete the placement test! Additionally, none of the courses with the lowest proportion to complete the placement test are learning English. Interestingly, the majority of the courses with the highest proportion to complete the placement test from Figure 4 are located in Central/Eastern Europe. This may be attributed to the fact that it is mandatory in that region to learn English in school, and English knowledge is a common skill needed to pursue higher level careers. Whatever the reason, their dedication to learning English through Duolingo is impressive!

English speakers learning Irish and Turkish were the least likely to complete the placement test, which suggests that these courses attract first-time learners. This is hardly surprising, since in most countries it is difficult to find classes in either one of these languages.

Which courses have the most motivated users?

Globalization has increased the importance of being able to communicate with people outside one’s native language. So which languages are Duolingo users most dedicated to learning? We compared all courses by looking at the average max row of the users for each course. Since completing the placement test is strongly correlated with the rate of finishing the tree, we made the comparison separately for users who completed the placement test versus those who did not.

Figure 5. The average max row for users who completed the placement test versus the average max row
for users who didn’t complete the placement test. (Dot size reflects the number of users in the course.)
The average max row for users who completed the placement test versus the average max row for users who didn’t complete the placement test.

Figure 5 shows that whether or not the placement test was completed, the course with the lowest average max row within 90 days was Irish, while the course with the highest average max row was German. (The English course for Japanese speakers had a higher average max row than German for those who didn't complete the placement test, but more on this below!).

The slow progress of the bottom courses could be explained by a variety of factors. One hypothesis is that users might have difficulty learning a language with an entirely different writing system or entirely different grammar (English for Arabic or Hindi speakers comes to mind). Another is that users may simply not be as motivated to make fast progress in a course, as might true with the overall less popular languages like Dutch or Irish. The Spanish course for English speakers is a noteworthy outlier — it is by far the most common language to learn in the United States, but users show relatively slow progress on average.

As for the top courses, the fast progress might be due to higher proficiency in the language at the start. This is likely true with the English courses since English is often mandatory in schools around the world. A high starting proficiency may also be a factor for courses where the course language and the user’s native language are very similar, such as the French course for Italian speakers. The rapid progress of users in the German course for English speakers is more of a a mystery, but the recent overhaul of the course might have been a contributing factor.

One interesting statistic: although the English course for Japanese speakers was still only in beta during part of the period in which data were collected, that course still had the highest average max row for users who did not complete the placement test (and second highest for those who completed it). These users seem to be very committed to using Duolingo! Perhaps the upcoming 2020 Summer Olympics — which will be hosted by Japan — provides a partial explanation for this.

How much does the specific course matter?

As we saw, it seems to be that users who are learning English often make faster progress, but just how much more motivated are they?

Let’s look at the courses in terms of the proportion of users at each course row. Figure 6 compares the courses in which users are learning English (in green) to the courses taken by English speakers (in red) and the courses in which users are neither learning English nor learning from English (in blue). We can see that the courses that teach English have a higher proportion of users in farther rows. The courses that don’t teach English, on the other hand, still have many users in the earlier rows.

Figure 6. The average proportion of users per course type by row.
The average proportion of users per course type by row.

Users learning from English are, on average, not as far along in the course as users learning from a different language. But are there any Duolingo-related factors that could be contributing to this difference in motivation?

Of the 44 Duolingo courses available by the end of February, 19 were courses learning English from another language! At the time, seven of those 19 courses were the only courses offered with that particular base language. On the other hand, if English was your base language then you had 12 courses to choose from. So rather than just focusing on one course, it is possible that English speakers are using Duolingo to learn multiple languages at the same time and therefore don't make as much progress on each individual course as users who only have one course to choose from. We leave this question for future investigation.

Ultimately, the amount of progress a user makes in a course appears to be influenced by their amount of previous knowledge and their desire and motivation to learn. However, in the end, it is not the speed at which you finish a Duolingo course that matters, but the simple fact that you are pushing yourself to learn another language! Happy learning!