Comparing success of female students to their male counterparts in the STEM fields: an empirical analysis from enrollment until graduation using longitudinal register data

In this paper, we investigate the predictors for enrollment and success in Science, Technology, Engineering, and Mathematics (STEM) programs in higher education. We develop a sequential logit model in which students enroll in STEM education, may drop out from STEM higher education, or continue studying until they graduate in an STEM field. We use rich Dutch register data on student characteristics and high school exam grades to explain the differences in enrollment, success, and dropout rates. We find that females are less likely to enroll in STEM-related fields, while students with higher high school mathematics grades are more likely to enroll in STEM. Female students have lower first-year dropout rates at university of applied sciences STEM programs. With respect to study success, we find that conditional on enrollment in STEM, women are less likely to graduate than men within the nominal duration or the nominal duration plus one additional year. However, female students do perform equally well as male students in terms of graduation within 10 years. We conclude that STEM programs are less popular among female students and that female students are less likely to graduate on time. However, females perform equally well in STEM higher education in the long run. For this reason, policy should be geared at increasing study success in terms of nominal graduation rates among female STEM students.


Introduction
For some years now, the question of how to increase the number of graduates from higher education studies in Science, Technology, Engineering, and Mathematics (STEM) has featured prominently on many political agendas, including the European Union's Horizon 2020 strategy regarding science education. Next to this, persistence in undergraduate STEM education is a heavily researched topic, which is illustrated in Talking about Leaving Revisited (Seymour & Hunter, 2019). The interest from both policymakers as academics is fueled by the great demand in both the private and public sectors for STEM graduates. This demand is from technology firms, governments, and research institutes (Giffi et al., 2018). However, the inflow of students in STEM fields is too low to satisfy demand. At the same time, concerns have been raised over the lack of and relative underperformance of female and students with a migration background in STEM fields, along with the significant dropout rates for those who do enroll. In this light, improving study success in higher education is a priority Page 2 of 17 Vooren et al. International Journal of STEM Education (2022) 9:1 (European Commission, 2015). Study success is defined as graduating within the nominal duration plus one additional year. In the Netherlands, the nominal duration for university of applied sciences (UAS) programs is 4 years, where for research university (RU) programs, this amounts to 3 years. As an example, only 59% of the bachelor's degree graduates from UAS and 72% of the bachelor's degree graduates from RU in the Netherlands completed their studies within the nominal duration of the program plus one additional year in 2017 (Inspectie van het Onderwijs, 2018, p. 174). For STEM-related studies, these figures are even lower. In this paper, we identify the underlying factors that predict enrollment, dropout and study success in STEM higher educational programs in the Netherlands. Doing so, we identify how differences in performence of female students with respect to their male counterparts vary over time, and how this compares with the differences in performance of students with a migration background with respect to students without.
In the Western world, elevated dropout rates in STEM fields are a widespread problem (Chen, 2013). In analyzing this, Meyer and Marx (2014) describe the experiences of students who dropped out from engineering programs in the United States, for instance, by switching to a different major. A commonly given reason relates to not only their performance, but also to their sense of belonging: students who drop out tend to experience difficulty fitting in with the field of engineering in addition to the program requirements. A lack of self-confidence and motivation is another frequently hypothesized reason for dropping out of STEM programs. In addition, in the American context, Litzler et al. (2014) break down the self-reported levels of STEM confidence by gender and ethnicity and find that on average, women report lower self-confidence in their ability for STEM than men. Many studies, particularly of STEM programs, assess the effectiveness of interventions to enhance general study success by improving the fit between secondary education and higher education (see Brock (2010) for a review). Based on structural equation modeling and data from the Netherlands, Torenbeek et al. (2010) argue that a closer resemblance between the higher education program and the courses that students take in secondary education advances first year study success significantly. This raises the question whether good grades in mathematics and science in high school are reliable predictors of study success in STEM programs.
Better study success in higher education STEM programs has the potential to help reduce labor market shortages of STEM graduates. However, before directing attention towards improving study success and decreasing the time-to-degree, it is necessary to first appreciate the determinants that predict a student's choice of STEM study. Therefore, this paper explores the factors underlying students' decisions to enroll and pursue a course of study in STEM. These decisions are then used as a basis for estimating study success, using the rich register data from Statistics Netherlands for five cohorts for enrolling in higher education from 2007 until 2011. The register data contain detailed information on: (i) the student's background characteristics, (ii) the student's high school grades, and (iii) the student's career in higher education. After combining these data, we develop a sequential logit model to model the educational careers from year to year. This includes the decision to enroll, drop out from STEM, to the moment when the student graduates. We model the dropout decisions for each year separately. By doing so, the sequential logit model should account for the low STEM enrollment rates among females, because it takes into account earlier enrollment and dropout decisions while estimating study success (see Arcidiacono et al. (2016), Hunt (2015), Reuben et al. (2014), Volman and Van Eck (2001) for more studies in the United States and elsewhere).
We contribute to the literature in a number of ways. First, the analyses in this paper follow individuals throughout their entire educational careers, starting with and including their high school exams. This allows us to analyse the factors that both underlie the decision to enroll in STEM higher education as well as predict dropout from STEM-related programs and the probability of graduation. Second, since the high school exams in the Netherlands are the same for each student from a specific cohort in the entire country, these grades are comparable for all students within a cohort. This facilitates a nationwide comparison of high school achievement, which enables us to give a robust answer to the question to what extent high school exam grades predict enrollment and study success in STEM higher education. Furthermore, thanks to the longitudinal nature of the data, we are also able to address longer term educational outcomes. Doing so, we are able to answer the main research question: "How do differences in performance of female students with respect to their male counterparts in STEM fields vary over time, and how does this compare with differences in performance of students with a migration background with respect to students without?" The insights in this paper are useful for targeting underrepresented groups that have the potential for greater success, and to increase societal returns to STEM education.
In the next section, we describe the literature on educational choice and study success, and describe the theoretical underpinnings for our hypotheses. We outline and model the Dutch educational system in section "The Dutch education system". In section "Methods", we describe the data from Statistics Netherlands and

Literature and theoretical underpinnings
There is copious literature on the determinants of study choice, including the determinants of choosing STEM. This strand of literature identifies different factors that influence a student's decision to enroll in a STEM program. A review study by Van Tuijl and Van der Molen (2015) focuses on the factors in early childhood that explain why certain students enroll in STEM and others do not. The review study argues that stereotypes can negatively affect pupils' belief in their own ability and influence the STEM enrollment decision of both males and females in later life, causing low enrollment rates. From a cohort study of 6,000 students in the United States, Sadler et al. (2012) conclude that the differences in STEM interest between males and females increase during the high school years. During high school, the percentage of females interested in STEM careers falls with every school year, while for males, this percentage remains stable over the course of the high school years. Jouini et al. (2018) analyze a conceptual model of gender stereotypes, confidence, and decision-making in mathematics. The authors observe that women are under-represented in STEM study programs and careers, and argue that this is due to lower self-confidence in their mathematical ability. This low self-confidence in math ability is driven by negative stereotypes. Assuming the same distribution of ability among boys and girls, the optimal belief formation mechanism leads to the fact that stereotypes are formed, survive, and are reinforced over time. In this paper, we develop a sequential logit model to analyze the performance of female students in STEM fields during several stages of the STEM educational career. Based on the development and reinforcement of stereotypes, we expect a difference in dropout rates between female and male students that increases over time.
Another reason for lower STEM enrollment rates among females could be that girls might simply perform worse in mathematics than boys in high school. Gender differences in math scores exist in the Program for International Student Assessment (PISA) data set (see OECD, 2018), with boys outperforming girls in many countries (see Guiso et al. (2008) and Nollenberger et al. (2016) for an analysis of these PISA data). This difference could, however, be driven by the competitive setting of test-taking: boys perform better in competitive environments than girls (see Niederle and Vesterlund (2010) and Wang and Degol (2017) in the context of the United States). In the Netherlands, girls are less likely to follow extensive math subjects in high school. Hidalgo Saá (2017) and Haan (2018) attribute this to gender bias in the Dutch math curriculum. In addition, in the Dutch context, Buser et al. (2014) argue that boys' more competitive nature makes them more likely to choose the prestigious high school specializations characterized by the inclusion of more mathematics and science subjects, which are important for STEM higher education. Based on an experiment in Switzerland, Buser et al. (2017) contend that this is due the fact that boys are more willing to compete than girls. Students who are more competitive are more likely to choose a math-intensive specialization in high school. Therefore, we account for the fact that boys are more likely than girls to have the more technical mathematics course in their curriculum (see section "Secondary education") and include a dummy variable for it in our model.
Back in the context of the United States, Wang (2013) maintains that, after self-belief in math and science ability, the mathematics grade in the 12th grade is the best predictor for STEM enrollment. Moakler and Kim (2014) also find self-confidence in mathematics to be an important factor in the decision to enroll in STEM. In addition, they confirm that female students are less likely to enroll in STEM courses, although they find no difference in enrollment with respect to ethnic minorities. In conclusion, the scientific literature on study choice establish gender and 12th grade math achievement as compelling determinants for STEM enrollment.
In addition to differences in STEM enrollment rates, there are also differences in terms of study success in the STEM fields conditional upon enrollment. Using national survey data from the United States, Griffith (2010) explores the factors that explain why students drop out from STEM and switch to different majors. Female students in particular tend to drop out from STEM fields and change subjects. According to the Griffith (2010), women are more likely to persist in STEM study at institutions which have a higher percentage of female STEM graduate students, although a larger share of female STEM faculty members is not necessarily indicative of lower dropout rates among female STEM students. In a cohort study at a research university in the Midwestern United States, Whalen and Shelley (2010) investigate the predictors for study success in STEM majors. Those authors find that the previous grade point average is the strongest predictor for graduation in STEM programs.
Kokkelenberg and Sinha (2010) make use of studentlevel data from Binghamton University, in New York state, to investigate the factors associated with academic success in STEM study programs. The difference between male and female persistence in STEM fields at Binghamton University is most conspicuous in the field of engineering: female students drop out more frequently from engineering than from other STEM fields. According to the authors, the trouble with study success in the field of engineering, as opposed to other fields, stems from different levels of proficiency in high school mathematics. Still, from the existing literature, it is unclear whether the discrepancies in study success are due to gender or mathematical ability, because science and mathematics test scores diverge between genders, as shown in a randomized double-blind study at the University of Colorado (Miyake et al., 2010). Given this, it is important to adjust for mathematical ability while assessing STEM performance. In our analysis of gender differences in STEM performance, we utilize standardized high school grades to measure and account for differences in mathematical ability. In short, the two main theoretical mechanisms behind female underperformance in STEM are: (i) low self-confidence in math ability among female students, which is driven by negative stereotypes that become stronger over time, and (ii) lower female math performance in high school, driven by the competitive setting of testtaking where girls shy away from. These two theoretical mechanisms both involve math ability and confidence. In addition to this, the former also involves a time element. Therefore, an appropriate measure of math ability that is comparable between students of different universities is required, as well as longitudinal data that allows for the investigation of how these differences in performance develop over time. To do so, this paper utilizes data including standardized math grades on high school exams that we have linked to longitudinal data of individual high school careers.

Secondary education
In the Dutch system, a school advisory from primary school determines track placement of students in secondary education, from grade 7 onwards. Dutch secondary education consists of three tracks: pre-vocational education, higher general education, and pre-academic education (known by the Dutch acronyms vmbo, havo, and vwo, respectively). Pre-vocational education takes 4 years, higher general education 5 years, and pre-academic education 6 years. Despite the early tracking of the Dutch system, it is possible to move up a track in secondary education, although this is less common than grade repetition or stepping back a track. To gain access to higher education directly from high school, a student needs to hold a degree from either the general (havo) or the academic (vwo) educational tracks. In any case, only students with a high school degree from the academic (vwo) track can enroll directly into RU bachelor's programs (see NVAO (2021)).
All students in the general and the academic high school tracks take the subjects Dutch, English, and mathematics. However, not all students take the same type of mathematics. In Dutch secondary education, two types of mathematics are offered: (i) mathematics A focuses more on statistics (e.g., diagrams, tables, formulas, and probabilities) and (ii) mathematics B which is more technical, concentrating on such subjects as algebra, geometry, differentials, and functions. Mathematics B is more challenging and delves deeper into calculus. High school students, who are more interested in math, as well as those who specialize in science, are obliged to follow mathematics B instead of mathematics A. The admittance prerequisites for most STEM fields require that high school students should have completed a class in either mathematics A or B. The student does not necessarily have to receive a passing grade in mathematics, as students in the Netherlands are considered to have passed the final exam and satisfied all the requirements of high school even if they have one failing grade. That failing grade may even have been in mathematics A or B, but this is not as important has having taken the course, as Dutch universities only stipulate that certain subjects must be followed and do not specify that a specific final grade must be attained.
All high school students must take a standardized national written exam in their final year of high school, which covers the high school subjects Dutch, English, and mathematics A or B. These exams are the same for each student within a cohort, and are marked by two different teachers: the student's own teacher and a randomly chosen teacher from a different school in the Netherlands. This makes the grades on these standardized national exams comparable and robust predictors in our analyses.

Higher education
After high school, graduates from the academic track can choose to either go RU (wo), or UAS (hbo). Graduates from the general track can only apply to UAS. Bachelor's programs at RU have a duration of 3 years, whereas their counterparts at UAS have a duration of 4 years. Figure 1 shows a stylized diagram of the Dutch higher educational system. The present study considers a subsample of high school graduates who can enroll in higher education directly after graduating from high school. The first decision that both general and academic high school graduates have to make is whether or not they will study any of the STEM disciplines in higher education. This is also illustrated in Fig. 1. Assuming that the decision is made to follow a course of study in STEM, students with a degree from the academic high school track can decide to enroll in an STEM program at either an RU or at a UAS. However, only approximately 10% of the students in our sample who enrolled in UAS STEM programs are pre-academic education graduates. For each year after the initial decision, students can choose to either drop out from the STEM program or to continue studying. For purposes of our model, not dropping out means that the student continues on to the next year of the current STEM program or switches to another program within STEM at the same type of institution, i.e., the RU or UAS.
During the first year, an academic dismissal policy is enforced when students do not perform well enough in the program. This dismissal policy is called bindend studieadvies and students are forced to quit their studies if they do not accumulate sufficient credits. These policies have become increasingly common in Dutch universities. For a detailed description of these dismissal policies, see Cornelisz et al. (2019). After the first year, there is no enforced dismissal. After studying for a set number of years-at least 3 years in the case of RU and at least 4 years in the case of UAS-the students may graduate. In our analysis, we distinguish between (1) Fig. 1 Sequential logit model: simplified version of the Dutch higher education system. Notes After high school, graduates from the academic track can choose to either go to research universities, or universities of applied sciences. Bachelor's programs at research universities have a duration of 3 years, whereas their counterparts at universities of applied sciences have a duration of 4 years. For this reason, we model the students' choice sets differently depending on the type of higher education that they are enrolled into: research university students can graduate 1 year earlier than their colleagues at universities of applied sciences. Out of sample means that the student does not graduate within the nominal duration plus 1 year students who graduate at the first possible opportunity, and students who graduate within 1 year thereafter.

Statistics Netherlands
We use non-public microdata from Statistics Netherlands. Under certain conditions, these microdata are accessible for statistical and scientific research. The guiding principle here is safeguarding privacy and preventing disclosure of persons or companies. For further information, see Statistics Netherlands (2021). This microdata facility provides longitudinal register data on every Dutch citizen and inhabitant. The source of the educational data that we use for our analysis is the Dienst Uitvoering Onderwijs (DUO) of the Dutch Ministry of Education, which administers the educational data in the Netherlands. Their registers contain information about enrollments, degrees, high school exam subjects, and grades. For the analysis, in this paper, these data have two main advantages. First, they allow us to follow an individual's educational career over multiple years. By this means, we can identify which students enroll into higher education in the Netherlands, in which program they enroll, whether they drop out and if so when, whether they switch to another program or institution inside of the Netherlands, and when they graduate at a Dutch institution. Second, the microdata facility of Statistics Netherlands allows us to link these two data sets on higher education and high school achievement at the individual level. In this way, we are able to incorporate high school grades in our predictions of dropout probabilities.

Sample
Our sample includes all the students who took their high school exams in the Netherlands between 2007 and 2011. The lower limit of this time period is constrained by the availability of high school grade data. The data on high school grades are only accessible for individuals who took their high school exam in the years from 2007 onward. The upper limit is set at 2011 for two reasons. The first is necessitated by the availability of data on higher education, since we need to follow the students for a sufficient number of years to estimate our model. The second reason is that the high school exam requirements were the same during the period from 2007 until 2011. The requirements to pass the high school exam have become gradually more stringent since 2011. These extra requirements include the introduction of minimum grades for core subjects and a mandatory arithmetic test. For the sake of comparison, we only include students who enroll in higher education directly after graduating from high school. Not many Dutch students take a gap year between graduation from high school and enrollment in higher education (Warps, 2018), so we do not have to concern ourselves about that contingency. Our final sample consists of 281,806 students spread out over five cohorts. Out of these, 51,948 enrolled into an STEM program. This represents about 18% of all enrollments. From an international perspective, this figure is low. In Germany, for instance, 37% of all students are enrolled in STEM programs (Freeman et al., 2019).

Variables
We have derived the variables in our sample from different source data sets. Table 1 gives an overview of the explanatory variables and their source tables within the microdata catalogue from Statistics Netherlands. The vwo variable is a dummy variable for the academic high school track. The math, dutch, and english variables are continuous variables for the high school math, Dutch, and English grades, ranging from value 1 to 10. The advmath variable is a dummy for whether the student has taken the math B instead of the math A subject in high school. These variables have been obtained from the registers containing information about high school exam grades. In addition to this, the female and migration background variables are dummy variables for female students and students with a migration background. These two variables have been obtained from the civil registers.
The enrolled variable is a dummy for whether a student is enrolled in an STEM program at a Dutch UAS or RU in a given academic year t. To determine whether a student has graduated or has dropped out from the STEM program, we take the following approach. We create two dummy variables: dropout and graduated. When the student's level of education changes in 1 year with respect to the year before, this implies that the student has graduated, in which case the graduated variable takes value one.
The dropout variable takes value one when we observe a change in the main program in which the student is enrolled compared with the year before, without observing a change in their level of education. This is the case when the student has either switched to a different major or to a lower level of post-secondary education, or has dropped out from the educational system altogether.
For instance, when a student switches from one STEM program to another within the higher educational system, the dropout variable takes value 0. When a student switches from a STEM program to non-STEM program, or drops out from higher education altogether, the dropout variable takes value 1.

Background characteristics and descriptives
Tables 2 and 3 provide descriptive information on the variables that are included in our analysis. Table 2 shows the entire sample, and Table 3 shows the sample subject to STEM enrollment. In both tables, the observations are equally divided over the five cohorts. Similarly, students from native and migration backgrounds are included in the entire sample and the sample of STEM enrollees. Where the genders are stated, we note that there are far fewer female students who choose to go into STEM. The percentage of female students in RU STEM programs is slightly greater than in UAS.
As previously explained, successful completion of high school mathematics A or B is prerequisite to enrollment in STEM higher education. Consequently, we are able to gather the high school mathematics grades for all students in our sample. Moreover, because all high school students are obliged to take an exam in Dutch and English, we have the grades for those subjects as well. We focus on the final grades in the standardized national exams. For our analysis, we further standardize all high school grades to mean zero and standard deviation one. This simplifies interpretation of the estimated coefficients later on and makes it easier to generalize the results. Hence, there are no additional descriptive statistics to report that provide more information than is shown in Tables 2 and 3 . Table 4 gives an overview of the distribution of the outcome variables. About 16% of the high school graduates in our sample enroll in an STEM-related bachelor's program, UAS and RU combined. A large share of students who drop out, do so during the first year. It is also noteworthy that many students drop out during the final year of the program: the fourth year in UAS, and the third year in RU. It is possible that students who do not drop out during the first year, but start underperforming halfway through the program, drop out during the final year, because they have not earned sufficient credits to qualify for writing the bachelor's thesis. In practice, the student dropout rate is spread out across all years, but it peaks during the first and final years of the program. In the Netherlands, both RU and UAS programs require a bachelor's thesis in the final year. Among the students in our sample who graduate, the majority complete their studies within the nominal duration. It is noteworthy that the share of students who eventually graduate from RU is comparatively larger when measured against the share of students at UAS. In RU, roughly half of the students who enroll in STEM-related programs eventually graduate on time, whereas in UAS, only 28% do, signalling a lower study success rate at UAS.

Sequential logit model
We estimate a sequential logit model (McFadden & Domencich, 1975) to quantify the educational decisions of the students in our sample. As shown in Fig. 1, we assume that each year, students can decide to either continue studying for another year or drop out. After having studied for the nominal study duration (i.e., 3 years for RU and 4 years for UAS), students who have passed all courses with sufficient credits can also graduate. The outcome variable is a categorical variable that captures the final outcome state corresponding to the model in Fig. 1. In Fig. 1, the values of the outcome variable corresponding to the student's outcome state are shown in parentheses. We estimate the sequential logit model by performing a set of logistic regressions-one for each transition that students can make after each year. The first transition is the decision whether to enroll in STEM higher education. Conditional on the decision to enroll in STEM higher education, we consider the students' dropout decisions, followed by an estimation of the probability of graduation within either the nominal duration or the nominal duration plus one more year.

Results and discussion
The results of the estimation of the sequential logit model are presented in Table 5. In this table, we focus on STEM enrollment, first year dropout, and study success. Since a lot of students drop out during the first year, we focus on this transition in our analysis, in addition to the probabilities of graduation. The estimation results for the transitions that are not presented in Table 5 can be found in Appendix Table 8. Columns 1, 2, and 3 show the odds ratios, coefficients and their standard errors for students at UAS, and columns 4, 5, and 6 show the results for students at RU.

STEM enrollment
First, we estimate the probability of enrolling in STEM higher education. The first panel of Table 5 gives the results for this step. A higher grade for mathematics correlates with a higher probability of enrolling in STEM. For students at UAS, this correlation only applies when the student followed mathematics B in high school. This is an interesting and unexpected result. It could be that the correlation is driven by the fact that in the general high school track, mathematics is not a requirement for every specialization: students who do not like math have the option to skip the subject. In addition, the focus of mathematics B in high school is geared more towards STEM applications, whereas mathematics A is focused more on social sciences. In other words, students in the general track who are more interested in social sciences beforehand might select the mathematics A subject. Combined with the fact that 90% of the students in UAS have followed the general track in high school (see Table 3), this might explain why we find a negative relationship between the high school math grade and the probability of enrollment in STEM at UAS, but a positive coefficient for STEM programs at RU.
For both RU and UAS, a higher grade in English seems to increase the probability that a student will enroll in an STEM program. This is an expected result, since many Dutch universities advocate a thorough knowledge of the English language as a prerequisite for many higher educational programs, given the widespread use of English-language textbooks in Dutch universities. Conversely, a higher grade in the Dutch language seems to decrease the probability that a student will choose to enroll in STEM. The reason could be that students who are interested in an STEM career do not perform well on the high school Dutch exam. In accordance with the recent literature (Buser et al., 2014(Buser et al., , 2017Van Tuijl & Van der Molen, 2015), we find that female students are less likely to enroll in STEM higher education, at both UAS and RU. The same finding applies to students with a migration background in examining UAS STEM enrollment. Students who graduated from the academic high school track are less likely to enroll in STEM programs at UAS when compared to those who graduated from the general Table 4 Distribution of outcomes Panel A shows the share of students from the total sample that choose to enroll in STEM higher education. Panel B shows the distribution of outcomes for the students that are enrolled in STEM higher education. Enrollment, drop-out, and graduation data has been calculated based on the microdata register of educational attainment and enrollments, referred to by Statistics Netherlands as Onderwijsdeelnemerstab  Table 3.

First year dropout
Once we have estimated the probability of enrollment in STEM higher education, we proceed to calculate the dropout probabilities during the first year of higher education. The results of this calculation are shown in the second panel of Table 5. The results show that students who followed the academic track in high school and enrolled into UAS are less likely to drop out in the first year. In both RU and UAS, higher high school mathematics grades go hand-in-hand with lower first-year dropout rates. This means that students with higher high school grades for mathematics perform better during the first year of STEM education. This relationship is stronger for students who followed mathematics B in high school. This is an expected outcome, since mathematics B in high school implies a better a priori understanding of calculus, which is useful in STEM programs. The high grade for Dutch language does not seem to explain first year dropout rates; it only has a statistically significant effect for RU, but the coefficient is small. The relationship between mathematics performance in high school and the probability of dropping out in the first year might also help to explain the difference in performance between students from the two high school education tracks when they study in UAS. This correlation is most likely due to the differences in prior knowledge of mathematics between academic track students and general track students as described in the previous paragraph. In addition, the general high school track spans 5 years, while the academic track takes 6 years. The additional year of high school mathematics might further justify the variation in performance during the first year in higher education.
Students with a higher grade for English in high school seem to be more likely to drop out from STEM bachelor's programs at UAS. However, the coefficient is small and we do not observe this relation at RU STEM bachelor's programs. Interestingly, we find that female students are more likely to drop out from STEM programs in year one at RU, while they are less likely to drop out from STEM programs at UAS. We observe a similar disparity for students with a migration background. Using the Statistics Netherlands definition as a basis, we define students with a migration background in this paper as students with one or two foreign parents, i.e., born in a foreign country, outside the Netherlands 1 . Students with a migration background are more likely to drop out from STEM programs at RU, whereas we do not find any difference in the first year dropout rate at UAS.

Study success
In our analysis, we measure study success in two different ways: graduation at the end of the nominal duration of the program and graduation at the end of the nominal duration plus a maximum of one additional year. In UAS, students who graduated from the academic track in high school perform worse on both outcomes. They perform even worse in terms of study success at year five. This finding seems counterintuitive. Since the 6-year academic high school track is more rigorous than the 5-year general track, the expectation is that academic track graduates leave high school better prepared for higher education compared to general track graduates. However, it is possible that selection effects play a role here. Less motivated academic high school graduates may choose to enroll into UAS instead of RU than their more motivated peers, explaining their lower performance in UAS.
With respect to graduation within the nominal duration of the program, the predictive power of high school exam grades seems to diminish. At UAS, the effects of high school grades are negative for mathematics and Dutch language. The English language grade yields only a small positive effect on the probability of graduating after 4 years. For RU STEM programs, the coefficients are not significant, except for a small negative effect for the English language high school exam grade. When we consider the probability of graduating instead of dropping out within the nominal duration plus 1 year, the results do not change. For UAS STEM programs, the coefficients for high school grades are all negative. For RU programs, the coefficients for high school programs are all insignificant as well, except for a negative coefficient for the high school exam grade in English.
Female students appear to perform worse in terms of study success in both UAS and in RU. This applies to the probability of graduating in both the nominal duration and in the nominal duration plus 1 year. Notably, we do not observe female students performing worse in terms of first year dropout rates at UAS. For UAS STEM programs, we also observe that students with a migration background are less likely to graduate after 5 years. We do not observe differences in graduation probabilities for RU programs.
Our empirical findings support the theory laid out in Jouini et al. (2018) that females change their view of the STEM fields when they progress through the program, due to the reinforcement of stereotypes over time. In a questionnaire of the motivation and interest development of female STEM students in the US, Talley and Martinez Ortiz (2017) also find that women lose interest in STEM over time. A related explanation for this is that family and school decisions may initially motivate women for STEM. Over time, the reinforcement of the decision for STEM relies on family support. When family support weakens, while stereotypes are enforced, this causes females to lose interest in STEM. As shown in a survey of nine US institutions (Verdín 2021), women's interest and identity formation are crucial for their decision to persist in STEM. From interviews with students at US institutions conducted by Puccia et al. (2021), it looks like students rely heavily on parental reinforcement during the first year of the engineering major.

Long-run graduation rates
From the sequential logit model, we find that female students perform worse in terms of study success measured by nominal graduation rates in both UAS and RU. Because we estimate our sequential logit model for several cohorts, we only track students for the nominal duration of the program plus one additional year due to data availability. To investigate the performance of female and students with a migration background in STEM higher education in the long run, we estimate a logit model for the probability of STEM graduation within 10 years for the cohort that started in 2007 alone. The reason for investigating a period of 10 years in this long-run analysis is that the Dutch law on financing higher education requires students to achieve a degree within 10 years from the date of their initial enrollment. When they achieve a degree within 10 years, their government subsidized student loans are converted into a gift. If they do not achieve a degree within 10 years, they must repay their accumulated student loans. This degree does not necessarily have to be a STEM degree, however. The results of this long-run cohort analysis are presented in Table 6.
Remarkably, we find that female students are more likely to graduate in STEM within 10 years' time than male students in UAS. In RU, the coefficient for females is not statistically significant, so female and male students perform equally well in terms of graduation within 10 years. In the 2007 cohort, students with a migration background perform worse than native Dutch students in RU, but not in UAS.
To assess whether this finding is not driven by just one cohort, we would ideally run the 10 year analysis for the other cohorts as well. However, this is not possible due to data availability constraints. In Table 7, we compare the descriptive statistics of the regression variables from the 2007 cohort to the 2008-2011 cohorts. It shows that the share of female and students with a migration background are comparable between the 2007 and the 2008-2011 cohorts. This implies that the composition of the 2007 cohort is comparable to the other cohorts. Therefore, it is unlikely that the findings from Table 6 are driven by cohort effects.

Conclusion
In this paper, we contribute to the literature by modelling study choices and study success in the Dutch higher educational system, from the transition from secondary school to graduation from STEM higher education. This comprises a time period of 4-5 years. A major contribution of this paper is that the data allow us to track a student's educational career over the course of multiple consecutive years. We focus on enrollment and study success in STEM programs. In different phases of the model, students can either drop out, continue studying, or graduate from an STEM program. We use longitudinal Dutch register data including high school exam grades. All students in the Netherlands take the same high school exam, which allows for a robust comparison between students from different schools. We account for the low STEM enrollment rates among females (Arcidiacono Table 6 Logit model for the probability of STEM graduation within 10 years, 2007 cohort only ***, **, *1%, 5%, and 10% significance levels, respectively. Columns (1) and (2) show the coefficients and their corresponding standard errors for universities of applied sciences, and columns (3) and (4) Buser et al., 2014Buser et al., , 2017Hunt, 2015;Reuben et al., 2014;Venkatesh et al., 2003;Volman & Van Eck, 2001) by first considering the STEM enrollment decision. This is important for a fair comparison between different groups and to control for the influence of high school achievement on STEM enrollment.
In STEM programs at RU, we find that students with a migration background most often drop out in the first year of study, but if they do not drop out, their study success is on a par with that of native Dutch students. In UAS, the first year dropout rate is no different between students with and without a migration background, but we do observe lower study success among students with a migration background. It seems that UAS are better at preventing students with a migration background from dropping out during the course of the program, but if they do not graduate by the extra year that is permitted after year five of the program, the students with a migration background are more likely to drop out than students without a migration background. In RU STEM programs, students with a migration background are more likely to drop out during the program, but if they have not dropped out before the final year, students with a migration background in RU are just as likely to graduate when compared to students without a migration background at the end of the program.
The results of our study show that not only do female students perform worse than male students in both RU and UAS, females are also less likely to enroll in STEM programs. Presumably, these results are caused by existing gender differences in math scores (Guiso et al., 2008;Nollenberger et al., 2016), which are in turn driven by disparate degrees of competitiveness between boys and girls (Buser et al., 2014(Buser et al., , 2017Niederle & Vesterlund, 2010;Wang & Degol, 2017) and gender bias in the Dutch math curriculum (Haan, 2018;Hidalgo Saá, 2017). Commensurate with the pattern of reduced STEM enrollment prospects for females, we find that women are also less likely to graduate on time, whether within the nominal duration or within a year after the end of the nominal duration of the program.
In addition to these differences in interest between male and female students when making the decision to enroll in STEM, we find that female students are also more likely to drop out over the course of the program. This suggests that female students are not only less interested in STEM when choosing to enroll, they also lose interest in STEM over time. A conceptual model by Jouini et al. (2018) attributes this to stereotypes that are enforced during the STEM educational career. A strand of literature on STEM persistence also highlights the importance of parental support for female STEM students (Puccia et al., 2021;Talley & Martinez Ortiz, 2017;Verdín, 2021). Our results suggest that either this positive impact of parental support on STEM persistence decreases over time, or the negative impact of the enforcement of stereotypes gets the upper hand, leading to higher dropout rates for female students in the final years of the program.
Based on such diminished study success, it is arguably a rational choice for many women to avoid STEM programs when making enrollment decisions. However, an interesting pattern emerges when we consider longerterm graduation and dropout rates for female students. We find that gender-based differences in choice of major and study success disappear when we consider long-run graduation rates in a separate, long-run analysis (i.e., within 10 years after initial enrollment). A deeper longrun analysis of one of our cohorts reveals that women do not perform worse than men when examining the graduation pattern over a term of 10 years. In that case, female students are equally likely to graduate as male students. Indeed, in scrutinizing the first-year dropout rate in UAS, we observe that females are actually less likely to drop out. Based on these observations, it would appear that the gender differences in study success within STEM higher education only exist in terms of nominal graduation rates, and disappear in the long run. Therefore, advanced labor practice policy should be geared to bolster study success and to reduce first-year dropout rates among both female and students with a migration background. In short, it seems that when we control for high school mathematics and language achievement, female and students with a migration background exhibit inferior study success conditional on STEM enrollment. From a randomized experiment, Russell (2017) concludes that female and minority students in STEM higher education might benefit from small, individualized learning communities. Indeed, smaller learning groups might explain the discrepancy in results between RU and UAS, since UAS tend to work with smaller bachelor's degree classes. Students in UAS also stay in the same small group during the entire course of the program.
Although this paper benefits from unique longitudinal Dutch registration data, the conclusions are drawn within the context of the Dutch higher educational system. While this could be seen as a threat to external validity, the division of higher education into bachelor's and master's programs has been common practice in the European Union since ratification of the Bologna Treaty in 1999. The system has also been found for many years in Anglo-Saxon countries, such as the United Kingdom and the United States. In the Netherlands, the higher education system has been divided into bachelor's and master's programs since 2002, although it still distinguishes between the two types of bachelor's programs: RU and UAS. This division is also common in many other European countries, such as Germany, Austria, Switzerland, Belgium, and several Scandinavian countries. Given such accepted practice, we argue that there are many similarities between the Dutch higher educational system and those in North America and Europe, making it plausible that the results from this paper may be generalized to other countries.
To recapitulate, we find that high school exam grades explain most of the variation for the dropout decision in the first year. Students with higher mathematics grades seem to be less likely to drop out of STEM higher education in the first year. This is especially true in the Netherlands for students who took mathematics B in high school. However, our results show that high school exam grades have little predictive power for study success. The literature points at relation between high school grades and study success (Danilowicz-Gösele et al., 2017), but this correlation is probably due to the tendency of admission officers to select students based on their high school achievement and cognitive test scores (Akos & Kretchmar, 2017). The present study is distinguishable in that aspect, because in the Netherlands, the selection of students at the admission stage is rare, especially in STEM fields. The majority of Dutch bachelor's programs accept all applicants who have earned a high school diploma; grades or test scores are irrelevant to admission in the Netherlands. For the present study, therefore, this systemic disregard of grades and exam scores for purposes of university enrollment eliminates any upward bias in relation to admission selection that might otherwise be attributed to the predictive power of high school performance. Our results show that selecting students based on high school grades might only improve upon first year dropout rates, but will not improve study success. Moreover, the predictive power of high school grades on first year dropout rates might be due to the bindend studieadvies, an academic dismissal program that is in effect in the first year of higher education in the Netherlands. If students do not earn enough credits during the first year, they are forced to quit the program, and switch to a different one at the same institution or a similar program at a different institution (Cornelisz et al., 2019).
In summary, while we find evidence that female and students with a migration background perform worse in STEM higher education than native Dutch males, their performance is not uniformly poor in every aspect of enrollment, dropout rate, and study success. Female students are not as prone to enroll in an STEM program or to graduate in the nominal duration (or nominal duration plus one additional year), but they are more likely to survive the first year. Students with a migration background are less disposed to enroll in STEM in RU, and less apt to survive the first year. Students with a migration background are also unlikely to graduate on time in both RU and UAS. However, when we specifically examine the 2007 cohort, we are able to conclude that female students are equally, if not more likely than men to graduate within 10 years. This specific long-term analysis of the 2007 cohort does not reflect this same pattern. Students with a migration background are less apt to graduate within 10 years in RU, but this finding does not pertain to UAS. We therefore conclude that female students in general can perform well in STEM, but need support to help them graduate on time.
Improving on the underperformance of female students in STEM higher education can also contribute to relieving the labor market shortage of STEM workers. In other words, the STEM pipeline, which runs from the moment of enrollment to the point of graduation, is leaky. Policymakers should use our models to diagnose where exactly this STEM pipeline leaks by identifying where dropout and underperformance of female students is concentrated. In addition to having a lower probability to enroll in STEM, female students have a higher probability to drop out in the final years of the program. The literature on STEM persistence points at both increasing stereotypes and decreasing parental support over time that are at play here. However, a limitation of the present study is that based on our models, we cannot conclude why females drop out and how to prevent it besides identifying when it exactly happens. Further research should assess how these leaks can be fixed, or in other words, which interventions are effective in increasing interest and STEM persistence among females. This further research should be more focused on answering the question 'What works?' , which should be addressed with randomized, evidence-based effect evaluations of interventions aimed at reducing the leaks in the STEM pipeline where capable female students are lost because they lose interest or switch to a non-STEM subject because they feel more welcomed there.