Perceived abilities or academic interests? Longitudinal high school science and mathematics effects on postsecondary STEM outcomes by gender and race

Previous literature has examined the relationship between high school students’ postsecondary STEM major choices and their prior interest and perceived ability in mathematics. Yet, we have limited understanding of whether and how perceived ability and interest in science and mathematics jointly affect students’ STEM major choices. Using the most recent nationally representative longitudinal cohort of U.S. secondary school students, we examine the degree to which students’ perceived mathematical and scientific abilities and interests predict their STEM major choices, employing logistic regression and a series of interaction analyses. We find that while both mathematics and science perceived ability positively influence STEM major selection, academic interest in these subjects is a weaker predictor. Moreover, across a series of analyses, we observe a significant gender gap—whereby women are less than half as likely to select STEM majors—as well as nuanced distinctions by self-identified race. The relationships among perceived ability, interest, and STEM major choice are not found to meaningfully vary by race nor consistently by gender. However, perceived ability has a more positive effect for men than women who are pursuing Computing/Engineering majors and a more positive effect for women than men who are pursuing other STEM majors, including less applied Social/Behavioral, Natural, and Other Sciences. These findings suggest potential opportunities to enhance their perceived mathematical and scientific abilities in high school, positioning them to potentially enter STEM fields. School sites with more resources to support the ambitions of STEM students of all backgrounds may be better positioned to reduce postsecondary disparities in STEM fields. Given existing opportunity gaps and resource differentials among schools, corresponding recommendations are suggested.


Introduction
Science, technology, engineering, and mathematics (STEM) fields are growing in demand and pay twice as much as non-STEM occupations (Fayer et al., 2017). Accordingly, U.S. high schools offer more advanced mathematics and science opportunities to better position students for postsecondary STEM degrees. Notably, while we have decades of research on mathematical Page 2 of 26 Zhao and Perez-Felkner International Journal of STEM Education (2022) 9:42 ability and interest as predictors of postsecondary STEM outcomes, limited research has directly assessed the interplay and distinctions between these predictors. Even less is known about science ability and interest, after decades of research focused particularly on mathematics. Early exposure to STEM-related courses can awaken students' interests and lead them to postsecondary STEM majors (Bottia et al., 2015). Advanced mathematics and science courses in high school have been found to be essential to students' opportunities to study STEM subjects in college (Dalton et al., 2007;Schneider et al., 2013). Psychological factors, including interest, have been found to influence students' pathways into these courses (Milesi et al., 2017;Perez-Felkner et al., 2017). Often regarded as a motivational variable and/or an affective component in educational psychology, interest has been widely studied as a mechanism that stimulates students' learning and academic achievement (Renninger & Hidi, 2011;Renninger et al., 1992). However, we know less about how academic interests influence students' postsecondary degree field. Self-assessed mathematics ability has also been found to affect students' STEM major choices, even when students' self-assessments of ability are biased by sociocultural norms around who belongs in these fields (Beyer, 1990;Correll, 2001). Interest is similarly socially conditioned (Watt et al., 2012). Complicating these educational metrics of self-assessed ability and interest in science and mathematics are the complex socioeconomic, racial, and gender dynamics currently at play in secondary and postsecondary education. Socioeconomic inequality by race and gender remains a major social problem, even among college graduates (Doren & Lin, 2019;Pais, 2011). Since STEM degrees can lead to highly paid jobs, one would expect STEM professions to attract underrepresented students and women who are increasingly the primary earners in their family. However, racial and gender disparities continue to persist in these fields (Carter et al., 2019;Saw et al., 2018;Xie et al., 2015). Importantly, gender differentials in postcollege earnings appear to be meaningfully explained in part by postsecondary degree field (Bobbitt-Zeher, 2007;Xu, 2015;Zhang, 2008). Whether this is also the case for racial disparities remains unclear.
Given the persistent gender and race disparities in White-and male-dominant STEM fields, despite extensive investment in broadened access and a changing economy, we examine a contemporary cohort comprising a full range of socioeconomic, racial, and gender demographics with a focus on both mathematics and science predictors of STEM postsecondary education. Using the National Center for Education Statistics' most recent longitudinal data, High School Longitudinal Study of 2009 (HSLS: 09), this study investigates (1) the distinct effects of mathematical and scientific interests and ability beliefs on STEM major choice; (2) whether interests or abilities have stronger effects on students' STEM major choice; (3) specific effects by students' self-reported gender and racial identity, and (4) whether these patterns differ in by STEM major cluster, with specific attention to Computing/Engineering majors.

Disparities in U.S. postsecondary STEM education and beyond
There are clear gender and racial disparities in STEM postsecondary educational outcomes (e.g., Griffith, 2010;Huang et al., 2000;Shapiro & Sax, 2011). Riegle-Crumb and Peng (2021) found that gender differences in postsecondary major choice can be shaped by societal stereotype and self-beliefs about mathematics ability. Students' pathways to STEM degrees may differ among students by gender and race (Ireland, et al., 2018). Using an intersectional approach, Nix and Perez-Felkner (2019) found that Black women and men experience especially strong gains from positive mathematics ability beliefs and-after controls-are the most likely to declare mathematics-intensive STEM majors and earn degrees in these fields.
Historically, Black, Latina/o/x, and Native American students have been characterized as underrepresented groups in STEM disciplines (Estrada et al., 2016;Maltese & Tai, 2011). Yet, racially minoritized students such as Black and Latina/o/x youth were more likely to major in certain STEM disciplines than their White peers when using multivariate analyses to account for explanatory factors such as secondary school course preparation and postsecondary enrollment , and racial differences are effectively null when focusing on the population of students who enroll in college (e.g., Riegle-Crumb & King, 2010). This is not the case for gender, where disparities remain among the U.S. postsecondary student population. Such research supports the value of ongoing investigation into the mechanisms which might explain STEM postsecondary major selection, as it varies by gender and race (see Garrison, 2013;Riegle-Crumb, King, et al., 2019;Riegle-Crumb, Morton, et al., 2019;Xu, 2016).
Studies using nationally representative, longitudinal data on U.S. cohorts of students have advanced our understanding of these patterns; these data follow students from secondary school through postsecondary education and workforce entry. In Griffith's (2010) analysis of National Education Longitudinal Study of 1988 (NELS: 88) and National Longitudinal Study of Freshmen data, White and male students comprised higher shares of those persisting in STEM disciplines when compared to underrepresented groups and women. Secondary In recent years, a steady flow of research has highlighted factors that may affect students' STEM major selection (Shapiro & Sax, 2011;Wang, 2013a). In addition to the demographic characteristics discussed above, socioeconomic characteristics such as parental occupation and educational background also positively predict postsecondary STEM outcomes (Wagner et al., 2002). For instance, Oguzoglu and Ozbeklik (2016) found in the 1979 National Longitudinal Study of Youth that girls whose fathers who were employed in a STEM profession were more likely to choose a STEM major in college, if they had no male siblings. In addition, family income was one of the components of family socioeconomic status that was used in predicting students' STEM major choices (Niu, 2017). High-demand and high-earning fields such as Computing and Engineering may be especially attractive to students from less socioeconomically advantaged families, as compared to less applied and high-earning scientific fields.
Looking further into specific major fields such as Computing and Engineering vs. applied and non-applied Humanities and Social Science fields, Wiswall and Zafar (2015) found that variations in individual beliefs about ability in each major affect students' major intentions, exacerbating the gender gap in major choices. Different majors tend to have distinct associations with abilities. Focusing on mathematics-intensive majors, Nix and Perez-Felkner (2019) found that 12th grade girls' perceptions of their ability with difficult mathematics increased their likelihood of choosing mathematically intensive majors such as engineering and computer sciences over biology and social/behavioral science majors. Dika and D' Amico (2016) also found that students' perceived preparation in math could significantly predict students' persistence in these majors. Notably, in the fields of Computing and Engineering, women have remained distinctly underrepresented (Corbett & Hill, 2015). Thus, this study digs further into these fields specifically.
The demographic characteristics of students' schools are also commonly included as covariates in statistical models in prior research that examine students' STEM major choices, such as high school type (e.g., Wang, 2013a) and urbanicity (e.g., Legewie & DiPrete, 2014). Quadlin (2017) used the 1997 National Longitudinal Study of Youth cohort and found that students from less economically advantaged family backgrounds had a higher probability of majoring in applied non-STEM fields such as business, communications, and education disciplines as compared to higher-income families. Generally, private high schools offer more advanced mathematics and science courses and have a higher socioeconomic and college preparatory profile (Lee et al., 1998). As such, students in private schools are better positioned for postsecondary STEM majors and STEM careers (Ketenci et al., 2020). School urbanicity matters as well. Rural students have been found less likely to complete advanced science courses than those in urban schools (Perez-Felkner et al., 2014). Bottia et al. (2018) found that rural high schools were less likely to have STEM-focused programs, which could positively affect students' postsecondary major intentions.
In sum, it seems important to consider these contextual factors which may affect students' pathways into STEM disciplines, during and beyond high school. Wang (2013b) found a stronger effect of math self-efficacy on 2-year college students' STEM interests than on 4-year college students. Phelps et al. (2018) found that students' STEM major enrollments differed at 2-year and 4-year colleges (e.g., mechanical technologies vs. engineering and engineering technology, respectively), within if not between major clusters such as Computing/Engineering fields. Gender differences in U.S. students' attainment of Natural/Engineering Sciences and Life Sciences majors has also been found to vary by college type, after controlling for additional characteristics such as college STEM GPA and academic and social integration . Following these discussions of the value of explaining disparities in postsecondary STEM major selection and contextual explanations, we turn to developmental and social psychological frameworks, which are the focus of our study.

Theorizing postsecondary STEM major choice
Psychological theories have been adapted to explain students' pathways from earlier educational experiences through postsecondary STEM outcomes, including major selection. This manuscript focuses especially on Page 4 of 26 Zhao and Perez-Felkner International Journal of STEM Education (2022) 9:42 students' perceived abilities and interests in mathematics and science. We also considered complementary theory which attends to the motivational relationships between students' (1) expectations for success and (2) subjective task values; expectancy task-value theory postulates that individuals' achievement-related choices (such as postsecondary majors) are associated with their confidence in specific subject domains, such as mathematics and science (Eccles & Wigfield, 2002). Subjective task value has been further categorized into four subcomponents including intrinsic value-the enjoyment obtained from participating in these tasks (Eccles & Wigfield, 2002).
In turn, college students may be more likely to choose the majors in which they feel more efficacious and interested. This may be especially true for students from underrepresented backgrounds who typically receive less encouragement and support in their pursuit of STEM degrees. Using an older cohort of U.S. longitudinal data, Perez-Felkner et al. (2017) also found that perceived mathematical ability affects science course enrollment in secondary school, which in turn has consequences for postsecondary major choice. Measures for science ability were not available in their study. However, students with higher perceived mathematical abilities were found more likely to declare STEM postsecondary majors. Perceived mathematical abilities had particularly positive effects on the probability of Black girls' and boys' selection of mathematics-intensive STEM majors and completing degrees in these fields .
There has been more limited attention to how mathematics and science interest may affect students' STEM major choice, despite academic interest being an important concept in educational psychology (Renninger et al., 1992). Therefore, this study examines whether interestin conjunction with other established factors-predicts the likelihood of majoring in STEM fields and demographic disparities in these outcomes. This study will also assess students' perceived ability in mathematicsalready shown to affect postsecondary STEM pathwaysand science, a crucial but under-examined domain.

Conceptual framework
This study develops a framework which integrates Bandura's (1977) self-efficacy, Hidi and Renninger's (2006) interest development model, and Eccles et al. expectancy-value theory (Eccles, 1983;Eccles & Wigfield, 2002), while accounting for prior literature on factors predicting students' STEM major choices. In this model (see Fig. 1), students' intent to choose a STEM major is influenced by their perceived mathematical and/or scientific abilities, mathematical and/or scientific interests, parental education and occupations, high school and classroom characteristics, which may stimulate students' STEM motivation. This study presents a detailed explanation of the model's theoretical grounding.

Ability-related perceptions and beliefs
Postsecondary STEM majors require strong foundational knowledge of mathematics and science. High school advanced mathematics and science courses can signal preparedness to enter college in general and to enter postsecondary gateway courses to STEM majors specifically (Schneider et al., 2013;Tyson et al., 2007). These courses also have implications for students' perceived mathematical ability, including how girls and underrepresented students' self-assessments of their ability may be shaped by in these gendered course and school contexts, with implications for entry to STEM majors (Correll, 2001;Perez-Felkner et al., 2014). Bandura's (1977) self-efficacy model emphasized an individual's belief in their innate capacity to achieve a particular goal. When individuals realize that their abilities may not enable them to accomplish certain goals, they may give up. The concept of self-efficacy has been widely used in education research. Zimmerman et al. (1992) found that students' perceived academic self-efficacy significantly affects their educational goal setting. Bandura (1993) illustrated that self-efficacy could be functioning in the academic field through four major processes, including motivational and selection processes. Students' beliefs in their efficacy played an important role in regulating their academic activities and aspirations, as well as their motivation directly. Later, Bandura et al. (2001) demonstrated that children's self-efficacy could be more decisive in shaping their occupational decisions than actual academic achievement, which in turn indirectly influences students' major choices. Therefore, self-efficacy is connected to goal setting as well as motivation and/or aspiration, thereby influencing students' STEM-related experiences in secondary school. In turn, building on Bandura's concept of self-efficacy, perceived mathematical and scientific abilities may each shape high school students' pursuit of STEM degrees in college towards the particular goal of majoring in STEM fields. We assess students' perceived abilities holistically as described further in the methodology, encompassing self-assessments of ability on specific tasks (tests, assignments, difficult textbook material) and with their mathematics/science course and the discipline more generally, during their first high school year of study in these crucial subjects for future STEM majors. Hidi and Renninger's (2006) Four-Phase Interest Development Model illustrates how a person's interest influences their attentions, goals, and levels of learning, which may contribute to academic motivation and/or aspiration. Hidi and Harackiewicz (2000) identify interest as a crucial motivational variable that affects students' academic performance. Thus, if students have stronger interest in mathematics and science, they may be more likely to have better mathematics and science performance and/or ability, potentially influencing their STEM choices.

Interest development
According to Hidi and Renninger's model, interest is categorized as situational or individual. Situational interest is a temporary interest aroused by specific activities, while individual interest is a relatively stable interest (Schiefele, 2009). Situational interest can develop into individual interest, where a person's subsequent decisions/behaviors are associated with this well-developed interest. Notably, scholars have argued that individual interest is not exclusively generated by the individual (Csíkszentmihályi et al., 1993); School actors such as peers and teachers may affect the formation of welldeveloped individual interest. Frenzel et al. (2010) found students' development of mathematics interest was positively associated with classroom characteristics like mathematics teacher enthusiasm. Therefore, students' classroom experiences with their mathematics and science teachers may also affect their interests in these areas.

Research questions and hypotheses
Guided by prior literature and the conceptual framework above, this study examines the influence of perceived abilities and interest in mathematics and science on students' STEM major choice. Specifically, this study addresses the following four research questions: Students with higher perceived mathematical/scientific abilities may be more likely to choose STEM disciplines in college. Similarly, students who have higher math/science interests are more likely to choose STEM disciplines. It is unclear whether perceived abilities or academic interests are the stronger predictor of STEM major choice. Research on perceived abilities and academic interest discussed above suggest that both could positively affect students' corresponding choices. Given interest in developing interventions to enhance students' opportunities to enter and complete STEM majors, we are interested in which is a more effective predictor of students' selection of STEM majors. Additionally, we evaluate whether these relationships vary by gender and race. Finally, we consider whether the analytic models are sensitive to alternate specifications of the dependent variable. We assess these relationships on a more nuanced dependent variable, which parses high-growth and highearning applied majors in technological fields (Computing and Engineering) as compared to other STEM and non-STEM fields.

Data source, declarations, and participants
We used restricted-use data from the newest nationally representative longitudinal U.S. cohort, the High School Longitudinal Study of 2009 (HSLS: 09). The National Center for Education Statistics (NCES) followed incoming ninth-grade students through secondary and postsecondary education, beginning in fall 2009. Students completed follow-up surveys in spring 2012 when the majority were in eleventh grade and at the transitory point from secondary to postsecondary education in 2013, to collect information like college plans and choices after high school completion (Duprey et al., 2018;Ingels et al., 2014). An additional follow-up survey was administered in spring 2016. Beyond these surveys, federal data from other sources was incorporated into the restricteduse HSLS dataset by NCES and is acknowledged where appropriate in the table source information we report in the main paper and appendix. Public use access to these data is available through the NCES website, as is the application for restricted-use data, such as we used for these analyses: https:// nces. ed. gov/ surve ys/ hsls09/ hsls09_ data. asp. Statistical code generated to analyze supporting the findings of this study are available from the corresponding author upon request.
The base-year sample includes ninth graders from 940 high schools across the country. Here and throughout, descriptive information is rounded to the nearest tenth in accordance with NCES restricted-data licensing regulations. Parents, teachers, school administrators, and counselors were also surveyed. The base-year student survey asked questions related to students' perceived mathematical/scientific abilities and math/science interests as well as their feelings about math/science teachers. The base-year parent survey collected information about students' family characteristics. Students' first-declared college major was gathered in the postsecondary follow-up student survey (see detailed variable descriptions Table 7 in Appendix).
The analytic sample represents the students who were ninth graders in Fall 2009 and were still college students in spring 2016. We restricted the sample to those who attended postsecondary institutions. Because there are missing cases associated with each analytic variable, we used multiple imputation to reduce nonresponse bias. After imputation, our final analytic sample yielded 11,560 cases. To ensure generalizability to the U.S. student population and enhance external validity, we used panel strata and survey weights (w4w1stup1) and provided by HSLS: 09 to address the complex stratified sampling design, and to adjust for unequal selection probabilities of sub-populations, consistent with the recommendations by Duprey and colleagues (2018) for appropriate weights when analyzing multi-wave HSLS data.

Measures
This section summarizes all the variables used in the hypothesized model ( Fig. 1).
STEM major The primary dependent variable in this study represents whether a student chooses a STEM major, a dichotomous variable (1 = first major is STEM; 0 = not STEM). According to the HSLS: 09 codebook, the source variable (X4RFDGMJSTEM) represents the respondent's major or field of study as of February 2016. STEM designations follow the U.S. Department of Education's (2010) Classification of Instructional Programs. We also developed and investigated an alternate threecategory specification drawn from related source variable X4RFDGMJ14Y, that breaks into STEM major clusters: (1) Computing/Engineering fields, (2) Social/Behavioral, Natural, and Other Sciences, and (3) non-STEM fields. Past research has shown Computing and Engineering function distinctly from other scientific areas given the high earnings and growth associated with these applied and mathematically intensive technological fields (see Corbett & Hill, 2015;Scott, et al., 2015). The non-STEM category is identical across both groups, as indicated later in Table 2 and Table 7 in Appendix.
Perceived mathematical/scientific ability Multiple items in the base-year student survey measured students' perceived ability. We identified seven items that include ability self-assessments and their reflection in their perceptions of themselves as a mathematics/science "person"; this scale is intended to richly and robustly reflect students' perceived mathematical/scientific abilities (see Table 8 in Appendix). Original items were all 4-point Likert scale items, ranging from 1 (strongly agree) to 4 (strongly disagree), which we reverse-coded for interpretability such that the items range from strongly disagree to strongly agree as they increase from 1 to 4. The Cronbach's reliability α for our perceived mathematical and scientific ability scales are 0.89 and 0.87, respectively, indicating strong internal consistency among these seven items (see Kline, 2013). As an additional robustness check, we assessed the correlations between our original scales for students' perceived mathematical and scientific abilities and the NCES-provided measures of mathematics self-efficacy (r = 0.92, p < 0.001) and science self-efficacy (r = 0.70, p < 0.001). 1 More detailed information about these and subsequently discussed survey items is provided in Table 8.
Math/science interest Similarly, mathematics and science interests were also measured by a series of items (details Table 9 in Appendix). Again, the original score on each item ranged from 1 to 4 (strong agreement) and 4 (strong disagreement). Two negatively worded items were found. We therefore reverse-coded these measures, such that four indicates "strongly agree", to be consistent with high values meaning high interests. The Cronbach's reliability α for the scales we generated to represent mathematics and science interests are 0.78 and 0.81, respectively. These three items thus represent math/science interest reasonably well, as reliability scores of α > 0.75 are in the acceptable range of internal consistency (Kline, 2013).
Student characteristics We included students' demographic characteristics, include dichotomous variables coded 1/0 for students' self-reported gender and race/ ethnicity: White, Asian, Black, Latina/o/x, and Multiple/ Other race. We also account for students' family socioeconomic status, as designated by parental education and family income. Parental education was distinguished by whether at least one parent had a 4-year college degree. Family income is an ordinal scale representing family income from all sources in 2008, the year prior to ninth grade, and is the most detailed family income variable included the HSLS restricted-use data; income ranges from 1 = less than or equal to $15,000 and 13 = greater than $235,000. We also account for parental occupation, designated as whether students' fathers and/or mothers were employed in STEM fields. Previous research indicates that students' academic performance matters (e.g., Wang et al., 2013). Therefore, we included students' high school mathematics and science course grade point averages (GPA) to measure students' observed ability, in distinction to their perceived abilities. Finally, as indicators of high school preparation for STEM coursework in college, we also include indicators of highest mathematics and science courses taken. Detailed coding information is provided Table 7 in Appendix.
High school and college characteristics High school characteristics may affect students' STEM plans in college (Legewie & DiPrete, 2014). Therefore, we controlled for high school characteristics by using variables representing high school types and urbanicity, as shown in Table 7. High school types include public, Catholic, and other private high schools. Urbanicity indicates whether the school is an urban, suburban, town, or rural area.
Classroom teachers likely have direct influence on their students' STEM ambitions as well. Aaronson et al. (2007) found math teacher quality was positively associated with students' math scores. Additionally, since academic interest is one of the main independent variables and previous research (Frenzel et al., 2010) indicates interest can be influenced by students' relationships with teachers, we include students' ratings of their experiences with their math/science teachers in our analyses. In the base-year student survey, these two variables were measured by nine items. Similar processes for generating these variables were conducted by testing Cronbach's reliability for students' experiences with their mathematics (α = 0.89) and science (α = 0.89) teachers (see details Table 10 in Appendix). Higher values of each variable are associated with students having better experiences with their math/ science teacher.
Previous studies have found that postsecondary institutions matter (e.g., Phelps et al., 2018;Wang, 2013b). We included students' first college type to account for postsecondary characteristics. This variable is a dichotomous variable, coded as 1 = 2-year or less and 0 = 4-year.

Limitations
Academic interests may be situational, Hidi and Renninger note (2006), affected by external environments. It is possible that during the period of the base-year survey, students were interested in mathematics and science, but then lost interest before entering college, influencing their major choice. Our measures may not fully represent students' true academic interests at the point they were making college major decisions, which could bias the estimated effects of academic interests on STEM major choice. Indeed, research suggests deep engagement and/ or passion in a STEM subject could be transformative to sustaining interest (e.g., Eccles & Wigfield, 2020;Schneider et al., 2016). Our assessment of this variable is limited to the effect of self-reported interest at the start of high school, which indeed may be malleable to change in accordance with the variables we measure and control for, including subsequent course taking and school context variables. Another limitation is the nature of secondary selfreported large-scale data. Our measure of student gender is drawn from a measure of biological sex (male/female) without options beyond this binary during the first three study waves. We note this again later in reporting out the results. In addition, both perceived abilities and academic interests were self-reported by adolescents, which may influence the survey accuracy. Additionally, HSLS: 09 is not explicitly designed for the measure of psychological constructs and therefore may not be able to measure perceived abilities and academic interests fully. However, we argue that the choice of dataset allows us to combine reliable measures (as noted by the scores reported above) and a nationally representative longitudinal cohort, to assess how interests and perceived ability in mathematics and science affect STEM choices over time.

Analytic approaches
We addressed our research questions and hypotheses through logistic regression due to the categorical nature of our dependent variables-binary logistic regression when predicting "STEM" major choice and a multinomial logistic regression with our more detailed categorical dependent variable. Regression equations and corresponding hypotheses are described below. For the first research question, the dependent variable is students' college STEM choices. The main independent variables were perceived mathematical and scientific abilities and interests, while adjusting for other demographic, high school, and classroom characteristics. Two main hypotheses were proposed for the first research question: H1a: Students with high perceived mathematical/ scientific abilities are more likely to choose a STEM major. H1b: Students with high interest in mathematics/ science are more likely to choose a STEM major.
To test H1a, we started with an initial model that only included students' STEM major choice as the dependent variable and perceived mathematical/scientific abilities as the independent variable. Then, we added student characteristics, high school/college characteristics, and classroom characteristics one by one to test our hypotheses. For simplification, we present the final model as follows: where STEM = whether student choose a college major in STEM fields; PMA i = the perceived mathematical ability of student i; PSA i = the perceived scientific ability of student i; STC i = a vector of student i's demographic information such as gender, race/ethnicity, parental education, parental occupation, family income, student preparation and observed ability in math and science in high school; SC i = a vector of student i's high school/college information including school type, school's urbanicity, and college type; and CC i = a vector of student i's classroom experiences with their mathematics/science teachers. The procedure of testing H1b is similar to the process when examining the effect of perceived abilities on students' STEM major choice. The only difference is substituting perceived ability variables for interest variables. Thus, our final model for H1b was: The second research question assesses whether perceived abilities or interests could be more predictive of students' STEM major choice. To address this research question, we added both sets of perceived abilities and interests into one model: Though the scale used for perceived abilities and interest is the same, we reported regression coefficients and odds ratios to compare the relative effects of mathematics and science perceived ability and interest variables on the dependent variable. In addition, we reported the model statistic (F-test) to compare overall model strength. We additionally assessed whether perceived abilities and interest were too highly correlated with each other or otherwise linearly related. Specifically, we estimated an ordinary least squares model to check the variance inflation factor (VIF) for multicollinear relationships among our variables, before entering them into the logistic regression.
The third research question seeks to examine whether the relationships between perceived abilities and interests and STEM major choices vary by gender and race/ ethnicity. We generated multiple interaction terms 2 to Perez-Felkner International Journal of STEM Education (2022) 9:42 see whether these intentional relationships vary by gender and race (Kaufman, 2018). We also estimated a model with state fixed effects, controlling for unobserved factors at the state level such as STEM employment opportunities and access to 4-year colleges. This model did not improve upon the prior model's explanatory power, nor did it change the trajectory of the results. Therefore, to conserve space and focus on key findings, we do not report them here.
To answer our last research question, we estimated multinomial logistic regression models using the same covariates used to answer research questions 2 and 3, but using a three-category rather than a binary dependent variable, where Non-STEM is the reference group, and we assessed how the results vary between the (aggregated) STEM results reported on in Table 4 and STEM major clusters reported on in Table 6: Computer/Engineering Sciences and Social/Behavioral, Natural, and Other Sciences.
In summary, RQ1 is assessed using Eqs. (1) and (2), and RQ2 is evaluated using Eq. (3). RQ3 is subsequently investigated with the use of interaction terms added to Eqs. (1) and (2). RQ4 is examined by changing the analytic approach used for RQ2 and 3 to a multinomial logistic regression rather than a binary logistic regression while retaining the predictors, and changing the dependent variable to the recoded three-category major variable in Eq. (3). Table 1 reports the descriptive statistics for control variables for the weighted analytic sample. Approximately 53.6% of sampled students identify as women. White students accounted for 54.6% of the analytic sample, while Latina/o/x, Black, Asian, and Multiple/Other Race were 20.3%, 12.2%, 4.5%, and 8.5%, respectively. Notably, there was less than a one percentage point difference between the share of students' fathers (15.3%) and mothers (14.5%) working in STEM occupations. While on average most students completed Intermediate and many completed Advanced Math, most students had not completed a year of physics by the time they entered postsecondary school (as the mean of 0.5 is halfway between 0 = No Physics and 1 = General Physics). Even among students in our analytic sample of postsecondary degree declarers, this is a striking indicator of U.S. students' under-preparation in Physics, a key gateway course for most postsecondary STEM majors. With respect to school characteristics, 90.3% of students in the sample attended public high schools, and 23.9% of schools were in rural areas. In addition, students reported that they had good experiences with their mathematics and science teachers; the Table 1 Descriptive statistics of control variables for the analytic sample with multiple imputation

Descriptive results
The analytic sample size is n = 11,560. S.E. refers to Standard Error. Survey weights (w4w1stup1) are applied to account for students and parents' nonresponses to enhance the external validity. This table is using percentage (0-100 scale) to describe the dichotomous variables defined in Table 7 in Appendix to ease interpretation. In the analysis, this study still uses the actual scale (0-1). In addition, the means were rounded to the nearest tenth decimal to comply with NCES restricted-data regulations. Missing data figures reported were generated from Stata 16's "misstable" command prior to imputation Page 10 of 26 Zhao and Perez-Felkner International Journal of STEM Education (2022) 9:42 weighted means were both 3.1 on a scale where 4 means "strongly agree" and 1 means "strongly disagree". a meaningful-and positive-difference as compared to their peers. Next, we turned to our review of disparities in Computing/Engineering vs. Other Science major selection, which we found-similar to the pattern above-varies significantly by gender (χ 2 = 375.9, p < 0.001) and (less so) by race (χ 2 = 12.8, p < 0.01). Here, we filtered out non-STEM students to compare Computing/Engineering majors to those in the Natural Sciences (e.g., Biology, Physics), Social/Behavioral Sciences, and often applied Other Sciences (e.g., Agriculture, Architecture). We found that 2/3 of boys in STEM fields selected Computing/Engineering majors, as compared to just over 1/4 of girls. With respect to race, we found limited differences between groups in who selects Computing/Engineering, such that variation did not exceed a 13-percentage point difference (between Multiple/Other at 44.0% and Latina/o/x at 56.5%). 55.0% of White students, 51.5% of Black students, and 50.4% of Asian students select Computing/Engineering vs. Other STEM fields. Across these analyses, we found that race disparities existed but were less severe than the gender disparities. Table 2 Dependent variable distribution in the analytic sample with multiple imputation, by gender and race (N = 11,560) Total Number refers to the total number of observations in each gender and/or race group. The survey weight (w4w1stup1) was applied to account for nonresponse and the stratified sampling design, to enhance the external validity. The N in each subgroup was rounded to the nearest ten and the percentages rounded to the nearest tenth decimal, to comply with NCES restricted-data regulations. The total number of cases in each subgroup may not be exactly equal to that of the analytic sample because of this rounding. Statistical significances within each gender and race group were assessed using Chi-square tests, with the test statistic and p-value reported for each variable

Gender and race intersections: Differences in key psychological predictors
To better understand gender and race differences in students' perceived abilities and interests in mathematics and science, we constructed a series of gender-race subgroups such as White male and Black female to conduct the Hedges' g effect size test of mean differences. 3 Hedges' g is appropriate when the sizes of subgroups are different (Kemp and colleagues, 2010). Using White male as the reference group, results reported in Table 3 showed that White females had significantly lower perceived mathematical and scientific abilities, which was consistent with the Riegle-Crumb, King, et al. (2019) and Riegle-Crumb, Morton, et al. (2019) findings that males' math and science self-efficacies were statistically significantly higher than that of females. It might be that negative gender stereotypes socialize many girls to believe they were not good at mathematics and science (Martinot & Désert, 2007). Notably, Latino males' mathematics interest and Black males' interest in mathematics and science were significantly higher than that of White males, in line with Riegle-Crumb et al. (2010). However, Latino and Black males reported lower perceived scientific ability than their White male counterparts. This suggests that racially minoritized students may have strong interest but low self-efficacy in these STEM-related subjects. Similarly, Black females reported lower perceived scientific ability than White males (2.7 vs. 2.9; 0.43 SD difference). However, their mathematical interests were significantly higher than White male peers (3.0 vs. 2.9; 0.26 SD difference). Our descriptive findings-not yet adjusting for other factors-were consistent with previous findings that descriptive gender and racial-ethnic gaps exist in STEM fields, prior to adding controls and covariates as we did in the next steps of our analysis.

Effects of perceived abilities and academic interests on postsecondary STEM major RQ1: Do perceived abilities and academic interests predict STEM major choice?
To measure the longitudinal relationship and potential interaction between students' postsecondary STEM major selection and their earlier perceived abilities and academic interests, we tested multiple models. Initial models included only the outcome variable and perceived mathematical and scientific abilities. Next, we added student characteristics, and then high school/college and classroom characteristics. Across all models, perceived mathematical and scientific abilities both positively influenced STEM major choice. For simplicity, Table 4 shows only the full models, which included student, high school/college, and classroom characteristics. Unstandardized coefficients were reported in addition to odds ratios, which indicated the effect of the predictor on the dependent variable (STEM major choice), such that the relative effect on the odds of the event occurring (choosing a STEM major) could be compared across the predictors, irrespective of the units of each predictor variable.
Beginning with Model 1, we found a one unit increase in students' perceived mathematical ability scale was associated with a 62% increase in the odds of choosing STEM disciplines, holding other variables constant. Similarly, a one unit increase in students' perceived scientific ability scale was associated with an 61% increase in the odds of majoring in STEM fields. The odds of choosing STEM majors were 0.59 times lower for HSLS cohort girls than for otherwise similar boys. Asian students were 1.49 times more likely than White students to major in STEM fields.
Students' high school physics course completion could significantly and positively predict STEM major choice, with OR = 1.72, p < 0.001. A significantly negative relationship was found between private schools and STEM degrees. Our results showed that the odds of choosing STEM majors were lower for students from private schools than those of students from public schools (OR = 0.61, p < 0.001).
Next, we examined the effects of mathematics and science interest in high school on STEM major choice. The results of Model 2 in Table 4 show that both mathematics and science interests significantly and positively predicted students' STEM choices, with OR = 1.23, p < 0.05 and OR = 1.30, p < 0.05, respectively. A one unit increase in mathematics interest was associated with a 23% increase in the odds of majoring in STEM, all else constant. Gender remained significant and even more negative in this model, as girls see a decrease of 64% in the odds of choosing STEM majors (p < 0.001). A one unit increase in science interest was associated with a 30% increase in the odds of choosing STEM majors. In addition, high school physics course completion was positively associated with STEM major choice (OR = 1.78, p < 0.001) as is, to a lesser degree, high school science GPA (OR = 1.34, p < 0.05). Again, attending a private school was (even more) negatively associated with STEM major choice (OR = 0.51, p < 0.001).  Table 3 Descriptive statistics for students' math/science-related characteristics: mean, standard deviations, and effect sizes Standard deviations are in parentheses. Effect sizes were calculated as mean differences between two groups, using Hedges' g (reference = White male). An asterisk means the given mean is statistically significantly different at the 0.05 level. The survey weight (w4w1stup1) was applied to account for nonresponse, to enhance external validity, and multiply imputed data are used. To comply with NCES restricted-data regulations, means were rounded to the nearest tenth decimal, and sample Ns were rounded to the nearest ten. When comparing subgroup means, some significantly different means may appear identical because of rounding Source:

RQ2: Comparison testing: Perceived abilities or academic interests?
Research Question 2 examined whether perceived abilities or interests more strongly predict students' STEM major choice, associated with the results of Table 4's Model 3. Multiple tests yielded the same answer. Looking across our three models-perceived abilities only (1), interests only (2), and abilities + interests (3)-the F-tests all showed a significant predictive relationship on STEM major choice. However, the perceived abilities only model was the strongest (F = 26.1), and the interests only model was the weakest (F = 19.1). Next, the independent variables' t-test results also favored perceived abilities over interests as predictors of STEM major choice. When both perceived abilities and interests were examined together in Model 3, only perceived mathematical/scientific abilities remained significant, positively predicting STEM degree major fields. Adding mathematics and science interests to the model only slightly diluted the predictive power of perceived mathematics and scientific ability in Model 3 as compared to the simpler Model 1. In both cases, the ability measures were significant beyond the p < 0.001 level, with changes being less than 0.05 in magnitude for the slope coefficient and less than 0.10 for odds ratios. In summary, the limitations of these measures notwithstanding, we found that perceived abilities mattered more than interests as predictors of students' STEM degrees, all else equal. Before closing our discussion of the independent effects of abilities and interests on STEM major choice, we added an additional analysis to confirm this finding and assessed the degree to which perceived abilities and interests might be highly correlated or non-linear in their relationship. The correlations between perceived mathematical/scientific abilities and math/science interests were both 0.51, moderate correlations as suggested by Evans (1996). Next, we estimated an OLS regression model to obtain the variance inflation factors (VIF), to rule out any potential multicollinearity. As seen in Table 5, all the VIFs were less than 3, therefore multicollinearity was not a concern (Thompson et al., 2017).
Having ruled out such concerns, we could then infer with confidence that perceived abilities were stronger and more important predictors of students' STEM majors than academic interests. This finding recalled our earlier descriptive finding, that there was more significant race-gender variation on perceived than academic interest (Table 3). Altogether, we found that perceived abilities played a more crucial role than academic interests when students chose their college majors.

RQ3: Do these relationships vary by gender and race?
To examine whether these relationships between perceived abilities, interests, and STEM major choice varied by gender and race, we added interaction terms into the respective models. All other variables' predictive relationships with STEM degree fields remained consistent with Model 1 and Model 2 as reported earlier in Table 4. After estimating these two models with gender and race interaction terms, significant interaction terms were found for gender only. Specifically, we found a positive main effect for male (b = 3.19, p < 0.001) and perceived science ability (b = 0.71, p < 0.001), and a negative interaction term (b = − 0.04, p < 0.05). This indicated that the nature of these relationships varied to a degree by gender (favoring boys) but not by race. We investigated further gender and race interactions in response to research question #4.

RQ4: Assessing Computing/Engineering major choice vs. other STEM fields
Given the heterogeneity of disciplinary fields represented within "STEM", we examined how sensitive our earlier findings were to an alternative specification, which examined a three-level dependent variable. Multiple logistic regression was used to assess the effects of mathematical/scientific perceived abilities and interests on the choices of postsecondary degrees in Computing/Engineering fields, Other STEM fields, and non-STEM fields (the reference group. We reported the findings from the full model in Table 6, using Model 3 from Table 4 (abilities + interests) to investigate differences between the predictive patterns. Table 5 Correlations and multicollinearity between perceived abilities and interests N = 11,560. Variance inflation factor (VIF) was obtained from OLS regression to test for potential multicollinearity. N was rounded to the nearest ten, to comply with NCES restricted-data regulations The primary findings held in this more nuanced analysis. Academic interest did not reach significance for either STEM major cluster. Perceived ability remained highly significant as a predictor of students' attainment of Computing/Engineering degrees (RRR math ability = 1.63, p < 0.01; RRR science ability = 1.50, p < 0.001) and other STEM fields (RRR science ability = 1.50, p < 0.001). We could see how the gender disparity in STEM varied widely and indeed flipped depending on which STEM cluster we examined, as girls' probability of majoring in Computing/Engineering was strongly negative (RRR = 0.25, p < 0.001) but their probability majoring in other STEM fields that were generally less applied and less sex-segregated was highly positive, all else equal (RRR = 2.35, p < 0.001). Race matters differently in each, and only in part. Identifying as Latina/ o/x as compared to white was positively associated with Computing/Engineering major choice (RRR = 1.48, p < 0.05), and identifying as Asian was positively associated with STEM major choices in the comparatively less applied Other STEM category (RRR = 1.65, p < 0.001). Private schools remained a significant and negative predictor, but only with Computing/Engineering majors (RRR = 0.68, p < 0.01). High school science GPA was predictive only of the non-Computing/Engineering STEM cluster (RRR = 1.25, p < 0.01), and course taking was insignificant across categories.
Returning to Research Question #3, we reported on interaction models with figures for clearer demonstration of the predictive and in some cases intersecting relationships between demographic characteristics, abilities, and interests, across these major clusters. More specifically, we used the postestimation margins command in Stata 16 to estimate from the interaction model students' predicted probabilities of choosing a STEM major. Figures 2  and 3 show whether the relationships between perceived abilities/interests and STEM major choice varied for men and women. The slopes did not vary by mathematics/ science interest but do vary in perceived mathematical and scientific abilities. In other words, the relationship between mathematics/science interests and STEM major choice (specifically, Computing/Engineering and Other Sciences) did not differ for boys and girls. However, perceived mathematical and scientific ability significantly moderated the relationship between gender and the choice of Computing/Engineering (vs. Non-STEM) degrees, whereby higher perceived ability had a greater effect on boys than girls (p < 0.05 for each). The interaction with Social/Behavioral, Natural, and Other STEM fields (vs. non-STEM) was significant in the other direction, where perceived mathematics ability had a more positive effect on girls than boys (p < 0.05).
Figures 4 and 5 in Appendix show predicted probabilities by race, by degree field cluster. These lines consistently had similar trends and slopes. These figures supported our null finding: the predictive relationship between perceived abilities and interests on STEM major choice was stable across students' self-identified race categories.
We added one additional robustness check on our findings, adding a final full re-analysis of our models with the NCES-generated scales for perceived abilities and interests, to assess the merits of our predictors as compared to those made freely available to users of the HSLS dataset (mathematics and science self-efficacy, respectively: x1mtheff, x1scieff; mathematics and science interest, respectively: x1mthint, x1sciint). We did so to add further transparency for any studies that might employ these measures for replication or cross-cohort comparisons. The core findings aligned, but our predictor-and model-level statistics were stronger than those using the NCES constructs. For instance, Table 4's full model with perceived abilities and interests had an F-test of 24.3 as compared to its counterpart in the NCES-variable analysis, with an F-test of 19.3. Similarly, Table 6's F-test of 19.5 was stronger than the comparative model which had the same N but a probability of 17.3. Accordingly, we did not report these findings here given the already extensive series of tables and figures included in the manuscript and its supplement. These checks did enhance our confidence in this study's design.
In summary, across our analyses, we found that perceived abilities and interests could significantly and positively predict students' STEM major choice. Notably, perceived mathematical and scientific abilities seemed to be a better predictor than mathematics and science interests. Finally, we found that these relationships did not vary by race. However, the relationship between perceived mathematics and science ability and STEM major choice did vary by gender, and with distinct effects depending on how STEM was defined and categorized.

Discussion
Given persistent disparities in STEM postsecondary majors and degrees, it is imperative to identify malleable factors that may welcome rather than deter students from majoring in these key fields. Our descriptive analyses of U.S. students reported in Tables 1 and 2 show a clear gender gap and more nuanced racial disparities in STEM major choice. The effect size analysis suggests that while Black and Latino male students reported higher mathematics and/or science interest as compared to White male peers, women and some racially minoritized groups still represent a small share of STEM majors. Further exploration of the data reveals that besides perceived abilities and academic interests, students' observed mathematics and science abilities (grades) and father's STEM Table 6 Predictive relationships between perceived abilities, interests, and STEM major clusterŝ p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. Standard errors are shown in parentheses. Multinomial logistic regression results are reported. RRR represents relative risk ratios, where numbers greater than zero represent a positive increase in the relative probability of an event occurring. N was rounded to the nearest ten to comply with NCES restricted-data regulations Source: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009(HSLS: 09), Base Year, Student Survey, 2009Parent Survey, 2009;Update, Second Follow-up, Student Survey, 2016Common Core of Data, Private School Survey, 2005 Computing and engineering sciences occupation also predict their STEM major choices. Two policy recommendations emerge from these findings. First, we find that the gender gap in STEM fields remains severe. The results reported in Table 3 indicate that every female subgroup had lower perceived mathematical and scientific abilities than their corresponding same-race male peers (e.g., Black girls had lower perceived abilities than Black boys). This is in line with extant research showing girls' self-ratings are lower even when test scores are identical to those of boys . Traditional social norms such as "girls are not as good as boys at math" may continue to influence girls' cognitive development (Harro, 2000). In our analytic model results, girls are less than half as likely as boys to major in STEM, 3/4 less likely to major in Computing/Engineering fields, and more than twice as likely to major in other STEM fields.
Notably, the moderating effect we observe with our significant interaction terms-where perceived mathematics and scientific ability change the nature of the relationship between gender and major choice-we see the status quo enhanced, not lessened. More positive perceptions of one's ability promotes STEM, Computer/Engineering, and Other STEM major choice for both girls and boys, as we see in Tables 4, 6, and Fig. 2. However, boys see the greatest gains in STEM and Computer/Engineering major choice (as compared to non-STEM fields), even though they are already overrepresented in these fields. Girls experience greater gains than boys in the relationship between perceived scientific ability and Social/Behavioral, Natural, and Other STEM fields (see Fig. 2). But while these patterns support individual students, at the aggregate level, they reinforce rather than undo existing gender distinctions. It may be that gendered (and racialized) social norms mute the potential effect of academic interest and perceived abilities for students who do not regularly encounter reinforcing motivational forces in their secondary and postsecondary schooling environments.
Socializing contexts in schools, families, and broader culture could be sites to undo the status quo, to allow girls to believe (as boys seem to) that they are positioned to be successful in mathematics and science classrooms and careers that employ these skills. This may be especially important in public schools who enroll most U.S. students, as shown in our descriptive results. Our findings indicate the chances of majoring in STEM are considerably stronger in public high schools, even after controlling for state fixed effects as we did in an earlier analysis. Public high schools appear especially poised for potential interventions to increase STEM interest, which may also foster higher perceived mathematics and science ability in high school and subsequent postsecondary STEM majors. Given the positive association between higher levels of (and indeed any) high school physics course taking, making these foundational courses available to students-irrespective of demographic background-appears essential to allow them the opportunity to enter and complete STEM majors across disciplines. Studies of 2-year colleges also suggest both opportunities and challenges for STEM equity in this sector, as gender gaps remain in mathematics-intensive STEM fields  and research attending to gender and race together find additional challenges for racially minoritized women transferring from 2-year to 4-year college STEM programs (Allen et al., 2022).
Our second policy recommendation focuses on financial support for STEM-aspiring students. With respect to socioeconomic challenges that disproportionately affect students of color, St. John and Asker (2003) illustrated that many students with academic preparation cannot enter college because of financial constraints. Quadlin (2017) found that students without financial burdens were more likely to choose liberal arts majors in college, while low-income students may be motivated to pursue college majors associated with higher paying careers, including STEM. However, STEM majors may require additional time and coursework for students whose high schools did not offer as many opportunities, which may present a financial burden. Public and institutional policy interventions can address such challenges by enhancing access to financial aid for underrepresented students to pursue STEM careers, with benefit for larger economic opportunity.

Conclusion
Using the most recent nationally representative dataset, this study examines the longitudinal effects of self-assessed perceived ability and academic interest in mathematical and scientific domains on high students' postsecondary STEM major choice. Major findings include that (1) perceived mathematical and scientific abilities positively predict students' STEM major choice; (2) higher mathematics and science interests increase the likelihood of majoring in STEM fields, but less so than perceived mathematical and scientific abilities, and (3) these relationships vary by gender but do not consistently vary by race/ethnicity. Together, these findings suggest that high school educators ought to focus more on encouraging students in mathematics and science courses and activities, to enhance their perceived mathematical and scientific abilities, ultimately positioning them to potentially enter STEM fields. Additionally, public schools organized to support the ambitions of STEM students of all backgrounds may be better positioned to reduce postsecondary disparities in STEM major choice.   Continuous variable S1MPERSON1, S1MPERSON2, S1MUNDERST, S1MTESTS, S1MTEXTBOOK, S1MSKILLS, S1MASSEXCL Math observed ability (GPA)

Math interests
Student's math interests, which is measured by three items using Cronbach's alpha Continuous variable S1MENJOYING, S1MWASTE, S1MBORING