Skip to main content

Inclusion in practice: a systematic review of diversity-focused STEM programming in the United States


Colleges across the United States have shown a commitment to advancing diversity in the STEM fields by creating programs aimed at improving outcomes of women and/or racially and ethnically minoritized students. However, most existing literature focuses on the successes of singular college programs rather than comparing these STEM interventions across the higher education landscape. This systematic review investigates the literature on diversity-focused “STEM intervention programs” (SIPs) at the postsecondary level. We categorize key features of these programs and their outcomes, and we look at which program components have the most empirical support. We examine 82 articles that reported on SIPs with disaggregated outcomes, coding each initiative’s features and outcomes. Across these articles, we found six common program components, with most programs including more than one component, and five common program outcomes. Just 53 articles tested differences in outcomes of participants relative to a comparison group. This subset of research found support for the effectiveness of all coded components for improving student outcomes, though studies of multi-component programs did not parse the relative contributions of each component. Based on these findings, we conclude multi-component interventions that create a welcoming environment and focus on the successes of minoritized students help redress existing institutional shortcomings and are a promising step towards diversity, equity, and inclusion in STEM. However, more rigorous quantitative studies are needed to empirically assess the effectiveness of individual SIP program components.


The lack of gender and racial parity in the field of STEM has been studied for more than four and a half decades (Kanny et al., 2014; National Center for Science and Engineering Statistics [NCSE], 2021; Ong et al., 2011). Yet students belonging to certain ethnic and racial groups—including Latinx, Indigenous, and Black/African-American students—still earn a disproportionately low percentage of bachelor’s degrees in STEM fields in comparison to their representation in the general population of the United States (Fry et al., 2021; NCSE, 2021). Black and Latinx STEM majors transfer out of those majors more often than White STEM students (Flynn, 2016). Furthermore, while women earn almost equal numbers of science undergraduate degrees as men, the number of women awarded degrees in “hard science” fields like computer science and engineering is very low (Fry et al., 2021; NCSE, 2021). Black and Latinx workers and women have historically been underrepresented in STEM occupations, and they are particularly underrepresented in the highest-earning STEM occupations in tech, computer science, and engineering (Funk & Parker, 2018; Muro et al., 2018; U.S. Bureau of Labor Statistics, 2021). In turn, the racial/ethnic and gender wage gaps are even larger in STEM occupations than in non-STEM occupations (Funk & Parker, 2018). Women of color are affected by racial and gender trends simultaneously, a phenomenon labeled within STEM literature as “the double bind” (Hall & Sandler, 1982; Ong, et al., 2011). The Pew Research Center reports that “[c]urrent trends in STEM degree attainment appear unlikely to substantially narrow these gaps”, even “amid longstanding efforts to increase diversity in STEM” (Fry et al., 2021, pp. 4–5).

The enduring struggle to diversify STEM fields necessitates continued research into diversity-focused interventions. As Kanny et. al. (2014) note, that research on diversity, equity, and inclusion (DEI) in STEM has continued for so long without solving issues of underrepresentation points to a major gap in both literature and praxis. This article examines one prevalent tactic used throughout the United States to encourage STEM persistence: “STEM intervention programs” (SIPs), college programs dedicated to helping students historically underrepresented in STEM to prepare for and graduate from STEM fields (Rincón & George-Jackson, 2016a, p. 743, 2016b). In this article, we refer to groups that have been historically underrepresented in STEM, particularly in technology, computer science, and engineering, as minoritized populations, where minoritized describes groups with STEM outcomes tied to their experiences of historical and present marginalization by dominant group members (Chase et al., 2014, p. 671).

While STEM programming is different at every institution, certain program features and components appear repeatedly across the higher education landscape (Castro, 2014; Chubin et al., 2015; Rincón & George-Jackson, 2016a, 2016b; Tsui, 2007). However, for many years there has been a lack of up-to-date, comprehensive reviews of STEM college programming in the United States, especially ones that look into these different features. Thus, the purpose of this article is to systematically review DEI-focused STEM interventions, categorize these programs’ features and outcomes, and hypothesize patterns of association between STEM programming reported in the literature and outcomes for minoritized populations in STEM, particularly in technology, computer science, and engineering. We ask: what are the features of STEM programs that produce positive outcomes for underrepresented minorities? We argue that diversity-focused STEM programs have clear features associated with successful outcomes and that overall, DEI-focused STEM programs show promise in fighting the discriminatory environments and lack of institutional support in many college STEM departments.

Literature review

Institutional failures to support diversity in STEM

Educational institutions serve as a primary site of experiences and opportunities that shape students’ STEM interest, efficacy, and outcomes (Fouad & Santana, 2017; Lent & Brown, 2019). A large body of literature has documented ways educational institutions have failed to support STEM diversity at the postsecondary level. This literature has focused on how academic environments cause students of color and women to feel excluded, how schools provide insufficient academic preparation to minoritized youth, the prevalence of overly complex institutional course structures and financial aid requirements, and other institutional shortcomings.

Numerous studies have found that the climate of STEM higher education programs is often unwelcoming for certain minoritized populations. The “chilly climate” theory (Hall & Sandler, 1982) was coined almost 40 years ago and continues to be discussed in contemporary literature on higher education (Bottia et al., 2021; Giles, 2015; Lee & McCabe, 2020; Morris, 2003; Rincón & George-Jackson, 2016a; Rolin, 2008). According to this theory, minoritized students in STEM may experience discrimination in almost every aspect of college life, from interactions with peers to faculty to administrators, due to a “chilly” culture which tacitly allows discrimination and hostility towards these students (Bottia et al., 2021; Giles, 2015; Kanny, et al., 2014; Lee et al., 2020; McGee, 2020; Ong, 2005; Rolin, 2008). For example, Lee and McCabe (2020) found that “gendered expectations”, including those perpetuated by professors, may lead to an environment where female students speak less and/or more hesitantly during STEM classes (p. 50). Although the phrase “chilly climate” originally aimed to encapsulate women’s negative experiences in higher education, it has been expanded to include male students from minoritized ethnic and racial groups as well (Bottia et al., 2021; Giles, 2015; Hall & Sandler, 1982; Harper, 2012). Building on this work, Harper (2012) and Giles (2015) assert that higher educational environments are characterized by discriminatory actions and barriers better labeled as racist and sexist than by euphemistic terms like “chilly”. Lord et. al. (2009) add, “If the climate has been characterized as ‘chilly’ for women […] the terrain is ‘icy’ for minority women” (p. 170). Many negative behaviors that contribute to a chilly environment for women and racial and ethnic minorities also fall under the category of micro-aggressions, shown by Lee et. al. (2020) to be prevalent in higher education STEM spaces. Whether this theory is called chilly climate, the discriminatory environment of STEM, or something else, it has a noticeably negative effect on minoritized students in STEM majors.

A second approach posits that minoritized students leave STEM because they lack sufficient academic preparation. There are two ways to think about this approach: through a deficit lens, and through an anti-deficit lens. Deficit thinking is described by Valencia (2010) as “an endogenous theory—positing that the student who fails in school does so because of [their] internal deficits or deficiencies” (p. 6–7). Deficit thinking steers higher education institutions to hold minoritized students liable for their insufficient preparation in STEM prior to college, ultimately victim blaming students and faulting them for institutions’ own lack of diversity (Castro, 2014; Harper, 2010, 2012; McGee, 2020; Valencia, 2010). Anti-deficit thinking recognizes that in educational institutions in the United States, “[r]acialized opportunity structures lead to racialized academic achievement patterns”, which includes “school failure” from students both before and during college (Valencia, 2010, p. 2–3). For example, students in racial and ethnic minoritized groups are more likely to attend high schools that prepare them inadequately for college-level academics (Bound et al., 2009; Ciocca Eller & DiPrete, 2018; Deil-Amen & DeLuca, 2010; Jennings et al., 2015). A lack of funding for racially minoritized students’ K-12 schools could affect their pre-college exposure to fields like computer science or to high-level coursework in STEM (Bottia et al., 2021; Byrd, 2020). Academic under-preparation is also due in part to academic curricular tracking. A large body of literature has found that between-class sorting based on perceived academic ability disproportionately channels racial minority students into low-level academic coursework, where they perform worse than their peers in heterogeneously grouped classes (Gamoran, 2009; Loveless, 2009; Oakes, 1985; Rosenbaum, 1976; Rui, 2009). Rather than “pathologiz[ing]” students, anti-deficit thinking asks for reflection and action from education institutions on their role in ensuring the retention of minoritized students in STEM (Castro, 2014, p. 415).

Even when they are equally academically prepared, Black and Latinx students are more likely than White students to attend open-access community colleges and less likely to attend selective 4-year colleges (Carnevale et al., 2018). Black and Latinx students are then disproportionately assigned to remedial coursework, which increases the time and cost of degree completion (Palmer et al., 2010; Sanabria et al., 2020). Open-access public colleges also receive much less in state appropriations than the selective public colleges that White students are more likely to attend. Researchers have found that lower institutional resources are associated with lower degree completions (Bound et al., 2009). Due to their more ample funds, selective public colleges are able to offer higher-quality instruction, advising, and other student support services (Brock, 2010; Carnevale et al., 2018). Open-access community colleges, on the other hand, often have severely limited advising services that are difficult for students to access (Rosenbaum et al., 2017). Community college students who aim to attain a bachelor’s degree in STEM must navigate complex institutional requirements that hamper many students’ efforts to make the transition to a 4-year school (Jenkins & Fink, 2016).

A lack of institutional support may also encompass other aspects of higher education for students. For example, because of a paucity of women and people of color in STEM faculty roles, students of color and female students may not see themselves in the upper echelons of STEM and may experience a lack of support and mentoring from institutional figures (Espinosa, 2011; McGee, 2020). Espinosa (2011) also faults colleges, particularly “predominantly White, large public research institutions”, for perpetuating “impersonal, large classrooms; unapproachable professors; and competitive grading practices resultant from a system that actively attempts to ‘weed’ students out of STEM majors” (p. 214), while McGee (2020) suggests that “Eurocentric” STEM departments sustain a culture of “meritocracy”, “unrelenting competition”, and more (p. 634). These systemic cultural facets of university STEM departments may be particularly discouraging for minoritized populations. For example, among students with low performance in introductory college STEM courses, racial/ethnic minority and female students were less likely than White male students with similar performance to complete a STEM degree (Hatfield et al., 2022). Finally, complex financial aid requirements reduce rates of college completion (Dynarski & Scott-Clayton, 2013; Ciocca Eller & DiPrete, 2018). STEM programs may suffer if they do not have consistent and “strategic” institutional support in the form of funding for both students and program administrators—especially if these SIPs are aimed at low-income or first-generation students (Chubin et al., 2015; Linley & George-Jackson, 2013, p. 101; National Center for Education Statistics, 2019; Rincón & George-Jackson, 2016).

STEM intervention programs

Postsecondary STEM intervention programs (SIPs) are designed to address underrepresentation in STEM. Rincón and George-Jackson (2016a) describe SIPs as “supplemental programs offered by colleges and universities to attract, retain, and support traditionally underrepresented students” (p. 743). Although many SIPs are dedicated to increasing diversity in STEM, the institutional rationale behind these programs varies (George et al., 2019). For example, a qualitative study of 39 SIPs by George et. al. (2019) lists “recruitment and retention”, “external funding opportunit[ies]”, and the documented achievements of other SIPs as common motivators for universities to enact STEM programming for their students (p. 1654). While this article focuses on features and outcomes of STEM programs, it is important to note that the reasons programs are initiated are likely to have effects on their results (George et al., 2019). Additionally, although certain programs have better reputations and are more widely studied than others (George et al., 2019, p. 1646), geographic and community context is crucial to the formation, running, and discussion of each SIP (Chubin et al., 2015, p. 275; Lent & Brown, 2019).

There are a number of critiques of SIPS: for example, George et. al. (2019) express skepticism towards STEM initiatives solely focused on increasing statistical representation in the student body, as “the presence of individuals from particular backgrounds does not necessarily result in salient markers of postsecondary success, such as students’ inclusion, sense of belonging, or persistence” (p. 1652). Achieving these less tangible markers of meaningful postsecondary STEM diversity may require broader institutional reform. Scholars including López et. al. (2022), McGee (2016, 2020), Miriti (2020), Robinson (2022), and Whittaker and Montgomery (2012) have argued that student-focused interventions, independent of broader systemic transformation around higher education’s biased culture, values, structures, and practices, are insufficient for, or even distracting from, broad and sustained progress toward diversity, equity, and inclusion in STEM. According to this viewpoint, efforts toward systemic change require actively confronting the ways dominant cultural biases are embedded in higher education’s research, teaching, and service. Such biases may shape perceptions of which research agendas are legitimate, which activities are most rewarded (where publications and grants outrank mentoring and service toward diversity efforts), and which faculty members’ perspectives are considered in the construction of institutional policy (Whittaker & Montgomery, 2012, p. 239–240). SIPs and other interventions that target students rather than higher educational institutions themselves are unlikely to move the needle on these cultural and social dynamics. As Linley and George-Jackson (2013) state, “Programs that seek to repair students rather than initiate institutional change will fail to contribute to the social change that is needed to include and advance underrepresented students in the STEM fields” (p. 100).

Additionally, theoretical frameworks that have been used to understand student persistence in STEM, such as the widely used social cognitive career theory (Fouad & Santana, 2017; Lent & Brown, 2019), emphasize that learning experiences like SIPs are only one piece of a much broader puzzle. Students’ interest in and persistence in STEM are shaped by their feelings regarding their own self-efficacy and possible outcomes in STEM, which in turn are influenced by personal factors and contextual factors such as community access to supports and information about STEM, experiences with family and friends, and so forth. Learning experiences are generally not designed to impact these other important influences.

Despite these critiques and considerations, previous reviews and syntheses of STEM program features have documented the benefits of SIPs for women and/or racial and ethnic minorities. One of the most comprehensive articles may be the work of Tsui (2007), who divides the STEM program features reported in the literature into ten distinct categories: summer bridge, mentoring, research experience, tutoring, career counseling and awareness, learning center, workshops and seminars, academic advising, financial support, and curriculum and instructional reform. Their study reports that mentorship and research experience are some of the most commonly reported STEM program features, although the most successful programs are those which provide the best comprehensive support through multiple features (Tsui, 2007, p. 21). Tsui’s (2007) categories have parallels in other articles exploring the features of SIPs. For example, in Rincón and George-Jackson’s (2016b) examination of 48 STEM programs, the authors classify these programs’ “services” as academic advising, financial support, professional development, exposure to STEM, social interaction, structured learning and tutoring, hands-on experience and research, residential experiences, recruitment, and mentoring and networking (p. 433); and George et. al. (2019) use almost identical program categories. The National Academies of Sciences, Engineering, and Medicine (2016) breaks down SIPs—which they call co-curricular programming—and their features into the categories of internships, summer bridge programs, student professional groups, peer tutoring, living and learning environments, and comprehensive interventions (pp. xviii, 95–102). Most recently, Pearson et. al. (2022) detailed features and outcomes of 25 STEM intervention programs for low-income, first-generation, and underrepresented student groups. The authors focused on STEM programs related to engineering, and they categorized 13 program features: recruitment; professional development/networking; research experiences; tutoring and study skills; targeted academic interventions; graduate school/GRE prep; mentoring; social integration experiences; community service; summer bridge transition programs; experiences influencing character traits; and financial support. They found that these features were correlated with positive outcomes, including year-over-year retention, graduation rate, grade point average, and students’ beliefs linked to persistence.

The current study

In this study, we build upon previously published reviews of STEM programming, developing a comprehensive and up-to-date view of DEI-focused SIPS. Like Tsui (2007) and Pearson et. al. (2022), we review STEM programs that center racially minoritized students. We also include programs that target women, focusing specifically on programs to improve DEI in technology, computer science, and engineering—the STEM fields in which future disparities in career outcomes may be critical, as discussed in the introduction. While previous literature has introduced categorizations of program features, few expanded on their labeling schema to the same level as Tsui (2007), and only one has categorized program outcomes (Pearson et al., 2022). While Pearson et. al. (2022) provide an up-to-date review of diversity-focused STEM interventions, our study differs in several important ways: we include studies from a broader range of years as well as studies of technology and computer science programs, and we limit our focus to the undergraduate level.

Here, we categorize SIP program features and outcomes, characterizing each in-depth and describing how they relate to one another. We frame program features as components of institutional supports for individual students and outcomes as evidence of institutional supports, recognizing that the student-centered nature of SIPs may limit the extent to which programs directly address the roots of systemic institutional failures.


In this article, we conducted a systematic review (Alexander, 2020; Booth et al., 2016) of 82 qualitative, quantitative, and mixed-methods studies published between 1991 and 2020 that report on STEM programs at 4-year U.S. colleges. The central research question for this article was: what are the features of STEM programs that produce positive outcomes for underrepresented minorities? Our goal was to describe and synthesize this literature using systematic procedures, as outlined below (Alexander, 2020; Booth et al., 2016). Given weaknesses in the underlying literature, this review does not contain a meta-analysis of the literature. We do not estimate the size or directionality of associations between STEM program components and student outcomes, as this kind of analysis was not appropriate for this literature set. Next, we detail the methods for our selection, coding, and analysis of the literature reviewed in this study.

Criteria for inclusion of literature

After defining our research questions, we outlined search criteria and criteria for inclusion. Our original criterion for literature was that studies should be focused on a technology, computer science, or engineering education program. For the purposes of this article, we defined a “program” similarly to Rincón and George-Jackson (2016a2016b), whose definition of a SIP is incorporated into the literature review. Upon finding few studies that focused exclusively on technology, computer science, or engineering, we expanded our search to STEM in general, seeking to include programs that may have addressed the fields of technology/computer science/engineering but used the term “STEM” to describe their programs. Programs narrowly focused on specific disciplines within the “science” or “math” realms of STEM (e.g., environmental science, chemistry) were excluded, as the information yielded in such studies may not apply to our interest area of technology, computer science, and engineering. Studies also needed to speak directly about a program or intervention, or specific program or intervention features, rather than about general practices that could be used in any program or initiative. Next, we limited our search to articles that were published in peer reviewed journals and that focused on the postsecondary education level, and we later limited our analysis to undergraduate education due to a lack of literature on graduate student and workforce programming. We only included programs implemented in community college settings if they were also implemented at or in partnership with a  4-year institution. We also restricted the search to the United States to collect a cohesive body of evidence in a national context, and we narrowed our review to studies that included some documentation of outcomes. These outcomes could be either quantitative or qualitative, but they needed to be present in some form. Studies reporting any outcomes were included, even if those outcomes were unrelated to our constructs of interest. Program descriptions without outcomes, policy pieces, and editorials were excluded.

Articles included in the review were required to have disaggregated data for the minoritized groups of interest. This meant one of several things: either the program was geared towards or targeting students from a specific minoritized group, so outcomes were implicitly disaggregated by group; demographic data for participants were reported and included some participants from minoritized groups; or outcome data were disaggregated based on some sort of demographic data for the minoritized groups. The last major requirement for studies was that the program or intervention was directly student-focused; faculty- or institution-focused interventions that may or may not have indirect effects on students via faculty or institutional behaviors were excluded.

Search and eligibility

We searched the Education Resources Information Center (ERIC) database on both the ProQuest (covers 1966–2021) and EBSCOHost (does not specify dates covered) search engines to find the most comprehensive body of literature possible. We generated search terms by brainstorming possible iterations of the terms STEM or technology education, postsecondary education, and diversity, and running these terms through the databases in various combinations. This was achieved by utilizing Boolean shortcuts and syntax. NOT terms were added when searches yielded too many articles displaying characteristics that did not meet eligibility criteria (ex. NOT K-12). See Additional file 1: Appendix A for a comprehensive list of search strings.

All articles were imported into RefWorks citation management software, and all screening for eligibility was done by one author. These searches initially yielded 9187 articles after removing all duplicates. Due to the high volume of articles, we utilized the tag function of RefWorks to remove articles including tags that would exclude them from eligibility. Once studies were flagged by the tag function, the screener scanned all titles and saved those they thought might be pertinent to the review. See Additional file 1: Appendix B for a complete list of tags searched and removed, along with articles saved by title search. The remaining articles underwent abstract scanning by one author based on the predetermined eligibility criteria. After abstract scanning, 144 articles remained. Two authors reviewed the 144 articles’ contents, and articles found to be ineligible through full text review during this phase were then excluded. An additional three articles were excluded because we were unable to access the full text. Two additional articles were added at this stage—these articles were referenced by one of the excluded articles, and we found that they fit our criteria. After the full text of the articles were analyzed by the authors, 93 studies were found to be eligible for this review of literature. Because we then limited the scope of this article to undergraduate education, the final number of studies included was 82.


Three of the authors coded the studies in an Excel spreadsheet using a binary coding scheme. We devised a coding scheme utilizing the PICOS framework. PICOS typically stands for Participants, Intervention, Comparison, Outcomes, and Study Design (Pollock & Berge, 2018; Methley et al., 2014). In coding participant data for the studies, we coded for the program’s intended target population and then documented the number of participants, participant race and ethnicity, and gender. When participants included multiple racial groups, as was the case with most articles, all groups mentioned in the study were included. When participant race or gender was not specified, this was coded for. We also coded for several other factors that emerged during our analysis, including whether participants were considered low-income or were first-generation college students. While we focused on articles that targeted minoritized groups, especially in terms of race and ethnicity and gender, we also included articles that disaggregated for or noted characteristics of students such as first-generation, low-income, disabled, and academically at risk.

With regard to the interventions covered in the literature, our coding focused on documenting the type of program. Specifically, we coded for types of activities or components implemented as part of the programs or interventions. We separated these programs into areas of study: engineering; computer science; or STEM in general, defined as programs that targeted STEM students but did not specify subject areas. We also coded whether a program targeted subjects in addition to engineering, computer science, or STEM in general (such as other science or math courses). These categories were maintained throughout coding, and when multiple areas of emphasis were noted in the study, all categories were indicated. We coded for outcomes and how the study addressed the counterfactual: if they compared to a control or comparison group, and if they calculated the statistical significance of the difference between groups. This specific coding was an iterative process which followed the model of hybrid coding (Braun & Clarke, 2012). We also recorded study design based on type of study—qualitative or quantitative.

All three coders coded 19 of the 82 articles, and inter-rater reliability (IRR) for coding of program interventions, outcomes, and statistical significance was calculated using Fleiss’ kappa. Kappa was calculated to be 61%, indicating substantial agreement (Landis & Koch, 1977). Disagreements were discussed between the coders, and a consensus was reached for all disagreements. Coding was then revised to reflect consensus decisions, but this was not included in calculation of IRR. The remaining 63 studies were split between the three coders. When questions about coding arose, the coders met to resolve confusion and recoded data as necessary.

Program component descriptions

In order to determine what commonly found program components looked like in practice, we gathered all articles containing each specific component that found statistically significant results in one or more areas, and we analyzed the descriptions of each specific component using deductive coding (Braun & Clarke, 2012). We present a narrative summary of this qualitative review in our results.


These results are organized by the elements of the PICOS framework examined in this study. See Additional file 1: Appendix C for details on individual articles.


Articles often reported on the number of participants in the actual program (or programs) being studied, but they more frequently and reliably provided the number of participants in the studies of the program (which might include control groups). The reported program sizes ranged from quite small (serving 16 or 19 students, for instance) to quite large (serving thousands of students). Table 1 shows the range of participant numbers in studies of the programs. In interpreting Table 1, please note that while 52 articles in our review contained a single study of a program or program set, 30 articles contained multiple studies of a program or program set, each with slightly different samples. Thus, Table 1 reports the frequency for each study, meaning the total N is higher than the 82 total articles in our review.

Table 1 Frequency of participants in studies analyzed

These sample sizes ranged from 17 to 12,000 participants. (We cannot calculate an exact mean, as studies sometimes reported sample sizes in a fashion that allowed general inference about size, but not a specific n.). Sample sizes did relate to the kinds of studies conducted. The 30 articles that contained multiple studies used both quantitative and qualitative methods 70% of the time, while articles with single studies used both methods 10% of the time. Indeed, articles with single studies were predominantly quantitative only (75%); both multiple- and single-study articles measured significance at a fairly similar clip (70% and 62%, respectively). Similarly, articles that included small samples of study participants (< 50) were much more likely to use both methods (60%) than articles with only large samples (23%). Articles with larger samples tended to use solely quantitative methods (69% of the time) and to measure significance more frequently than small-sample articles (73% and 40%, respectively). This makes logical sense, as many articles conducted quantitative analyses with large numbers of program participants, then conducted more qualitative analyses with a subsample.

Frequencies of articles (N = 82) that reported participants being in each gender and race/ethnicity categories are reported in Table 2. Most articles described programs that served both genders, and it was also most common that programs served multiple racial/ethnic groups. In addition, 13 articles reported participants’ age, while 69 did not. For the articles that reported age, all 13 had participants between 17 and 25 years old, while 5 had additional participants beyond that age range. Eleven articles reported that some or all of their participants were from low-income households, two articles included some or all participants with disabilities, and 19 articles reported that some or all participants were first-generation college students.

Table 2 Gender and race or ethnicity in articles in the systematic review


The 82 articles in this review examined a wide range of programs. Seventy articles examined a single program, while 12 articles examined a set of programs deemed similar in some regard. A handful of programs were examined by multiple articles, led by the Meyerhoff Scholars Program (9 articles), but most articles focused on unique programs. As noted previously, some articles examined the program(s) with one study/sample (N = 52), while others examined the program(s) with multiple studies/samples (N = 30), often using mixed methods (i.e., a quantitative look followed by a qualitative look with fewer participants). However, no articles examined multiple programs separately; indeed, all articles that examined multiple programs did so with one combined study/sample.

Out of the total number of articles (N = 82), 47 studied engineering programs, 35 studied computer science or technology programs, 14 studied general STEM programs, and 52 studied programs that targeted some other combination of STEM disciplines. Since many programs had more than one disciplinary target, the total in the previous sentence is greater than 82; for instance, an article that looked at a program for engineering and computer science students would be counted twice—once for engineering and once for computer science.

Articles most commonly were focused on programs targeting students who were underrepresented in STEM on the basis of race or ethnicity (N = 38), while a substantial number focused on women (N = 27) and generalized “underrepresented minorities” (N = 20). Some studies (N = 16) concentrated on another target population, such as students who were low-income, first-generation, or academically underprepared. Many articles (N = 19) studied programs with multiple targets (making the total count greater than 82). There were also several programs without a target population (N = 9), but they were included because they disaggregated the data for different student populations.


Table 3 documents the most frequently reported outcomes among articles included in this review. This does not differentiate which outcomes had better results, only the frequency with which they were found in the scope of the review. In interpreting Table 3, note that 56 out of the 82 articles included multiple outcomes, so the N is much higher than 82.

Table 3 Most frequently studied program outcomes in the literature

Of the 82 articles included in the review, 72 used quantitative methods and 36 used qualitative methods, with 26 using both methods. One major finding of this review pertains to the quality of the quantitative research base on this topic. Of the 72 articles with a quantitative component, only 53 measured statistical significance in relation to a null hypothesis based on some sort of comparison or control group, such as through longitudinal designs or pre–post analyses, and none used other statistical methods to evaluate outcomes.

Table 4 displays information about significant findings for Retention Outcomes, Academic Outcomes, and Psychological Outcomes. For each of these outcomes, we present number of articles that measured significance, number and percentage of those articles that reported positive significance, and the frequency of features reported for the program(s) studied in those articles. Note that many of the articles measured more than one of the outcomes. Because there were not enough studies to make meaningful comparisons in articles measuring the statistical significance of Graduate School Outcomes and Employment Outcomes, those articles are not included in Table 4 (changing the total number of articles to 47).

Table 4 Overview of articles that reported significance for each outcome

It is important to keep in mind that the findings reported by these articles, as a group, should not be interpreted causally. Few articles employed experimental or quasi-experimental designs, and most programs that were studied had multiple components—meaning that the components were studied together, not individually. In interpreting these results, we note that publication bias may have prevented studies that found null or negative results from being included in this review. In turn, the positive associations found should be interpreted cautiously (Torgerson, 2006). A version of Table 4 in which positively significant findings are related to features can be found in Additional file 1: Appendix E. Because many of the features were combined in programs, we believe such a breakdown can be slightly misleading, and thus do not include it in the main text.

Finally, Table 5 delineates the most common program features found as a part of our review and the frequency with which they showed up in the 82 articles without accounting for effectiveness or outcomes. The categories of program features will be explained according to their descriptions in the existing literature. Again, note that the sum of frequencies is greater than 82 because most articles (N = 67) studied programs with multiple features.

Table 5 Common program features in the literature

Skill building

Skill building refers to opportunities for students to apply academic or professional skills in context. In the existing literature, skill building programs were most frequently present in the forms of undergraduate research and service learning.

Most undergraduate research programs were carried out during the summer (Dunn et al., 2018; Hrabrowski & Maton, 1995; Huziak-Clark et al., 2015; Kassaee & Rowell, 2016; Maton et al., 2000; Pender et al., 2010). These programs usually lasted around 8 to 10 weeks (Dunn et al., 2018; Huziak-Clark et al., 2015) and required a full-time commitment of 40 h per week during that time (Huziak-Clark et al., 2015). However, there were several undergraduate research programs that ran during the school year and required students to participate for around five hours per week (Fisler et al., 2000; Windsor et al., 2015). These programs were generally staffed by existing university or college faculty (Baron et al., 2020; Dunn et al., 2018; Fisler et al., 2000; Huziak-Clark et al., 2015; Kassaee & Rowell, 2016; Windsor et al., 2015). Although it was not specified in many cases, it seems that most students were placed into research teams on their own campus, though some programs placed students at other universities, government, and corporate research sites (Hrabrowski & Maton, 1995; Maton et al., 2000; Pender et al., 2010). Programs with skill building components might include workshops or classes on research skills (Baron et al., 2020; Fisler et al., 2000) in addition to more hands-on research activities. Some programs placed students into existing faculty research projects (Fisler et al., 2000), with assignment based on student interests (Huziak-Clark et al., 2015). Many programs offered participants compensation or a stipend for the time they spent working on these projects (Dunn et al., 2018; Fisler et al., 2000; Kassaee & Rowell, 2016; Windsor et al., 2015).

The other way skill building frequently played a part in undergraduate SIPs was in the form of service learning. In the existing body of literature, service learning was often built into for-credit courses as a part of a program (D’Souza et al., 2018; Liou-Mark et al., 2018). According to Howard (2001), service learning includes three main parts: “relevant and meaningful service with the community”, “enhanced academic learning”, and “purposeful civic learning” (p. 15). Examples of service-learning activities include peer leadership on campus (Liou-Mark et al., 2018) and participating in STEM outreach programs in K-12 schools (D’Souza et al., 2018).

Supplemental learning

Supplemental learning refers to opportunities for learning content, academic skills, or professional skills outside of regular university programming. Supplemental learning was presented in several ways in the established body of research, including workshops and seminars, tutoring, supplemental instruction, and learning communities.

Many SIPs included workshops or seminars for program participants. The content of these activities varied, but often included instruction on study skills and learning strategies as well as college life skills (Dunn et al., 2018; Kassaee & Rowell, 2016; Lisberg & Woods, 2018; Van Sickle et al., 2020). Workshops and seminars also featured guest speakers and information on different careers or opportunities (Allen, 1999; D'Souza et al., 2018; Dunn et al., 2018; Gibson et al., 2019; Huziak-Clark et al., 2015; Van Sickle et al., 2020). The frequency of these events varied by program. When scheduling was specified in the research, it was most commonly documented that they occurred on a monthly basis (D'Souza et al., 2018; Dunn et al., 2018; Huziak-Clark et al., 2015).

Tutoring is another activity that falls under the umbrella of supplemental learning. Though many studies reported that programs included or required a tutoring component, most did not disclose detailed information on these activities. However, some studies reported that tutoring was staffed by graduate assistants or peer tutors (Dagley et al., 2016; D'Souza et al., 2018; Pender et al., 2010). Instruction was another frequently observed supplemental learning activity. It was sometimes required for program participants or was graded and attendance based (Peterfreund et al., 2008; Van Sickle et al., 2020). Supplemental instruction sessions may have been staffed by peer leaders who had done well in the class previously (Archat-Mendes et al., 2019; Van Sickle et al., 2020). When documented in the studies, supplemental instruction took up 90 min (Peterfreund et al., 2008) or 150 min (Van Sickle et al., 2020) per week.


Mentorship was an integral part of many programs examined in the scope of this review. This included peer, faculty, and professional mentoring. Peer mentors in these programs were frequently upperclassmen who were alumni of the programs themselves (Good et al., 2002; Huziak-Clark et al., 2015; Ikuma et al., 2019; Lisberg & Woods, 2018). Configurations and models of peer mentoring varied greatly among programs, from one-on-one mentoring (Huziak-Clark et al., 2015) to one mentor per ten students (Dunn et al., 2018). Peer mentors’ roles included providing social-emotional support (Dunn et al., 2018), sharing their own experiences (Lisberg & Woods, 2018), and leading workshops or supplemental instruction for mentees (Liou-Mark et al., 2018; Van Sickle et al., 2020).

Mentorship by faculty members was often part of undergraduate research experiences (D'Souza et al., 2018; Huziak-Clark et al., 2015; Maton et al., 2000; Pender et al., 2010). Faculty took on roles in which they provided social, emotional, and practical support to their mentees (D'Souza et al., 2018; Estrada et al., 2018). Mentors also served as sources of formal or informal academic advising (Dagley et al., 2016; D'Souza et al., 2018; Dunn et al., 2018). Mentorship by professionals in industry was a part of two programs (Hrabrowski & Maton, 1995; Ikuma et al., 2019; Maton et al., 2000).


Social components were described in less detail than other types of program components. However, from the body of literature, we concluded that these types of activities may include cultural events (Allen, 1999; Hrabrowski & Maton, 1995; Maton et al., 2000), dinners (Allen, 1999; Good et al., 2002), field trips (Gibson et al., 2019; Windsor et al., 2015) and networking events (Windsor et al., 2015). One program, the Meyerhoff Scholars Program, included family members as part of their program community by inviting them to events (Hrabrowski & Maton, 1995; Maton et al., 2000). The frequency of social activities varied greatly from weekly (Good et al., 2002) to once per semester (Van Sickle et al., 2020). A notable strategy utilized by one SIP was to recruit upperclassmen to serve as leaders during networking events and interact with underclassmen (Windsor et al., 2015).

Learning communities were also included in several programs. Study groups were often a large part of these communities (D'Souza et al., 2018; Pender et al., 2010; Windsor et al., 2015). Several learning communities provided an option to live together in the same dorm community, also known as living-learning communities (Allen, 1999; Fisler et al., 2000; Sezelenyi & Inkelas, 2011). Learning communities seemed to be a larger framework into which other types of program components were integrated.

Financial aid

Of the programs that offered some sort of financial aid or incentive, aid was primarily offered in the form of scholarships or stipends. Although some studies of programs that included scholarships did not specify the amount of aid given, those that did were primarily focused on the Meyerhoff Scholars program, which provides a full academic scholarship, room and board, and covers books and fees for participating students on the basis that they maintain a B average and a science or engineering major (Hrabrowski & Maton, 1995; Maton et al., 2000; Pender et al., 2010). Furthermore, one SIP based scholarship amounts on a sliding scale dependent on students’ scores on application criteria (D'Souza et al., 2018). Those programs that included stipends varied in the amount of support provided, from $250 (Baron et al., 2020) to $3500 (Dunn et al., 2018), depending on the amount of work or time commitment expected in return. Several programs’ stipends were dependent on participation in program activities or research work (Baron et al., 2020; Dunn et al., 2018; Fisler et al., 2000; Lisberg & Woods, 2018).

Bridge programs

Bridge programs are programs that occur between students leaving their last institution (e.g., high school, community college) and courses beginning at their new institution. These programs often included many of the same components as STEM programs at large. Some contained for-credit coursework (Hrabrowski & Maton, 1995; Maton et al., 2000), while others included mock classes (Lisberg & Woods, 2018; Murphy et al., 2010) or intensive instruction in one or more areas (Huziak-Clark et al., 2015; Kassaee & Rowell, 2016). Some programs also included social events (Hrabrowski & Maton, 1995; Maton et al., 2000). These programs were often staffed by both faculty and peer mentors or coaches (Huziak-Clark et al., 2015; Lisberg & Woods, 2018; Murphy et al., 2010). The duration of these bridge programs varied from 4 days (Fisler et al., 2000) to 5 weeks (Murphy et al., 2010), but the most common duration was 2 weeks (Kassaee & Rowell, 2016; Lisberg & Woods, 2018; Van Sickle et al., 2020).

Discussion and implications

This study began by asking the question: what are the features of STEM programs that produce positive outcomes for underrepresented minorities? We systematically reviewed 82 published articles on STEM intervention programming in the United States. Studies focused particularly on the fields of technology, computer science, engineering, or STEM in general, and they disaggregated information on students’ gender and/or racial or ethnic identity. Like Tsui (2007), George et. al. (2019), Rincón and George-Jackson (2016), and Pearson et. al. (2022), this article categorizes the common features of STEM intervention programs. We found six groups: supplemental learning, mentorship, skill building, financial aid, socializing, and bridge programs. All of these components can be considered institutional supports to address prior educational system failures, where failures include excluding underrepresented minorities from STEM environments (Bottia et al., 2021; Giles, 2015; Hall & Sandler, 1982; Harper, 2012; Kanny, et al., 2014; Lee & McCabe, 2020; Lee et al., 2020; Lord et al., 2009; McGee, 2020; Morris, 2003; Ong, 2005; Rincón & George-Jackson, 2016a), inadequately preparing them for rigorous coursework (Bottia et al., 2021; Bound et al., 2009; Byrd, 2020; Ciocca Eller & DiPrete, 2018; Deil-Amen & DeLuca, 2010; Gamoran, 2009; Jennings et al., 2015; Loveless, 2009; Oakes, 1985; Rosenbaum, 1976; Rui, 2009; Valencia, 2010); arduous course requirements coupled with inadequate advising (Brock, 2010; Carnevale et al., 2018; Palmer et al., 2010; Rosenbaum et al., 2017; Sanabria et al., 2020); and burdensome financial aid processes for low-income students, who are disproportionately students of color (Dynarski & Scott-Clayton, 2013; Ciocca Eller & DiPrete, 2018). Additionally, we categorized commonly reported program outcomes into five groups: retention or graduation rate, academic outcomes, psychological outcomes, graduate school admission or intent, and employment.

Only about two-thirds of the quantitative articles included in this review used statistical techniques to evaluate their outcomes. Articles may have not measured or not reported how participant outcomes fared relative to comparison students for a number of reasons, including low numbers of participants, poorly matched comparison groups, or a lack of statistically significant findings. In turn, the sample of articles we used to evaluate program effectiveness may be biased, such as by having an overrepresentation of articles with positive findings. We also note that the majority of the articles we reviewed measured correlations rather than causal relationships between program features and student outcomes. In turn, we document these correlations but are not able to provide evidence that program components directly cause any of the outcomes observed. These findings point to the need for more rigorous quantitative methodological designs that evaluate the effectiveness of various program features, implemented independently and in combination with one another.

When limiting our analyses to articles that evaluated participant outcomes relative to a null hypothesis based on comparison students, we found that each category of features showed promise for improving outcomes for minoritized students in STEM, as all were included in studies that found statistically significant positive outcomes. One reason why these program features appear to be successful at achieving positive outcomes may be due to a negation or softening of STEM’s chilly climate. Because these programs are dedicated to uplifting minoritized students, students have the chance to socialize and learn with others who share their experiences with and feelings about STEM (Tsui, 2007). For example, in Ramsey et. al. (2013)’s study of the University of Michigan’s Women in Science and Engineering program, the researchers found that “environmental reminders of ingroup success made women seem more prevalent in STEM careers and reduced participants’ stereotyping concerns” (p. 393). The committed support for diverse students within university STEM programming could explain why these programs are successful at achieving their outcomes, as well as showcase a possible solution to helping students feel like they belong in STEM spaces.

The STEM program features studied here may also be successful because they are dedicated to improving academic preparation and providing other student supports (Valencia, 2010). STEM programs with features like financial aid, supplemental learning, skill building, and bridge programming give students educational and institutional support beyond the norm. The Meyerhoff Program, mentioned above, is one such example. By providing financial support that is contingent on high grades, the Meyerhoff Program and others like it aid students while driving them to succeed. Meyerhoff participants are also provided access to tutors, optional study groups, a preparatory summer bridge program, and academic counselors, ensuring that students do not have to struggle by themselves to meet high program expectations (Maton et al., 2000; Tsui, 2007). This initiative is just one example of how colleges can use their resources to address systemic institutional failures and help students in STEM to thrive.

By showing the success of program features at achieving positive outcomes for students from diverse backgrounds, this review provides evidence that SIPs can help students who historically have been insufficiently supported in STEM, particularly the technology, computer science, and engineering fields, to persist and achieve. Therefore, colleges devoted to diversity in STEM fields should consider creating or expanding these STEM-focused programs. With technology and science as omnipresent as they are, helping present and future college students explore their passions for STEM is more critical than ever (Bottia et al., 2021; Funk & Parker, 2018; U.S. Bureau of Labor Statistics, 2021).

However, the slow progress at improving diversity in STEM over the past 40 years (Kanny et al., 2014; Hall & Sandler, 1982; NCSE, 2021; Ong et al., 2011) despite the increasing prevalence of SIPS (Rincón & George-Jackon, 2016a, 2016b) suggests that scaling these programs alone is insufficient to achieve equitable representation. The program features identified do not show the full breadth of actions that program-running institutions can and should take to promote diversity in STEM and tackle discrimination in the world of academia and beyond (Allen-Ramdial and Campbell, 2014; BrckaLorenz et al., 2021; George et al., 2019; McGee, 2020). While this systematic review primarily focuses on programs implemented within existing systems or institutions, what may really be necessary to remedy this issue sustainably is a fundamental change in the way that these systems and institutions operate (López et al., 2022; McGee, 2016, 2020; Miriti, 2020; National Academies of Sciences, Engineering, & Medicine, 2016; Robinson, 2022; Whittaker & Montgomery, 2012).


This review has several limitations. For example, we may have passed over interventions that were very effective or promising because the article studying the program did not provide indicators to assess effectiveness. Publication bias may also have been a significant limitation in our systematic review. There is a documented tendency for journals to publish studies that have positive results (Torgerson, 2006). We could be missing out on a well-rounded body of literature because studies deeming STEM program practices to be ineffective may not be published.

Because the programs reviewed specifically focused on undergraduate programs, we cannot be sure that DEI programs housed within other types of postsecondary programs would have the same outcomes. Similarly, because we grouped different populations together in our analysis, we are unable to tease apart relationships between program features and outcomes for specific minoritized subgroups. In turn, not all findings may be applicable to all groups underrepresented in STEM (Chubin et al., 2015). SIP program features that are effective overall may not be effective for certain racial or ethnic groups or for women, and SIPs that are effective for White women may not be helpful for women of color because of “the way in which gender operates together with race” (Lord et al., 2009). As an example, Lord et. al. (2009) note, “Women in engineering do not necessarily share common experiences of marginality. For example, women of color may experience both sexism and racism, compounding their experiences of exclusion.” Therefore, STEM programs serving women of color must actively work to address “the double bind” of oppression that these students face, or else they may fail or only partially succeed (Hall & Sandler, 1982; Lord et al., 2009; Ong et al., 2011).


This systematic review yields several implications for practice. The first is that all the program components included in this review show promise for improving outcomes for minoritized students in undergraduate STEM programs. Supplemental learning and mentorship appeared the most in articles showing positively significant findings. However, skill building, socializing, bridge programs, and financial aid show potential for success. These features were less common overall, but studies of programs that included them often found positive results. The majority of programs covered in this systematic review had multiple program components. As others have argued (Tsui, 2007), we believe it is likely that providing a wide number of features in a program increases participant success, in part because such programs seem to address multiple institutional failures underlying underrepresentation in STEM rather than just one. Programs focused on minoritized students in STEM can address not only academic preparation, but also issues of campus climate and culture as well as institutional structures such as admission policies and distribution of resources.

Future research should examine not only the efficacy of individual components, but of components as they interact. More research should be done specifically on financial aid, socializing and bridge programs as interventions for minoritized groups in STEM. These components show promise in the existing research, but further study would help to support or refute their efficacy. Another implication for research relates to the quality of the body of literature we found. A greater number of high-quality quantitative studies on SIPs must be published in order to promote best practices and ensure the successes of future generations of minoritized students in STEM. This means using the necessary statistical methods to support conclusions about programs and interventions, something that is not prevalent in the current body of research about STEM programming. Finally, researchers should examine how SIPs and other student-focused interventions are implemented alongside of, or instead of, interventions that address institutions’ systemic biases and barriers to inclusion.


In this article, we presented an updated systematic review of 82 articles about diversity-focused STEM programs and their features and outcomes. The aim of this review was to answer the question: what are the features of STEM programs that produce positive outcomes for underrepresented minorities? Following in the footsteps of prior literature reviews on this topic, we created new categories for STEM program features, and went a step further to classify commonly studied outcomes of these programs. We found that the program features examined here—supplemental learning, mentorship, skill building, financial aid, socializing, and bridge programs—represent various forms of institutional support for STEM students, and all have demonstrated associations with positive outcomes for SIP participants. Thus, students struggling in STEM due to an unwelcoming climate, inadequate prior academic preparation, or other institutional shortcomings may find their retention, academic success, and psyche boosted after participating in an SIP. Although the interventions investigated throughout this review were successful, more work needs to be done to enhance our understanding of how to promote and sustain equity in STEM for minoritized students.

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.



Asian-American and Pacific Islander


Diversity, equity, and inclusion


Interrater reliability


Education Resources Information Center


Participants, Intervention, Comparison, Outcomes, and Study Design


STEM intervention program


Science, technology, engineering, and math


Citations with an asterisk (*) next to them were analyzed during the systematic review process. A complete list of the articles included in the systematic review can be found in Additional file 2.

  • Allen-Ramdial, S. A., & Campbell, A. G. (2014). Reimagining the pipeline: Advancing STEM diversity, persistence, and success. BioScience, 64(7), 612-618.

  • Alexander, P. A. (2020). Methodological guidance paper: The art and science of quality systematic reviews. Review of Educational Research, 90(1), 6–23.

    Article  Google Scholar 

  • Allen, C. (1999). Wiser women: Fostering undergraduate success in science and engineering with a residential academic program. Journal of Women and Minorities in Science and Engineering, 5(3), 265–277. *

    Article  Google Scholar 

  • Archat-Mendes, C., Anfuso, C., Johnson, C., Shepler, B., Hurst-Kennedy, J., Pinzon, K., Simmons, R., Dekhane, S., Savage, J., Sudduth, E., D’Costa, A., Leader, T., Pursell, D., Runck, C., & Awong-Taylor, J. (2019). Learning, leaders, and STEM skills: Adaptation of the supplemental instruction model to improve STEM success and build transferable skills in undergraduate courses and beyond. Journal of STEM Education: Innovations and Research, 20(2), 14–23. *

    Google Scholar 

  • Baron, S. I., Brown, P., Cumming, T., & Mengeling, M. (2020). The impact of undergraduate research and student characteristics on student success metrics at an urban, minority serving, commuter, public institution. Journal of the Scholarship of Teaching and Learning, 20(1), 85–104. *

    Google Scholar 

  • Booth, A., Sutton, A., & Papaioannou, D. (2016). Systematic approaches to a successful literature review. Sage.

    Google Scholar 

  • Bottia, M. C., Mickelson, R. A., Jamil, C., Moniz, K., & Barry, L. (2021). Factors associated with college STEM participation of racially minoritized students: A synthesis of research. Review of Educational Research, 91(4), 614–648.

    Article  Google Scholar 

  • Bound, J., Lovenheim, M., & Turner, S. (2009). Why have college completion rates declined? An analysis of changing student preparation and collegiate resources (No. 15566; NBER Working Paper Series).

  • Braun, V., & Clarke, V. (2012). Thematic analysis. In H. Cooper, P. M. Camic, D. L. Long, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology, Vol. 2. Research designs: Quantitative, qualitative, neuropsychological, and biological (pp. 57–71). American Psychological Association.

    Chapter  Google Scholar 

  • BrckaLorenz, A., Haeger, H., & Priddie, C. (2021). An examination of inclusivity and support for diversity in STEM fields. Journal for STEM Education Research, 4, 363–379.

  • Brock, T. (2010). Young adults and higher education: Barriers and breakthroughs to success. The Future of Children, 20(1), 109–132.

    Article  Google Scholar 

  • Byrd, A. (2020). “Like coming home”: African Americans tinkering and playing toward a computer code bootcamp. College Composition and Communication, 71(3), 426–452.

    Google Scholar 

  • Carnevale, A., Van Der Werf, M., Quinn, M. C., Strohl, J., & Repnikov, D. (2018). Our separate and unequal public colleges: How public colleges reinforce white racial privilege and marginalize Black and Latino students. Georgetown Center on Education and the Workforce.

  • Castro, E. L. (2014). “Underprepared” and “at-risk”: Disrupting deficit discourses in undergraduate STEM recruitment and retention programming. Journal of Student Affairs Research and Practice, 51(4), 407–419.

    Article  Google Scholar 

  • Chase, M. M., Dowd, A. C., Pazich, L. B., & Bensimon, E. M. (2014). Transfer equity for “minoritized” students: A critical policy analysis of seven states. Educational Policy, 28(5), 669–717.

    Article  Google Scholar 

  • Chubin, D. E., Didion, C., & Beoku-Betts, J. (2015). Promising programs: A cross-national exploration of women in science, education to workforce. In W. Pearson Jr., L. Frehill, & C. McNeely (Eds.), Advancing women in science: An international perspective (pp. 275–305). Springer.

    Chapter  Google Scholar 

  • Ciocca Eller, C., & DiPrete, T. A. (2018). The paradox of persistence: Explaining the Black-White gap in bachelor’s degree completion. American Sociological Review, 83(6), 1171–1214.

    Article  Google Scholar 

  • Dagley, M., Georgiopoulos, M., Reece, A., & Young, C. (2016). Increasing retention and graduation rates through a STEM learning community. Journal of College Student Retention: Research, Theory & Practice, 18(2), 167–182. *

    Article  Google Scholar 

  • Deil-Amen, R., & DeLuca, S. (2010). The underserved third: How our educational structures populate an educational underclass. Journal of Education for Students Placed at Risk, 15(1–2), 27–50.

    Article  Google Scholar 

  • D’Souza, M. J., Shuman, K. E., Wentzien, D. E., & Roeske, K. P. (2018). Working with the Wesley College Cannon scholar program: Improving retention, persistence, and success. Journal of STEM Education: Innovations and Research, 19(1), 31–40. *

    Google Scholar 

  • Dunn, C., Shannon, D., McCullough, B., Jenda, O., & Qazi, M. (2018). An innovative postsecondary education program for students with disabilities in STEM (Practice Brief). Journal of Postsecondary Education and Disability, 31(1), 91–101. *

    Google Scholar 

  • Dynarski, S., & Scott-Clayton, J. (2013). Financial aid policy: Lessons from research. National Bureau of Economic Research.

    Article  Google Scholar 

  • Espinosa, L. L. (2011). Pipelines and pathways: Women of color in STEM majors and the experiences that shape their persistence. Harvard Educational Review, 81(2), 209–241.

    Article  Google Scholar 

  • Estrada, M., Hernandez, P. R., & Schultz, P. W. (2018). A longitudinal study of how quality mentorship and research experience integrate underrepresented minorities into STEM careers. CBE—Life Sciences Education. *

    Article  Google Scholar 

  • Fisler, J. L., Young, J. W., & Hein, J. L. (2000). Retaining women in the sciences: Evidence from Douglass College’s project SUPER. Journal of Women and Minorities in Science and Engineering, 6(4), 349–372. *

    Article  Google Scholar 

  • Flynn, D. T. (2016). STEM field persistence: The impact of engagement on postsecondary STEM persistence for underrepresented minority students. Journal of Educational Issues, 2(1), 185–214.

    Article  Google Scholar 

  • Fouad, N. A., & Santana, M. C. (2017). SCCT and underrepresented populations in STEM fields: Moving the needle. Journal of Career Assessment, 25(1), 24–39.

    Article  Google Scholar 

  • Fry, R., Kennedy, B., & Funk, C. (2021, April 1). STEM jobs see uneven progress in increasing gender, racial and ethnic diversity. Pew Research Center.

  • Funk, C., & Parker, K. (2018, January 9). Women and men in STEM often at odds over workplace equity. Pew Research Center.

  • Gamoran, A. (2009). Tracking and inequality: New directions for research and practice. The Routledge international handbook of the sociology of education (1st ed., pp. 231–246). Routledge.

    Chapter  Google Scholar 

  • George, C. E., Castro, E. L., & Rincon, B. (2019). Investigating the origins of STEM intervention programs: An isomorphic analysis. Studies in Higher Education, 44(9), 1645–1661.

    Article  Google Scholar 

  • Gibson, A. D., Siopsis, M., & Beale, K. (2019). Improving persistence of STEM majors at a liberal arts college: Evaluation of the Scots science scholars program. Journal of STEM Education: Innovations and Research, 20(2), 6–13. *

    Google Scholar 

  • Giles, M. (2015). Acclimating to the institutional climate: There’s a “chill” in the air. In F. A. Bonner II., A. F. Marbley, F. Tuitt, P. A. Robinson, R. M. Banda, & R. L. Hughes (Eds.), Black faculty in the academy. Routledge.

    Chapter  Google Scholar 

  • Good, J., Halpin, G., & Halpin, G. (2002). Retaining Black students in engineering: Do minority programs have a longitudinal impact?. Journal of College Student Retention, 3(4), 351–364. *

    Article  Google Scholar 

  • Hall, R. M., & Sandler, B. R. (1982). The classroom climate: A chilly one for women? Association of American Colleges.

  • Harper, S. R. (2010). An anti-deficit achievement framework for research on students of color in STEM. New Directions for Institutional Research, 2010(148), 63–74.

    Article  Google Scholar 

  • Harper, S. R. (2012). Race without racism: How higher education researchers minimize racist institutional norms. Review of Higher Education: Journal of the Association for the Study of Higher Education, 36(Suppl. 1), 9–29.

    Article  Google Scholar 

  • Hatfield, N., Brown, N., & Topaz, C. M. (2022). Do introductory courses disproportionately drive minoritized students out of STEM pathways? PNAS Nexus.

    Article  Google Scholar 

  • Howard, J. (2001). Service learning course design workbook. Michigan Univ., Ann Arbor. Edward Ginsberg Center for Community Service and Learning.

  • Hrabowski, F. A., & Maton, K. I. (1995). Enhancing the success of African-American students in the sciences: Freshman year outcomes. School Science and Mathematics, 95(1), 19–27. *

    Article  Google Scholar 

  • Huziak-Clark, T., Sondergeld, T., Staaden, M., Knaggs, C., & Bullerjahn, A. (2015). Assessing the impact of a research-based STEM program on STEM majors’ attitudes and beliefs. School Science and Mathematics, 115(5), 226–236. *

    Article  Google Scholar 

  • Ikuma, L. H., Steele, A., Dann, S., Adio, O., & Waggenspack, W. N., Jr. (2019). Large-scale student programs increase persistence in STEM fields in a public university setting. Journal of Engineering Education, 108(1), 57–81. *

    Article  Google Scholar 

  • Jenkins, D., & Fink, J. (2016). Tracking transfer: New measures of institutional and state effectiveness in helping community college students attain Bachelor’s degrees. Community College Research Center, Teachers College, Columbia University.

    Google Scholar 

  • Jennings, J. L., Deming, D., Jencks, C., Lopuch, M., & Schueler, B. E. (2015). Do differences in school quality matter more than we thought? New evidence on educational opportunity in the twenty-first century. Sociology of Education, 88(1), 56–82.

    Article  Google Scholar 

  • Kanny, M. A., Sax, L. J., & Riggers-Pieh, T. A. (2014). Investigating forty years of STEM research: How explanations for the gender gap have evolved over time. Journal of Women and Minorities in Science and Engineering, 20(2), 127–148.

    Article  Google Scholar 

  • Kassaee, A. M., & Rowell, G. H. (2016). Motivationally-informed interventions for at-risk STEM students. Journal of STEM Education: Innovations and Research, 17(3), 77–84. *

    Google Scholar 

  • Landis, R. L., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.

    Article  Google Scholar 

  • Lee, J. J., & Mccabe, J. M. (2020). Who speaks and who listens: Revisiting the chilly climate in college classrooms. Gender & Society.

    Article  Google Scholar 

  • Lee, M. J., Collins, J. D., Harwood, S. A., Mendenhall, R., & Huntt, M. B. (2020). “If you aren’t White, Asian or Indian, you aren’t an engineer”: Racial microaggressions in STEM education. International Journal of STEM Education, 7(1), 48.

    Article  Google Scholar 

  • Lent, R. W., & Brown, S. D. (2019). Social cognitive career theory at 25: Empirical status of the interest, choice, and performance models. Journal of Vocational Behavior, 115, 103316.

    Article  Google Scholar 

  • Linley, J. L., & George-Jackson, C. E. (2013). Addressing underrepresentation in STEM fields through undergraduate interventions. New Directions for Student Services, 2013(144), 97–102.

    Article  Google Scholar 

  • Liou-Mark, J., Ghosh-Dastidar, U., Samaroo, D., & Villatoro, M. (2018). The peer-led team learning leadership program for first year minority science, technology, engineering, and mathematics students. Journal of Peer Learning, 11, 65–75. *

    Google Scholar 

  • Lisberg, A., & Woods, B. (2018). Mentorship, mindset and learning strategies: An integrative approach to increasing underrepresented minority student retention in a STEM undergraduate program. Journal of STEM Education: Innovations and Research, 19(3), 14–19. *

    Google Scholar 

  • López, N., Morgan, D. L., Hutchings, Q. R., & Davis, K. (2022). Revisiting critical STEM interventions: A literature review of STEM organizational learning. International Journal of STEM Education.

    Article  Google Scholar 

  • Lord, S. M., Camacho, M. M., Layton, R. A., Long, R. A., Ohland, M. W., & Wasburn, M. H. (2009). Who’s persisting in engineering? A comparative analysis of female and male Asian, Black, Hispanic, Native American, and White students. Journal of Women and Minorities in Science and Engineering, 15, 167–190.

    Article  Google Scholar 

  • Loveless, T. (2009). Tracking and detracking: High achievers in Massachusetts middle schools.

  • Maton, K. I., Hrabowski, F. A., III., & Schmitt, C. L. (2000). African American college students excelling in the sciences: College and postcollege outcomes in the Meyerhoff scholars program. Journal of Research in Science Teaching, 37(7), 629–654.;2-8 *

    Article  Google Scholar 

  • McGee, E. O. (2016). Devalued Black and Latino racial identities: A by-product of STEM college culture? American Educational Research Journal, 53(6), 1626–1662.

    Article  Google Scholar 

  • McGee, E. O. (2020). Interrogating structural racism in STEM higher education. Educational Researcher, 49(9), 633–644.

    Article  Google Scholar 

  • Methley, A. M., Campbell, S., Chew-Graham, C., McNally, R., & Cheraghi-Sohi, S. (2014). PICO, PICOS and SPIDER: A comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Services Research, 14(1), 579.

    Article  Google Scholar 

  • Miriti, M. N. (2020). The elephant in the room: Race and STEM diversity. BioScience, 70(3), 237–242.

    Article  Google Scholar 

  • Morris, L. K. (2003, November 6). The chilly climate for women: A literature review. Annual meeting of the mid-south educational research association, Biloxi, MI.

  • Muro, M., Berube, A. & Whiton, J. (2018). Black and Hispanic underrepresentation in tech: It’s time to change the equation. Brookings.

  • Murphy, T. E., Gaughan, M., Hume, R., & Moore, S. G. (2010). College graduation rates for minority students in a selective technical university: Will participation in a summer bridge program contribute to success?. Educational Evaluation and Policy Analysis, 32(1), 70–83. *

    Article  Google Scholar 

  • National Academies of Sciences, Engineering, and Medicine. (2016). Barriers and opportunities for 2-year and 4-year STEM degrees: Systemic change to support students’ diverse pathways. The National Academies Press.

    Book  Google Scholar 

  • National Center for Education Statistics. (2019, February). Indicator 22: Financial aid.

  • National Center for Science and Engineering Statistics (NCSES). (2021). Women, minorities, and persons with disabilities in science and engineering: 2021 (NSF 21–321). National Science Foundation.

  • Oakes, J. (1985). Keeping track: How schools structure inequality. Yale University Press.

    Google Scholar 

  • Ong, M. (2005). Body projects of young women of color in physics: Intersections of gender, race, and science. Social Problems, 52(4), 593–617.

    Article  Google Scholar 

  • Ong, M., Wright, C., Espinosa, L. L., & Orfield, G. (2011). Inside the double bind: A synthesis of empirical research on undergraduate and graduate women of color in science, technology, engineering, and mathematics. Harvard Educational Review, 81(2), 172–208.

    Article  Google Scholar 

  • Palmer, R. T., Davis, R. J., & Thompson, T. (2010). Theory meets practice: HBCU initiatives that promote academic success among African Americans in STEM. Journal of College Student Development, 51(4), 440–443. *

  • Pearson, J., Giacumo, L. A., Farid, A., & Sadegh, M. (2022). A systematic multiple studies review of low-income, first-generation, and underrepresented, STEM-degree support programs: Emerging evidence-based models and recommendations. Education Sciences, 12(5), 1–27.

    Article  Google Scholar 

  • Pender, M., Marcotte, D. E., Sto Domingo, M. R., & Maton, K. I. (2010). The STEM pipeline: The role of summer research experience in minority students’ Ph.D. aspirations. Education Policy Analysis Archives, 18(30), 1–39. *

    Google Scholar 

  • Peterfreund, A. R., Rath, K. A., Xenos, S. P., & Bayliss, F. (2008). The impact of supplemental instruction on students in STEM courses: Results from San Francisco State University. Journal of College Student Retention: Research, Theory & Practice, 9(4), 487–503. *

    Article  Google Scholar 

  • Pollock, A., & Berge, E. (2018). How to do a systematic review. International Journal of Stroke, 13(2), 138–156.

    Article  Google Scholar 

  • Ramsey, L. R., Betz, D. E., & Sekaquaptewa, D. (2013). The effects of an academic environment intervention on science identification among women in STEM. Social Psychology of Education: An International Journal, 16(3), 377–397. *

    Article  Google Scholar 

  • Rincón, B. E., & George-Jackson, C. E. (2016a). STEM intervention programs: Funding practices and challenges. Studies in Higher Education, 41(3), 429–444.

    Article  Google Scholar 

  • Rincon, B. E., & George-Jackson, C. E. (2016b). Examining department climate for women in engineering: The role of STEM interventions. Journal of College Student Development, 57(6), 742–747.

    Article  Google Scholar 

  • Robinson, T. N. (2022). The myths and misconceptions of change for STEM reform: From fixing students to fixing institutions. New Directions for Higher Education, 2022(197), 79–89.

    Article  Google Scholar 

  • Rolin, K. (2008). Gender and physics: Feminist philosophy and science education. Science & Education, 17(10), 1111–1125.

    Article  Google Scholar 

  • Rosenbaum, J. E. (1976). Making inequality: The hidden curriculum of high school tracking. Wiley.

    Google Scholar 

  • Rosenbaum, J. E., Ahearn, C. E., & Rosenbaum, J. E. (2017). Bridging the gaps: College pathways to career success. Russell Sage Foundation.

    Book  Google Scholar 

  • Rui, N. (2009). Four decades of research on the effects of detracking reform: Where do we stand? A systematic review of the evidence. Journal of Evidence-Based Medicine, 2(3), 164–183.

    Article  Google Scholar 

  • Sanabria, T., Penner, A., & Domina, T. (2020). Failing at remediation? College remedial coursetaking, failure and long-term student outcomes. Research in Higher Education, 61, 459–484.

    Article  Google Scholar 

  • Sezelenyi, K., & Inkelas, K. K. (2011). The role of living-learning programs in women’s plans to attend graduate school in STEM fields. Research in Higher Education, 52(4), 349–369. *

    Article  Google Scholar 

  • Torgerson, C. J. (2006). Publication bias: The Achilles’ heel of systematic reviews? British Journal of Educational Studies.

    Article  Google Scholar 

  • Tsui, L. (2007). Effective strategies to increase diversity in STEM fields: A review of the research literature. The Journal of Negro Education, 76(4), 555–581.

    Google Scholar 

  • U.S. Bureau of Labor Statistics. (n.d.). Occupational outlook handbook. Retrieved August 17, 2021, from

  • Valencia, R. (2010). The construct of deficit thinking. Dismantling contemporary deficit thinking: Educational thought and practice (pp. 1–18). Taylor & Francis.

    Chapter  Google Scholar 

  • Van Sickle, J., Schuler, K. R., Quinn, C., Holcomb, J. P., Carver, S. D., Resnick, A., Jackson, D. K., Duffy, S. F., & Sridhar, N. (2020). Closing the achievement gap for underrepresented minority students in STEM: A deep look at a comprehensive intervention. Journal of STEM Education: Innovations and Research, 21(2), 5–18. *

    Google Scholar 

  • Whittaker, J. A., & Montgomery, B. L. (2012). Cultivating diversity and competency in STEM: Challenges and remedies for removing virtual barriers to constructing diverse higher education communities of success. The Journal of Undergraduate Neuroscience Education, 11(1), A44–A51.

    Google Scholar 

  • Windsor, A., Bargagliotti, A., Best, R., Franceschetti, D., Haddock, J., Ivey, S., & Russomanno, D. (2015). Increasing retention in STEM: Results from a STEM talent expansion program at the University of Memphis. Journal of STEM Education: Innovations and Research, 16(2), 11–19. *

    Google Scholar 

Download references


The authors would like to thank Dr. Danielle Clark and the rest of the teams at Discovery Partners Institute and the Illinois Workforce and Education Research Collaborative for their edits, comments, and support. We would especially like to thank Phyllis Baker for her help in outlining the project’s original concept and for her guidance throughout this project. Finally, the authors would like to acknowledge the support of the Pritzker and Pritzker Traubert foundations.


This project was funded by the Discovery Partners Institute (DPI), the Illinois Workforce and Education Research Collaborative (IWERC), the Pritzker Foundation, and the Pritzker Traubert Foundation.

Author information

Authors and Affiliations



MB outlined the original concept and helped with coding, data analysis, and revisions. MB and SD worked together on the original selection criteria and coding framework. CC and SC contributed to quantitative analysis and the shaping of the analysis. OP worked as a coder, collected articles for the literature review, and assisted in writing article. SC helped with coding, writing the article, and the calculation of inter-rater agreement. SD performed the search and eligibility, collected and analyzed data from the articles (including coding), and helped to write the article. All authors worked together to revise the article and reviewed the article prior to submission. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Olivia Palid.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Appendix A.

Search terms. Appendix B. Studies removed by tag function. Appendix C. Article details. Appendix D. Categorization of study outcomes. Appendix E. The percentage of positively significant findings for each outcome by feature included in the program.

Additional file 2

. Articles Included in Systematic Review.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Palid, O., Cashdollar, S., Deangelo, S. et al. Inclusion in practice: a systematic review of diversity-focused STEM programming in the United States. IJ STEM Ed 10, 2 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: