Predicting Graduation and the Achievement Gap

The long median time to graduation at California State University, Northridge (CSUN) is five years and fewer than fifty percent of first time freshmen graduate in less than eight years. We have used data mining and predictive analytics to determine some of the key academic indicators of success at CSUN. The most important indicators that we have found are (a) first math course attempted; (b) grade point average (GPA) at the end of each of the first two terms; and (c) successful completion of a freshman experience seminar course (UNIV 100). When all three are considered simultaneously, we can correctly identify over two thirds of the students who will drop out without graduating, while incorrectly misidentifying approximately one-fifth of students who ultimately graduate as at-risk of not graduating.

One of the tools we use is logistic regression.

Logistic regression fit of data using first term GPA as a predictor of graduation. The red curve gives the estimator of the probability of graduation based purely on the GPA. The red circles show the number of students who graduated (in bins of width 0.1 fin GPA); the blue circles show the same for students who did not graduate. This probabilistic model shows the errors inherent with the conceptual model. Wherever the threshold for acceptance (saying that a student is likely to graduate), there will be some false predictions. These are indicated by either False Positives (FP) or False Negatives (FN) in the figure.
Logistic regression fit of data using first term GPA as a predictor of graduation. The red curve gives the estimator of the probability of graduation based purely on the GPA. This probabilistic model shows the errors inherent with the conceptual model. Wherever the threshold for acceptance (saying that a student is likely to graduate), there will be some false predictions. These are indicated by either False Positives (FP) or False Negatives (FN) in the figure.

The Achievement Gap

An achievement gap is a disparity in educational measures between the performance of students in different groups. Such gaps have been most frequently noted between different racial, economic, and gender groups.

At CSUN the achievement can be seen in the following logistic regression, which predicts the probability of graduating in six years based on GPA at the end of the first semester. It was based on aggregated data from students who started as freshmen at CSUN in 2005, 2006, and 2007 (and hence would have graduated by 2011, 2012, or 2013, respectively).

LogRegEthnicities

Pell grants are provided by the government based on income to students; approximately one third of all students in the US receive Pell grants. The maximum award for 2015-2016 is $5775 per year, but for many students is less. Since most students who are eligible for Pell grants apply and receive some aid, we expected that a students’ Pell grant status would approximately reflect their socio-economic status. However, a student’s Pell grant status is not a significant predictor of graduation, as the following would seem to suggest:

LogRegPell

Does Living on Campus Help?*

Whether or not students live on campus dos not, at first appearance, look like a good predictor of the gap:

housingHowever, the above plot combines all students together. What this plot shows is that living on campus, in and of itself, is not specifically an predictor of success for all ethnic and socio-economic groups.

As the following plot shows, when the data is broken down by group, we see  that African American students who live on campus have a significantly higher likelihood of graduation than similar African American students with the same GPA who do not live on campus.

Logistic regression on 6-year graduation rate as against GPA at end of first semester for African American First Time Freshman at CSUN, broken down into those who lived on campus and those who did not live on campus during their first first semester.
Logistic regression on 6-year graduation rate as against GPA at end of first semester for African American First Time Freshman at CSUN, broken down into those who lived on campus and those who did not live on campus during their first first semester.

This is in start contrast to other groups.  For example, non-hispanic white students who live on campus actually have a lower chance of graduating, than other students with comparable GPA’s at the end of their first semester. In fact, the higher performing students tended to do worse on campus.

white-housing
Comparison of probability of graduation in six years for white non hispanic students as a function of first semester GPA, living on campus or off campus.

Lower-performing Asian-American students who live on campus have a slightly higher probability of graduation compared to other students with the same GPA at the end of their first semester.  There is less improvement for higher-performing Asian-Americans.

asian-american-housing
Comparison of probability of graduation in six years, Asian American students, as  function of first semester GPA.

The results are particularly perplexing for Latino students. The graduation rate is better for lower achievement students when they live on campus, and is better for higher achievement students when they live off campus.

latinohousing
Comparison of probability of six year graduation as a function of first semester GPA for Latino students who live on campus and off campus.

*Plots prepared by Jorge Martinez and Andrew Miller

 

 

 

 

 

 

Leave a Reply