Pants-of-dog wrote:SCOTUS has made verifiably incorrect claims in the recent past. This could easily be one of them.
From the same source used in the previous post:
Harvard retained David Card, professor of economics at the University of California, Berkeley, as their expert witness. Card recently won the 2021 Nobel Memorial Prize in Economic Sciences.
Card agreed with Arcidiacono’s general analysis approach but wrote in his expert report that Arcidiacono’s models place too much emphasis on academic factors as predictors of admissions outcomes. According to Card, Harvard’s admissions practices place significant weight on contextual factors “that account for the life experience and background of each candidate” including “high school, community and family background.”
By considering these contextual factors in his model, Card argued that the effect of considering racial and ethnic factors doesn’t result in a bias towards Asian-American students as Arcidiacono found.
So even the source presented as evidence does not claim that the evidence is correct.
I am reading both of the briefs. This, from Arcidiacono's brief, is... Interesting:
Arcidiacono wrote:It is worth pausing to note that the opportunity for racial penalties and preferences is least present in academic and extracurricular ratings for two reasons. First, both are easily measured. For the academic rating, Harvard’s files contain information on the test scores of the students, their grades, number of AP exams taken and the scores on these AP exams, etc. For the extracurricular rating, lists of activities are included that specify the type of activity, the years the student participated in that activity, and the number of hours per week devoted to the activity. Second, they are specific, reflecting how an applicant scored on a particular set of tasks.
This is in contrast to the personal rating, which is difficult to measure directly, and the various ratings that reflect agglomerations of another individual’s rating of a candidate along many dimensions (e.g., the counselor and teacher ratings, as well as the overall ratings of the reader and the alumni interviewer). Harvard’s Reader Guidelines illustrate why it would be easy to manipulate the personal rating. While the guidelines provide detailed instructions for the various other ratings, for the personal rating, the guidelines list only the following: “1. Outstanding. 2. Very strong. 3. Generally positive. 4. Bland or somewhat negative or immature. 5. Questionable personal qualities. 6. Worrisome personal qualities.”48
Harvard’s OIR researchers in fact recognized racial differences in the assignment of personal ratings in 2013. Using data over ten years, they found that Harvard’s admissions officers assigned substantially lower personal ratings to Asian-American applicants versus white applicants, especially when compared to the ratings assigned by teachers, counselors, and alumni interviewers.49
This is Card's take on the personal ratings:
Card wrote:5.2.3. Prof. Arcidiacono’s analysis does not support the conclusion that the personal rating is biased
145. The models discussed above include as a control variable Harvard’s personal rating.
Using an ordered logit model that predicts personal ratings, Prof. Arcidiacono has argued that the
personal rating is biased against Asian-American applicants. Based on this result, he then argues that
the inclusion of the personal rating in the model is inappropriate. As discussed in Section 2 above,
there are several reasons why Prof. Arcidiacono’s statistical evidence of bias in the personal rating is
weak and does not justify the exclusion of the personal rating from his model. Here, I expand on this
issue.
146. First, Prof. Arcidiacono’s model of personal ratings cannot reliably explain the assignment of personal ratings. The Pseudo R-Squared value of the model is 0.28, which is quite low;
for example, Prof. Arcidiacono’s more reliable model of the academic rating has a Pseudo R-Squared
value of 0.56.120 Additionally the model has very low predictive accuracy. Of the 47 applicants in
Prof. Arcidiacono’s sample who have personal ratings of 1, his model correctly predicts their rating
zero percent of the time, and of the 30,976 applicants with a rating of 2, it correctly predicts their
rating only 45% of the time.121
147. As detailed above, a common methodological challenge in assessing the potential for
racial bias using regression models is that a model almost always excludes some relevant
information. This concern is particularly significant in attempting to model Harvard’s personal rating,
which considers many individualized and hard-to-quantify factors (i.e., the “missing data” I discuss
above). Thus, if a regression estimates that race affects applicants’ personal ratings, there is a serious
question whether that estimated effect might actually be explained not by race but by racial
differences in some factor that is not included in the model and that affects the personal rating—in
other words, by omitted-variable bias (or “missing data”). One clear example of such missing data is
an applicant’s personal essay, which according to documents and testimony in this case is an
important consideration in the determination of the personal rating.122
148. As discussed above, one way to determine if the missing data problem is affecting the
estimated effects of race in a particular model is to consider how the estimated effect in the model
changes as more of the available variables are added to the model. Importantly, Prof. Arcidiacono’s
own regression results show that the estimated effect of Asian-American ethnicity on the personal
rating shrinks as non-academic factors are added to his model of the personal rating. This pattern
suggests that, were more information available, the alleged effect could shrink further. For example,
in Table B.6.7 of Prof. Arcidiacono’s report, the coefficient of Asian-American ethnicity is -0.542 in
Model 3 before he has added controls for neighborhood and school background and for the relevant
ratings that feed into the personal rating. When he adds those controls (in his Model 5), the
coefficient falls to -0.366.123 If the model could account for unobserved factors like the personal essay, the gap could fall further.
149. Another sign that Prof. Arcidiacono’s regression models of the personal and overall
ratings are not capturing actual bias against Asian-American applicants is that his models find a
statistically significant positive effect of Asian-American ethnicity on the academic and
extracurricular ratings. As noted above in Section 5.1.6, such a pattern calls into question whether the
effects his models attribute to race are more properly explained by factors that are missing from his
models (either because he does not include them or because they are unobservable). If Harvard were
in fact biased against Asian-American applicants, it would make little sense for Harvard to give an
unexplained advantage to Asian-American applicants in the academic and extracurricular ratings. On
the other hand, if Harvard were not biased, but the ratings models were simply missing relevant
variables that explain the differences across race in ratings assignments, it would not be surprising to
see an inconsistent pattern of “bias” across the profile ratings.
150. Further, as detailed in Section 3, the essential function of the ratings is to quantify the
otherwise unobservable information about applicants that admissions officers discern from their
intensive review of each file. It is therefore unsurprising that regression models struggle to reliably
explain the ratings; the whole point of the ratings is to capture information that is hard to measure.
151. Despite my view that Prof. Arcidiacono’s analysis does not support an inference that the
personal rating is biased against Asian-American applicants, I have also conducted an analysis that
assumes for the sake of argument that the personal rating is biased, and therefore removes it from the
model. This approach is an extremely conservative analysis that overcorrects for any concern of bias
in the personal rating, because it completely removes from the model the personal rating (a factor on
which White applicants, in aggregate, are relatively stronger than Asian-American applicants), rather
than removing only the allegedly discriminatory component of the rating. In fact, Prof. Arcidiacono’s
Table 6.1––which uses his personal ratings regression to calculate the share of Asian-American
applicants who would receive a rating of 1 or 2 under the assumption that there was no bias in the
personal rating––shows that White applicants are still, on average, a bit more likely than Asian-
American applicants to have a personal rating of 1 or 2.124
152. As Exhibit 21 shows, even in this very conservative model that ignores an important
dimension of the admissions process on which White applicants are relatively strong, I still find only
weak and inconsistent evidence of a disparity between Asian-American and White admission rates.
Specifically, I find no evidence of a significant negative effect of Asian-American ethnicity in five of
the six years of data I analyze.
154. Before moving on, I want to respond to three other arguments offered by Prof.
Arcidiacono in support of his claim that the personal and overall ratings are biased. First, Prof.
Arcidiacono’s model of the overall rating, like his model of the personal rating and other nonacademic
ratings, is weak; it has a Pseudo R-Squared value of just 0.34.125 Given the evidence
detailed above that the estimated negative effect of Asian-American ethnicity on applicants’
probability of admission shrinks as available non-academic qualifications are added to the model, and
given that non-academic qualifications are harder to measure than academic qualifications, the small
negative effect that the model attributes to Asian-American ethnicity is not reliable evidence of bias;
it is entirely possible and even likely that that effect is attributable to omitted non-academic variables.
Additionally, Prof. Arcidiacono’s overall rating model has very poor predictive accuracy. Of the 109
applicants in Prof. Arcidiacono’s sample who have overall ratings of 1 (including pluses and
minuses), his model correctly predicts their rating only 18% of the time, and of the 8,124 applicants
with a rating of 2 (including pluses and minuses), it correctly predicts their rating only 28% of the
time.126 Further, as explained above, I have not included the overall rating in any of my regressions
because it is the one rating that may be influenced by applicants’ race (in the sense that, for example,
the overall ratings of African-American, Hispanic, or Other (AHO) applicants may reflect the
contribution they would make to the racial diversity of the student body). As I have shown above,
even without the overall rating in my regression, I find no evidence of systematic bias in Harvard’s admissions process against Asian-American applicants.
155. Second, Prof. Arcidiacono suggests that the school support (teacher and guidance
counselor) ratings assigned by Harvard are biased against Asian-American applicants because he
observes that Asian-American applicants with the strongest academic qualifications (defined as those
in the top deciles (4-10) of the academic index) are less likely to receive strong school support ratings
relative to applicants of other races.127 Again, this conclusion depends on Prof. Arcidiacono’s
assumption that candidates who are strong on academic factors are also strong on non-academic
factors—an assumption that, as discussed above, is not supported by the available data. The teacher
and guidance counselor ratings reflect strength across both academic and non-academic dimensions.
Thus, the small gap between Asian-American and White applicants’ school support ratings may well
be attributable to the fact that Asian-American applicants tend on average to be weaker than White
applicants on the available measures of non-academic factors that Prof. Arcidiacono’s analysis
explicitly ignores by focusing on only deciles of the academic index.
156. Third, Prof. Arcidiacono also suggests that differences between the alumni overall and
personal ratings and Harvard’s admissions officers’ overall and personal ratings show that Harvard’s
personal and overall ratings are biased. But that argument once again depends on Prof. Arcidiacono’s
regression models of the ratings—which, again, are quite low in predictive accuracy and do not
reliably control for the many hard-to-measure factors that are likely very important to the
determination of the ratings. Second, the alumni and admissions-officer ratings are based on different
sources. An alumni personal rating reflects only the alumni interviewer’s brief interaction with the
applicant, whereas the personal rating assigned by Harvard admissions officers considers not just the
alumni interview (to the extent it has occurred before the rating is assigned, which is often not the
case) but also the candidate’s essays, teacher recommendations, secondary school report, and so on.
Alumni ratings are also much more generous in general. For example, 62% of applicants receive an
alumni personal rating of 1 or 2, while only 23% of the sample receive a personal rating of 1 or 2.128
Moreover, the personal ratings given by the Harvard admissions officers explain much more about
Harvard’s admissions decisions than the alumni interviewer personal ratings do. For Prof.
Arcidiacono’s expanded sample, the Pseudo R-Squared value of a model that controls for only the
personal rating is 0.19, while a model that controls for only the alumni personal rating has a Pseudo
R-Squared value of just 0.08.129 Given all of this, it is not particularly surprising that there exist
differences in the size of various coefficients across the two models.
Splitting the data by year to account for year-effects is just wrong. If you want to control for year fixed effects, you add those in the model and not fit different models each year. I think I already explained why elsewhere in the forum, but it's just inappropriate to split the data by the levels of an independent variable and fit a different model for each.
As for Arcidiacono, he also shouldn't just drop data (legacy admits, athletic admits, early applicants, those in the Dean and others' lists, etc) for the analysis. If he wants to control for those, he can always add the appropriate categorical variables.
Furthermore, the fact that a model has a lower fit than another is not necessarily wrong or make it wrong, Card seemingly forgets about overfitting. Of course you will get a better fit if you just keep adding variables, as Card does, that alone doesn't say much.
I also find it hard to understand why is it that the adcom ratings don't fit well with alumni ratings. Saying alumni ratings are the result of the applicant spending a short while with whoever scored him so they're less reliable than ratings that come from many sources
except actually meeting and interviewing the applicant is extremely odd and it's hard to believe teachers or counselors are more in tune with Harvard's culture than alumni - not that it matters, as I very much doubt Card and others would stick to this line if the ones who had lower personal ratings were African-American, Hispanic or Indigenous applicants. It this is that should raise eyebrows above everything else, even admission rates.