Blind Spots in the ‘Blind Audition’ Study

menstruatrix · 2019-10-22T11:53:15+00:00

A lauded 2000 article claiming to find sexism in American orchestras looks increasingly spurious.

It is one of the most famous social-science papers of all time. Carried out in the 1990s, the “blind audition” study attempted to document sexist bias in orchestra hiring. Lionized by Malcolm Gladwell, extolled by Harvard thought leaders, and even cited in a dissent by Justice Ruth Bader Ginsburg, the study showed that when orchestras auditioned musicians “blindly,” behind a screen, women’s success rates soared. Or did they?

Nobody questions the basic facts that led to the study’s publication. During the 1970s and ’80s, America’s orchestras became more open and democratic. To ensure impartiality, several introduced blind auditions. Two economists, Claudia Goldin of Harvard and Cecilia Rouse of Princeton, noticed that women’s success rates in auditions increased along with the adoption of screens. Was it a coincidence or the result of the screens? That is the question the two economists tried to answer in “Orchestrating Impartiality: The Impact of ‘Blind’ Auditions on Female Musicians,” published in 2000 in the American Economic Review.

They collected four decades of data from eight leading American orchestras. But the data were inconclusive: The paper includes multiple warnings about small sample sizes, contradictory results and failures to pass standard tests of statistical significance. But few readers seem to have noticed. What caught everyone’s attention was a big claim in the final paragraph: “We find that the screen increases—by 50 percent—the probability that a woman will be advanced from certain preliminary rounds and increases by severalfold the likelihood that a woman will be selected in the final round.”

According to Google, the study has received more than 1,500 citations in academic articles and thousands of media mentions. It has been featured in TED Talks, celebrated at the Davos conference, and showcased in so many diversity workshops that one attendee begged never to hear about it again. Inspired by the “academically verified Orchestra study,” GapJumpers, a Silicon Valley startup, offers companies software to conduct blind interviews in other contexts.

The study’s appeal is clear: Two prominent economists, in a top journal, wielding state-of-the-art econometrics, captured and quantified bias against women and documented a solution. Or so it seemed.

The research went uncriticized for nearly two decades. That changed recently, when a few scholars and data scientists went back and read the whole study. The first thing they noticed is that the raw tabulations showed women doing worse behind the screens. But perhaps, Ms. Goldin and Ms. Rouse explained, blind auditions “lowered the average quality of female auditionees.” To control for ability, they analyzed a small subset of candidates who took part in both blind and nonblind auditions in three of the eight orchestras.

The result was a tangle of ambiguous, contradictory trends. The screens seemed to help women in preliminary audition rounds but men in semifinal rounds. None of the findings were strong enough to draw broad conclusions one way or the other.

So where did Ms. Goldin and Ms. Rouse get their totemic conclusion that blind auditions dramatically improved the success of women candidates? After warning that their findings were not statistically significant, they declared them to be “economically significant.” What does that mean in this context?

“That doesn’t mean anything at all,” writes Columbia University data scientist Andrew Gelman, in a recent post about the study. “Some fine words but the punchline seems to be that the data are too noisy to form any strong conclusions.” My guess is that the authors thought they had detected something with real-world relevance despite an absence of statistical rigor. But that’s a reason to call for more research, not to declare the transformative power of screens in women’s quest for equality.

Still, isn’t it obvious that the screens at least contributed to equal hiring? No. The screens might have been a reflection of changing attitudes, and perhaps those attitudes, not the screens, helped women. After all, women didn’t need blind auditions to move ahead in law, business, medicine or the academy—or at the Cleveland Orchestra, which, according to the study, did not use them.

Mr. Gelman and the other critics don’t deny the existence of bias against women or question the potential merits of blind auditions. Anonymous tryouts make sense as a means of achieving impartiality. But Ms. Goldin and Ms. Rouse verified nothing about the value of blind recruitment for women.

Nor has anyone else. The subsequent research is a morass of baseless claims, retracted statements and contradictory findings. There is, however, one study that stands out for its rigor and transparency. In 2017 a team of behavioral economists in the Australian government published the results of a large, randomized controlled study entitled “Going Blind to See More Clearly.” It was directly inspired by the blind-audition study. Iris Bohnet, a Harvard Kennedy School dean and Goldin-Rouse enthusiast, served as an adviser.

For the study, more than 2,000 managers in the Australian Public Service were asked to select recruits from randomly assigned résumés—some disguising the applicant’s sex, others not. The research team fully expected to find far more female candidates shortlisted when sex was disguised. But, as the stunned team leader told the local media: “We found the opposite, that de-identifying candidates reduced the likelihood of women being selected for the shortlist.” It turned out that many senior managers, aware that sexist assumptions had once kept women out of upper-level positions, already practiced a mild form of affirmative action. Anonymized hiring was not only time-consuming and costly, it proved to be an obstacle to women’s equality. The team plans to look elsewhere for solutions.

Truth matters. Overhyped claims of scientific certainty create confusion, undermine public trust and send scarce resources in the wrong direction. Most of all, they don’t solve problems. Sex discrimination in the workplace is a serious matter. But improvements require solid data, replicable research and careful evaluations of causation. As the scholar Alice Dreger says, “Evidence is an ethical issue.” If blind auditions aren’t helpful to women, it’s important we know.

Archive Link

Ody_ssey · 2019-10-22T12:46:55+00:00

Entire feminism is based on biased studies. Just because you write your dreams on a paper and publish it, that doesn't make it conclusive.