AI tools are becoming the norm as a first filter of applicants for jobs, particularly in situations where there are large numbers of applicants for a small number of jobs. In theory, AI should be more objective in selecting applicants because it doesn’t have implicit biases against people from poorer backgrounds, women, ethnic minorities, people with weird names, etc. However, there have been a series of press articles that claim that AI tools selected against the most qualified candidates and can easily be manipulated.
In my view, the problem with AI tools in HR is not that they are biased or filtering out the most qualified candidates. In a review of studies on the topic, Zhisheng Chen concluded that AI does make fairer (i.e. less biased) decisions but only when it is trained against an unbiased dataset.
In an effort to include AI in their offering, businesses selling HR software tend to rush products to market with limited or no testing. And these hastily assembled products are full of biases. It’s the usual slogan of ‘move fast and break things’, except that in this case, the fast movers are breaking people’s careers and job prospects.
However, if done right, AI tools in recruitment can be extremely effective and help both applicants and hiring firms. This has neatly been demonstrated by Mallory Avery and her collaborators when it comes to gender biases in tech jobs.
The tech industry is struggling with a severe gender imbalance both because few women are interested in qualifying for such jobs or working in the industry and because the industry has a reputation for being driven by a very male ‘tech-bro’ culture that is biased against women.
Through two field experiments, the new study looked at the differences in sorting job candidates based on where AI tools were used and where humans were involved. The chart below summarises the findings, but I will talk you through it (or am I writing you through it?).
Reduced gender bias through the use of AI tools in recruiting
Source: Avery et al. (2024)
The thick black line in the chart shows the share of female candidates in each percentile among all candidates for a job if candidates apply in person and are evaluated by human recruiters (i.e., no AI involved). As one can see, most women applicants are classified below the 50th percentile, indicating that women on average are considered to be less qualified than men. If one looks at the highest percentiles, the share of women drops rapidly showing that few women if any make it onto a recruiter’s shortlist in this setup.
Next, let’s look at the dashed black line which are situations where human recruiters select candidates but these candidates have been aided by AI tools in submitting their application. Here, more women are evaluated as above average and gender bias in applications is reduced significantly. Yet, when we look at the supposedly ‘best candidates’ the share of women still falls off rapidly since recruiters subconsciously still discriminate against them. However, using AI tools to help applicants through the application process makes sense because the experiments show that more women complete the application process when they know they are doing this for an unbiased AI tool, not a human. They simply think the process is fairer and thus are less likely to give up mid-way through their application.
Now, let’s look at the solid green line, which is a situation where applicants apply without the help of AI tools, but are evaluated and ranked by an unbiased AI tool. Alas, without the biased human recruiter, more women are classified as above average and about ten times more women are classified as top candidates that make it on a shortlist for the second round. This gives you an idea of how large the bias against women by human recruiters is.
Compare the solid green line with the dashed red line where applicants are supported by AI tools and recruiters get an AI score for each applicant but make the decision on who to pass on to the second round themselves. The simple solution of providing recruiters with an unbiased objective score of each candidate largely removes the bias against female applicants.
Finally, let’s look at the dashed green line which is a setup where applicants are guided through the application using AI tools, and the recruitment decision is made by another AI tool rather than a human. As one can see, this largely debiases the ranking of candidates and provides the highest chance for women to end up in a shortlist of candidates for the second round.
Thus, if done right, AI tools can and do help de-bias recruitment. And these tools outperform human recruiters in terms of the quality of candidates as Chen discusses in its review of the existing research. But as I said, the underlying necessary condition is that one uses unbiased AI tools, which doesn’t seem to be the case in real life at the moment.
AI recruitment may have issues with "employment discrimination." If a recruiter with biases related to gender, region, or age uses an algorithmic system to screen resumes, it creates a "discrimination information cocoon." The resumes that are filtered and pushed by the algorithm are those that have already passed through biased screening. This makes the already subtle problem of "discrimination" even more difficult to detect.
Amazon tried using AI/ML and ran into the same issue about ten years back, and even after making it neutral, the project was abandoned, which tells me that there is more to abandoning reason than just gender bias. More here (https://tinyurl.com/3ud85b75).
“By 2015, it was clear that the system was not rating candidates in a gender-neutral way because it was built on data accumulated from CVs submitted to the firm mostly from males, Reuters claimed.
The system started to penalise CVs which included the word "women". The program was edited to make it neutral to the term, but it became clear that the system could not be relied upon, Reuters was told.
The project was abandoned, although Reuters said that it was used for a period by recruiters who looked at the recommendations generated by the tool but never relied solely on it.”
I have not read the study. Does the author think new models/algorithms have found a way to overcome the missing part of the puzzle? I agree AI will be able to help if it is trained with better quality data, but I think being a black box and not explaining why someone or something was picked requires it to propagate one bias or another in the data and will require human in the loop. However, humans start relying on AI systems even when it is wrong. Here is an article about it (https://hai.stanford.edu/news/ai-overreliance-problem-are-explanations-solution)
“In theory, a human collaborating with an AI system should make better decisions than either working alone. But humans often accept an AI system’s recommended decision even when it is wrong – a conundrum called AI overreliance. This means people could be making erroneous decisions in important real-world contexts such as medical diagnosis or setting bail, says Helena Vasconcelos, a Stanford undergraduate student majoring in symbolic systems.”