AI Resume Screeners Prefer Resumes Written by the Same AI Model. The Bias Is 23-60%.
A study found AI screening tools shortlist candidates who used the same model to write their resume at rates 23-60% higher. What hiring managers and applicants need to do about it.
By Forge Team
If you use an AI tool to screen job applications, it is probably rating candidates more favorably when they used the same AI model to write their resume. A study circulated on Hacker News (May 2, 327 upvotes) found candidates who wrote resumes with the same LLM used for screening were shortlisted at rates 23-60% higher than candidates who used a different model or none. Self-preference bias across major models ranged from 68-88%. The screening tool wasn't evaluating candidate quality — it was recognizing its own style and scoring it higher.
Why this happens
AI models produce distinctive patterns: sentence structures, word choices, how they frame accomplishments, how they order information. A resume written by GPT-5.5 looks different from one written by Claude — not in ways you'd necessarily notice, but in ways that a language model trained on its own outputs would recognize.
When the screening tool and the resume-writing tool share the same underlying model, the screener rewards familiar patterns. Candidates who used a different AI, or no AI, are penalized for style differences that have nothing to do with their qualifications. The study found business-related fields showed the largest penalties. One straightforward intervention — asking the screening tool to evaluate explicit criteria rather than holistic fit — cut the bias by more than 50%.
What to do differently Monday morning
Two different people need to act on this.
If you're in hiring: The default "relevance" or "fit" score from an AI screening tool is not a neutral measure of candidate quality. Before your next screening run, define explicit criteria and score each one separately. "Does this candidate have experience managing a team of 10 or more?" gives you a more reliable signal than "How strong is this candidate overall?" Spot-check your shortlist by pulling five applications the tool scored low and reviewing them yourself. If well-qualified candidates are landing there repeatedly, the holistic mode is working against you.
If you're applying: Writing your resume with AI is fine, but don't optimize solely for one platform's output style. Keep a human-edited version that uses your own voice, your specific numbers, and language you'd use in a conversation. That version will hold up across screeners better than one that reads like it came from the same tool that might be evaluating it.
The shortlist that was shorter than it should have been
Priya runs talent acquisition at a 220-person B2B software company. Her team received 340 applications for a Head of Growth role and used an AI screening tool to cut the list down before any human review. She trusted the tool's relevance scoring — it had been reliable for engineering roles, and she had no reason to expect it behaved differently for business functions.
After reading the study, she pulled the 20 applications the tool had ranked in the middle tier — candidates it scored as "possible fit" rather than screening out entirely. Three of them had stronger growth experience than two people on her final shortlist of 18. When she re-evaluated those three against specific criteria — relevant revenue scale, number of channels owned, team size managed — all three scored higher than two shortlisted candidates.
The tool hadn't made an error. Its holistic scoring mode had weighted style patterns she never asked it to weight.
Find where an AI assessment is measuring the wrong thing.
The other side of the same problem
Darren is a senior operations director applying for supply chain leadership roles at mid-sized manufacturers — companies between 250 and 800 employees. He uses Claude to help structure his application materials. He has no idea which AI tools the companies he applies to use for screening, and in most cases he never will.
What he can control: writing a version of his resume that doesn't rely on AI-distinctive structures. His Claude-assisted draft starts accomplishments with phrases like "Spearheaded cross-functional alignment to optimize procurement cycles." His rewritten version: "Cut raw material lead time from 18 days to 11 by moving three key suppliers to consignment inventory." That specificity performs better across every screener — human or AI — because it gives any evaluation method something concrete to recognize.
Surface the assumptions an AI evaluation is making that you didn't ask for.
The reliable signal
The bias doesn't mean AI screening is broken. It means holistic AI scoring is an unreliable signal when you haven't defined the criteria. Any tool making a judgment call on vague input will find patterns in style and proximity rather than substance. Define the criteria, score each one separately, and spot-check the bottom of your shortlist. That adjustment takes 10 minutes before a screening run and changes what you're actually measuring.
Like this post?
Get the next one in your inbox. Practical AI skills, no filler.