Algorithmic judgements are better (and why you should use them in assessments)

In using assessment data to make talent decisions, most IO Practitioners are familiar with the basics: construct a test battery that will measure a job profile which in turn is based on the organisation’s competency framework. Once a candidate or candidates have been assessed, collate the results in the form of different test scores, and then find the candidate who best matches the job profile. Sounds simple enough, doesn’t it?

Of course, the reality is far more complex. Just how the overall picture of a candidate’s scores is composed and how that is related to a job profile can take multiple forms. And traditionally, there has been much disagreement among assessment professionals on what the best method might look like.

A closer scrutiny of just what assessment integration method ought to be used references a debate that has been raging ever since human judgements have been scientifically studied. While this debate may well seem intimidating to the uninitiated, it is a vital one to understand. Why? Because it informs the accuracy of our predictions of job success based on assessment data.

In addition, the extent to which IO Professionals can expect organisations to invest in assessments flow directly from our professional capacity to accurately predict which candidates are the best fit, the most useful talent to take into the organisation.

To summarise the entire debate of which kind of judgements work best falls outside the scope of this article, but over the last few decades of psychological scientific research, the data points in a very specific direction.

Representative of this trend is a study conducted by Nathan Kuncel from the University of Minnesota and his colleagues who investigated two commonly used methods in assessment decision-making: clinical and mechanical/algorithmic data integration.

Clinical methods require practitioners or hiring managers to subjectively apply their professional judgement in making predictions about an individual’s fit for a role. In doing so, they would presumably draw upon collective insights, experience, and training to arrive at a conclusion.

In contrast, mechanical or algorithmic decision-making is based on a pre-defined decision rule, often in the form of weighted results that produce an overall score for each candidate. Traditionally, IO Professionals have been loath to entertain the notion that their expert judgements could possibly be inferior to such simple mechanisms as algorithms (albeit algorithms constructed by fellow professionals).

In Kuncel’s study, the researchers used meta-analytical methods to summarise 25 independent research papers that investigated the relative worth and accuracy of these two decision-making methodologies.

So, what did Kuncel and his colleagues find?

  • Across the 25 studies, the researchers spotted a strong common thread: when decisions were made clinically, they were far less accurate than those that were made mechanically or algorithmically.
    • What that means is when assessment scores are weighted using pre-specified algorithms to produce overall scores, predictions proved to be more accurate by up to 50% compared to predictions made using clinical methods.
  • Perhaps even more interesting was the additional finding that the validity (such as it was) of clinical judgments was unaffected by the experience, organisational allegiance or specific job in question.
    • This suggests that clinical judgements, as a whole, have inherent limitations that transcend the knowledgeability of the practitioner who employs them.

Studies before and since the ones conducted by Kuncel have shown very similar results and in domains that go beyond IO Psychology. Engineering, medical and other professions are equally vulnerable to the limitation of clinical judgements. What are IO Practitioners to make of them?

For one, the science reveals that we ought to be very sceptical of decisions made based on subjective, professional judgement. While it may seem intuitively appealing to prefer our own experience and expertise when making judgements on whether to recommend (or not) a candidate based on our reading of their assessment results, doing so risks giving our clients and organisation deeply flawed advice.

A hallmark of good scientific research is that it often delivers counter-intuitive results, findings that we couldn’t have guessed at. Studies of clinical versus algorithmic judgements are excellent examples of this characteristic of scientific inquiry. The paradigm of using subjective and collective wisdom to arrive at an integrated judgement of assessment results is at best, outmoded, and at worst, potentially irresponsible.

For IO Professionals to continue to deliver valuable advice to the businesses they serve and for managers to continue to have confidence in the ability of assessments to predict job success, switching to mechanical and algorithmic methods of data integration are imperative. Ultimately, our value as IO Psychologists will be in constructing the most advanced algorithms that science can allow for, in getting close to the job in our role profiles, and in being level-headed, empirically-driven advisors to the organisations we serve.

If you’d like to know how TTS uses algorithmic judgements to help our clients make better talent decisions, why not drop us a line at

Source: Kuncel, N. R., Klieger, D. M., Connelly, B. S., & Ones, D. S. (2013). Mechanical versus clinical data combination in selection and admissions decisions: A meta-analysis. Journal of Applied Psychology, 98(6), 1060.