Are Video Games the Future of Standardized Tests?

Perhaps it’s a function of the need to get things done with a large and diverse population, but the United States has a long love affair with standardized aptitude tests.

Right through the education system, scores on nationally standardized aptitude tests are both measures of success and gateways to the next level of study. Even at the graduate level, GREs and GMATs govern admissions to higher degrees.

Beyond the academic world, the US military has long been a leader in aptitude and IQ testing, with the datasets they have produced over large populations being of huge interest to academics in psychology and related disciplines.

Standardized tests come in for a lot of flack, but they have important functions. We see why standardized tests are necessary, but note that existing tests are often highly flawed.

To see how standardized tests might be reformed over time, we look at how the management consulting firm McKinsey replaced a dysfunctional old test with their new McKinsey Problem Solving Game – and how this might represent the future for standardized tests in general.

The Real Advantages of Standardised Testing

It has become fashionable to criticise standardized testing. Current tests are certainly not perfect, but moves against testing, in general, threaten to throw out the baby with the bathwater.

Where standardized testing works well, it is hugely useful, both in producing social mobility amongst test-takers and in improving the overall efficiency of the institutions which administer those tests.

For an individual from a deprived or otherwise underprivileged background, top scores in the exact same test everyone else took to act as clear, undeniable proof of hard work and talent. Subsequently, it becomes significantly harder for a biased recruiter to lock these individuals out of academic or working opportunities.

The US Military is a case in point for this kind of testing regime. During the Second World War, in particular, vast numbers of conscripted men with little formal education, testing was used as a means to find the individuals with the capacity required to serve in leadership positions or in technical roles as engineers, radar operators, and the like.

Problems With Standardised Tests

At the same time, a perfect test is impossible.

Even the best-designed test in the world only captures the taker on one day – which might be an unusually good or bad day. After all, we have likely all messed up a test in school when we simply had an unexplainable bad day.

However, we are (very) far from the point where we have perfected aptitude testing.

Say we are testing candidates who have applied for a job to see who might be best for the role:

We might not be testing for the correct skills. For example, we might mistakenly test mathematical ability or something like raw intelligence for a job which mostly requires people skills to be successful.
We might not be able to measure the traits we are interested in very well. Thus, even if we simply wanted to test raw intelligence, we might have to work very hard to find a test that reliably measured that trait and could not be gamed or otherwise allow candidates to have unfair advantages.

As a result of these and other factors, the resulting problem is that standardized test results might not actually be predictive of subsequent performance. That is to say, if we use a bad test to select people for a job, the people we select might not actually be any good at the job.

Real Life: The Management Consulting Industry

All this might sound very theoretical so far, but we have a real-world example from the management consulting industry – with implications that could feedback to into testing in the academic and other domains.

Specifically, we will be looking at McKinsey. However, this is large because that firm is the industry leader in the consulting sector and has simply been the first to really begin to address issues that affect the file more generally.

In doing so, though, McKinsey has also pointed the way towards more useful forms of standardized testing.

Why Tests?

Each recruiting season, McKinsey finds itself swamped with far more candidates than it has jobs. Interviews are expensive and inconvenient as they require taking senior staff off active engagements.

In short, then there is no way McKinsey can interview every applicant.

Whilst many are cut simply on the strength of their resumes, there is still a huge number left. To thin down this pack further, the firm has long used screening tests. These were used to cut down the applicant pool to a manageable number to interview.

Previously, these were in the form of written, multiple-choice, business-themed reasoning questions – in much the same format as the GMAT MBA applicants will sit.

The similarity to the GMAT was appropriate when the typical new consultant was an MBA grad being recruited to work on generalist projects, where no previous technical experience would be required.

Changing Times

However, the world has changed and the management consulting industry with it.

Management consultants solve business problems for clients. In previous times, a clever MBA grad was generally sufficient to work on these problems.

However, each year, clients have been coming to firms like McKinsey with more and more complex, technical problems. For McKinsey to keep their competitive edge, they need to be able to solve these problems.

The problem is that this means recruiting a whole new kind of consultant; selecting recruits with industry-specific knowledge and skills to work on specific kinds of projects. Programmers to work on tech projects, chemical engineers to work on oil and gas engagements.

This meant that the old GMAT-style screening test was no good, in that it assumed general business knowledge a technical expert might not have. Thus, the candidates the old test selected for were no longer necessarily the ones who would do the best job.

The Brief

All this meant that McKinsey really needed a screening test that could select candidates with the highly developed cognitive skills they need but without pre-supposing any kind of specific previous experience.

The ideal was also to be able to evaluate the candidate’s thinking process rather than just the eventual answers (what was called a “process score” as opposed to a “product score”). This both controls for the kind of fluke correct answers which might swing a multiple-choice test, whilst giving a much deeper degree of insight into how the candidate actually thinks.

A Video Game!

Enter the company Imbellus, who McKinsey contracted to create the McKinsey Digital Assessment. This is a truly innovative standardized test in that it makes use of a video game to assess candidates.

The game is set in a natural setting, either in an alpine or marine ecosystem – solving problems involving animals, with no mathematical component whatsoever.

The idea is that this evens the playing field, as pretty well everyone applying for this kind of job will have a reasonable idea of how these generic kinds of environment behave. Certainly, there is no business component.

Powerful software, driven by statistical learning, then tracks the specific way the candidate approached the game scenarios; examining every movement and click of the mouse to look for ordered, rational thought.

What is more, to prevent any kind of “gaming” the test by candidates conferring with one another or preparing for common scenarios, the Assessment generates unique games for each individual test-taker. Thus, the system changes the parameters of each new scenario so that the “correct” answer one time might not be so again.

Importantly, the test is constantly improved as results are compared to subsequent performance at in-person interviews. This way, McKinsey can be sure their Assessment really is selecting the right people for the job.

The Future?

Imbellus have explicitly set their sights on the academic sphere, ideally hoping to replace the tests currently governing university admissions.

So far their test with McKinsey is very promising and they might be able to make the case for change in a system already widely regarded as dysfunctional.

It might be a scary idea to have one’s future education or employment hinge on how one performs in a seemingly ridiculous video game.

However, if these tests really do work, then the results should be less random than the current testing regime. As such, we should be content that our chances really will reflect ou talents and that society really is giving university places and jobs to the best people, regardless of their backgrounds.

Of course, this benefits the individuals concerned. However, society as a whole also stands to materially gain from making sure we have the most competent minds in the positions which are most influential on the lives of others.

Interesting Related Article: “How Video Games Could Influence the Gambling Industry“