Thursday, March 24, 2011

B-School Admission Criteria: A Critique, Part One

Disclaimer: I have spent well over two years trying to gain admission into a top-notch B-school in India, because I think an MBA education is necessary for my career progression.

A small note
Until a month or so ago, I never paid much attention to the logic behind the admission criteria of top B-schools. All I knew was that such criteria existed, and in order to gain an admission, I needed to satisfy them.

While going through the rigmarole of admission processes in 4 Symbiosis institutes in Pune, I happened to buy a book called Fooled by Randomness by Nassim Nicholas Taleb at a roadside bookstall. Reading that book changed my thinking. It made me realize the uselessness of B-school admission criteria.

That knowledge can now be yours, for free, if you take the time out to read this somewhat lengthy article.

Even non-MBA aspirants can gain from reading this. How? I am outlining a way of thinking here, which can be universally applied, i.e. in practical situations faced in daily life.

I hope you gain as much pleasure and understanding reading this as I did from writing it.

THE PROCESS
B-schools give weightage to a lot of different criteria. Such division of weightage among ‘relevant’ parameters is considered to be a positive.

While the relative weightage of the criteria differ, all B-schools have the following criteria:
1. Written Assessment Test (WAT) or Computer Based Test (CBT) with Multiple Choice Questions (MCQs)
2. Profile of the candidate (Academics, achievements, work experience etc.)
3. Interview (Group/ Personal)

Other commonly used criteria:
1. Essay writing/ Précis writing
2. Group discussion
3. Case study analysis
4. Extempore
5. Group task/ group exercise

The final selection is based on an overall composite score, with minimum qualifying marks in the interview.

So, what’s wrong with this process?
At first glance, everything seems fine. All the above criteria appear to be objective, and will ensure that the B-school gets a great candidate overall, with competence in a lot of relevant aspects.

Delving into them a bit though, we start treading murky water. All the above criteria fail as a test for measuring a candidate’s potential in handling an MBA program and a career beyond. The rest of this article will be devoted to explaining why.

First, the assumptions
The obvious assumption I have made is that the B-schools employ these criteria in order to get the ‘right’ talent for their institute. They may have other motives, but if so, it remains a secret to the aspirants.

As an MBA aspirant myself, and speaking on behalf of all aspirants, I can safely say that we assume that the selection criteria is intended to get the ‘best’ students.

Debunking the Process
First, the WAT (or CBT) with MCQs in the areas of Language Comprehension, Quantitative Ability, Reasoning and Data Analysis. The mark allotment and weightage given to each area differs, but all entrance exams have these. Some also include a General Awareness (GA) section.

To explain the logic behind these parameters, I have to digress a little. The primary aim of schooling was to educate children in math, logic and language. These 3 are regarded as the fundamental skills every man needs (that school has an etymology that translates into luxury is a different issue).

An MBA program is most probably the last formal education a student will receive in his life. To weed out those who have had at least 17 years (2+10+2+3) of formal education and still have not acquired these skills, B-schools employ such objective tests of math, logic and language (English). I’ll come to the GA part later.

I am in favour of testing for these skills, but not on the method employed in using these test scores.

Note: The relative weightage to be given to each section is subjective, and cannot be argued against provided there is not too huge a bias in favour of or against a section.

Let’s look at two types of entrance tests: a paper pencil (PP) based one (say SNAP, with +1 mark for a right answer and -0.25 for an incorrect answer) and a computer based one with ‘normalized’ scoring (say CAT).
In the case of SNAP, everyone gets the same paper, which translates into an equal opportunity for all to prove their mettle vis-à-vis his competitors in that paper. The only unknowns at play are luck (which cannot be controlled) and preparedness of the candidate. It is a good way of testing.

Note: No man is perfect. No test designed by man can be perfect. A good test is therefore one which is the least imperfect.

The problem with SNAP (leaving GA apart for now) lies not in the test itself, but in the interpretation of the test score. Say SIBM-P has a cut-off score of 118, and calls candidates at and over that threshold score for due process. They think candidates who meet the cut-off are worthy of further consideration.

But how is a candidate with a score of 118 so much better than one with 117.75 that one gets an interview call and the other does not. Clearly, he isn’t.

I don’t deny that for the sake of convenience, the admissions committee has to pick some random cut-off number, which will invariably be a round number. I only say it is not an objective measure of a candidate’s ability.

Yes, the candidate with a 117.75 is out of luck. It can’t be helped. Or can it? I say yes. Let us now look at how the marking system in SNAP is skewed in favour of risk takers.

The problem with negative marking
Why negative marking? Obvious answer: to minimize guessing, so that only real ability is rewarded. By penalizing a candidate for incorrect answers, they hope to ensure that people who make it to the cut-off do so due to their competence, without the help of lucky guesses.

But negative marking actually has the opposite effect: it rewards the guessers and penalizes the talented.

How?

Let us say you’re a salaried employee working as a coder in an IT company. You work hard, spend frugally and save your money in a bank. Would you gamble with it, knowing that you may lose all the fruits of your labour? I assume not.

But say your job is to gamble with other people’s money. You then do it probabilistically, with expected value calculations, hedges and stop losses in place. This is called investing.

Getting back to the SNAP scenario, consider two students A and B, both at a score of 100 due to sheer effort.

A is prudent, B has the investor mindset. A does not guess the answers he doesn’t know. His score stops at 100.

B is a risk-taker. He decides to randomly guess 10 answers. He is risking a maximum loss of 2.5 marks for a maximum gain of 10 marks. And his expected value is positive, at 0.625 marks.
[Expected value: (0.25x1-0.25x0.75)10= 0.625]
A net positive, just from guessing!

But wait. If guessing can be so rewarding, why doesn’t B just guess all the questions he doesn’t know? If he is intelligent, he won’t. An intelligent investor always has a stop loss in place.

Understanding stop loss
Let us say you own a share worth Rs. 100/- at present market conditions. You expect it to rise in value, to a maximum of Rs. 110/-, at which point you will surely sell it, which, incidentally, is called profit booking (realizing the gain). In case the market moves against you, you don’t want to take too much loss on your investment. So, you place a stop loss order at Rs. 97.5/-.

The moment the share price touches Rs. 97.5/- (called stop price), your broker sells your share for you and gets you out with a loss of Rs. 2.5/-. (This is a simplified version; sometimes your broker may be unable to sell at the stop price and you may make a greater loss). Your net risk is 2.5% of your investment. Your net reward can be up to 10% of your investment. Does the risk seem to be worth it? It depends on how much risk you can stomach.

Consider a few more scenarios with the same analogy:
Current Stock Price: Rs. 100/-
Stop Loss: Rs. 95/-
Expected Profit: Up to Rs. 20/-
Expected Value: Rs. 1.25/-
Risking up to a 5% downside for up to a 20% upside.

Current Stock Price: Rs. 100/-
Stop Loss: Rs. 92.5/-
Expected Profit: Up to Rs. 30/-
Expected Value: Rs. 1.875/-
Risking up to a 7.5% downside for up to a 30% upside.

Current Stock Price: Rs. 100/-
Stop Loss: Rs. 90/-
Expected Profit: Up to Rs. 40/-
Expected Value: Rs. 2.5/-
Risking up to a 10% downside for up to a 40% upside.

Current Stock Price: Rs. 100/-
Stop Loss: Rs. 87.5/-
Expected Profit: Up to Rs. 50/-
Expected Value: Rs. 3.125/-
Risking up to a 12.5% downside for up to a 50% upside.

These scenarios I have outlined are basically the same as guessing in SNAP. I stopped at a gain of 50 since SNAP has only 150 questions and B has already answered 100 through his talent.

B has decided to place a stop loss at 97.5 because of his subjective assessment. A has placed his stop loss at 100 by not taking any risk.

If B thinks the cut-off is surely more than 100, he is definitely motivated to guess some answers. He either makes it, luck being in his favour, or does not make it, luck being against him. If the cut-off is above 100, A has no chance of making it, while B does. The risk-taker has been successfully rewarded by SNAP.

More worryingly, if B has access to past data regarding cut-offs in SNAP for various institutes, he can do a time-series analysis and estimate the current cut-off (if he knows what he is doing, he can do it with remarkable accuracy). Armed with this estimate, and depending on his score due to talent, he can determine the optimal level of risk he needs to take. And if he can eliminate one or two choices per question he guesses, with 100% certainty, his expected value will shoot up!

Thus, we see how the intelligent risk-takers are rewarded and the prudent non-guessers are punished.

Lesson: Negative marking makes luck play a more important role and worse still, luck helps only a few (the risk-takers), than all. A better way to eliminate the role of luck would be to give everyone an equal opportunity to use it: stop penalizing incorrect answers.

The problem in no negative marking
If guesses are not penalized, then everybody is incentivized to guess. No candidate’s score will therefore be an accurate measure of his talent! Is not having negative marking worse than the problem it is trying to solve?

No.

Consider the following table:



Type of candidate Scores Reflect
Negative marking No negative marking
A (No Risk) Talent Talent/More than talent
B (Risk-taker) More than /Less than/ Equal to talent Talent/More than talent


When there is no negative marking, both A and B have an equal opportunity to use luck. And their scores will definitely be at least equal to their level of talent. When there is negative marking, A’s score reflects his real talent while B’s score may not. Also, under negative marking, if you don’t know who A is and who B is, how will you judge who is merely talented, and who is talented plus a risk-taker? You can’t. Whereas, under no negative marking, you can more accurately discount the influence of luck of each candidate’s score. (If you do that in a negative marking scenario, you penalize A, and make his score go lower than that which he got by talent).

Sure, discounting the influence of luck (under no negative marking) may still harm those who were favored less by luck than the discounted value and benefit those who were overly favored by luck. But it is less imperfect than the current negative marking system.

(I have to make a confession here. There is a statistical concept called “Regression to the Mean” which can be used to take my explanation of luck further, but I haven’t understood it sufficiently well to attempt to invoke the concept here).

And so I say, since luck influences your chances of an interview call anyway, everyone should be given an equal opportunity to use it, and not just the risk-takers. There may be outliers who make it on sheer luck, but that is true in both negative marking and no negative marking scenarios.

Consider this example: while it is entirely possible that among a large group of music illiterates hitting 60 keys on a piano randomly, one may produce the starting notes of a symphony, the chances of him performing at Carnegie Hall are very, very slim. His lucky performance does not invalidate the ability of the talented performers. That is, a test should not be judged based on outliers. (There is a concept in statistics called smoothing, which tries to capture patterns while eliminating the effects of the outliers. While this has some defects, it holds good for analyzing data sets where the effects of noise (here, luck) on the actual data need to be minimized).

Also, it is more ‘fair’ to those MBA aspirants who just need to be good at their 3 fundamental skills in order to do an MBA, such as those who do not want to specialize in finance. They don’t need to cultivate the investor mindset for their profession; they shouldn’t be forced to do so just to take an entrance exam.

Moving on to CAT
If even a PP based test like SNAP suffers from a lack of objectivity, what of CBTs with normalized (scaled) scores like CAT and NMAT? NMAT has rightly decided to stop penalizing incorrect answers (and has allowed multiple attempts), so I stick to CAT.

Statistics is reliable only at the macro level. It doesn’t truly represent any individual sample. Let’s say you flip a coin twice in succession. According to statistics, you should get 1 Tail (T) and 1 Head (H) in any order (A more technical statement would be that the results converge to that). The possible outcomes are {HH, HT, TH, and TT}. Of these, {HT, TH} individually satisfy the condition. But {HH, TT} do not, although when taken together, they do. After 4 experiments, if all the 4 results (collectively called as the sample space) materialize, we will have 4[1H and 1T] as predicted by statistics. When this experiment is repeated a number of times, the ratio of H to T converge to 1:1, though you will find numerous strings of {HH} and {TT}.

What we understand from this is that while statistics ‘work’, they can still misrepresent an individual sample. A statistical scoring technique which is 99.95% reliable will tend to mess up the scores of 5 out of every 10,000 candidates. At over 2 lakh test takers, around 100 may get short shrift. And you won’t know if you’re one among them.

The more complicated something is, the greater the chances of failure. Do you trust an opaque algorithm to evaluate you accurately?

The problem is exacerbated with the presence of negative marking.

For CAT, where a 0.01 percentile could be the difference between an IIM-A call and no IIM-A call, how many deserving candidates are excluded because they ran out of luck? (I would not say the converse though, that many undeserving candidates got an IIM-A call because Lady Fortuna smiled on them. You need both talent and luck to get an IIM-A call, and frankly, you can’t differentiate between the two. And beyond a threshold level of talent, talent doesn’t really matter for your performance).

A better selection system would be a lottery for all those who passed a threshold percentile. Skip the illusion of objective selection, acknowledge the role of chance. It’s much more ‘fair’ to the candidates than the current system.

How does CAT compare students across time slots?
They use ‘similar’ questions to judge the caliber of the test takers in each slot. Based on this, all scores are brought to the same scale. I have my reservations on the use of these ‘similar’ questions, and how they are scored. I’m not critiquing it; since they keep their methods a secret it is not possible to critique it.

Consider two similar questions to understand my point.

Q1. Suppose I have a pack of cards, each of which has a letter written on one side and a number written on the other side. Suppose, in addition, I claim that the following rule is true: If a card has a vowel on one side, then it has an even number on the other side.

Imagine that I now show you 4 cards from the pack: E 6 K 9
Which card(s) should you turn over in order to decide whether the rule is true or false?

Q2. You are a bartender in a town where the legal age for drinking is 21 and you feel responsible for the violations of the rules. You are confronted with the following situations and would have to ask the patron to show you either his age or what he is drinking. Which of the 4 patrons would you have to question?

1. Drinking beer
2. 2. Over 21
3. Drinking soft drinks
4. Under 21

While both these problems are ‘similar’, the second question is much easier to answer. (The answers are to check 1 and 4 in both cases). Why? Because it is easier for us to associate with people than cards, though the logic is the same. My question now is: how will these types of ‘similar’ questions be evaluated?

How well do entrance tests test your 3 fundamental skills?
It can be argued that any question in math, reasoning and data analysis helps test for your math and logic skills, but what about language? Does the answer to this question really test anything useful?

A baby deer is called a ____.
(a) Foal
(b) Fawn
(c) Calf
(d) Joe

That apart, in many questions, the answers are subjective. The ‘right’ answer is anybody’s guess. This is also true of the decision-making caselets in XAT.

Imagine you are driving a motorbike, wearing a helmet, and at a red signal, you are asked by a traffic cop to produce your license. You don’t have it with you at that moment. Do you…

(a) Plead with the cop saying you have it at home and you would produce it if only he lets you go get it? (And actually do so if he lets you go).
(b) Stall and wait for the signal to turn green and just zoom away on your motorbike.
(c) Pay him a bribe which is less than the fine.
(d) Pay the fine.

Which is the ‘right’ decision to take? Clearly, it depends. If I am sure I could zoom away, I would. If I’m uncertain about that, I would either plead or pay a bribe (depending on whether my time or my money is more important at that moment). I would never pay the fine, if I could avoid it. Now, which choice should I tick? (The worst choice of all, pay the full fine, would probably be the ‘best’ one from an admission point of view though. Does ticking that tell the examiner anything about my actual decision-making ability? What are the odds that a person does in real life what he says he will do in an exam paper?)

A test of decision-making is probably the worst contrivable, save for the test of general awareness.

What’s wrong with testing a candidate’s general awareness? Surely a manager should know what’s going on around him. In fact, everyone should know what’s going on around them. Why not test for it?

I counter-question: How will you test ‘general’ awareness with ‘specific’ questions? Can specific questions test for general answers? A specific question can only test for a specific answer. Only open-ended questions can test your general awareness. Let me explain the difference with an example.

Specific question: What was India’s real GDP growth in the last fiscal year?

Open-ended question: What do you think of India’s real GDP growth in the last fiscal year?

The first type of question tests your memory; the second type tests your awareness as well as gives an insight into your understanding of the data. A rote learner can answer the first question, but not the second. A ‘generally aware’ person can most definitely answer the second question, and he may be able to answer the first one with approximate statistics. Whom does the GA section of the entrance exam benefit? The rote learner! The one who mugs up a CSR yearbook excels, a real thinker who doesn’t care for the exact numbers (which, in any case, are made up and manipulated so much that they may as well as be fictitious) is penalized. The insidious practice of rewarding rote learning over thinking continues well beyond school and under-graduation!

An even more saddening criterion
There are institutes whose placement eligibility criteria include test scores on ‘business awareness.’ That’s taking stupidity to new highs. I wouldn’t be surprised if they churned out rote learners by the dozen. I pity the thinkers who aspire to be real managers but make the mistake of pursuing their MBA in such institutes. I’m glad I’m not among them. (Full disclosure: I got rejected by one such institute).

All in all though, there isn’t much to criticize of the entrance test. It is the most objective among all the selection criteria (though still flawed).

To be continued....

6 comments:

smriti said...

I agree with your views, though it never occured to me to analyse this deeply. Nice to see someone capable of such analysis and your presentation is great as well. Lkied the interspersed humour as well :) Keep blogging!

culdude said...

Kashyap, you have come up with a wondeful post backed with strong logic...I admire the way in which this long post has been constructed to keep the readers interested :)

Bharathwaj said...

Kashyap, you have come up with a wonderful post...I especially admire the way in which the post has been constructed to keep the readers interested till the end :)

Bharathwaj said...

I can safely say that we assume that the selection criteria is intended to get the ‘best’ students.

Actually, this is indeed an assumption...I have had the feeling that the IIMs have always tried to eliminate the max number of candidates to narrow down to 800from 2.5 lakhs...Why else would they consider my 10th and 12th grade percentage? I'm not quite sure how scoring better in history, geography and sanskrit some 10 odd years ago would make me a better person for MBA?!

Bharathwaj said...

Thus, we see how the intelligent risk-takers are rewarded and the prudent non-guessers are punished.

Apart from analytical,logical and time management skills, i strongly believe that a manager should possess risk taking ability...You might ask how taking a risk might be a ability as risk is something which we do without being sure of the outcome? Well, intelligent and calculated risk is always expected of a manager...I suppose these exams expects one to take calculated risks and hence they are rightly rewarded

Kashyap said...

Thanks for the feedback :)

Bharathwaj,
"I suppose these exams expects one to take calculated risks and hence they are rightly rewarded"

I don't accept that managers are risk-takers. If you're a shareholder, would you prefer your company minimizes risk and maintains a steady revenue, or take 'calculated risks' but might go bankrupt if it doesn't work out?

Also, this isn't about calculated risk. Among say 100,000 calculated risk takers, only a handful get lucky. It's still a case of luck favouring only the elite.