At that company (which was a ~200 engineer, privately held, software company) we found a few things: - in person tests were less predictive than take home tests. - tests that did not provide automated test cases as examples were less predictive than those that did. - there was virtually no predictive power to 'secret test cases' that we ran without providing to the candidate. - no other part of the interview pipeline was predictive at all. Not whiteboarding, not presenting, not personality interviews, not culture fit testing, not credentials, or where experience came from, nothing. That was across all interviewers and candidates.
A few caveats about this: - this was before take home testing had become widespread and many companies screwed it up. At the time we were doing this it was seen as novel and interesting by candidates, not as just one more painful hoop they had to jump through. - we never interviewed enough candidates to get true statistical relevance. - false negatives were our biggest concern, they are extremely hard to measure (and potentially open yourself up to lawsuit). The best we ended up doing was opening up our pipeline to become less selective to account for it. This did not seem to reduce employee quality.
In a more meta-sense, that experience led me to believe that strict hiring pipelines are largely not useful. Bad candidates still get through and good candidates don't. Also, many other things have a much bigger outsized impact on productivity than if a candidate was 'good'. It turns out, humans do not produce at consistent levels all the time and things outside of what you can interview for make more impact (company process, employee health, life events, etc. all have way more impact on employee productivity than their 'score' at interview time).
Did you test predictive power from individual interviewers? At a company I worked at previously we did, and this was the by far the best overall predictor: some interviewers just did a much better job at identifying those likely to succeed than others. Which can explain another reason why you didn't see much predictive power if you looked across those other items over all interviewers - the variance between interviewers essentially "swamps" any smaller differences between those interview techniques.
Note this didn't surprise me that much, as you see this dynamic in lots of other "person-to-person" endeavors. For example, when looking at whether one type of psychotherapy intervention is better than another, most of the data that I've seen shows that by far the most important factor is the skill and "match" between therapist and client, far more important than any individual modality.
Again, we didn’t have enough data points for real statistical validity so it could be that, but I became convinced that it didn’t matter who was interviewing or the format of the interview. Some candidates are good at interviewing and some aren’t but that didn’t hold to the job.
this brings back some unpleasant memories of a take-home i got from a FAANG.
basically i was given a loose spec to implement, with no real data or test cases (and was told that none would be provided when i asked). after submitting my work i received a terse rejection with 0 constructive feedback for my 6hrs of work. uncool.
The only exception I've made is if the company pays for the time.
The point of a coding interview is to eliminate, as fast as possible, people who simply can't code. I'm being completely serious here. They can even have a CS degree (or will claim to but if you look closely they were in an easier program to get into and took CS electives) but cannot write a simple program on the board in an hour.
It's also why I don't like take-homes. First it's trivial to cheat (I don't mean lookup stuff online, just flat out have someone else do the work) and because of that the final stage would still have to be in-person whiteboard (or pair programming over Slack but still have an engineer spend 40+ minutes with the candidate).
I strike the style required. I capitalize on opportunities to make decisions I can discuss. "I used tape instead of jest because this example product will be distributed to many developers. The reduced API-surface area keeps us focused on the how's not the what's."
I tone that down if the role seems more rote-work like, at which point I try to highlight my ability to solve problems and learn quickly. For example, a comment above some network call: "// I was getting a cors error and found out I can run my own proxy for this"
I'm not putting in half a day of work for zero pay to help you with your first-pass weed-out phase before we bother to make sure we align otherwise and this looks like a good fit. Thanks, bye, next (employer) candidate.
If you are talking about short term trials many devs are bound by anti-moonlighting employment agreements that either outright bar working for someone else or require notification.
For long term trials you severely limit your hiring pool because that is effect temp-to-hire which many devs simply will not do.
How did you measure the candidate once hired?
What factors were indicative of a "good" hire vs. a "bad" hire?
If we are recruiting a senior, we would expect them to easily complete basic technical tests. If they are more junior we might use them only as an indicator of their ability.
I don't particularly expect a strong correlation between how well they did in the tests and their long-term ability since their value is made up of many things, only one of which is their ability in the tests.
There is only time I was still unsure and didn't want to waste the candidates time so rather than telling him sorry, I set him a paid coding test to develop a microservice in order to judge his style, how long he took, what questions he asked etc. I didn't think the result was good enough but because we paid him, we parted on good terms and he had some good feedback.
I have found that investing the time to correctly onboard new team members makes a huge difference. Correctly onboard an average/good hire and they go on to produce solid output and often thrive. On the other hand, you could have a great new hire but because of no/poor onboarding they "sink" instead of swim.
This seems like one of those occasions where improving your reliability by just a few percent (even if far from statistical significance) can massively reduce costs in the long run. (Maybe Kahneman even used interviewing as a concrete example of this in his latest book?)
Do you check the applicants who were denied based on their test and see where they ended up working at. E.g. you are a mid tier start up who rejects someone who ends up working at amazon as a high level engineer – do you mark that a failure?
I'd be careful to presume you can know these things from an interview.
> unless (of course) the candidate that was actually hired ends up being an even worse fit (ergo the need to fix your hiring process).
Total lack of self awareness in the corporate world really is an amazing thing to behold. I suppose this is "iterating" (in HR speak, not code speak): taking a set of criteria which generates a wrong conclusion, and then applying all that to ancillary things to find more wrong answers.
Unless one's focus is research and development, there is a non-zero cost to training for production skills, so it's best to start with someone who understands the delivery process.
Linear metrics are probably less useful, inasmuch as it will become rather obvious as to which employees are self-starting and work well with others, versus those that require motivation or are staunch individualists.
Timed algo challenges encompass a slew of antipatterns in terms of how good code is actually written and shipped. To begin with, pitting someone against a clock and hidden test cases (and a foreign editor) is actively optimizing against solutions that are readable to other human beings -- or to the person writing them, a year from now. The nature of running them in a browser means that it can't evaluate a person's capacity to actually use tools outside of core language functionality. Never mind that building the entire exercise around predetermined test cases precludes any way to gauge whether the person taking it has an understanding of writing tests.
And that's assuming your test environment doesn't add obnoxious and arbitrary restrictions of its own. Like telling you that using documentation is cheating. (Btw imagine listening for ctrl+t here, but not ctrl+n.) Or offering you "the language of your choice," but then throwing API call exercises at you while limiting your choice of JavaScript runtimes to a bare installation of Node -- the only one still in active development, out of a list that also includes every browser you would use to actually access the test -- that doesn't support Fetch.
The submitted question seems to just brush over this aspect, but so far when I've tried to evaluate interviewing techniques that has been the primary obstacle; people just can't agree on what success means once employed, so anything that tries to correlate interviewing to that will be an equal amount of junk.
I interviewed hundreds of technical people in my career, across dev, test, and ops skill sets. I saw limited correlation between tests and aptitude. If you talk to someone about a project they've done, you know pretty quickly:
1) Can they communicate technical ideas? 2) Can I develop a rapport with this person and work together? 3) Do they understand what they built? Can they talk about tradeoffs they made? Did they learn anything from the experience?
A fizz buzz test isn't a terrible idea, but you also have to have an interviewer that understands how to administer it within the wider context of the interview. If the interviewer themselves doesn't understand it, they aren't qualified to actually administer it.
https://catonmat.net/programming-competitions-work-performan...
We have short, standardized, broad interviews. We look for what can be added to the team rather than poking holes, and we're still trying to improve.
So far we’ve hired 7 decent and 3 great people. No truly bad people have made it through that pipeline yet.
I can’t say anything about why, and I’d be prejudiced in any case.
Of course, it's really not possible at all to do this at the level of rigor expected of, say, clinical trials. Each new hire will know what type of interview you put them through, and there is no reliable way you can prevent them from telling others.
On top of being hard to measure, the data points generated through hiring is just too few and the data collection process is too long and subjective
Just ask your team if they like the new hire, can they make progress together. Things like do you like working with the new hire? Is the new hire bringing in new insights to the team? Is the new hire easy to work with? Is the new hire learning new things.
And most importantly, can the team let go of mismatch fast enough. Overall I would say it is just not worth it in measuring hiring.
However, we do hire some contractors essentially without an interview, and it is fairly apparent that's a bad idea.