The Car Wash Problem: A variable isolation study on prompt architecture
Last week, the "Car Wash problem" (50m away, walk or drive?) went viral here on HN. Every major LLM failed because they missed the implicit physical constraint: the car must be there. While testing InterviewMate's prompt architecture, I posed the same question. It answered drive immediately. Every other LLM had failed. But I didn't actually know why it worked — so I ran a variable isolation study to find out. 100 API calls, Claude Sonnet 4.5, 5 conditions:
Baseline (no prompt): 0% Role only: 0% Context injection (user profile, car location): 30% Structured reasoning (STAR framework): 85% Full stack (both combined): 100%
Throwing facts at the model doesn't work unless the architecture forces it to explicitly evaluate the task goal first. Without structure, the model jumps straight to the distance heuristic: "100m is short, walk." I'm writing a paper on this. Wanted to share the raw data with HN first. Code and raw eval data: https://github.com/JO-HEEJIN/interview_mate/tree/main/car_wash