Which is the problem: It is using natural language processing techniques that were state of the art in 2009, but have been completely eclipsed in the past couple of years. The "rigor" tends to be at odds with the fluidity of user input: typos, search query-ese, ambiguity, etc.
The challenging problem that Wolfram|Alpha tries to solve is conversion of natural language queries to structured ones. Although I doubt Wolfram's parser has been completely static for a decade, the most recent generation of language models are vastly better at translating natural language queries to structured ones. See also: how terrible Siri is at everything.