Every high school student should learn how to grapple with uncertainty, how to evaluate statistical claims and experiments, how to interpret graphs and charts, understand how machine learning models work (at a high level), and internalize concepts like "significance", "error bars", and "expected value."
This training will help all students every single day of their lives, because it teaches them how to think. Society benefits from having more people with the tools to evaluate data and deal with uncertainty, especially as we face a looming epistemological crisis.
Calculus, on the other hand, will be used by very few students, and even for those few, it will not likely be used every day. Yes, it is a prerequisite for some STEM courses as part of a degree program, and so calculus can be taught to undergraduates pursuing a STEM field in their first year (or those who take it as an elective in high school.)
It's a shame that Stanford and Harvard, which set the tone for high schools and high schoolers, are going the wrong direction here.
Pet peeve: can we just go back to calling these things statistics?
While I agree with you that statistics should be more heavily emphasized at the high school level, the issue goes much deeper within American math education that the one class.
Visualization, scripting, data collection, models, simulation. EDx had a great course by Guttag and Grimson. Add to this Scott E Page’s Model Thinking. Add EDx Data Analytics and Learning From UT Arlington. And some Tufte.
I say these because i work in the accounting field and brought scripting to my firm from my own self-study. It’s been a super power for me, and solved several problems which my colleagues had tackled using Excel alone.
I’ve also studied statistics, but found it less generally useful.
How do you expect students to understand what they are doing with "data science" without learning probability and statistics, and how do you expect students to get probability and statistics without learning calculus?
I mean, Bayes' theorem. How do you get people to get it if they don't know calculus?
Bayes' theorem follows straightforwardly from P(A & B) = P(A|B) P(B) and P(A & B) = P(B & A). The latter tells us that we can swap A and B in the former without changing the value, giving us P(A|B) P(B) = P(B|A) P(A).
Rearranging gives P(A|B) = P(B|A) P(A) / P(B), which is Bayes' theorem.
If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability. They don't have to know how to do the integral, in the case of a Gaussian, it's just tabulated anyway.
I'd argue that you could teach a perfectly reasonable high school stats class using this kind of approach.
A "calculus-free" method is mostly what is done for high school physics, with occasional nods in that direction to set the students up later. And like physics, the obvious connection to of continuous probability to calculus will be a nice motivation later on.
One analogy is how we teach probability to sophisticated engineering undergraduates. I'm not aware of undergrad engineering curricula that use measure theory. This results in awkwardness around delta "functions" and probabilities of certain sets of measure zero (sets that cannot be integrated without the Lebesgue integral).
And sure, some of those undergrads don't ever take that measure theory class, so they escape to the wild without knowing the answers to awkward questions.
>Calculus, on the other hand, will be used by very few students,
These two statements do not mesh. Understanding how machine learning models work requires Calculus.
Those two institutions are recommending more foundational (calculus) rather than applied courses (data science).
I knew a professor from a math department from a top European university who taught data science courses who swore that data science and data mining were just marketing terms invented to sell statistics.
UC CS undergrads had to take statistics for engineers and scientists.
UC CS undergrad majors in particular could end within 2 courses from a math undergrad degree. Is this not the case that squishier applied courses are possible?
EE/CS undergrads had to take the entire upper-division physics track for scientists and engineers, including modern physics.
So has something changed since then and is something changing back?
> This is the [open] textbook for the Foundations of Data Science class at UC Berkeley: "Computational and Inferential Thinking: The Foundations of Data Science" http://inferentialthinking.com/ (JupyterBook w/ notebooks and MyST Markdown)
> [#1 Undergrad Data Science program, #2 ranked Graduate Statistics program]
> Data literacy is distinguished from statistical literacy since it involves understanding what data means, including the ability to read graphs and charts as well as draw conclusions from data.[6] Statistical literacy, on the other hand, refers to the "ability to read and interpret summary statistics in everyday media" such as graphs, tables, statements, surveys, and studies. [6]
Data Literacy and Statistical Literacy are essential for good leadership. For citizens to be capable of Evidence-Based Policy, we need Data Driven Journalism (DDJ) and curricular data science in the public high schools.
This is the Stanford guidance. Mathematics: four years of rigorous mathematics incorporating a solid grounding in fundamental skills (algebra, geometry, trigonometry). We also welcome additional mathematical preparation, including calculus and statistics.
This is the Harvard guidance. Update to math curricular guidance: There is no single academic path we expect all students to follow, but the strongest applicants take the most rigorous secondary school curricula available to them. We receive many questions specifically about what type of math courses students should take. Applicants to Harvard should excel in a challenging high school math sequence corresponding to their educational interests and aspirations. Rigorous and relevant data science, computer science, statistics, mathematical modeling, calculus, and other advanced math classes are given equal consideration in the application process.
It is possible to teach calculus without trig (just for polynomials) and I think it is very useful just at that level.
A whole lot of stuff in AP Stats is a relatively dead end for many people, but geometry and geometric reasoning is necessary for all kinds of engineering-ish math.
Math is interesting in that the early foundation is so useful, but the use drops off quickly. While I feel like other areas often become more useful as I learn more. Possibly because I haven’t spent 15 years on that topic like I had math.
This is the most jarring thing I’ve read today. I can’t say I agree, but I haven’t spent 15 years studying math myself, so who am I to disagree.
For example, if you go into medicine and medical research having a good understanding of statistics is useful, but very little in calculus or analysis is useful (and even if you do need Calculus, most of the useful stuff for those fields is taught in the 1st semester of Calculus).
The Mathematics for Machine Learning book[1] exposes this as a top-down vs bottom-up problem. While both approaches have pros and cons, a sweet spot may lay somewhere in the middle and that needs you to embrace some inevitable backtracking (i.e. college curricula should not forget to add some courses where world modelling using the math and throughfully explaining why that underlying theory and math is actually useful in describing and/or predicting reality).
PS: I also think there is still a lot of focus in resolving problems manually.
[1] https://mml-book.github.io/book/mml-book.pdf, page 13.
Statements like this are a big part of the reason statisticians never trust anyone who works in "data science". The whole field is basically applied statistics/calculus and you're saying none of that is useful.
Not if you want to leave open the possibility of majoring in engineering, physical and biological science, or economics.