Stanford, Harvard data science no more (opens in new tab)

(stanforddaily.com)

23 pointsMauiWarrior3y ago54 comments

54 comments

A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.

Every high school student should learn how to grapple with uncertainty, how to evaluate statistical claims and experiments, how to interpret graphs and charts, understand how machine learning models work (at a high level), and internalize concepts like "significance", "error bars", and "expected value."

This training will help all students every single day of their lives, because it teaches them how to think. Society benefits from having more people with the tools to evaluate data and deal with uncertainty, especially as we face a looming epistemological crisis.

Calculus, on the other hand, will be used by very few students, and even for those few, it will not likely be used every day. Yes, it is a prerequisite for some STEM courses as part of a degree program, and so calculus can be taught to undergraduates pursuing a STEM field in their first year (or those who take it as an elective in high school.)

It's a shame that Stanford and Harvard, which set the tone for high schools and high schoolers, are going the wrong direction here.

comte70923y ago

> Every high school student should learn how to grapple with uncertainty, how to evaluate statistical claims and experiments, how to interpret graphs and charts, understand how machine learning models work (at a high level), and internalize concepts like "significance", "error bars", and "expected value."

Pet peeve: can we just go back to calling these things statistics?

While I agree with you that statistics should be more heavily emphasized at the high school level, the issue goes much deeper within American math education that the one class.

roddylindsay3y ago

I would assume that a data science class is mostly "good old statistics." But if "data science" is the phrase that gets education boards to put more student butts in seats in stats class, I'm all for it.

xtiansimon3y ago

Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?

Visualization, scripting, data collection, models, simulation. EDx had a great course by Guttag and Grimson. Add to this Scott E Page’s Model Thinking. Add EDx Data Analytics and Learning From UT Arlington. And some Tufte.

I say these because i work in the accounting field and brought scripting to my firm from my own self-study. It’s been a super power for me, and solved several problems which my colleagues had tackled using Excel alone.

I’ve also studied statistics, but found it less generally useful.

comte70923y ago

>Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?

I would say yes, however, the items listed in the comment I quoted fall squarely within the realm of statistics. I don’t have a problem with calling a curriculum of statistics + data manipulation tools “data science” but that’s not what’s realistically being covered in these high school programs.

1 more reply

simplotek3y ago

> A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.

How do you expect students to understand what they are doing with "data science" without learning probability and statistics, and how do you expect students to get probability and statistics without learning calculus?

I mean, Bayes' theorem. How do you get people to get it if they don't know calculus?

tzs3y ago

I don't recall Bayes' theorem involving calculus. Are you sure you aren't thinking of some other theorem?

Bayes' theorem follows straightforwardly from P(A & B) = P(A|B) P(B) and P(A & B) = P(B & A). The latter tells us that we can swap A and B in the former without changing the value, giving us P(A|B) P(B) = P(B|A) P(A).

Rearranging gives P(A|B) = P(B|A) P(A) / P(B), which is Bayes' theorem.

mturmon3y ago

You can sidestep calculus by just using the discrete setting rather than a continuous one.

If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability. They don't have to know how to do the integral, in the case of a Gaussian, it's just tabulated anyway.

I'd argue that you could teach a perfectly reasonable high school stats class using this kind of approach.

A "calculus-free" method is mostly what is done for high school physics, with occasional nods in that direction to set the students up later. And like physics, the obvious connection to of continuous probability to calculus will be a nice motivation later on.

One analogy is how we teach probability to sophisticated engineering undergraduates. I'm not aware of undergrad engineering curricula that use measure theory. This results in awkwardness around delta "functions" and probabilities of certain sets of measure zero (sets that cannot be integrated without the Lebesgue integral).

And sure, some of those undergrads don't ever take that measure theory class, so they escape to the wild without knowing the answers to awkward questions.

simplotek3y ago

> If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability.

What name do you give to this "area under the curve", or the "rate of change" of this area? They are pretty fundamental concepts with important and basic properties, which affect things like local optima and minimization, and expected value and covariance, etc. I mean, you can't cover linear models and least squares without this stuff, and if you don't then I wouldn't really call it learning.

2 more replies

comte70923y ago

High schools often teach physics and without calculus as a prerequisite. It definitely makes it more challenging, but you can still communicate the concepts at a different level of detail.

simplotek3y ago

> High schools often teach physics and without calculus as a prerequisite.

Does it though? For example, you simply cannot teach Newton's laws of motion without knowing what a derivative is.

2 more replies

roddylindsay3y ago

You can definitely explain Bayes' theorem without calculus. I just asked ChatGPT to do it and it came up with a great example using a deck of cards and some fraction math.

hollandheese3y ago

>understand how machine learning models work (at a high level)

>Calculus, on the other hand, will be used by very few students,

These two statements do not mesh. Understanding how machine learning models work requires Calculus.

acchow3y ago

If the goal is to teach them basic statistics to be useful and not to do science with it, then just make them watch a few YouTube videos on the topic as part of their 9th grade math class?

CobaltFire3y ago

Note that this is the recommendations for applying for Undergrad, not their classes or program offerings.

Those two institutions are recommending more foundational (calculus) rather than applied courses (data science).

mlyle3y ago

They continue to recommend statistics. They removed data science from a sentence that also included calculus and statistics.

simplotek3y ago

> They continue to recommend statistics. They removed data science from a sentence that also included calculus and statistics.

I knew a professor from a math department from a top European university who taught data science courses who swore that data science and data mining were just marketing terms invented to sell statistics.

CobaltFire3y ago

Thanks, reread that. I’ve corrected it but wanted to post to make sure your comment kept context.

Guybrush_T3y ago

I think it makes sense for them to emphasize a strong understanding of the fundamentals. It will help those students who later want to go into data science as well.

theGnuMe3y ago

It's rote learning though as ChatGPT proves by its scores on AP calculus exams. This is just a way to maintain elite admissions.

vineyardmike3y ago

Just because ChatGPT can do it, doesn’t mean that it isn’t valuable for a human to learn. This is especially true for foundational courses.

drewcoo3y ago

> Just because ChatGPT can do it, doesn’t mean that it isn’t valuable for a human to learn.

No, but it does sort of suggest that, doesn't it?

> This is especially true for foundational courses.

Sure, but calculus is about memorizing ways to answer problems. We're not talking about real analysis, the course in which students develop the calculus and prove it works.

1 more reply

theGnuMe3y ago

Its valuable if you wrap it up in an applied problem solving course like physics. That’s the point of the new curriculum reform anyway. Other countries that exceed us in test scores do this.

So it really begs the question as to what is the point? The only thing I can think of is college admissions. A specific selection of rigorous memorization for elite admission.

tinglymintyfrsh3y ago

Seems like a mischaracterization because I don't see a problem.

UC CS undergrads had to take statistics for engineers and scientists.

UC CS undergrad majors in particular could end within 2 courses from a math undergrad degree. Is this not the case that squishier applied courses are possible?

EE/CS undergrads had to take the entire upper-division physics track for scientists and engineers, including modern physics.

So has something changed since then and is something changing back?

westurner3y ago

Do Stanford or Harvard have a UC BIDS: UC Berkeley Institute of Data Science?

> This is the [open] textbook for the Foundations of Data Science class at UC Berkeley: "Computational and Inferential Thinking: The Foundations of Data Science" http://inferentialthinking.com/ (JupyterBook w/ notebooks and MyST Markdown)

https://data.berkeley.edu/ :

> [#1 Undergrad Data Science program, #2 ranked Graduate Statistics program]

westurner3y ago

Data literacy: https://en.wikipedia.org/wiki/Data_literacy :

> Data literacy is distinguished from statistical literacy since it involves understanding what data means, including the ability to read graphs and charts as well as draw conclusions from data.[6] Statistical literacy, on the other hand, refers to the "ability to read and interpret summary statistics in everyday media" such as graphs, tables, statements, surveys, and studies. [6]

Data Literacy and Statistical Literacy are essential for good leadership. For citizens to be capable of Evidence-Based Policy, we need Data Driven Journalism (DDJ) and curricular data science in the public high schools.

https://news.ycombinator.com/item?id=20173228

AbrahamParangi3y ago

Familiarity with pandas and familiarity with taking an integral aren’t even close to the same thing. I don’t think it ever made sense to group them together.

rawgabbit3y ago

The article is written poorly with a click bait title.

This is the Stanford guidance. Mathematics: four years of rigorous mathematics incorporating a solid grounding in fundamental skills (algebra, geometry, trigonometry). We also welcome additional mathematical preparation, including calculus and statistics.

This is the Harvard guidance. Update to math curricular guidance: There is no single academic path we expect all students to follow, but the strongest applicants take the most rigorous secondary school curricula available to them. We receive many questions specifically about what type of math courses students should take. Applicants to Harvard should excel in a challenging high school math sequence corresponding to their educational interests and aspirations. Rigorous and relevant data science, computer science, statistics, mathematical modeling, calculus, and other advanced math classes are given equal consideration in the application process.

hw-guy3y ago

It's amazing how many bad ideas, if you scroll down far enough, are justified by an appeal to "equity." Which usually translates into dumbing things down.

humanistbot3y ago

Misleading editorialized title. The actual title is "Stanford, Harvard revise high school math curriculum recommendations, exclude data science"

dooglius3y ago

Geometry isn't really any more "fundamental" than statistics (for concreteness, let's say the geometry covered by the SAT and statistics as covered by the AP Statistics exam). Maybe they are using it as a proxy for formal proofs? A proof-based course in probability would actually be a lot more fundamental than either geometry or statistics, I think.

fnordpiglet3y ago

I think geometry and trig are pretty related, and have a lot of relevance in calculus especially multivariate. To your point geometry is also the first place formal proofs take shape. That said I think geometry and trig could each be a quarter of a year long and be taught to the extent needed for almost any pursuit, with supplemental at point of need. In my education geometry and trig were two entire years, and it was so dull I lost all interest in math until I took calculus. I agree a probability course would be useful, but I don’t think probability before calculus is an awesome idea. Statistics and probability can be taught at a rudimentary level without calculus, but insight really requires calculus and linear algebra. I found taking a non calc stats and probability course made the calc version harder.

vaidhy3y ago

While trig is used for starting on calculus now, in practice, I never had to use trig for ML. Most of the work was in numerical differentiation and integrations. I would think trig usefulness is more in some hard sciences while ML has a much more horizontal applicability.

It is possible to teach calculus without trig (just for polynomials) and I think it is very useful just at that level.

fnordpiglet3y ago

It seems hard to conceive of a world where e^ix isn’t important in ML, unless that ML is sans probability, neural networks, or really most anything useful. Perhaps for regressions, so long as they have no periodic component. I think you probably can mechanically, without understanding, skate by in a job without any understanding of trig, but I don’t think you can understand much ML without it, and certainly can’t reason about limitations of an ML technique. While you might not directly use trig, I feel you must use things that were taught using trig to justify the technique and bound it’s applicability.

But really trig isn’t very complex a topic. I don’t think you should attempt to avoid teaching it. I just think it’s like a 1 month topic that is filled in as you learn calculus, linear algebra, and physics. The real intuition of trig comes form the use of it in other areas, and as a standalone subject it’s just boring.

mlyle3y ago

I don’t like the whole layer cake of math that we do. Still, in a traditional geometry course you get a lot of pieces that help with trig and calculus, and an exposure to at least informal proofs.

A whole lot of stuff in AP Stats is a relatively dead end for many people, but geometry and geometric reasoning is necessary for all kinds of engineering-ish math.

kenjackson3y ago

While I’ve led a data science team, I’ve never taken a data science course — so I’m not sure what it teaches. But i do feel pretty confident in saying that I think math does lose its usefulness around after trig. Not to say there aren’t useful aspects, but the curriculum is so inefficient. And maybe it’s because everyone needs some part of it, but that part is different for each person.

Math is interesting in that the early foundation is so useful, but the use drops off quickly. While I feel like other areas often become more useful as I learn more. Possibly because I haven’t spent 15 years on that topic like I had math.

2devnull3y ago

“math does lose its usefulness around after trig”

This is the most jarring thing I’ve read today. I can’t say I agree, but I haven’t spent 15 years studying math myself, so who am I to disagree.

kenjackson3y ago

I think most people would agree if pressed. Math is so ridiculously useful prior to trig. Almost any white collar job relies on these maths. Trig is an interesting inflection point in that so much math gets built on top of it, although its not that useful in of itself. And then after trig, things become much more fragmented and you really need to go into specific subfields to determine which branch of math is of value.

For example, if you go into medicine and medical research having a good understanding of statistics is useful, but very little in calculus or analysis is useful (and even if you do need Calculus, most of the useful stuff for those fields is taught in the 1st semester of Calculus).

mlyle3y ago

I think a lot of things use calculus concepts, even if calculus isn't explicitly invoked.

A whole lot of finance and pharmacology are about exponential functions and their derivatives and integrals, for instance. A whole lot of fields use optimization, even if "just asking the computer to do it", etc.

I admit I am weaker now in calculus and linear algebra because I lean on CAS and simulation a lot... but at least I know how it works so that I have an idea of what I'm doing.

1 more reply

mrbungie3y ago

Wouldn't say math loses usefulness after trig. But rather, due to how it is (usually, my experience being in Chile) teached it rapidly becomes too abstract, decouples from its "real world" use cases and then it is easy to forget the forest for the trees.

The Mathematics for Machine Learning book[1] exposes this as a top-down vs bottom-up problem. While both approaches have pros and cons, a sweet spot may lay somewhere in the middle and that needs you to embrace some inevitable backtracking (i.e. college curricula should not forget to add some courses where world modelling using the math and throughfully explaining why that underlying theory and math is actually useful in describing and/or predicting reality).

PS: I also think there is still a lot of focus in resolving problems manually.

[1] https://mml-book.github.io/book/mml-book.pdf, page 13.

hollandheese3y ago

>While I’ve led a data science team, I’ve never taken a data science course — so I’m not sure what it teaches. But i do feel pretty confident in saying that I think math does lose its usefulness around after trig.

Statements like this are a big part of the reason statisticians never trust anyone who works in "data science". The whole field is basically applied statistics/calculus and you're saying none of that is useful.

kenjackson3y ago

Sorry, the statement I made wasn't intended to be connected that way. Data science uses a bunch of math beyond trig. I meant that in general math beyond trig becomes much less useful. I was talking about the general usefulness of different levels/types of taught math for white collar jobs/living. Not what is of use for data science.

hw-guy3y ago

"math does lose its usefulness around after trig"

Not if you want to leave open the possibility of majoring in engineering, physical and biological science, or economics.

danielmarkbruce3y ago

ChatGPT might argue linear algebra and calculus are useful.

j / k navigate · click thread line to collapse

54 comments

roddylindsay3y ago

A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.

It's a shame that Stanford and Harvard, which set the tone for high schools and high schoolers, are going the wrong direction here.

comte70923y ago

Pet peeve: can we just go back to calling these things statistics?

While I agree with you that statistics should be more heavily emphasized at the high school level, the issue goes much deeper within American math education that the one class.

roddylindsay3y ago

xtiansimon3y ago

Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?

I’ve also studied statistics, but found it less generally useful.

comte70923y ago

>Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?

1 more reply

simplotek3y ago

> A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.

I mean, Bayes' theorem. How do you get people to get it if they don't know calculus?

tzs3y ago

I don't recall Bayes' theorem involving calculus. Are you sure you aren't thinking of some other theorem?

Rearranging gives P(A|B) = P(B|A) P(A) / P(B), which is Bayes' theorem.

mturmon3y ago

You can sidestep calculus by just using the discrete setting rather than a continuous one.

I'd argue that you could teach a perfectly reasonable high school stats class using this kind of approach.

And sure, some of those undergrads don't ever take that measure theory class, so they escape to the wild without knowing the answers to awkward questions.

simplotek3y ago

> If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability.

2 more replies

comte70923y ago

High schools often teach physics and without calculus as a prerequisite. It definitely makes it more challenging, but you can still communicate the concepts at a different level of detail.

simplotek3y ago

> High schools often teach physics and without calculus as a prerequisite.

Does it though? For example, you simply cannot teach Newton's laws of motion without knowing what a derivative is.

2 more replies

roddylindsay3y ago

You can definitely explain Bayes' theorem without calculus. I just asked ChatGPT to do it and it came up with a great example using a deck of cards and some fraction math.

hollandheese3y ago

>understand how machine learning models work (at a high level)

>Calculus, on the other hand, will be used by very few students,

These two statements do not mesh. Understanding how machine learning models work requires Calculus.

acchow3y ago

If the goal is to teach them basic statistics to be useful and not to do science with it, then just make them watch a few YouTube videos on the topic as part of their 9th grade math class?

CobaltFire3y ago

Note that this is the recommendations for applying for Undergrad, not their classes or program offerings.

Those two institutions are recommending more foundational (calculus) rather than applied courses (data science).

mlyle3y ago

They continue to recommend statistics. They removed data science from a sentence that also included calculus and statistics.

simplotek3y ago

> They continue to recommend statistics. They removed data science from a sentence that also included calculus and statistics.

CobaltFire3y ago

Thanks, reread that. I’ve corrected it but wanted to post to make sure your comment kept context.

Guybrush_T3y ago

I think it makes sense for them to emphasize a strong understanding of the fundamentals. It will help those students who later want to go into data science as well.

theGnuMe3y ago

It's rote learning though as ChatGPT proves by its scores on AP calculus exams. This is just a way to maintain elite admissions.

vineyardmike3y ago

Just because ChatGPT can do it, doesn’t mean that it isn’t valuable for a human to learn. This is especially true for foundational courses.

drewcoo3y ago

> Just because ChatGPT can do it, doesn’t mean that it isn’t valuable for a human to learn.

No, but it does sort of suggest that, doesn't it?

> This is especially true for foundational courses.

Sure, but calculus is about memorizing ways to answer problems. We're not talking about real analysis, the course in which students develop the calculus and prove it works.

1 more reply

theGnuMe3y ago

Its valuable if you wrap it up in an applied problem solving course like physics. That’s the point of the new curriculum reform anyway. Other countries that exceed us in test scores do this.

So it really begs the question as to what is the point? The only thing I can think of is college admissions. A specific selection of rigorous memorization for elite admission.

tinglymintyfrsh3y ago

Seems like a mischaracterization because I don't see a problem.

UC CS undergrads had to take statistics for engineers and scientists.

UC CS undergrad majors in particular could end within 2 courses from a math undergrad degree. Is this not the case that squishier applied courses are possible?

EE/CS undergrads had to take the entire upper-division physics track for scientists and engineers, including modern physics.

So has something changed since then and is something changing back?

westurner3y ago

Do Stanford or Harvard have a UC BIDS: UC Berkeley Institute of Data Science?

https://data.berkeley.edu/ :

> [#1 Undergrad Data Science program, #2 ranked Graduate Statistics program]

westurner3y ago

Data literacy: https://en.wikipedia.org/wiki/Data_literacy :

https://news.ycombinator.com/item?id=20173228

AbrahamParangi3y ago

Familiarity with pandas and familiarity with taking an integral aren’t even close to the same thing. I don’t think it ever made sense to group them together.

rawgabbit3y ago

The article is written poorly with a click bait title.

hw-guy3y ago

It's amazing how many bad ideas, if you scroll down far enough, are justified by an appeal to "equity." Which usually translates into dumbing things down.

humanistbot3y ago

Misleading editorialized title. The actual title is "Stanford, Harvard revise high school math curriculum recommendations, exclude data science"

dooglius3y ago

fnordpiglet3y ago

vaidhy3y ago

It is possible to teach calculus without trig (just for polynomials) and I think it is very useful just at that level.

fnordpiglet3y ago

mlyle3y ago

I don’t like the whole layer cake of math that we do. Still, in a traditional geometry course you get a lot of pieces that help with trig and calculus, and an exposure to at least informal proofs.

A whole lot of stuff in AP Stats is a relatively dead end for many people, but geometry and geometric reasoning is necessary for all kinds of engineering-ish math.

kenjackson3y ago

2devnull3y ago

“math does lose its usefulness around after trig”

This is the most jarring thing I’ve read today. I can’t say I agree, but I haven’t spent 15 years studying math myself, so who am I to disagree.

kenjackson3y ago

mlyle3y ago

I think a lot of things use calculus concepts, even if calculus isn't explicitly invoked.

I admit I am weaker now in calculus and linear algebra because I lean on CAS and simulation a lot... but at least I know how it works so that I have an idea of what I'm doing.

1 more reply

mrbungie3y ago

PS: I also think there is still a lot of focus in resolving problems manually.

[1] https://mml-book.github.io/book/mml-book.pdf, page 13.

hollandheese3y ago

kenjackson3y ago

hw-guy3y ago

"math does lose its usefulness around after trig"

Not if you want to leave open the possibility of majoring in engineering, physical and biological science, or economics.

danielmarkbruce3y ago

ChatGPT might argue linear algebra and calculus are useful.

j / k navigate · click thread line to collapse