Scanning the topics here, what seems really interesting is how basic data structure are, naturally, containers for data but these advanced data structures are more like "mix-ins" or effects, qualities that can be added to a container. Time travel most obviously but also the various optimizing tree containers.
Not sure how to make this an interesting question. But it's interesting. How many effects can we add to a given structure? Can we make these additions automatically? Can we combine all of these or are some incompatible with each other? What are the cost-benefits between these?
I find it interesting that a common story of people who did big things is that nobody would have predicted they'd end up doing the things they did.
Reminds me of my own reflections on things that I have done that I never would have expected I'd be doing. As well as this recent article by the creator of Entrepreneur magazine [0].
"literally nobody, myself included, would have once predicted that I’d be running a magazine called Entrepreneur"
So, I suggest you just read everything you can, getting the gist of things so if you encounter something similar you know where to look.
I also suggest looking for little tricks you can apply again and again. For example, some of the data structures in that course use a trick where you break the structure into a top part, using one algorithm, and a bunch of leaf parts, each of size maybe O(log n), that are handled using another algorithm. Often this combination does better than either algorithm by itself.
[1] https://courses.csail.mit.edu/6.851/fall17/lectures/
[2] https://github.com/6851-2017
[3] https://courses.csail.mit.edu/6.851/fall17/psets/
I only kept looking because I didn't think a useless site would make it to the top of HN. People who find the course site another way might not have a reason to keep looking.
"Smart data structures and dumb code works a lot better than the other way around." -- Guy Steele
"The Representation Principle: Once a problem is described using an appropriate representation, the problem is almost solved." -- P. H. Winston
"Show me your [code] and conceal your [data structures], and I shall continue to be mystified. Show me your [data structures], and I won't usually need your [code]; it'll be obvious." -- Fred Brooks
"I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships." -- Linus Torvalds
"Rule 5: Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming." -- Rob Pike
"What are tensors? The facts of the universe." -- Lillian Lieber
I found that the greatest barrier to deal with seemingly hard concepts was getting the right intuition on what "might" be happening under the hood. For me, this is absolutely crucial even if my intuition about a "thing" might not be completely correct.
As someone who didn't go to a classroom to be taught a lot of the basics, I dive in head-first to read papers, I mean a lot of papers, to help build this mental model EVEN though I don't understand everything in it, but it lets you pick up patterns for solving similar problems you might be having.
For things you don't understand, you can always fit in these gaps in your mental model as you dig deeper into the abstraction cake and repeat the whole "intuition" building loop.
I find that the same also applies to Math and other domains which build on a bunch of primitives that has to be really well known to be able to reason in higher abstractions.
I feel software is no different... dig deep enough, there will be a hashmap of some sort.
My favorite so far is Skiena's "The Algorithm Design Manual". I especially love the computational geometry section.
Just genuinely curious because I love reading technical books but I can never finish them from page to page.
Even that is probably overkill for a lot of people, practically speaking.
On the other hand, if you're learning for fun, you should study the thing you're most excited about because you will do the most interesting things with knowledge and skills that you're passionate about.
Ultimately, in my personal experience, the latter course has also led to career advancement.
From a practical perspective... The best sign is if you're spending a lot of time trying to wrangle performance or scalability and most of the bottleneck is inside the standard data structures.
E.g. if you're spending too much time or memory doing dictionary lookups, then it may be worth looking into Bloom filters. But if dictionary lookups are a trivial component of your overall performance, then there are no real gains to be had from replacing with a fancier data structure.
There's a follow-up point I'd like to make too. Skill with data structures is more about knowing the right questions to ask, rather than memorizing an encyclopedic list of a whole bunch of exotic structures. When you understand the major tradeoffs and design considerations underlying data structure design, then it's usually easy to find a solution in the literature even if you were completely unaware of it prior.
Effectively that means something like being able to translate between business requirements and academic language. "Oh, I need something that works on directed graphs, handles cycles, logarithmic in space, linear in average case time, resistant to adversaries, respects cache locality during lookup, and behaves deterministically."
The best way to get learn this stuff is to read a bunch of famous papers in data structure. Again, the point isn't to memorize a zoo. It's to get comfortable with the already well-developed framework that previous researchers have used. It's very unlikely that your business problem is truly unique at a theoretical level, so most likely someone before you has already figured out the tradeoffs and considerations that you are facing.
Excellent comment! I'd love to see a catalog of data structures that documented those characteristics; do you know of a good resource like that?
I mostly disagree. You won't need to implement them, but a working knowledge of the basics is the basis of any performance analysis and can make a huge difference in system performance in day to day programming. At some point all the small inefficiencies also add up.
I would go so far as to say that data structures and basic algorithms (e.g. Dijkstra) are more important to become a competent engineer than any of the practical tools he listed. It's like the difference between a mechanic and a mechanical engineer.
There is also a reason they are taught very early in a computer science degree.
In my experience knowing how to implement such things isn't very useful in daily practical programming. Knowing that they exist, what problems they solve, the pros and cons that is incredibly useful day to day. Especially if you can recognize when you've accidentally strayed into implementing one of them under the guise of solving a business problem. You can then apply known patterns or have the foresight to say "Yo this is a solved problem, let's find a library"
I have the same question, except I already have a partial answer (i.e. less interested in going after it, I sometimes do these things if I'm interested enough). I'm currently working my way up 6.006, but I wonder why I'd have to do advanced data structures. My guess is: I think I don't, because it's more of a research area than anything else.
I would need to take a quick look to see if there are some interesting data structures solving some interesting problems.
Personally, I'd be interested in time travel and memory hierarchy (caching) as topics. I think they might have some immediate practical utility in those areas.
I studied a lot of Intel's 64 and IA-32 Architectures Software Developer Manuals, particularly the first and third volumes, and have forgotten a lot of details of what I read. I should have waited until I was ready to invest time in writing an OS or a first-stage bootloader or something. The broader things were useful, but I learned a lot of details that I haven't used and have forgotten. I'm sure it's still useful in the sense that I can quickly pick it back up if I ever needed to, but still. Same for all those things I read from the OSDev Wiki.
https://en.wikipedia.org/wiki/Erik_Demaine
EDIT: so is the coauthor! http://joebergeron.io/posts/post_four.html
https://ocw.mit.edu/courses/electrical-engineering-and-compu...
Since this is meant to be a graduate level research course, I wonder if any cutting edge topics have been added since? Were there any major advances or new ideas in fundamental data structures in the past few years?
And then it lists five requirements...do I get the MIT course description equivalent of a Knuth check now?
Nearest I might have got was trying to use a bloom filter in a vehicle tracker, but nope.
What learning about stuff generally has given me is indirect, the ability to dig down and optimise because I understand systems. As I read up on eg. sorting I understand that quicksort is very much not the quickest sort and fixed bad performance bugs from its misuse.
Or understanding Btrees so I can get very good performance out of DBs, even if I've never written a Btree and frankly never want to.
Stuff like that. Maybe one day I'll find the need for dynamic programming and learn it properly, until then it's just a waste of time.
But use them directly, no. Surely sux.
https://news.ycombinator.com/item?id=19718287
Looks like a great course though!