Micrograd.jl (opens in new tab)

(liorsinai.github.io)

154 pointsthe_origami_fox1y ago49 comments

49 comments

Almost feels like a fallacy of Julia at this point, on the one hand Julia really needs a stable, high-performance AD-engine, but on the other hand it seems to be fairly easy to get a minimal AD-package off the ground.

And so the perennial cycle continues and another Julia AD-package emerges, and ignores all/most previous work in order to claim novelty.

Without a claim for a complete list: ReverseDiff.jl, ForwardDiff.jl, Zygote.jl, Enzyme.jl, Tangent.jl, Diffractor.jl, and many more whose name has disappeared in the short history of Julia...

0cf8612b2e1e1y ago

I do not think this is meant to be a “real” library, but a didactic exercise inspired by Andrej Kaparthy‘s Python implementation.

nextos1y ago

This is a didactic exercise. Julia is fantastic, but lacks funding to develop a differentiable programming ecosystem that can compete with Torch or Jax. These two have corporate juggernauts backing them. Still, it is quite remarkable how far Julia has got with few resources.

Having an alternative to Python would benefit the ML ecosystem, which is too much of a monoculture right now. Julia has some really interesting statistics, probabilistic programming and physics-informed ML packages.

1 more reply

bdjsiqoocwk1y ago

> Almost feels like a fallacy of Julia at this point, on the one hand Julia really needs a stable, high-performance AD-engine, but on the other hand it seems to be fairly easy to get a minimal AD-package off the ground.

Nothing wrong with this. Very successful languages like Python and JavaScript has tons of reimplementations of the same things because those languages are _easy_. This isn't a loss for the language it's a victory. If a lot of people can use it that's good.

As a consequence there will be lots of garbage libs, like there are lots of garbage libs in Python and JavaScript.

xyproto1y ago

Why did Julia select a package naming convention that makes every project name look like a filename?

eigenspace1y ago

It makes a julia package name very recognizable and easily searchable. It's actually something I really miss when I'm trying to look up packages in other languages.

NeuroCoder1y ago

I thought that was weird too but then I realized it was on of the most useful tools for searching stuff online and getting exactly what I wanted.

stackghost1y ago

Presumably for similar reasons to JavaScript, with names like Next.js, etc.

infogulch1y ago

Imagine searching for Plots.jl, Symbolics.jl, CUDA.jl if they didn't have the ".jl". I wish more package ecosystems used a convention like this.

xiaodai1y ago

I kinda gave up on Julia for deep learning since it’s so buggy. I am using PyTorch now. Not great but at least it works!

barbarr1y ago

I didn't find it too buggy personally, in fact it has an unexpected level of composability between libraries that I found exciting. Stuff "just works". But I felt it lacked performance in practical areas such as file I/O and one-off development in notebooks (e.g. plotting results), which is really important in the initial stages of model development.

(I also remember getting frustrated by frequent uninterruptible kernel hangs in Jupyter, but that might have been a skill issue on my part. But it was definitely a friction I don't encounter with python. When I was developing in Julia I remember feeling anxiety/dread about hitting enter on new cells, double and triple checking my code lest I initiate an uninterruptible error and have to restart my kernel and lose all my compilation progress, meaning I'll have to wait a long time again to run code and generate new graphs.)

adgjlsfhk11y ago

Julia does definitely need some love from devs with a strong understanding of IO performance. That said, for interactive use the compiler has gotten a bunch faster and better at caching results in the past few years. On Julia 1.10 (released about 6 months ago) the time to load Plots.jl and display a plot from scratch is 1.6 seconds on my laptop compared to 7.3 seconds in Julia 1.8 (2022)

Tarrosion1y ago

I'm curious what kind of slow IO is a pain point for you -- I was surprised to read this comment because I normally think of Julia IO being pretty fast. I don't doubt there are cases where the Julia experience is slower than in other languages, I'm just curious what you're encountering since my experience is the opposite.

Tiny example (which blends Julia-the-language and Julia-the-ecosystem, for better and worse): I just timed reading the most recent CSV I generated in real life, a relatively small 14k rows x 19 columns. 10ms in Julia+CSV+DataFrames, 37ms in Python+Pandas...ie much faster in Julia but also not a pain point either way.

1 more reply

catgary1y ago

I’m jealous of your experience with its autograd if it “just worked” for you. It was a huge pain for me to get it to do anything non-trivial.

Lyngbakr1y ago

Did you ever try alternatives to Jupyter like Pluto.jl? I'm curious if there's the same sort of friction.

pkage1y ago

Same here. I started my PhD with the full intention of doing most of my research with Julia (via Flux[0]), and while things worked well enough there were a few things which made it challenging:

- Lack of multi-GPU support,

- some other weird bugs related to autograd which i never fully figured out,

- and the killer one: none of my coauthors used Julia, so I decided to just go with PyTorch.

PyTorch has been just fine, and it's nice to not have to reinvent to wheel for every new model architecture.

[0] https://fluxml.ai/

samsartor1y ago

This was exactly my experience too

xiaodai1y ago

yeah. unless Julia has some killer framework with massive investment, it's hard to see 99.99% of cases moving to Julia. No point really.

ssivark1y ago

I think it would be more useful to list concrete bugs/complaints that the Julia devs could address. Blanket/vague claims like "Julia for deep learning [...] so buggy" is unfalsifiable and un-addressable. It promotes gossip with tribal dynamics rather than helping ecosystems improve and helping people pick the right tools for their needs. This is even more so with pile-on second hand claims (though the above comment might be first-hand, but potentially out-of-date).

Also, it's now pretty easy to call Python from Julia (and vice versa) [1]. I haven't used it for deep learning, but I've been using it to implement my algorithms in Julia while making use of Jax-based libraries from Python so it's certainly quite smooth and ergonomic.

[1] https://juliapy.github.io/PythonCall.jl/

moelf1y ago

what is so buggy, Julia the language or the deep learning libraries in Julia? in either case it would be good to have some examples.

currymj1y ago

julia the language is really good. but a lot of core infrastructure julia libraries are maintained by some overworked grad student.

sometimes that grad student is a brilliantly productive programmer + the libraries reach escape velocity and build a community, and then you get areas where Julia is state of the art like in differential equation solving, or generally other areas of "classical" scientific computing.

in other cases the grad student is merely a very good programmer, and they just sort of float along being "almost but not quite there" for a long time, maybe abandoned depending on the maintainer's career path.

the latter case is pretty common in the machine learning ecosystem. a lot of people get excited about using a fast language for ML, see that Julia can do what they want in a really cool way, and then run into some breaking problem or missing feature ("will be fixed eventually") after investing some time in a project.

catgary1y ago

This is an old-ish article about Julia, but from what I can tell the core issues with autograd were never fixed:

https://kidger.site/thoughts/jax-vs-julia/

xiaodai1y ago

the deep learning libraries. can't figure out why one of my gradient didn't work so i switch implementation to pytorch and it worked perfectly fine.

catgary1y ago

Yeah I was pretty enthusiastic about Julia for a year or two, even using it professionally. But honestly, JAX gives you (almost) everything Julia promises and its automatic differentiation is incredibly robust. As Python itself becomes a pretty reasonable language (the static typing improvements in 3.12, the promise of a JIT compiler) and JAX develops (it now has support for dynamic shape and AOT compilation) I can’t see why I’d ever go back.

The Julia repl is incredibly nice though, I do miss that.

adgjlsfhk11y ago

IMO the python JIT support won't help very much. Python currently is ~50x slower than "fast" languages, so even if the JIT provides a 3x speedup, pure python will still be too slow for anything that needs performance. Sure it will help on the margins, but a JIT can't magically make python fast.

1 more reply

mccoyb1y ago

Can you link dynamic shape support? Big if true — but I haven’t been able to find anything on it.

Edit: I see — I think you mean exporting lowered StableHLO code in a shape polymorphic format —- from the docs: https://jax.readthedocs.io/en/latest/export/shape_poly.html

This is not the thing I usually think when someone says dynamic shape support.

In this model, you have to construct a static graph initially —- then you’re allowed to specify a restricted set of input shapes to be symbolic, to avoid the cost of lowering — but you’ll still incur the cost of compilation for any new shapes which the graph hasn’t been specialized for (because those shapes affect the array memory layouts, which XLA needs to know to be aggressive)

1 more reply

enkursigilo1y ago

Can you elaborate a bit more?

thetwentyone1y ago

Odd that the author excluded ForwardDiff.jl and Zygote.jl, both of which get a lot of mileage in the Julia AD world. Nonetheless, awesome tutorial and great to see more Julia content like this!

fithisux1y ago

Another testament to the awesomeness of Julia

huqedato1y ago

Julia is a splendid, high performance language. And the most overlooked. Such a huge pity and shame that the entire current AI ecosystem is build on Python/Pytorch. Python - not a real programming language, let alone is interpreted... such a huge loss of performance besides Julia.

Y_Y1y ago

I recognize the use of "not a real language" as traditional hyperbole[0]. I have my own gripes with python, even though it pays the bills, but this is just going to set off a load of people and is probably bad for discussion quality.

Ironically it's very hard to write actual low-level parallel code (like CUDA) through python, there's really no choice but to call out to Fortran and C libraries for the likes of pytorch.

[0] https://en.wikipedia.org/wiki/Real_Programmers_Don't_Use_Pas...

brrrrrm1y ago

It’s all about the kernels tho. The language doesn’t matter much. For the things that matter, everything is a dispatch to some cuda graph

I’m not really a fan of this convergence but the old school imperative CPU way of thinking about things is dead in this space

adgjlsfhk11y ago

One of the really nice things about Julia for GPU programming is that you can write your own kernels. CUDA.jl isn't just C kernels. This is why (for example) DiffEQGPU.jl is able to be a lot faster than the other GPU based ODE solvers (see https://arxiv.org/abs/2304.06835 for details).

1 more reply

xaellison1y ago

as a major julia enthusiast I gotta say this is not how you get people to check it out buddy

FranzFerdiNaN1y ago

> not a real programming language

Really? Why do you feel the need to say this? Not liking Python, sure, but this kind of comments is just stupid elitism. What's next, the only REAL programmers are the ones that make their own punch cards?

chasd001y ago

Theyre just trolling for a reaction. It is indeed a ridiculous statement.

atoav1y ago

Python is not a real programing language? That must come as a shocking revalation to the many thousand people running it successfully in production. /s

As someone who programs C/C++/Python/Rust/JS you had me curious in the first half of the post. But that comment makes me wonder about the quality of the rest of what you're saying.

j / k navigate · click thread line to collapse

49 comments

anon389r58r581y ago

And so the perennial cycle continues and another Julia AD-package emerges, and ignores all/most previous work in order to claim novelty.

Without a claim for a complete list: ReverseDiff.jl, ForwardDiff.jl, Zygote.jl, Enzyme.jl, Tangent.jl, Diffractor.jl, and many more whose name has disappeared in the short history of Julia...

0cf8612b2e1e1y ago

I do not think this is meant to be a “real” library, but a didactic exercise inspired by Andrej Kaparthy‘s Python implementation.

nextos1y ago

1 more reply

bdjsiqoocwk1y ago

As a consequence there will be lots of garbage libs, like there are lots of garbage libs in Python and JavaScript.

xyproto1y ago

Why did Julia select a package naming convention that makes every project name look like a filename?

eigenspace1y ago

It makes a julia package name very recognizable and easily searchable. It's actually something I really miss when I'm trying to look up packages in other languages.

NeuroCoder1y ago

I thought that was weird too but then I realized it was on of the most useful tools for searching stuff online and getting exactly what I wanted.

stackghost1y ago

Presumably for similar reasons to JavaScript, with names like Next.js, etc.

infogulch1y ago

Imagine searching for Plots.jl, Symbolics.jl, CUDA.jl if they didn't have the ".jl". I wish more package ecosystems used a convention like this.

xiaodai1y ago

I kinda gave up on Julia for deep learning since it’s so buggy. I am using PyTorch now. Not great but at least it works!

barbarr1y ago

adgjlsfhk11y ago

Tarrosion1y ago

1 more reply

catgary1y ago

I’m jealous of your experience with its autograd if it “just worked” for you. It was a huge pain for me to get it to do anything non-trivial.

Lyngbakr1y ago

Did you ever try alternatives to Jupyter like Pluto.jl? I'm curious if there's the same sort of friction.

pkage1y ago

Same here. I started my PhD with the full intention of doing most of my research with Julia (via Flux[0]), and while things worked well enough there were a few things which made it challenging:

- Lack of multi-GPU support,

- some other weird bugs related to autograd which i never fully figured out,

- and the killer one: none of my coauthors used Julia, so I decided to just go with PyTorch.

PyTorch has been just fine, and it's nice to not have to reinvent to wheel for every new model architecture.

[0] https://fluxml.ai/

samsartor1y ago

This was exactly my experience too

xiaodai1y ago

yeah. unless Julia has some killer framework with massive investment, it's hard to see 99.99% of cases moving to Julia. No point really.

ssivark1y ago

[1] https://juliapy.github.io/PythonCall.jl/

moelf1y ago

what is so buggy, Julia the language or the deep learning libraries in Julia? in either case it would be good to have some examples.

currymj1y ago

julia the language is really good. but a lot of core infrastructure julia libraries are maintained by some overworked grad student.

catgary1y ago

This is an old-ish article about Julia, but from what I can tell the core issues with autograd were never fixed:

https://kidger.site/thoughts/jax-vs-julia/

xiaodai1y ago

the deep learning libraries. can't figure out why one of my gradient didn't work so i switch implementation to pytorch and it worked perfectly fine.

catgary1y ago

The Julia repl is incredibly nice though, I do miss that.

adgjlsfhk11y ago

1 more reply

mccoyb1y ago

Can you link dynamic shape support? Big if true — but I haven’t been able to find anything on it.

Edit: I see — I think you mean exporting lowered StableHLO code in a shape polymorphic format —- from the docs: https://jax.readthedocs.io/en/latest/export/shape_poly.html

This is not the thing I usually think when someone says dynamic shape support.

1 more reply

enkursigilo1y ago

Can you elaborate a bit more?

thetwentyone1y ago

Odd that the author excluded ForwardDiff.jl and Zygote.jl, both of which get a lot of mileage in the Julia AD world. Nonetheless, awesome tutorial and great to see more Julia content like this!

fithisux1y ago

Another testament to the awesomeness of Julia

huqedato1y ago

Y_Y1y ago

Ironically it's very hard to write actual low-level parallel code (like CUDA) through python, there's really no choice but to call out to Fortran and C libraries for the likes of pytorch.

[0] https://en.wikipedia.org/wiki/Real_Programmers_Don't_Use_Pas...

brrrrrm1y ago

It’s all about the kernels tho. The language doesn’t matter much. For the things that matter, everything is a dispatch to some cuda graph

I’m not really a fan of this convergence but the old school imperative CPU way of thinking about things is dead in this space

adgjlsfhk11y ago

1 more reply

xaellison1y ago

as a major julia enthusiast I gotta say this is not how you get people to check it out buddy

FranzFerdiNaN1y ago

> not a real programming language

chasd001y ago

Theyre just trolling for a reaction. It is indeed a ridiculous statement.

atoav1y ago

Python is not a real programing language? That must come as a shocking revalation to the many thousand people running it successfully in production. /s

As someone who programs C/C++/Python/Rust/JS you had me curious in the first half of the post. But that comment makes me wonder about the quality of the rest of what you're saying.

j / k navigate · click thread line to collapse