Then, when you committed a large binary that git could understand, what git would really be committing in its place would be a directory tree — sort of like the "resource tree" you see if you edit an MKV file, PNG file, etc., but realized as files in directories. Git would generate it, then commit it.
On checkout, this process would happen in reverse: a matching git-smudge filter could notice a metadata file in each of these generated directories, and collapse the contents of the directory together to form a binary chunk; recursively, up the tree, until you hit the toplevel, and end up with the original large binary again.
Since most of the generated leaf-nodes from this process wouldn't change on each commit, this would eliminate most of the storage overhead of having many historical versions of large files in git. (In exchange for: 1. the potentially-huge CPU overhead of doing this "taking apart" of the file on every commit; 2. the added IOPS for temporarily creating the files to commit them; and 3. the loss of any file-level compression [though git itself compresses its packfiles, so that's a wash.])
I'm almost inspired to try this out for a simple binary tree format like https://en.wikipedia.org/wiki/Interchange_File_Format. But ELF wouldn't be too hard, either! (You could even go well past the "logical tree" of ELF by splitting the text section into objects per symbol, and ensuring the object code for each symbol is stored in a PIC representation in git, even if it isn't in the binary.)
I remember reading about this technique years ago, thinking “cool when this catches on, software updates in all my software will be tiny”. But no, for some reason macos updates are still gigabytes in size. I have no idea why?
https://github.com/elfshaker/elfshaker#applicability
In summary, for manyclangs, we compile with -ffunction-sections and -fdata-sections, and store the resulting object files. These are fairly robust to insertions and deletions, since the addresses are section relative, so the damage of any addresses changing is contained within the sections. A somewhat surprising thing is that this works well enough when building many revisions of clang/llvm -- as you go from commit to commit, many commits have bit identical object files, even though the build system often wants to rebuild them because some input has changed.
elfshaker packs use a heuristic of sorting all unique objects by size, before concatenating them and storing them with zstandard. This gives us an amortized cost-per-commit of something like 40kiB after compression with zstandard.
[0] (edit: despite the playful name suggesting otherwise -- when we chose the name we planned to do more with ELF files, but it turned out to be unnecessary for our use case)
Perforce is still used by game developers and other creatives, because it handles large binaries, quite well.
In fact, I'm not sure if they still do it, but one of the game engines (I think, maybe, Unreal) used to have a free tier that also included a free Perforce install.