Neuralink Compression Challenge (opens in new tab)

(content.neuralink.com)

18 pointscrakenzak2y ago27 comments

27 comments

Apparently, someone solved it and achieved an 1187:1 compression ratio. These are the results:

All recordings were successfully compressed. Original size (bytes): 146,800,526 Compressed size (bytes): 123,624 Compression ratio: 1187.47

The eval.sh script was downloaded, and the files were decode and encode without loss, as verified using the "diff" function.

What do you think? Is this true?

https://www.linkedin.com/pulse/neuralink-compression-challen... context: https://www.youtube.com/watch?v=X5hsQ6zbKIo

djdyyz1y ago

Bogus. But a nice spoof.

Terrk_1y ago

In the video, it’s clear that the results were downloaded from the Neuralink website, no errors occurred, and the results are displayed correctly. Could you specify why you believe it’s Bogus?

djdyyz1y ago

Anyone can fake a website. Can you prove that they show the real Neuralink website?

Danjoe41y ago

Agreed. Just read their website marketing stuff and your scam bells should be going off.

djdyyz1y ago

Analyzing the data it becomes clear that the A/D used by Neuralink is defective, i.e. very poor accuracy. The A/D introduces a huge amount of distortion, which in practice manifests as noise.

Until this A/D linearity problem is fixed, there is no point pursuing compression schemes. The data is so badly mangled it makes it pretty near impossible to find patterns.

djdyyz1y ago

It's actually amazing that Neuralink can use this badly distorted data. I imagine that fixing the A/D would improve their results dramatically -- lower latency and higher precision. Why Neuralink has continued work with such an obvious hardware defect is a serious question. Do they actually analyze the A/D to make sure its working properly?

palaiologos2y ago

they're looking for a compressor that can do more than 200MB/s on a 10mW machine (that's including radio, so it has to run on a CPU clocked like original 8086) and yield 200x size improvement. speaking from the perspective of a data compression person, this is completely unrealistic. the best statistical models that i have on hand yield ~7x compression ratio after some tweaking, but they won't run under these constraints.

iamcreasy2y ago

I thought 200x is too extreme as well. In compression literature, is there a way to estimate the upper limit on lossless compressibility of a given data set?

Sesse__2y ago

There is not, because there could always be some underlying tricky generator that you just haven't discovered, and discovering that pattern is basically equivalent to solving the halting problem. (See https://en.wikipedia.org/wiki/Kolmogorov_complexity#Uncomput...)

As a trivial example, if your dataset is one trillion binary digits of pi, it is essentially incompressible by any regular compressor, but you can fit a generator well under 1 kB.

iamcreasy2y ago

Cool. Thanks. How about lossy compression?

1 more reply

ClassyJacket2y ago

So, they're asking skilled engineers to do work for them for free, and just email it in?

Why didn't every other company think of this?

rl32y ago

>So, they're asking skilled engineers to do work for them for free, and just email it in?

Yup:

"Submit with source code and build script."

But hey, the reward is a job. Maybe.

I mean, not everyone can be privileged enough to experience Ultra Hardcore™ toxic work culture.

djdyyz1y ago

200X is possible.

The sample data compresses poorly, getting down to 4.5 bits per sample easily with very simple first-order difference encoding and an decent Huffman coder.

However, lets assume there is massive cross-correlation between the 1024 channels. For example, in the extreme they are all the same, meaning if we encode 1 channel we get the other 1023. That means a lower limit of 4.5/1024 = about 0.0045 bits per sample, or a compression rate of 2275. Viola!

If data patterns exist and can be found, then more complicated coding algorithms could achieve better compression, or tolerate more variations (i.e. less cross-correlation) between channels.

We may never know unless Neuralink releases a full data set, i.e. 1024 channels at 20KHz and 10 bits for 1 hour. That's a lot of data, but if they want serious analysis they should release serious data.

Finally, enforcing the requirement for lossless compression has no apparent reason. The end result -- correct data to control the cursor and so on -- is the key. Neuralink should allow challengers to submit DATA to a test engine that compares cursor output for noiseless data to results for the submitted data, and reports the match score, and maybe a graph or something. That sort of feedback might allow participants to create a satisfactory lossy compression scheme.

djdyyz1y ago

Sorry, corrected an error.

It's 2275X

That's the compression ratio for complete cross correlation. It's (10 bits uncompressed / 4.5 bits compressed on 1 channel) * 1024 channels

crakenzakOP2y ago

This reminds me a lot of the Hutter Prize[1]. Funnily enough, the Hutter Prize shifted my thinking 180 degrees towards intelligence ~= compression, because to truly compress information well you must understand its nuanced.

[1]http://prize.hutter1.net/

codingdave2y ago

And in exchange for solving their problem for them, you get... ???

I'm all for challenges, but it is fairly standard to have prizes.

occamschainsaw2y ago

Probably the Turing award for discovering a breakthrough compression scheme.

davikr2y ago

> apparently the best submissions get fast tracked to an onsite if you want a job

1 more reply

raffihotter1y ago

200x compression on this dataset is mathematically impossible. The noise on the amplifier and digitizer limit the max compression to 5.3x.

Here’s why: https://x.com/raffi_hotter/status/1795910298936705098

djdyyz1y ago

Check out this link for background info. https://mikaelhaji.medium.com/a-technical-deep-dive-on-elon-...

fattless2y ago

"aside from everything else, it seems like it's really, really late in the game to suddenly realize 'oh we need magical compression technology to make this work don't we'"

https://x.com/JohnSmi48253239/status/1794328213923188949?t=_...

iamcreasy2y ago

< 10mW, including radio

Does it mean radio is using portion of this 10mW? If so, how much?

jappgar2y ago

why should it be lossless when presumably there is a lot of noise you don't really need to preserve

p0nce1y ago

exactly, when you look at the data it looks entirely like noise without any signal, why transmit that in the first place. And why losslessly.

yanngagnon1y ago

That's the thing. First principles thinking would say to look at that 200Mb/s and figure out what you can lose, before compressing.

j / k navigate · click thread line to collapse

27 comments

Terrk_1y ago

Apparently, someone solved it and achieved an 1187:1 compression ratio. These are the results:

All recordings were successfully compressed. Original size (bytes): 146,800,526 Compressed size (bytes): 123,624 Compression ratio: 1187.47

The eval.sh script was downloaded, and the files were decode and encode without loss, as verified using the "diff" function.

What do you think? Is this true?

https://www.linkedin.com/pulse/neuralink-compression-challen... context: https://www.youtube.com/watch?v=X5hsQ6zbKIo

djdyyz1y ago

Bogus. But a nice spoof.

Terrk_1y ago

In the video, it’s clear that the results were downloaded from the Neuralink website, no errors occurred, and the results are displayed correctly. Could you specify why you believe it’s Bogus?

djdyyz1y ago

Anyone can fake a website. Can you prove that they show the real Neuralink website?

Danjoe41y ago

Agreed. Just read their website marketing stuff and your scam bells should be going off.

djdyyz1y ago

Analyzing the data it becomes clear that the A/D used by Neuralink is defective, i.e. very poor accuracy. The A/D introduces a huge amount of distortion, which in practice manifests as noise.

Until this A/D linearity problem is fixed, there is no point pursuing compression schemes. The data is so badly mangled it makes it pretty near impossible to find patterns.

djdyyz1y ago

palaiologos2y ago

iamcreasy2y ago

I thought 200x is too extreme as well. In compression literature, is there a way to estimate the upper limit on lossless compressibility of a given data set?

Sesse__2y ago

As a trivial example, if your dataset is one trillion binary digits of pi, it is essentially incompressible by any regular compressor, but you can fit a generator well under 1 kB.

iamcreasy2y ago

Cool. Thanks. How about lossy compression?

1 more reply

ClassyJacket2y ago

So, they're asking skilled engineers to do work for them for free, and just email it in?

Why didn't every other company think of this?

rl32y ago

>So, they're asking skilled engineers to do work for them for free, and just email it in?

Yup:

"Submit with source code and build script."

But hey, the reward is a job. Maybe.

I mean, not everyone can be privileged enough to experience Ultra Hardcore™ toxic work culture.

djdyyz1y ago

200X is possible.

The sample data compresses poorly, getting down to 4.5 bits per sample easily with very simple first-order difference encoding and an decent Huffman coder.

If data patterns exist and can be found, then more complicated coding algorithms could achieve better compression, or tolerate more variations (i.e. less cross-correlation) between channels.

djdyyz1y ago

Sorry, corrected an error.

It's 2275X

That's the compression ratio for complete cross correlation. It's (10 bits uncompressed / 4.5 bits compressed on 1 channel) * 1024 channels

crakenzakOP2y ago

[1]http://prize.hutter1.net/

codingdave2y ago

And in exchange for solving their problem for them, you get... ???

I'm all for challenges, but it is fairly standard to have prizes.

occamschainsaw2y ago

Probably the Turing award for discovering a breakthrough compression scheme.

davikr2y ago

> apparently the best submissions get fast tracked to an onsite if you want a job

1 more reply

raffihotter1y ago

200x compression on this dataset is mathematically impossible. The noise on the amplifier and digitizer limit the max compression to 5.3x.

Here’s why: https://x.com/raffi_hotter/status/1795910298936705098

djdyyz1y ago

Check out this link for background info. https://mikaelhaji.medium.com/a-technical-deep-dive-on-elon-...

fattless2y ago

"aside from everything else, it seems like it's really, really late in the game to suddenly realize 'oh we need magical compression technology to make this work don't we'"

https://x.com/JohnSmi48253239/status/1794328213923188949?t=_...

iamcreasy2y ago

< 10mW, including radio

Does it mean radio is using portion of this 10mW? If so, how much?

jappgar2y ago

why should it be lossless when presumably there is a lot of noise you don't really need to preserve

p0nce1y ago

exactly, when you look at the data it looks entirely like noise without any signal, why transmit that in the first place. And why losslessly.

yanngagnon1y ago

That's the thing. First principles thinking would say to look at that 200Mb/s and figure out what you can lose, before compressing.

j / k navigate · click thread line to collapse