Learning to See in the Dark (2018) (opens in new tab)

(github.com)

645 pointsksaxena5y ago170 comments

170 comments

f-5y ago

As a photographer, the comparison to "raw" results without color balance or noise removal seems somewhat deceptive. The effects visible in the video seem easy to quickly replicate with existing techniques, such as the "surface blur" filter that averages out pixel values in areas with similar color.

This happens at the expense of detail in low-contrast areas, producing a plastic-like appearance of human skin and hair, and making low-contrast text unintelligible, which is why it's generally not done by default.

lopsidedBrain5y ago

The comparison is fair because it tries to automate expertise.

I'm sure you know exactly how much of which filter to apply for similar results. Laymen like ourselves will need a lot more trial and error. Their contribution here is to provide a push-button, automated mechanism.

I would have probably also tried something simple and given up due to the noise. So this is definitely interesting.

BubRoss5y ago

This is completely untrue.

What you are describing is usually called automatic tone mapping. This is basically noise reduction and possibly color normalization from brightening a dark image. Them showing their black image as the starting point is silly, because jpg will make a mess of the remaining information. What they should show is the raw image brightened by a straight multiplier to show the noisy version that you would get from trying to increase brightness in a trivial way.

1 more reply

pera5y ago

To me the results seem vastly superior to those sort of simple DSP algorithms. The video shows a comparison with some denoising: https://youtu.be/qWKUFK7MWvg?t=102

sdenton45y ago

Your example strikes me as the kind of thing neural networks are much better at than a fixed filter. You or I could easily identify regions of an image where it's safe vs unsafe to do the surface averaging, and boundaries where we wouldn't want to mix up the averages. (For example, averaging text should be fine, so long as you don't cross the text boundaries.) A CNN should also be able to learn to do this pretty easily.

BubRoss5y ago

What you are describing is a class of filters known as edge preserving filters. You can look at bilateral filters and guided filters for examples that have been around for decades at this point.

1 more reply

calibas5y ago

>As a photographer, the comparison to "raw" results without color balance or noise removal seems somewhat deceptive.

Huh? At 1:40 in the video that's exactly what they do.

chias5y ago

Interestingly, this effect is notably visible in their example image [0]. Notice the distinctly "plasticized" appearance of the book cover, and how the text is not intelligible in the low-contrast areas of the reflection.

[0]: https://raw.githubusercontent.com/cchen156/Learning-to-See-i...

LukeShu5y ago

Note (a) and (b) are separate photographs (different angles and everything), and that (c) is based on (a), not (b); comparing the glare between (b) and (c) isn't quite an even comparison.

1 more reply

arketyp5y ago

It would indeed be interesting to see a comparison with for instance non-local means on the scaled raw image. The speed is superior in any case, I suspect.

the_cat_kittles5y ago

i always have this complaint too. its fundamentally a lossy process, in the hand wavy sense. its more "impressive" looking, but actually conveying less real detail.

y7r4m5y ago

Hi, I'm a developer at NexOptic[0] and we are a company that was deeply inspired by this paper when it was first published. We had a lot of early success when attempting to replicate the results on our own and ended up running with it, and extending it into our own product line under our ALIIS brand of AI powered solutions.

For those curious, our current approach differs in some very significant ways to the author's implementation, such as performing our denoising and enhancement on a raw bayer -> raw bayer basis with a separate pipeline for tone mapping, white-balance, and HDR enhancement. As well, we explored a fair amount of different architectures for the CNN and came to the conclusion that a heavily mixed multi-resolution layering solution produces superior results.

As other commentators have pointed out, the most interesting part of it is really coming to terms that, as war1025 pointed out, "The message has an entropy limit, but the message isn't the whole dataset." It is incredibly powerful what can be accomplished with even extraordinarily noisy information as long as one has a extremely "knowledge packed" prior.

If anyone has any questions about our research in this space, please feel free to ask.

[0] https://nexoptic.com/artificialintelligence/

randyrand5y ago

It would be really cool if you could feed the network a photo with flash that it could use for gathering more information, but then recreated a photo without flash from the non-flash raw.

Often flash is not the look people are going for, but would be okay with the flash firing in order to improve the non-flash photo.

y7r4m5y ago

Absolutely! We recently rebranded our AI solutions from ALLIS (Advanced Low Light Imagine Solution) to ALIIS (All Light Intelligent Imaging Solution) specifically because we are beginning to branch out to handle use cases such as this!

As a proof of concept that this task can be tackled directly, a quick search brought up "DeepFlash: Turning a Flash Selfie into a Studio Portrait"[0]

Beyond denoising, we are already running experiments with very promising results on haze, lens flare, and reflection removal; super resolution; region adaptive white balancing; single exposure HDR; and a fair bit more.

One of the other cooler things we are doing is putting together a unified SDK where our algorithms and neural nets will be able to run pretty much anywhere, on any hardware, using transparent backend switching. (e.g. CPU, GPU, TPU, NPU, DSP, other accelerator ASICs, etc..)

[0] https://arxiv.org/abs/1901.04252

2 more replies

exikyut5y ago

The way I mistakenly initially parsed this comment gave rise to a potentially-dumb idea/question:

What would happen if you

- begin capturing video (unsure of fps) on a phone-quality sensor in a near-dark environment

- pulse the phone's flash LED(s) like you're taking a photo

- do super-resolution on the resulting video to extract a photo...

- ...while factoring in the decay in brightness/saturation in consecutive video frames produced by the flash pulse?

I vaguely recall reading somewhere that oversaturated photos have more signal in them and are easier to fix than undersaturated. Hmm.

IIRC super-resolution worked with 30fps source video for better quality; I wonder if 60fps or 120fps source video would produce better brightness decay data, or whether super-resolution could actually help extract more signal out of the decay sequence too.

On the other hand, I'm not sure if super-resolution fundamentally requires largely consistent brightness in order to work as well as it does. :/

Perhaps individual networks could be trained/tuned to specific slices/windows of the brightness gradient. I also wonder if it would be useful to factor the superresolution process into each of the brightness-specific stages or just to do it at the end.

1 more reply

dbeardsl5y ago

Sounds like you have taken this pretty far, do you have any example outputs? The only one I found via your website was a PDF with a low res image with no context.

y7r4m5y ago

Sure, we have a short deck[0] that gives an intro to our noise reduction, and also here is a folder[1] that shows off a calibration target we captured with a actual camera (20ms, f22) in low-light conditions: (original, 100x gain, 100x gain + ALIIS)

We also have some more raw data[2] where there is the original bayer data available as .npy files with 40db analog gain applied, however I think the calibration targets show off what we are able to accomplish more dramatically. Finally, we have a short youtube video[3] that shows off how it works when applied to video.

[0] https://www.dropbox.com/s/0bm4dpxhn35vkhe/ALLIS_Investor_Int...

[1] https://www.dropbox.com/sh/k861saentyq1cs6/AADmO7X_L49nUkEI_...

[2] https://www.dropbox.com/sh/fv8omdf4fbx59m9/AABDnf6sdvv7rtIml...

[3] https://youtu.be/99Cq1bWCmMM

NikhilVerma5y ago

It's surprising how little code [1] is needed to do this. On the other hand I feel this is quite dependent on the specific camera models and might not work on the RAW data downloaded from my phone. Happy to be corrected.

[1] - https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

covidacct5y ago

It's a huge amount of code, hidden behind the tensorflow import statements. It's common to credit GPUs for the rapid spread of deep learning, but good GPUs were available for quite a few years before deep learning really took off. As someone who wrote * a lot* of OpenCL code, including my own python wrappers, I'm fairly certain this code would be thousands of lines without a computation graph framework library. These frameworks are really amazing pieces of software engineering and deserve some non-trivial fraction of the credit for the rise of deep learning.

If you want to know what the next hot thing in software engineering will be, just pay attention to whatever Jeff Dean is doing.

ganstyles5y ago

I don't know that I agree with this first statement, but even if I do, everything is abstracted by import statements even outside ML. You say this is a huge amount of code abstracted, but it wouldn't be difficult to reimplement this in numpy and pandas directly without using tensorflow at all. The code would expand a bit, and you'd have to deal directly with backprop and calculating derivatives but it wouldn't expand things too much. But then you could make the same claim about numpy abstracting the linear algebra, and I could show you that I could extract that and do it without numpy but then it would be the python math library. It's turtles all the way down. My point is, your comment applies to everything.

2 more replies

liuliu5y ago

We don't have good GPUs that is as fast until 580 (or to some extents, the first Titan). Previous generations only about 2 to 5x faster depending on what types of CPU you compare against.

IMHO, credit should always go to Alex Krizhevsky for the rapid spread of deep learning. He has shown us it was possible. Even without Tensorflow and PyTorch, we will be fine with Caffe, torch, mxnet or Julia.

sambe5y ago

"How can I train the model using my own raw data?

Generally, you just need to subtract the right black level and pack the data in the same way of Sony/Fuji data. If using rawpy, you need to read the black level instead of using 512 in the provided code. The data range may also differ if it is not 14 bits. You need to normalize it to [0,1] for the network input."

The Sony and Fuji training code looks mostly the same - they haven't bothered to pull out common code and re-use.

ganstyles5y ago

This could actually be shortened, maybe simplified, significantly. For example there is a lot of redundancy in the layers and that could be pulled out into a function per block. This is what I often do with deep networks as it helps avoid errors in code and shortens everything significantly, at the potential expense of being able to grok it initially as quickly.

But, many DNN concepts (and ML concepts themselves) can be described with a few lines of pseudocode. CNNs, RNNs, etc. can all be described in a few lines.

It's really quite amazing, most of the work goes into first creating the net work from theory, then training and tuning it until you get good results.

elwell5y ago

Do you have an iPhone 6s? https://youtu.be/qWKUFK7MWvg?t=85

dgellow5y ago

A "Two Minute Papers" on this project from 2018: https://www.youtube.com/watch?v=bcZFQ3f26pA

jameshart5y ago

The problem with techniques like this is that they fundamentally amount to ‘making a plausible guess as to what the image would look like’, since essentially they can’t extract information that is simply not there. There is a Shannon entropy limit here.

Machine learning is really machine-enhanced educated-guesswork, which has its place but also has its limits.

anigbrowl5y ago

It's more than 'good enough' for most purposes. Matching other shots for Hollywood quality, probably not. For surveillance or the like it's fine. The things it's guessing poorly about are textures or colors.

Being able to read the title on the books in the example photo is great; you could rely on the title for evidentiary purposes, the smaller text probably not so much. So for a security camera it would do poorly at identifying the color of a car, but might well be sufficient to read the license plate.

jameshart5y ago

Reading a license plate seems like precisely the kind of circumstances where spurious 'plausible interpretation' of limited data can cause trouble.

You show in a courtroom a CNN-enhanced low light image of a car and it's there, literally 'clear as day' - the jury will find it pretty compelling. But maybe the data really wasn't there in the original image, and the CNN just filled in some blanks based on previous images of license plates, letters, and just random noise it had seen in the past.

The worry is when these kinds of algorithms get built in to basic image capture processes, so you never even see the raw data, only data that has already been filtered through the inbuilt prejudices of the CNN enhancement suite.

The camera never lies, but now it doesn't have to, because it can convince itself it saw something that wasn't really there...

1 more reply

melq5y ago

Using these processed images for evidence purposes sounds like precisely what the parent comment is concerned about.

6gvONxR4sf7o5y ago

Agreed. It gets to an important point of the purpose of the photo. Photo as a record versus photo as an aesthetic piece. This hurts the photo as a record but improves the photo as an aesthetic piece. This would be a bad addition to a security camera, but perhaps a good addition to an instagram pipeline. There are plenty of other issues there, like is it good/healthy for stuff like instagram to be diverging away from records, creating unrealistic (or perhaps literally unreal) expectations. We're taking what are slowly inching closer to imaginative art pieces and presenting them as records.

gowld5y ago

Photographs have never been faithful records. The map never the territory. There are always judgement calls. The whole concept of JPEG is to throw away information.

1 more reply

oconnor6635y ago

Counterpoint: The human brain converting a 2D image to a 3D model is educated-guesswork too :)

war10255y ago

This hits on an interesting point.

There is an entropy limit to the message, but the message isn't actually the only data.

One thing humans are great at is integrating existing knowledge into a messy situation and intuiting more than is available just from the raw message.

I.e. The message has an entropy limit, but the message isn't the whole dataset.

2 more replies

bialpio5y ago

The courts & hopefully the jurors should be aware that humans are fallible & capable of lying, but may have hard time believing that cameras can lie as well.

1 more reply

astrobe_5y ago

Two 2D Images. The brain doesn't have to guess much when it can use the parallax effect created by both eyes. That's why quite a few animals have two eyes. And I believe that's why we have two ears, too.

amelius5y ago

Yes, therefore it is better to turn on the lights!

Groxx5y ago

And you can see some of its biases in the results, yeah. Look at the thick book's spine on the right, and compare the box around the title - "our result" has pretty significant staircasing instead of being a slightly-off-vertical line.

jdmichal5y ago

I'm also not a fan of how the only part actually readable in the (a) original, which is part of the title in the front book, becomes completely whitewashed in (c). Where the model actually had the most information, it completely removed it in the result...

derefr5y ago

But wouldn't that level of glare be what would actually happen if you took the original image, and shot it in the amount of light required to make it look like the output image?

It's not trying to make things readable; it's trying to make things look like there was more light in the room when they were shot. In rooms with high lighting, some objects have glare. That's "correct"—it's what would appear in the training data.

edjrage5y ago

> the model actually had [...] information

Wait, did it? Isn't the middle photo being shown for comparison only, rather than as an input?

1 more reply

throwlaplace5y ago

>‘making a plausible guess as to what the image would look like’

people bring this up all the time as hot-takes in these areas. it's conditional inference. it's no more disingenuous than linear regression.

SilasX5y ago

It's a general principle of information theory that, "to make inferences, you have to make assumptions".

coenhyde5y ago

It's a great result, but it's not perfect. No need for the "perfect" hyperbole in the title.

ksaxenaOP5y ago

Video from the paper here: https://www.youtube.com/watch?v=qWKUFK7MWvg

jrimbault5y ago

Why is the "page suspended" ? http://cchen156.web.engr.illinois.edu/paper/18CVPR_SID.pdf

PStamatiou5y ago

if it's anything like my old college webserver.. too much traffic/bandwidth consumed

toomuchtodo5y ago

Wayback: http://web.archive.org/web/2018*/http://cchen156.web.engr.il...

dyycc5y ago

It is back now.

selectodude5y ago

The State of Illinois is out of money again.

anigbrowl5y ago

You shouldn't be downvoted - with a big recession/depression looming, link rot and many sorts of repositories shutting down are a big issue that will slow down the pace of research.

q3k5y ago

Finally, a way to restore https://en.wikipedia.org/wiki/The_Night_Watch !

babuskov5y ago

I was just wondering a couple days ago why the image from my phone is so grainy, while my eyes+brain can see everything clear in the dark (it wasn't completely dark, of course).

This seems to replicate the post-processing we do in our brain (which is also a giant neural network). I wonder if the process is similar?

koverda5y ago

That's not really a good analogy. You have a totally different sensor chemistry in your eyes, as well as different processing.

ganstyles5y ago

And while brains are the original neural networks, they don't resemble what's going on with ML DNNs at all.

1 more reply

arpa5y ago

Very small numbers of photons (1) are required to trigger rhodopsin cycle. So primary receptor itself is very very VERY sensitive.

tofof5y ago

To be clear, the parent did not fail to include a citation. The parenthetical note is that rod cells are so sensitive that they react to being struck by a single photon.

folli5y ago

That's actually a very cool fact: https://www.nature.com/news/people-can-sense-single-photons-...

empath755y ago

Your brain doesn’t make a 2 dimensional image based entirely on photons entering your eye. You generate a complex physical model of your surroundings based only partially on visual input and rely substantially on memory.

jdmichal5y ago

Also other senses, including proprioception. In a completely dark environment, you could swear that you see your hand waving in front of your face. That's because your brain actually does know it's there, and it's trying to create a unified model.

zaroth5y ago

Kind of like this well trained CNN is no longer relying entirely on the raw pixel values, but is statistically inferring a brighter image from the baseline.

1 more reply

Someone5y ago

The dynamic range of the human eye is better than that of the lens in your phone. https://en.wikipedia.org/wiki/Human_eye#Dynamic_range:

”The human eye can detect a luminance range of 10¹⁴, or one hundred trillion (100,000,000,000,000) (about 46.5 f-stops), from 10−6 cd/m2, or one millionth (0.000001) of a candela per square meter to 10⁸ cd/m2 or one hundred million (100,000,000) candelas per square meter. This range does not include looking at the midday sun (10⁹ cd/m2)[21] or lightning discharge.”

jacobush5y ago

Also, our eyes are better. So far.

xwdv5y ago

I wonder what things would be like when phone cameras are as good as human eyes, the climax of consumer photography?

mkchoi2125y ago

Pretty cool but seems like there’s a big limitation on this for now

“ The pretrained model probably not work for data from another camera sensor. We do not have support for other camera data. It also does not work for images after camera ISP, i.e., the JPG or PNG data.”

Would be cool to see how they come up with better models that would allow them to overcome the above limitations

Aardwolf5y ago

I doubt this image is showing the true raw data (a):

https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

If you take the dark image (a) from that and balance its color, the information that is present in it simply cannot contain the text from the book covers and so on. In fact, it's full of JPEG artifacts despite the image being a PNG. It would be useful if they presented a histogram equalized image of (a).

dang5y ago

Discussed at the time: https://news.ycombinator.com/item?id=17064079

penetrarthur5y ago

I always wondered if you can "trust" an image that has been basically recreated. Could that kind of image be used as an evidence in court?

mattkrause5y ago

Xerox copiers had a bug caused by a failed image (re)construction, which caused it to replace similar (but not identical) parts of an image with other pieces of the image.

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...?

anigbrowl5y ago

If you could show a basic level of consistency. Take the correctly rendered title on the book vs the incorrect colors; the odds that it got lucky with the text instead of a different title or a book with random letters on the cover are negligible. But if your evidence revolved around the color of the book the villain stole from the library, not so much.

So if you're planning to do crime, make choices where the evidence relies on spectra rather than geometry. Steal Rothkos rather than Mondrians; baggy coveralls are in, form-fitting ninja wear is out.

GhostVII5y ago

It would be good to get a comparison of a brightened version of the sample image compared with the CNN version. Right now the sample image just looks black, but if you scale up the brightness you get an image that looks more like the higher ISO image. That would be a better comparison since it shows what improvements the CNN gives over naive techniques like just bumping up the pixel values.

robmiller5y ago

I wonder if photographic evidence "enhanced" by such a method would be admissible in court?

draugadrotten5y ago

Perhaps it depends on "which court". The legal systems in some parts of the world would allow it.

amelius5y ago

Some questions:

- Did they create a special network topology for this problem?

- Does the network need to see the entire image, or only an NxN subblock at a time?

- How did they obtain the training data? Is it possible to take daylight images and automatically turn them into nighttime images somehow?

isatty5y ago

Take pictures at night with a tripod mounted camera with different exposure brackets?

amelius5y ago

The problem is the amount of pictures you'd need. It's much easier to use available datasets if you know how to preprocess the data.

todd38345y ago

Could something like this be done for night vision goggles or is there significant latency?

riazrizvi5y ago

An interpolation that looks for movement of a few anchor points? I imagine that would entail much less computation and so deliver apparent real-time night vision. Though sudden big movements in scene would cause blackout regions of about a second?

ZeroCool2u5y ago

Interestingly, this is effectively an extreme version of solving the colorization of black & white photos problem. I wonder what the results would be if you just threw some black and white photos into the model.

ChrisArchitect5y ago

(2018) and a better title that doesn't say CNN, c'mon

dang5y ago

We've added the year above. Submitted title was "CNN converts night images to perfect daylight in – 1s".

jordache5y ago

isn't this just some tweaked raw processing algorithm?

ptrenko5y ago

I think I'll faint seeing AI progress anymore.

I didn't even think this was possible. Have people ever done this manually before? Like without AI?

paul79865y ago

Cool and integrating this into AR Glasses would make them almost a must buy! Turn night into Day ..see in the dark, etc!

vehemenz5y ago

What would happen if this were paired with license plate recognition? And would it be admissible as evidence?

Invictus05y ago

I believe this should have a (2018) tag.

yokto5y ago

Yes, the result image and videos are from 2018 https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

soperj5y ago

Just wondering why for something brand new they'd use python 2.7??

as1mov5y ago

Probably because it was already installed on their machines. Also what benefit would this project get by using a newer version?

soperj5y ago

Longevity.

pachico5y ago

Funny. I'm walking down the corridor almost in total darkness trying to get my son to sleep. I get bored and with my free hand reach to my phone, open NH and stumble upon this title. Totally unrelated to its content but I had a (quiet) laugh :)

baybal25y ago

I wonder, how much can it improve over this: https://youtu.be/c_0s06ORTkY

X27 is also using some kind of neural algorithm to denoise and get maximum out of the CIS

css5y ago

What camera are they shooting at 409,600 ISO at?

Traster5y ago

In the video they reference the Sony A7S II, on Sony's website[1] they claim:

>Still images: ISO 100-102400 (expandable to ISO 50-409600),

[1]:https://www.sony.co.uk/electronics/interchangeable-lens-came...

dingaling5y ago

Which is extremely lossy, because any ISO other than the sensor's native level is the result of in-camera processing. Unlike film, adjusting the "ISO" in a digital camera doesn't increase sensitivity; that's physically impossible. Instead, very strong overgain processing is applied.

So in this instance they're processing lossily on top of an image already processed lossily in-camera.

1 more reply

jeswin5y ago

The Sony a7s and a7sii have a native iso range of 100 to 102400, which can be boosted to 409600.

skyde5y ago

why use iso 8,000 as input and not the camera native ISO ?

throwaway1223785y ago

Now all they have to do is make sure the correct image displayed for the story

GEBBL5y ago

Impressive of the American news channel, CNN, to convert images in minus one second.

dang5y ago

Please don't do this. What would be a fun crack in person becomes a plastic garbage patch here.

GEBBL5y ago

Apologies dang, I’ll keep this in mind in future :)

1 more reply

wyldfire5y ago

CNN here is a "Convolutional Neural Network"

echelon5y ago

The title needs to be changed so brains recognize it as such. It either needs a preceding adjective or letter indicating what type of convolutional network it is.

The other option is spelling it out.

Most people will read CNN as the news channel. Even those familiar with neural networks.

5 more replies

whoisjuan5y ago

I think OP was being sarcastic

1 more reply

heinrichhartman5y ago

Still -1 second is impressive!

FreakyT5y ago

I, too, was initially very confused by the headline.

wiredfool5y ago

Well, Turner used to colorize b/w movies, I guess progress marches on.

2 more replies

isoprophlex5y ago

They should make a CLI tool, it stores your processed images one sec before it is invoked!

The major peculiarity of the chemical is its "endochronicity": it starts dissolving before it makes contact with water.

StavrosK5y ago

See also Ted Chiang's What’s Expected of Us, where there's a device with a button and a LED, and the LED always lights up one second before you press the button.

1 more reply

ggregoire5y ago

Seriously, what/who's CNN here?

FreakyT5y ago

Convolutional Neural Network (for example, some more info here: https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Unders...)

corndoge5y ago

Convolutional neural network

montenegrohugo5y ago

I understand that the title may have brought confusion, and I think your comment calls attention to this whilst also being somewhat funny.

But still, could we make an effort not to devolve into what has happened on Reddit, i.e. comment sections which mainly consist of puns and other low effort jokes?

pbhjpbhj5y ago

Meta: this is when I like the Slashdot method, you allow the jokes, tag them as jokes, and let people use their own settings to show/hide the jokes. Basically be permissive on content, but demand proper tagging, then allow people to filter out what they don't want.

zoomablemind5y ago

Title is confusing (?intentionally).

Would make sense to add Tensorflow to make it more specific.

ccktlmazeltov5y ago

I don't understand what you are saying, it is very "HN" to comment on the title and not on the article.

And it is even more "HN" to comment on details of the title or the article because you don't really know what to say about the article.

Look at my comment.

1 more reply

yingw7875y ago

Yeah, my mind jumped to CNN's coverage of the first Gulf War and having some color night vision for combat journalism:

https://www.thedrive.com/the-war-zone/25803/this-is-what-col...

grumple5y ago

Ah, after decades of effort, we have finally replicated the effect of a candle.

j / k navigate · click thread line to collapse

170 comments

f-5y ago

lopsidedBrain5y ago

The comparison is fair because it tries to automate expertise.

I would have probably also tried something simple and given up due to the noise. So this is definitely interesting.

BubRoss5y ago

This is completely untrue.

1 more reply

pera5y ago

To me the results seem vastly superior to those sort of simple DSP algorithms. The video shows a comparison with some denoising: https://youtu.be/qWKUFK7MWvg?t=102

sdenton45y ago

BubRoss5y ago

What you are describing is a class of filters known as edge preserving filters. You can look at bilateral filters and guided filters for examples that have been around for decades at this point.

1 more reply

calibas5y ago

>As a photographer, the comparison to "raw" results without color balance or noise removal seems somewhat deceptive.

Huh? At 1:40 in the video that's exactly what they do.

chias5y ago

[0]: https://raw.githubusercontent.com/cchen156/Learning-to-See-i...

LukeShu5y ago

Note (a) and (b) are separate photographs (different angles and everything), and that (c) is based on (a), not (b); comparing the glare between (b) and (c) isn't quite an even comparison.

1 more reply

arketyp5y ago

It would indeed be interesting to see a comparison with for instance non-local means on the scaled raw image. The speed is superior in any case, I suspect.

the_cat_kittles5y ago

i always have this complaint too. its fundamentally a lossy process, in the hand wavy sense. its more "impressive" looking, but actually conveying less real detail.

y7r4m5y ago

If anyone has any questions about our research in this space, please feel free to ask.

[0] https://nexoptic.com/artificialintelligence/

randyrand5y ago

It would be really cool if you could feed the network a photo with flash that it could use for gathering more information, but then recreated a photo without flash from the non-flash raw.

Often flash is not the look people are going for, but would be okay with the flash firing in order to improve the non-flash photo.

y7r4m5y ago

As a proof of concept that this task can be tackled directly, a quick search brought up "DeepFlash: Turning a Flash Selfie into a Studio Portrait"[0]

[0] https://arxiv.org/abs/1901.04252

2 more replies

exikyut5y ago

The way I mistakenly initially parsed this comment gave rise to a potentially-dumb idea/question:

What would happen if you

- begin capturing video (unsure of fps) on a phone-quality sensor in a near-dark environment

- pulse the phone's flash LED(s) like you're taking a photo

- do super-resolution on the resulting video to extract a photo...

- ...while factoring in the decay in brightness/saturation in consecutive video frames produced by the flash pulse?

I vaguely recall reading somewhere that oversaturated photos have more signal in them and are easier to fix than undersaturated. Hmm.

On the other hand, I'm not sure if super-resolution fundamentally requires largely consistent brightness in order to work as well as it does. :/

1 more reply

dbeardsl5y ago

Sounds like you have taken this pretty far, do you have any example outputs? The only one I found via your website was a PDF with a low res image with no context.

y7r4m5y ago

[0] https://www.dropbox.com/s/0bm4dpxhn35vkhe/ALLIS_Investor_Int...

[1] https://www.dropbox.com/sh/k861saentyq1cs6/AADmO7X_L49nUkEI_...

[2] https://www.dropbox.com/sh/fv8omdf4fbx59m9/AABDnf6sdvv7rtIml...

[3] https://youtu.be/99Cq1bWCmMM

NikhilVerma5y ago

[1] - https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

covidacct5y ago

If you want to know what the next hot thing in software engineering will be, just pay attention to whatever Jeff Dean is doing.

ganstyles5y ago

2 more replies

liuliu5y ago

We don't have good GPUs that is as fast until 580 (or to some extents, the first Titan). Previous generations only about 2 to 5x faster depending on what types of CPU you compare against.

sambe5y ago

"How can I train the model using my own raw data?

The Sony and Fuji training code looks mostly the same - they haven't bothered to pull out common code and re-use.

ganstyles5y ago

But, many DNN concepts (and ML concepts themselves) can be described with a few lines of pseudocode. CNNs, RNNs, etc. can all be described in a few lines.

It's really quite amazing, most of the work goes into first creating the net work from theory, then training and tuning it until you get good results.

elwell5y ago

Do you have an iPhone 6s? https://youtu.be/qWKUFK7MWvg?t=85

dgellow5y ago

A "Two Minute Papers" on this project from 2018: https://www.youtube.com/watch?v=bcZFQ3f26pA

jameshart5y ago

Machine learning is really machine-enhanced educated-guesswork, which has its place but also has its limits.

anigbrowl5y ago

jameshart5y ago

Reading a license plate seems like precisely the kind of circumstances where spurious 'plausible interpretation' of limited data can cause trouble.

The camera never lies, but now it doesn't have to, because it can convince itself it saw something that wasn't really there...

1 more reply

melq5y ago

Using these processed images for evidence purposes sounds like precisely what the parent comment is concerned about.

6gvONxR4sf7o5y ago

gowld5y ago

Photographs have never been faithful records. The map never the territory. There are always judgement calls. The whole concept of JPEG is to throw away information.

1 more reply

oconnor6635y ago

Counterpoint: The human brain converting a 2D image to a 3D model is educated-guesswork too :)

war10255y ago

This hits on an interesting point.

There is an entropy limit to the message, but the message isn't actually the only data.

One thing humans are great at is integrating existing knowledge into a messy situation and intuiting more than is available just from the raw message.

I.e. The message has an entropy limit, but the message isn't the whole dataset.

2 more replies

bialpio5y ago

The courts & hopefully the jurors should be aware that humans are fallible & capable of lying, but may have hard time believing that cameras can lie as well.

1 more reply

astrobe_5y ago

amelius5y ago

Yes, therefore it is better to turn on the lights!

Groxx5y ago

jdmichal5y ago

derefr5y ago

But wouldn't that level of glare be what would actually happen if you took the original image, and shot it in the amount of light required to make it look like the output image?

edjrage5y ago

> the model actually had [...] information

Wait, did it? Isn't the middle photo being shown for comparison only, rather than as an input?

1 more reply

throwlaplace5y ago

>‘making a plausible guess as to what the image would look like’

people bring this up all the time as hot-takes in these areas. it's conditional inference. it's no more disingenuous than linear regression.

SilasX5y ago

It's a general principle of information theory that, "to make inferences, you have to make assumptions".

coenhyde5y ago

It's a great result, but it's not perfect. No need for the "perfect" hyperbole in the title.

ksaxenaOP5y ago

Video from the paper here: https://www.youtube.com/watch?v=qWKUFK7MWvg

jrimbault5y ago

Why is the "page suspended" ? http://cchen156.web.engr.illinois.edu/paper/18CVPR_SID.pdf

PStamatiou5y ago

if it's anything like my old college webserver.. too much traffic/bandwidth consumed

toomuchtodo5y ago

Wayback: http://web.archive.org/web/2018*/http://cchen156.web.engr.il...

dyycc5y ago

It is back now.

selectodude5y ago

The State of Illinois is out of money again.

anigbrowl5y ago

You shouldn't be downvoted - with a big recession/depression looming, link rot and many sorts of repositories shutting down are a big issue that will slow down the pace of research.

q3k5y ago

Finally, a way to restore https://en.wikipedia.org/wiki/The_Night_Watch !

babuskov5y ago

I was just wondering a couple days ago why the image from my phone is so grainy, while my eyes+brain can see everything clear in the dark (it wasn't completely dark, of course).

This seems to replicate the post-processing we do in our brain (which is also a giant neural network). I wonder if the process is similar?

koverda5y ago

That's not really a good analogy. You have a totally different sensor chemistry in your eyes, as well as different processing.

ganstyles5y ago

And while brains are the original neural networks, they don't resemble what's going on with ML DNNs at all.

1 more reply

arpa5y ago

Very small numbers of photons (1) are required to trigger rhodopsin cycle. So primary receptor itself is very very VERY sensitive.

tofof5y ago

To be clear, the parent did not fail to include a citation. The parenthetical note is that rod cells are so sensitive that they react to being struck by a single photon.

folli5y ago

That's actually a very cool fact: https://www.nature.com/news/people-can-sense-single-photons-...

empath755y ago

jdmichal5y ago

zaroth5y ago

Kind of like this well trained CNN is no longer relying entirely on the raw pixel values, but is statistically inferring a brighter image from the baseline.

1 more reply

Someone5y ago

The dynamic range of the human eye is better than that of the lens in your phone. https://en.wikipedia.org/wiki/Human_eye#Dynamic_range:

jacobush5y ago

Also, our eyes are better. So far.

xwdv5y ago

I wonder what things would be like when phone cameras are as good as human eyes, the climax of consumer photography?

mkchoi2125y ago

Pretty cool but seems like there’s a big limitation on this for now

Would be cool to see how they come up with better models that would allow them to overcome the above limitations

Aardwolf5y ago

I doubt this image is showing the true raw data (a):

https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

dang5y ago

Discussed at the time: https://news.ycombinator.com/item?id=17064079

penetrarthur5y ago

I always wondered if you can "trust" an image that has been basically recreated. Could that kind of image be used as an evidence in court?

mattkrause5y ago

Xerox copiers had a bug caused by a failed image (re)construction, which caused it to replace similar (but not identical) parts of an image with other pieces of the image.

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...?

anigbrowl5y ago

So if you're planning to do crime, make choices where the evidence relies on spectra rather than geometry. Steal Rothkos rather than Mondrians; baggy coveralls are in, form-fitting ninja wear is out.

GhostVII5y ago

robmiller5y ago

I wonder if photographic evidence "enhanced" by such a method would be admissible in court?

draugadrotten5y ago

Perhaps it depends on "which court". The legal systems in some parts of the world would allow it.

amelius5y ago

Some questions:

- Did they create a special network topology for this problem?

- Does the network need to see the entire image, or only an NxN subblock at a time?

- How did they obtain the training data? Is it possible to take daylight images and automatically turn them into nighttime images somehow?

isatty5y ago

Take pictures at night with a tripod mounted camera with different exposure brackets?

amelius5y ago

The problem is the amount of pictures you'd need. It's much easier to use available datasets if you know how to preprocess the data.

todd38345y ago

Could something like this be done for night vision goggles or is there significant latency?

riazrizvi5y ago

ZeroCool2u5y ago

ChrisArchitect5y ago

(2018) and a better title that doesn't say CNN, c'mon

dang5y ago

We've added the year above. Submitted title was "CNN converts night images to perfect daylight in – 1s".

jordache5y ago

isn't this just some tweaked raw processing algorithm?

ptrenko5y ago

I think I'll faint seeing AI progress anymore.

I didn't even think this was possible. Have people ever done this manually before? Like without AI?

paul79865y ago

Cool and integrating this into AR Glasses would make them almost a must buy! Turn night into Day ..see in the dark, etc!

vehemenz5y ago

What would happen if this were paired with license plate recognition? And would it be admissible as evidence?

Invictus05y ago

I believe this should have a (2018) tag.

yokto5y ago

Yes, the result image and videos are from 2018 https://github.com/cchen156/Learning-to-See-in-the-Dark/blob...

soperj5y ago

Just wondering why for something brand new they'd use python 2.7??

as1mov5y ago

Probably because it was already installed on their machines. Also what benefit would this project get by using a newer version?

soperj5y ago

Longevity.

pachico5y ago

baybal25y ago

I wonder, how much can it improve over this: https://youtu.be/c_0s06ORTkY

X27 is also using some kind of neural algorithm to denoise and get maximum out of the CIS

css5y ago

What camera are they shooting at 409,600 ISO at?

Traster5y ago

In the video they reference the Sony A7S II, on Sony's website[1] they claim:

>Still images: ISO 100-102400 (expandable to ISO 50-409600),

[1]:https://www.sony.co.uk/electronics/interchangeable-lens-came...

dingaling5y ago

So in this instance they're processing lossily on top of an image already processed lossily in-camera.

1 more reply

jeswin5y ago

The Sony a7s and a7sii have a native iso range of 100 to 102400, which can be boosted to 409600.

skyde5y ago

why use iso 8,000 as input and not the camera native ISO ?

throwaway1223785y ago

Now all they have to do is make sure the correct image displayed for the story

GEBBL5y ago

Impressive of the American news channel, CNN, to convert images in minus one second.

dang5y ago

Please don't do this. What would be a fun crack in person becomes a plastic garbage patch here.

GEBBL5y ago

Apologies dang, I’ll keep this in mind in future :)

1 more reply

wyldfire5y ago

CNN here is a "Convolutional Neural Network"

echelon5y ago

The title needs to be changed so brains recognize it as such. It either needs a preceding adjective or letter indicating what type of convolutional network it is.

The other option is spelling it out.

Most people will read CNN as the news channel. Even those familiar with neural networks.

5 more replies

whoisjuan5y ago

I think OP was being sarcastic

1 more reply

heinrichhartman5y ago

Still -1 second is impressive!

FreakyT5y ago

I, too, was initially very confused by the headline.

wiredfool5y ago

Well, Turner used to colorize b/w movies, I guess progress marches on.

2 more replies

isoprophlex5y ago

They should make a CLI tool, it stores your processed images one sec before it is invoked!

The major peculiarity of the chemical is its "endochronicity": it starts dissolving before it makes contact with water.

StavrosK5y ago

See also Ted Chiang's What’s Expected of Us, where there's a device with a button and a LED, and the LED always lights up one second before you press the button.

1 more reply

ggregoire5y ago

Seriously, what/who's CNN here?

FreakyT5y ago

Convolutional Neural Network (for example, some more info here: https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Unders...)

corndoge5y ago

Convolutional neural network

montenegrohugo5y ago

I understand that the title may have brought confusion, and I think your comment calls attention to this whilst also being somewhat funny.

But still, could we make an effort not to devolve into what has happened on Reddit, i.e. comment sections which mainly consist of puns and other low effort jokes?

pbhjpbhj5y ago

zoomablemind5y ago

Title is confusing (?intentionally).

Would make sense to add Tensorflow to make it more specific.

ccktlmazeltov5y ago

I don't understand what you are saying, it is very "HN" to comment on the title and not on the article.

And it is even more "HN" to comment on details of the title or the article because you don't really know what to say about the article.

Look at my comment.

1 more reply

yingw7875y ago

Yeah, my mind jumped to CNN's coverage of the first Gulf War and having some color night vision for combat journalism:

https://www.thedrive.com/the-war-zone/25803/this-is-what-col...

grumple5y ago

Ah, after decades of effort, we have finally replicated the effect of a candle.

j / k navigate · click thread line to collapse