undefined | Better HN

0 pointstachyonbeam6y ago0 comments

It's pretty common to have neural networks that have both recurrent nets processing text input and convolutional layers. A classic example would be visual question answering (is there a duck in this picture?). That would be a simple example involving looping over one part of the model. Ideally you want that looping to be done as locally as possible to avoid wasting time having a program on a CPU dispatching, waiting for results and controlling data flow.

Having talked to someone at Cerebras, I also know that they don't just want to do inference with this, they want to accelerate training as well. That can involve much more complex control flow than you think. Start reading about automatic differentiation and you will soon realize that it's complex enough to basically be its own subfield of compiler design. There have been multiple entire books written on the topic, and I can guarantee you there can be control-flow driven optimizations in there (eg: if x == 0 then don't compute this large subgraph).

0 comments

Veedrac6y ago

I would be surprised if Cerebras was trying to handle any recurrence inside the overall forward/backward passes. It seems like a lot of difficulty (as mentioned) for peanuts.

I don't get your point about training. Yes, it's backwards rather than forwards, and yes it often has fancy stuff intermixed (dropout, Adam, ...), but these are CPUs, they can do that as long as it fits the memory model.

j / k navigate · click thread line to collapse

0 comments

Veedrac6y ago

I would be surprised if Cerebras was trying to handle any recurrence inside the overall forward/backward passes. It seems like a lot of difficulty (as mentioned) for peanuts.

j / k navigate · click thread line to collapse