Parallella: A Supercomputer For Everyone (opens in new tab)

(kickstarter.com)

148 pointsBobPalmer13y ago85 comments

85 comments

throwaway197913y ago

I picked up a raspberry Pi a few days ago. Initially, I was blown away by the low price point. Since then, I've been reflecting on what makes a computer useful.

For personal computers - desktops and laptops - I think we don't have a shortage of processor cycles. The minimal specs of the Raspberry Pi make it useable - 256MB of RAM, 700 MHz CPU, a few GB of storage and enough MB to saturate a home broadband connection. What is compelling about the best contemporary personal computing devices is form factor. How easy is it to provide input; how nice is the screen; if it is a mobile device, how heavy is it and does the battery last long enough, etc.

Does a personal parallel computer really help me? At first blush, I am having a hard time seeing how. Clearly, there are CPU intensive workloads that people have mentioned in this discussion - ray tracing is one. The video mentions robotics and algorithms. I have mixed feelings about that since I personally believe the future of robotics lies in computation off the physical robot itself - aka cloud robotics. A use case I personally would find beneficial is the ability to run dozens of VMs on the same machine. Heck ... each of my 50 open browser tabs could run inside separate VMs. I know light weight container technology is around for a while. e.g. jails, LXC. But what about hypervisor-based virtualization - e.g. VMWare, Xen, etc.? While the parallelization offered by this tech would be awesome, what seems to be missing is the ability to address lots and lots of memory.

Qworg13y ago

As the majority of robotics research in the US is paid for by the military, I think there's more of a market for "fast computation on board" than you'd think. Communication and networking is expensive and hard. As a practicing roboticist, I'd love to work with a few of these. =)

gavanwoolery13y ago

The real value is in pushing forward a general compute device with many cores. Overall our programs are still stuck in the 1-2 thread era, and there is a bit of a chicken/egg problem. Without a very effective multicore processor, the payoff in writing parallel programs is small. GPGPU is still to expensive and not very practical due to memory constraints and the GPU/system memory bottleneck. This probably wont be the device to change all of that, but even failure is progress.

throwaway197913y ago

I used to feel exactly what you say. In 2006, when I first encountered the cell processor inside the PS3, my eyes popped. I found it extremely challenging to write useful software. The asymmetric architecture was a big culprit. I briefly looked into the dev environments offered by the likes of Rapid Mind but gave up. This didn't feel like general purpose computing.

Back in 06, I remember seeing fear in the eyes of some hardware and software engineers. In the next year, we were supposed to have 100 cores in our plain old desktops. How the heck are we going to program them? I found the situation a bit irrational. Every talk started with the death of Moore's Law because we couldn't shrink dies any further. More cores was posited as the only solution. Except, no one could code them for general purpose apps like Word, Excel, etc. In retrospect, I wonder why I don't have 100 cores in my desktop in 2012. I suspect because they aren't useful for average joe user.

P.S. Forgive my directionless rambling. I don't have a particularly strong opinion on this subject anymore.

1 more reply

astrodust13y ago

If by "expensive" you mean "time-consuming to implement algorithms on" then, yes, GPGPU is still expensive. The cards themselves, though, are stupidly cheap. For $140 you can get a card with 700+ general purpose shaders that run at around 800MHz.

1 more reply

vidarh13y ago

It's a dev platform. A way for people to experiment with a new architecture. The roadmap is to eventually get to PCIe cards with thousands of cores.

fghh45sdfhr313y ago

The raspberry Pi + tarsnap or dropbox is the disposable computer I've wanted for a while.

That plus cheap access to a massively parallel computer could also be very interesting.

Except where raspberry Pi + online storage could be useful to many, many people. Massive parallelism is probably only interesting to folks like us.

bdfh4213y ago

This is an interesting project that deserves to reach it's funding goal but progress towards that is slow (I have been keeping an eye on it since launched on Kickstarter).

I suspect the problem is that it has no compelling (and immediate) "use case". If they could communicate a set of application ideas then I suspect that a whole new raft of supporters will be happy to risk at least $99.

imrehg13y ago

Also their video is just mediocre. Very slow and very elevator-music like. I really want it to succeed, backed it already and got some more friends to do that too, but they have to do more as well. Fortunately these days they did make some progress by opening up the specs and more reward options.

Also, the $3million stretch goal is just waaaaay too far, too bad that the better design is floated for just that level.

compilercreator13y ago

I have backed this project. This is an interesting startup, with some good solid technology behind it. They have managed to design and tape out a chip with just a 2m dollar budget so far. The main draw of their architecture is not its peak, but rather its efficiency, both in terms of perf/watt and perf/die area. You can look at their manuals on the site.

Hoping their funding drive succeeds. I am liking the fact that ISA is being fully documented and we will have a fully open-source toolchain to work with the system.

(Disclaimer: Not associated with Adapteva in any way).

tsmarsh13y ago

I'm also a backer and I've been completely surprised by the lack of interest. $99 to try what could represent the future of CPU design. I see it as a platform to really try out if the new wave of concurrent languages really make a difference on these platforms.

mbenjaminsmith13y ago

I don't really have any comment on the project itself (not something I would ever use and I don't know the value of what they're proposing).

But on purely geek terms this thing seems to warrant a "holy shit":

http://www.adapteva.com/products/silicon-devices/e64g401/

Again I don't know how (un)common that sort of thing is but I wasn't expecting to see 64 cores in that tiny form factor. Does anyone here know how cutting edge this thing is if at all?

[Edit]

Also does anyone here want to address use cases for this thing?

DeepDuh13y ago

Well, NVIDIAs Kepler GPUs have 1536 cores on something like 320mm^2. I can't really find the die size of that adapteva product but I'd say it comes out at a similar range.

Having looked at the data a bit more: I like their specs concerning system balance. 100 GFLOPS over 6.4GB/s gives you a system balance of 15.625 FLOPS per memory access, that's about the same balance as a Westmere Xeon - pretty good for real world algorithms.

For comparison: NVIDIA Fermi has a system balance of about 20. Meaning: Fermi is sooner bounded by memory bandwidth, which is very often the limiting factor in real world computations.

One thing though: High Performance Computing is all about software / tooling support. If this company comes out with OpenCL in C (even better Fortran 90+) support, then we're talking.

Edit: By similar 'range' I meant core per mm^2 ratio.

compilercreator13y ago

Prepare to be surprised. The die size estimate for the Epiphany IV is 10mm-sq according to Adapteva. It is more appropriate to compare it to embedded GPUs than desktop GPUs in die size, power and performance.

For example, one particular embedded 40nm GPU design that I know about can deliver about 25 GFlops or so in the same die area.

2 more replies

compilercreator13y ago

OpenCL SDK support is provided. They also have a C/C++ compiler with OpenMP support.

edit: No OpenMP support.

1 more reply

stonemetal13y ago

http://en.wikipedia.org/wiki/TILE64

Tilera did a very similar looking 64 cores on a chip in 2007, which is the oldest instance I know of off the top of my head. Their devices cost(or at least they used to) a few grand though. Tilera has bumped it up to around 100 per chip these days. I don't know anything about either architecture so it is hard to say if 64 1Ghz adapteva cores compares with 64 1.5Ghz Tilera cores.

So not quite cutting edge just an under explored side channel.

wcchandler13y ago

I haven't looked at the specs, yet, but this is what I've had in mind: Roll one out to help with deep packet inspection of some of my network traffic. Spam filtering might also be offloaded to one of these guys.

Dedicated machines to host backend applications -- SQL servers, Apache, nginx, etc.

Geee13y ago

You'll need this if you want energy-efficiency when solving parallelizable problems. Use one chip in energy-limited systems, like battery-powered robots. Use multiple chips in power/heat-limited systems, like supercomputers.

Swizec13y ago

Don't modern GPU's have essentially thousands of cores?

stonemetal13y ago

It really depends on what you consider a core.

http://www.anandtech.com/show/2918/2

That first picture shows 4 cores made of 4 sub cores with 32 processing elements each. Now Nvidia would claim each of those 32 processing elements is a core, but each of those cores can not act independently. So it is more like a very wide, very hyper threaded 16 core processor.

1 more reply

sspiff13y ago

Yes, but modern GPUs also use upwards of 100W of power, and their cores operate mostly in lock-step, which means they aren't fast for all kinds of tasks.

jacques_chester13y ago

I think folk need to stop abusing the term "supercomputer".

It is not really a performance designation. It doesn't define a certain architecture or design.

It is pretty clearly an economic designation.

scott_s13y ago

I agree that people tend to abuse the term, but I think it is a performance designation. It's just a sliding performance target. A supercomputer is a computer that can achieve the upper limits of what has been achieved in performance.

jacques_chester13y ago

And what sets those upper limits?

In general: money. Buying more of the most performant equipment available.

So. It's an economic designation.

1 more reply

nnq13y ago

WTF: "45 GHz of equivalent CPU performance"

(though I see the more informative "50 GFLOPS/Watt" below... and I like the prospect of something that would make it cheap to play with large scale real time neural nets...)

compilercreator13y ago

Their attempts at marketing talk are indeed very bad, but their technology is pretty interesting.

alexchamberlain13y ago

Very good way of summing it up!

1 more reply

willvarfar13y ago

I would enjoy making a ray-tracing GPU from one of these.

That the cores don't run in lockstep can be shader heaven! I'm imagining using the cores in a pipeline with zoning so some core 'owns' some tile of the screen and does z-buffering, and other core does clipping of graphics primitives for each tile, and a sea of compute nodes between them chew up work and push it onwards.

Some kind of using the cores as a spatial index too. Passing rays to other cores as they propagate beyond the aabb belonging to a core.

Doubtless it wouldn't work like that. And wouldn't work well. But its fun thinking about it! :)

ricksta13y ago

Parallel computing is limited by Amdalah's Law. Having more core does not mean you can have have more speed because it's not easy to use all those cores. Most imperial languages are not designed with running codes on multiple core and few programers are taught how to design their algorithm for using a handful of cores.

I can see this platform being a good tool for students and researchers to experiment with algorithm speedups by making their sequential code, parallel.

In my parallel programming class, our teacher had to rig together a computer lab to connect the 12 quad core computers to simulate a 64 core cluster. Then again, 64 core cluster of Parallella would cost like $7000. You can get the same 64 core setup by buying 8 x 8 core consumer desktop computer for under $3000, which will still be more cost effective and probably have ten times more computing power because of the x86 architecture.

http://en.wikipedia.org/wiki/Amdahls_law

AustinGibbons13y ago

If you like Amdahl's law you may also like... http://en.wikipedia.org/wiki/Gustafsons_law

It is a more powerful expression of the benefit of scaling with parallelism. Principally, instead of scaling speed with respect to a fixed data size, you scale the data size with respect to a fixed speed.

Having more cores means you (sometimes) can have more data. You still need those parallel programmers with their parallel algorithms though :-)

IsTom13y ago

As to parallel algorithms, there's probably not that many of them. https://en.wikipedia.org/wiki/P-complete#Motivation

jamieb13y ago

"Pledge $199 or more: 64-CORE: You get everything in the SUPPORTER reward and a 64-core Epiphany-IV based Parallella board"

lelf13y ago

http://news.ycombinator.com/item?id=4583263

runako13y ago

Without commenting on the merit of this project, I'm alarmed to see a VC-backed making a Kickstarter pitch.

driverdan13y ago

I'm of the opposite opinion. Companies that already have financial backing, have already put significant time and effort into a project, and already have experience running their business are much more likely to follow through on their campaign than some kid who build a new chair in his bedroom and thinks he can deliver it a month after his $100k campaign is finished.

deweerdt13y ago

could you expand?

runako13y ago

Sure, Adapteva has raised ~$1.5mm in VC and another ~$850k in debt. Now they are raising $750k on Kickstarter, the "funding platform for creative projects." There's a big disconnect there. If well-funded companies like Adapteva are successful raising on Kickstarter, why wouldn't even bigger companies milk the Kickstarter sheep for R&D funds too? While it wouldn't violate the letter of Kickstarter rules for Intel to run a campaign like this, certainly it's not in the spirit.

And yes I get that it's open source blah blah blah, but this project is certainly part of the plan for an institutionally-funded business to make money. Adapteva is a .com, not a .org.

Separately: if Adapteva is only 8 months from delivering completed product to users, shouldn't they be able to raise more funds through traditional channels? They clearly have/had VC buy-in and can raise through institutional channels. If they are just finishing the final debugging/SDKs/etc. at this point, it's not a good sign that they can't raise another $750k from their existing backers to cover final launch costs.

I don't have a horse in this race, but it doesn't feel quite right to me.

3 more replies

sspiff13y ago

Does anyone know what kind of cores these RISC cores will be? Will it be some lower end ARM version, or MIPS? Will it be something for which a wide array of tooling already exists, or will this have its own custom architecture which only works with their toolchain?

rsneekes13y ago

Architecture reference manual can be found here: http://www.adapteva.com/support/docs/e3-reference-manual/

willvarfar13y ago

They have the cores and they are custom, if I read the blurb right. They already have the cores!

fuzzy13y ago

According to the kickstarter page the RISC cores are ARM A9.

wtracy13y ago

There are two ARM A9 cores (running Ubuntu) and 16 custom Epiphany cores.

batgaijin13y ago

I'd rather see a kickstarter for a book on greenarrays programming :(

err13y ago

glad to know i'm not alone. perhaps some instructional videos will suffice for now?

http://www.youtube.com/user/GreenArraysInc?feature=CAQQwRs%3...

unix-junkie13y ago

What's the point in having such a RAM/core ratio? By assigning 4 threads per core (which is fairly common to exploit manycore architectures) you don't even have 4Meg of memory per thread.

I would totally agree that memory constraint is sort of tied to manycore architectures, but in this case I find it pushed to the limits.

wmf13y ago

They don't have multithreading.

wtracy13y ago

Since I thought I saw an Adapteva person posting here earlier:

If the Kickstarter falls through, what options could you still make available to hobbyists? Is there some version of your current prototype setup that you could sell, even if it's not one convenient board?

adapteva13y ago

We would rather not think of that option:-) if the ks project fails, we'll do our best, but seems unlikely that we could support selling kits to hobbyists and they would certainly cost thousands of dollars each due to a lack of volume.

vidarh13y ago

If you don't reach it in time, collect pre-orders. Seriously. Getting an escrow setup in place that in effect gives you a similar payment mechanism as Kickstarter (money handed over to you once the $750k is met; returned if criteria are not met) does not need to be expensive. Even without Escrow I think that if you get close to the target, a substantial number of those of us who've committed on Kickstarter will be ok with taking the risk. And it'd let you set longer/more flexible terms to make reaching it easier.

plextoria13y ago

I'm so hoping this gets funded.

jayhawk13y ago

we got into a big discussion on super computers (the definition), the meaning of what a core is and a whole bunch of other issues... but the low power requirements of this are being completely ignored... as for applications... well portable and/or remote devices/sensors that need parallel computing capabilities and where high energy usage is prohibitive are possible applications. But the greatest asset of this is to spark the next gen of app developers and programmers to fully embrace parallel programming and truly make software scalable...

ksadeghi13y ago

Yes but can they mine Bitcoins? They come with OpenCL drivers so in theory they could as most Bitcoin miners have OpenCL interfaces to the GPU.

perlpimp13y ago

is it me or erlang would sort of fit nicely into the core's ideology of data processing? Seems that LD is like set constant. there are external STR commands. You can have data loaded into registers from the code - MOV. Not an expert in Erlang but it seem that two ideologies can beneficial to one another.

And if it is so, should expecting Erlang compiler be out of the question? :)

j / k navigate · click thread line to collapse

85 comments

throwaway197913y ago

I picked up a raspberry Pi a few days ago. Initially, I was blown away by the low price point. Since then, I've been reflecting on what makes a computer useful.

Qworg13y ago

gavanwoolery13y ago

throwaway197913y ago

P.S. Forgive my directionless rambling. I don't have a particularly strong opinion on this subject anymore.

1 more reply

astrodust13y ago

1 more reply

vidarh13y ago

It's a dev platform. A way for people to experiment with a new architecture. The roadmap is to eventually get to PCIe cards with thousands of cores.

fghh45sdfhr313y ago

The raspberry Pi + tarsnap or dropbox is the disposable computer I've wanted for a while.

That plus cheap access to a massively parallel computer could also be very interesting.

Except where raspberry Pi + online storage could be useful to many, many people. Massive parallelism is probably only interesting to folks like us.

bdfh4213y ago

This is an interesting project that deserves to reach it's funding goal but progress towards that is slow (I have been keeping an eye on it since launched on Kickstarter).

imrehg13y ago

Also, the $3million stretch goal is just waaaaay too far, too bad that the better design is floated for just that level.

compilercreator13y ago

Hoping their funding drive succeeds. I am liking the fact that ISA is being fully documented and we will have a fully open-source toolchain to work with the system.

(Disclaimer: Not associated with Adapteva in any way).

tsmarsh13y ago

mbenjaminsmith13y ago

I don't really have any comment on the project itself (not something I would ever use and I don't know the value of what they're proposing).

But on purely geek terms this thing seems to warrant a "holy shit":

http://www.adapteva.com/products/silicon-devices/e64g401/

Again I don't know how (un)common that sort of thing is but I wasn't expecting to see 64 cores in that tiny form factor. Does anyone here know how cutting edge this thing is if at all?

[Edit]

Also does anyone here want to address use cases for this thing?

DeepDuh13y ago

Well, NVIDIAs Kepler GPUs have 1536 cores on something like 320mm^2. I can't really find the die size of that adapteva product but I'd say it comes out at a similar range.

For comparison: NVIDIA Fermi has a system balance of about 20. Meaning: Fermi is sooner bounded by memory bandwidth, which is very often the limiting factor in real world computations.

One thing though: High Performance Computing is all about software / tooling support. If this company comes out with OpenCL in C (even better Fortran 90+) support, then we're talking.

Edit: By similar 'range' I meant core per mm^2 ratio.

compilercreator13y ago

For example, one particular embedded 40nm GPU design that I know about can deliver about 25 GFlops or so in the same die area.

2 more replies

compilercreator13y ago

OpenCL SDK support is provided. They also have a C/C++ compiler with OpenMP support.

edit: No OpenMP support.

1 more reply

stonemetal13y ago

http://en.wikipedia.org/wiki/TILE64

So not quite cutting edge just an under explored side channel.

wcchandler13y ago

Dedicated machines to host backend applications -- SQL servers, Apache, nginx, etc.

Geee13y ago

Swizec13y ago

Don't modern GPU's have essentially thousands of cores?

stonemetal13y ago

It really depends on what you consider a core.

http://www.anandtech.com/show/2918/2

1 more reply

sspiff13y ago

Yes, but modern GPUs also use upwards of 100W of power, and their cores operate mostly in lock-step, which means they aren't fast for all kinds of tasks.

jacques_chester13y ago

I think folk need to stop abusing the term "supercomputer".

It is not really a performance designation. It doesn't define a certain architecture or design.

It is pretty clearly an economic designation.

scott_s13y ago

jacques_chester13y ago

And what sets those upper limits?

In general: money. Buying more of the most performant equipment available.

So. It's an economic designation.

1 more reply

nnq13y ago

WTF: "45 GHz of equivalent CPU performance"

(though I see the more informative "50 GFLOPS/Watt" below... and I like the prospect of something that would make it cheap to play with large scale real time neural nets...)

compilercreator13y ago

Their attempts at marketing talk are indeed very bad, but their technology is pretty interesting.

alexchamberlain13y ago

Very good way of summing it up!

1 more reply

willvarfar13y ago

I would enjoy making a ray-tracing GPU from one of these.

Some kind of using the cores as a spatial index too. Passing rays to other cores as they propagate beyond the aabb belonging to a core.

Doubtless it wouldn't work like that. And wouldn't work well. But its fun thinking about it! :)

ricksta13y ago

I can see this platform being a good tool for students and researchers to experiment with algorithm speedups by making their sequential code, parallel.

http://en.wikipedia.org/wiki/Amdahls_law

AustinGibbons13y ago

If you like Amdahl's law you may also like... http://en.wikipedia.org/wiki/Gustafsons_law

Having more cores means you (sometimes) can have more data. You still need those parallel programmers with their parallel algorithms though :-)

IsTom13y ago

As to parallel algorithms, there's probably not that many of them. https://en.wikipedia.org/wiki/P-complete#Motivation

jamieb13y ago

"Pledge $199 or more: 64-CORE: You get everything in the SUPPORTER reward and a 64-core Epiphany-IV based Parallella board"

lelf13y ago

http://news.ycombinator.com/item?id=4583263

runako13y ago

Without commenting on the merit of this project, I'm alarmed to see a VC-backed making a Kickstarter pitch.

driverdan13y ago

deweerdt13y ago

could you expand?

runako13y ago

And yes I get that it's open source blah blah blah, but this project is certainly part of the plan for an institutionally-funded business to make money. Adapteva is a .com, not a .org.

I don't have a horse in this race, but it doesn't feel quite right to me.

3 more replies

sspiff13y ago

rsneekes13y ago

Architecture reference manual can be found here: http://www.adapteva.com/support/docs/e3-reference-manual/

willvarfar13y ago

They have the cores and they are custom, if I read the blurb right. They already have the cores!

fuzzy13y ago

According to the kickstarter page the RISC cores are ARM A9.

wtracy13y ago

There are two ARM A9 cores (running Ubuntu) and 16 custom Epiphany cores.

batgaijin13y ago

I'd rather see a kickstarter for a book on greenarrays programming :(

err13y ago

glad to know i'm not alone. perhaps some instructional videos will suffice for now?

http://www.youtube.com/user/GreenArraysInc?feature=CAQQwRs%3...

unix-junkie13y ago

What's the point in having such a RAM/core ratio? By assigning 4 threads per core (which is fairly common to exploit manycore architectures) you don't even have 4Meg of memory per thread.

I would totally agree that memory constraint is sort of tied to manycore architectures, but in this case I find it pushed to the limits.

wmf13y ago

They don't have multithreading.

wtracy13y ago

Since I thought I saw an Adapteva person posting here earlier:

adapteva13y ago

vidarh13y ago

plextoria13y ago

I'm so hoping this gets funded.

jayhawk13y ago

ksadeghi13y ago

Yes but can they mine Bitcoins? They come with OpenCL drivers so in theory they could as most Bitcoin miners have OpenCL interfaces to the GPU.

perlpimp13y ago

And if it is so, should expecting Erlang compiler be out of the question? :)

j / k navigate · click thread line to collapse