For personal computers - desktops and laptops - I think we don't have a shortage of processor cycles. The minimal specs of the Raspberry Pi make it useable - 256MB of RAM, 700 MHz CPU, a few GB of storage and enough MB to saturate a home broadband connection. What is compelling about the best contemporary personal computing devices is form factor. How easy is it to provide input; how nice is the screen; if it is a mobile device, how heavy is it and does the battery last long enough, etc.
Does a personal parallel computer really help me? At first blush, I am having a hard time seeing how. Clearly, there are CPU intensive workloads that people have mentioned in this discussion - ray tracing is one. The video mentions robotics and algorithms. I have mixed feelings about that since I personally believe the future of robotics lies in computation off the physical robot itself - aka cloud robotics. A use case I personally would find beneficial is the ability to run dozens of VMs on the same machine. Heck ... each of my 50 open browser tabs could run inside separate VMs. I know light weight container technology is around for a while. e.g. jails, LXC. But what about hypervisor-based virtualization - e.g. VMWare, Xen, etc.? While the parallelization offered by this tech would be awesome, what seems to be missing is the ability to address lots and lots of memory.
Back in 06, I remember seeing fear in the eyes of some hardware and software engineers. In the next year, we were supposed to have 100 cores in our plain old desktops. How the heck are we going to program them? I found the situation a bit irrational. Every talk started with the death of Moore's Law because we couldn't shrink dies any further. More cores was posited as the only solution. Except, no one could code them for general purpose apps like Word, Excel, etc. In retrospect, I wonder why I don't have 100 cores in my desktop in 2012. I suspect because they aren't useful for average joe user.
P.S. Forgive my directionless rambling. I don't have a particularly strong opinion on this subject anymore.
That plus cheap access to a massively parallel computer could also be very interesting.
Except where raspberry Pi + online storage could be useful to many, many people. Massive parallelism is probably only interesting to folks like us.
I suspect the problem is that it has no compelling (and immediate) "use case". If they could communicate a set of application ideas then I suspect that a whole new raft of supporters will be happy to risk at least $99.
Also, the $3million stretch goal is just waaaaay too far, too bad that the better design is floated for just that level.
Hoping their funding drive succeeds. I am liking the fact that ISA is being fully documented and we will have a fully open-source toolchain to work with the system.
(Disclaimer: Not associated with Adapteva in any way).
But on purely geek terms this thing seems to warrant a "holy shit":
http://www.adapteva.com/products/silicon-devices/e64g401/
Again I don't know how (un)common that sort of thing is but I wasn't expecting to see 64 cores in that tiny form factor. Does anyone here know how cutting edge this thing is if at all?
[Edit]
Also does anyone here want to address use cases for this thing?
Having looked at the data a bit more: I like their specs concerning system balance. 100 GFLOPS over 6.4GB/s gives you a system balance of 15.625 FLOPS per memory access, that's about the same balance as a Westmere Xeon - pretty good for real world algorithms.
For comparison: NVIDIA Fermi has a system balance of about 20. Meaning: Fermi is sooner bounded by memory bandwidth, which is very often the limiting factor in real world computations.
One thing though: High Performance Computing is all about software / tooling support. If this company comes out with OpenCL in C (even better Fortran 90+) support, then we're talking.
Edit: By similar 'range' I meant core per mm^2 ratio.
For example, one particular embedded 40nm GPU design that I know about can deliver about 25 GFlops or so in the same die area.
edit: No OpenMP support.
Tilera did a very similar looking 64 cores on a chip in 2007, which is the oldest instance I know of off the top of my head. Their devices cost(or at least they used to) a few grand though. Tilera has bumped it up to around 100 per chip these days. I don't know anything about either architecture so it is hard to say if 64 1Ghz adapteva cores compares with 64 1.5Ghz Tilera cores.
So not quite cutting edge just an under explored side channel.
Dedicated machines to host backend applications -- SQL servers, Apache, nginx, etc.
http://www.anandtech.com/show/2918/2
That first picture shows 4 cores made of 4 sub cores with 32 processing elements each. Now Nvidia would claim each of those 32 processing elements is a core, but each of those cores can not act independently. So it is more like a very wide, very hyper threaded 16 core processor.
It is not really a performance designation. It doesn't define a certain architecture or design.
It is pretty clearly an economic designation.
In general: money. Buying more of the most performant equipment available.
So. It's an economic designation.
(though I see the more informative "50 GFLOPS/Watt" below... and I like the prospect of something that would make it cheap to play with large scale real time neural nets...)
That the cores don't run in lockstep can be shader heaven! I'm imagining using the cores in a pipeline with zoning so some core 'owns' some tile of the screen and does z-buffering, and other core does clipping of graphics primitives for each tile, and a sea of compute nodes between them chew up work and push it onwards.
Some kind of using the cores as a spatial index too. Passing rays to other cores as they propagate beyond the aabb belonging to a core.
Doubtless it wouldn't work like that. And wouldn't work well. But its fun thinking about it! :)
I can see this platform being a good tool for students and researchers to experiment with algorithm speedups by making their sequential code, parallel.
In my parallel programming class, our teacher had to rig together a computer lab to connect the 12 quad core computers to simulate a 64 core cluster. Then again, 64 core cluster of Parallella would cost like $7000. You can get the same 64 core setup by buying 8 x 8 core consumer desktop computer for under $3000, which will still be more cost effective and probably have ten times more computing power because of the x86 architecture.
It is a more powerful expression of the benefit of scaling with parallelism. Principally, instead of scaling speed with respect to a fixed data size, you scale the data size with respect to a fixed speed.
Having more cores means you (sometimes) can have more data. You still need those parallel programmers with their parallel algorithms though :-)
And yes I get that it's open source blah blah blah, but this project is certainly part of the plan for an institutionally-funded business to make money. Adapteva is a .com, not a .org.
Separately: if Adapteva is only 8 months from delivering completed product to users, shouldn't they be able to raise more funds through traditional channels? They clearly have/had VC buy-in and can raise through institutional channels. If they are just finishing the final debugging/SDKs/etc. at this point, it's not a good sign that they can't raise another $750k from their existing backers to cover final launch costs.
I don't have a horse in this race, but it doesn't feel quite right to me.
http://www.youtube.com/user/GreenArraysInc?feature=CAQQwRs%3...
I would totally agree that memory constraint is sort of tied to manycore architectures, but in this case I find it pushed to the limits.
If the Kickstarter falls through, what options could you still make available to hobbyists? Is there some version of your current prototype setup that you could sell, even if it's not one convenient board?
And if it is so, should expecting Erlang compiler be out of the question? :)