While evented I/O is great for a certain class of problems: building network servers that move bits around in memory and across network pipes at both ends of a logic sandwich, it is a totally asinine way to write most logic. I'd rather deal with threading's POTENTIAL shared mutable state bullshit than have to write every single piece of code that interacts with anything outside of my process in async form.
In node, you're only really saved from this if you don't have to talk to any other processes and you can keep all of your state in memory and never have to write it to disk.
Further, threads are still needed to scale out across cores. What the hell do these people plan on doing when CPUs are 32 or 64 core? Don't say fork(), because until there are cross-process heaps for V8 (aka never), that only works for problems that fit well into the message-passing model.
It won't work for every problem, of course.
dnode is a good way to easily talk to other node.js processes without the HTTP overhead. It can talk over HTTP too, with socket.io.
node-http-proxy is useful as a load balancer, and a load balancer can distribute work between cores.
Finally, most of the node.js people I've met, online and offline, are polyglots, and are happy to pick a good tool for a job. But right now node.js has great libraries for realtime apps, the ability to share code on the client and server in a simple way, and good UI DSLs like jade, less, and stylus.
I feel you about the polyglot and tend to agree, but I think some people are really trying to force awkward things into node, like people attempting to write big full-stack webapps using it.
To handle branching flow-control like 'if' statements, Twisted gives you the Deferred object[1], which is basically a data structure that represents what your call stack would look like in a synchronous environment. For example, his example would look something like this, with a hypothetical JS port:
d = asynchronousCache.get("id:3244"); // returns a Deferred
d.addCallback(function (result) {
if (result == null) {
return asynchronousDB.query("SELECT * from something WHERE id = 3244");
} else {
return result;
}
});
d.addCallback(function (result) {
// Do various stuff with myThing here
});
Not quite as elegant as the original synchronous version, but much tidier than banging raw callbacks together - and more composable. Deferred also has a .addErrback() method that corresponds to try/catch in synchronous code, so asynchronous error-handling is just as easy.For the second issue raised, about asynchronous behaviour in loops, Twisted supplies the DeferredList - if you give it a list (an Array, in JS) of Deferreds, it will call your callback function when all of them have either produced a result or raised an exception - and give you the results in the same order as the original list you passed in.
It is a source of endless frustration to me that despite Twisted having an excellent abstraction for dealing with asynchronous control-flow (one that would be even better with JavaScript's ability to support multi-statement lambda functions), JavaScript frameworks generally continue to struggle along with raw callbacks. Even the frameworks that do support some kind of Deferred or Promise object generally miss some of the finer details. For example, jQuery's Deferred is inferior to Twisted's Deferred: http://article.gmane.org/gmane.comp.python.twisted/22891
[1]: http://twistedmatrix.com/documents/current/core/howto/defer....
The differences between your example and the common JavaScript practice for promises (when they're used; most of the time they aren't) are that then is used instead of addCallback and that chaining is available and taken advantage of.
getSomething("id", function(thething) {
// one true code path
});
function getSomething(id, callback) {
var myThing = synchronousCache.get("id:3244");
if(myThing) {
callback(null, myThing);
} else {
async(id, callback);
}
}
a minor quibble with language style isnt exactly what I would call "A Giant Step Backwards"I was under the impression that you could not do _anything_ synchronous? What if the call blocks for 100ms? or 1000ms? Won't that delay all other clients and all other requests?
It also made for a better title than "Confusion, Then Indifference, Slowly Turning Into Understanding & Affinity"
I recently wrote a project that needs to do 100's or 1000's of possibly slow network requests per second. The first try was Ruby threads. That was a disaster (as I should have predicted). I had an entire 8-core server swamped and wasn't getting near the performance I needed.
The next try was node. I got it running and the performance was fantastic. A couple orders of magnitude faster than the Ruby solution and a tenth of the load on the box. But, all those callbacks just didn't sit right. Finding the source of an exception was a pain and control flow was tricky to get right. So, I started porting to other systems to try to find something better. I tried Java (Akka), EventMachine with/without fibers, and a couple others (not Erlang though).
I could never get anything else close to the performance of Node. They all had the same problems I have with Node (mainly that if something breaks, the entire app just hangs and you never know what happened), but they were way more complicated, _harder_ to debug, and slower.
I have a new appreciation for Node now. And now that I'm much more used to it, it's still difficult to do some of the more crazy async things, but I enjoy it a lot more. It's a bit of work, and you have to architect things carefully to avoid getting indented all the way to the 80-char margin on your editor, but you get a lot for that work.
It's asynchronous, not actually parallel. Only a single CPU core will be used in node.js.
However, waiting asynchronous tasks will let other tasks run meanwhile, which can feel like parallelism.
I don't mean to be offensive, but welcome to at least the 1980s. We've known this doesn't scale for ages. The fact that you even tried it and thought it might be a viable solution just shows your education has failed you. I am highly biased against Node, I think it is a giant step backwards. Every blog post I have read that says the opposite admits they have no experience in anything else so they just default to Node being good. I only hope Node is a fad.
This holier-than-thou attitude is exactly the thing that prevents more people from becoming educated on these kind of subjects. Knowledge and experience on these kinds of subjects are _not_ trivial and are _not_ easy to obtain! Information about what scales, what does not, and why, are scattered all over the place and difficult to find. It may be very obvious to you after you already know it but it's really not. If, instead of spending so much time on declaring other people as dumb or uneducated, people would spend more time on educating other people, then the world would be much better off.
And on a side note "your education has failed you"? Seriously? You can't just preface something with "I don't mean to be offensive" and then say whatever you like. I don't mean to be offensive, but get yourself some social skills.
Also the first example, the cache hitting and missing, could be rewritten with async, too.
async.waterfall([
function(callback) {
asynchronousCache.get("id:3244", callback);
},
function(myThing, callback) {
if (myThing == null) {
asynchronousDB.query("SELECT * from something WHERE id = 3244", callback)
} else {
callback(myThing)
}
},
function(myThing, callback) {
// We now have a thing from the DB or cache, do something with result
// ...
}
]);From a readability standpoint I'll take the "old" version any day:
function getFromDB(foo) {
var result = asynchronousCache.get("id:3244");
if ( null == result ) {
result = asynchronousDB.query("SELECT * from something WHERE id = 3244");
}
return result;
} x = db.getFutureResult("x");
y = db.getFutureResult("y");
whenFuturesReady([x,y], callback(x, y) {
useResults(x,y);
});
This looks reasonably similar to typical synchronous code, x = db.getResult("x")
y = db.getResult("y")
useResults(x,y)
but it allows db queries to happen simultaneously and doesn't break the node paradigm.http://gfxmonk.net/2010/07/04/defer-taming-asynchronous-java...
There's a die-hard core of callback proponents (especially in twisted- and lately in node-land) who claim the pure callback-style is more predictable, robust and testable.
This is not my experience. I've been through that with twisted (heavily), some with EventMachine and some with node.js.
The range of use-cases where I'd benefit from that style was extremely narrow.
For most tasks it would turn into a tedium of keeping track of callbacks and errbacks, littering supposedly linear code-paths with a ridiculous number of branches, and constantly working against test-frameworks that well covered the easy 90% but then fell down on the interesting 10% (i.e. verifying the interaction between multiple requests or callback-paths).
I'm sticking to coroutines where possible now (eventlet/concurrence) and remain baffled over the node-crew's resistance against adding meaningful abstractions to the core.
I like javascript a lot (more so with coffee), but I see little benefit in dealing with the spaghetti when that doesn't even give me transparent multi-process or multi-machine scalability.
And to prevent the obligatory: Yes, I know about Step, dnode and the likes. They remain kludges as long as the default style (i.e. the way all libraries and higher level frameworks are written) is callback-bolognese.
I believe that JavaScript could become the dominant language on the server. We just need to have a set of consistent synchronous interfaces across the major server side JavaScript platforms. This would allow for innovation and code reuse higher up the stack.
I'm doing my bit by maintaining Common Node (https://github.com/olegp/common-node), which is a synchronous CommonJS compatibility layer for Node.js.
Wouldn't it be better to describe it as running serially, using non-blocking asynchronous function calls? Guess that doesn't really roll of the tongue, though.
https://github.com/scalien/scaliendb/blob/master/src/Framewo...
I guess the OP is saying inlining [in a language where this is even possible] leads to unreadable code, which sounds about right.
function handler(yes, no) {
return function (err, data) {
if (data) {
yes(err, data);
}
else {
no(err, data);
}
}
}
function get() {
function done(err, data) {
// do something with data
}
function db() {
asynchronousDb.query("SELECT * fomr something where id = 3244", done);
}
asynchronousCache.get("id:3244", handler(done, db));
}My experience (mostly in perl - EV,AnyEvent, etc.) is that combining evens with finite state machines gives more structured code, with smaller functions that interact in predefined manner.
Meanwhile there are other choices that are about as easy, like Python libraries and Google's Go. Too bad they don't have the same zealous community support.
There is SpiderNode, not sure what the status of it is, but it replaces V8 in node.js with SpiderMonkey. SpiderMonkey already has yield and much other new JS syntactic sugar.
http://blog.zpao.com/post/4620873765/about-that-hybrid-v8mon...
Mentions they're working closely with the node team here. And that whole talk is about fixing up JavaScript into a modern language, remove the weird syntax quirks around classes, modules, etc. Say what you mean instead of the weird closure soup.
I had many of the same concerns with node.js. Every time I attempted to wrap my head around how I'd write the code I needed to write, it seemed like node was making it more complicated. Since I learned erlang several years ago, and first started thinking about parallel programming a couple decades ago, this seemed backwards to me. Why do event driven programming, when erlang is tried and true and battle tested?
The reason is, there isn't something like node.js for erlang, and so I set out to fix that.
For about a year I've been thinking about design, and for a couple months I've been implementing a new web application platform that I'm calling Nirvana. (Sorry if that sounds pretentious. It's my personal name- I've been storing up over a decades worth of requirements for my "ideal" web framework.)
Nirvana is made up of an embarrassingly small amount of code. It allows you to build web apps and services in coffeescript (or javascript) and have them execute in parallel in erlang, without having to worry too much about the issues of parallel programming.
It makes use of some great open source projects (which do all the heavy lifting): Webmachine, erlang_js and Riak. I plan to ship it with some appropriate server side javascript and coffee script libraries built in.
Some advantages of this approach: (from my perspective)
1) Your code lives in Riak. This means rather than deploying your app to a fleet of servers, you push your changes to a database.
2) All of the I/O actions your code might do are handled in parallel. For instance, to render a page, you might need to pull several records from the database, and then based on them, generate a couple map/reduce queries, and then maybe process the results from the queries, and finally you want to render the results in a template. The record fetches happen in parallel automagically in erlang, as do the map/reduce queries, and components defined for your page (such as client js files, or css files you want to include) are fetched in parallel as well.
3) We've adopted Riak's "No Operations Department" approach to scalability. That is to say, every node of Nirvana is identical, running the same software stack. To add capacity, you simply spin up a new node. All of your applications are immediately ready to be hosted on that node, because they live in the database.
4) Caching is built in, you don't have to worry about it. It is pretty slick- or I think it will be pretty slick-- because Basho did all the heavy lifting already in Riak. We use a Riak in-memory backend, recently accessed data is stored in RAM on one of the nodes. This means each machine you add to your cluster increases the total amount of cache RAM available.
5) There's a rudimentary sessions system built in, and built in authentication and user accounts seem eminently doable, though not at first release. Also templating, though use any js you want if you don't like the default.
So, say, you're writing a blog. You write a couple handlers, one for reading an article, one for getting a list of articles and one for writing an article. You tie them to /, /blog/article-id, and /post. For each of these handlers, any session information is present in the context of your code.
To get the list of articles, you just run the query, format the results as you like with your template preference and emit the html. If it is a common query, you just set a "freshness" on it, and it will be cached for that long. (EG: IF you post new articles once a week, you could set the freshness to an hour and it would pull results from the cache, only doing the actual query once an hour.)
To display a particular article, run a query for the article id from the URL (which is extracted for you) and, again this can be cached. For posting, you can check the session to see if the person is authorized, or the header (using cookies) and push the text into a new record, or update an existing record. Basically this is like most other frameworks, only your queries are handled in parallel.
The goal is to allow rapid development of apps, easy code re-use, and easy, built-in scalability, without having to think much about scalability, or have an ops department.
This is the very first time I've publicly talked about the project. I think that I'm doing something genuinely new, and genuinely worth doing, but its possible I've overlooked something important, or otherwise embarrassed myself. I don't mean to hijack this thread, but felt that I needed to out my project sometime. A real announcement will come when I ship.
If you're interested in keeping up to date with the project I describe above, please follow me on twitter @NirvanaCore.
EDIT TO ADD: -- This uses Riak as the database with data persisted to disk in BitCask. The Caching is done by a parallel backend in Riak (Riak supports multiple simultaneous backends) which lives in RAM. So, the RAM works as a cache but the data is persisted to disk.
Ship it tomorrow! ;)
Yes, you have overlooked something important, there will be something to be embarrassed by -- whether it turns up next week or next decade -- and we'll all have a good laugh. Don't sweat it. And don't worry that the thing isn't finished; the kind of geeks who might sign on at this stage like unfinished things; that is why they can't resist reinventing the wheel. Plus, it doesn't have to be finished to give people ideas, which is half the point. You are ready to start spreading the news; your writeup says as much.
The public repository beckons!
(Frankly, this sounds like a great experiment, although I would never be too quick to predict the end of the ops department. ;)
I also shouldn't predict the end of the ops department, until I've had it running in production with a significant number of users.
I think it would be better to say- my goal is to have the ops department working on really interesting stuff, rather than shepherding a fleet of servers, every one of which has a different configuration.
I have some plans in this area, but I couldn't guess how to best fit into other people's workflows.
What might be nice is if there was a way to sync a git repository with Riak, and then Nirvana could just pull the relevant code from that. Seems like it would be the best solution, but looking into that- from looking at possibly integrating with a github API (do they have one?) to command line scripts is something I'm punting on to focus on the essentials.
But I do agree with your points!
That is not true, see: https://github.com/hookio/hook.io, been in development in Node.js for over two years.
By which, I meant to say "platform for building server applications in javascript, backed by the power of the erlang OTP platform."
Node.js gives server side javascript a platform, that's great. What I'm working on is giving server side coffeescript and javascript access to the erlang platform (and some really great erlang technologies.)