I haven't tried celluloid yet, but I've learned one lesson building things with PHP, Ruby, Python, Node, Java, Scala, etc. That lesson is to use tools with a community of people around it that are using those tools to solve the same problems as you. The more community is around a project the more likely that someone else has stumbled upon whatever bug or problem you have hit already and found a solution if it exists. Also, documentation tends to be better and easier to find on more popular languages and frameworks.
So, if you like busting out CRUD web apps, you probably should look at PHP and Zend/Symfony/etc. framework or Ruby on Rails or Python Django or Java Play. If you like doing single page JS apps you should look at Knockout, Backbone, and Ember. If you want to do more evented/parallel networked apps you should look at node, scala, clojure, go, and erlang because those communities care a lot about threading, evented, actor pattern type programming.
Evented/multicore/multithreaded programming is just not something that say PHP, Ruby, and Python have embraced as much as a community because it's not for the most part a problem that the average PHP, Ruby, and Python dev is trying to solve most of the time.
Node does not own evented I/O.
I didn't mean to say node owns evented I/O, just that their whole community embraces it.
Checkout Akka, it's awesome: http://akka.io/
It's well documented, straightforward to use, robust and works seamlessly on top of many event loops (libev, libevent, Glib, Tk and more).
And I just need to look at AnyEvent::* namespace on CPAN to see what can be run with it - http://search.cpan.org/search?query=anyevent%3A%3A*&mode... (currently lists 630 modules)
Use em-synchrony…
I think DRb pretty cool and has long been underutilized. That said, there are a few fundamental design problems with the way it works that I think DCell solves:
1. Distributed systems really need to be built on top of asynchronous protocols, and DRb is a direct mapping of a synchronous method dispatch protocol onto a distributed systems protocol. Similar attempts at this include: CORBA and SOAP. If anyone disagrees with this I can go into more detail but I think you will find ample distributed systems literature condemning synchronous protocols. DCell is fully asynchronous (but also provides synchronous calls over an underlying asynchronous protocol)
2. DRb is multithreaded but does not provide the user with any assistance in building multithreaded programs. This becomes particularly confusing when you have to deal with a proxy object (DRb::DRbObject) in a remote scenario but not in a local scenario
1. https://github.com/celluloid/dcell/commit/e3115f284084a78756...
EventMachine is • A frankenstein guts the ruby internals • Not in active development • Makes non-blocking IO block • Requires special code from Ruby libraries • Hard to use in an OOP way • Is really difficult to work with • Poorly documented
I'm not sure I understand how EventMachine "guts the Ruby internals" (I didn't watch the talk). It's true that EventMachine's internals are C++, not Ruby; there was originally a reason (again, I think it had to do with green threads) that it was designed this way. I'm not sure I can think of the Ruby functionality that EventMachine changes, or the manner in which EventMachine mucks with the interpreter or its runtime. I'm obviously ready to be corrected, but I'm missing how this impacts me as a programmer. Maybe he's talking about exception handling?
I also wasn't aware EventMachine "wasn't under active development". Because it's just an IO loop. Is libevent in active development? Do I need to be aware of that to use it? The underlying OS capabilities EventMachine maps haven't changed in over a decade. I think I'm actually happy they aren't constantly changing it.
I also don't understand how EventMachine "makes non-blocking block". All EventMachine I/O is nonblocking; it's essentially a select loop.
I also don't understand what special code EventMachine demands from libraries. Maybe he means database libraries? That is, maybe he's referring to the fact that you can't use standard Ruby database libraries that rely on blocking I/O inside an EventMachine loop? I'm wondering, then, what he expected. We wrote a little library (a small part of it is on Github) to do evented Mysql, but we stopped doing that when we realized that Redis evented naturally, and we just hook Mysql up through Redis.
"Hard to use in OOP way" just seems wrong, given that the ~30 evented programs I can find in my codebase directory all seem to be pretty object-oriented. So, that's not so much a question on my part.
Really difficult to work with? I've taught 7 different people EventMachine, in a few hours each. EventMachine is easier than Ruby's native sockets interface, in several specific ways.
I think maybe the issue here isn't so much EventMachine, but the idea of using EventMachine as a substrate for frameworks like Sinatra and Rails. That idea is whack, I agree. Trying to retrofit a full-featured web framework onto an event loop seems like an exercise in futility.
But on the other hand, I've been writing Golang code for the past 2 months, and Golang is militantly anti-event; it doesn't even offer a select primitive! Just alternating read/write on two different sockets seems to demand threads! And what I find is, my programs tend to decompose into handler functions naturally anyways. I try to force myself to write socket code like I did when I was 13, reading a line, parsing it, and writing its response, but that code is brittle and harder to follow than a sane set of handler functions.
So, long story short: I'm not arguing that evented code is the best answer to every problem, or that web frameworks should all be evented, or that actor frameworks aren't useful. It's probably true that a lot of people rushed to event frameworks who shouldn't have done that. But there are problems --- like, backend processing, or proxies, or routers and transformers, or feed processors --- where event loops are the most natural way to express a performant solution.
EventMachine does not use the Ruby IO primitives (e.g. TCPSocket, UDPSocket, etc) as the basis of its IO abstraction, and instead has reimplemented its own set of primitives for doing IO.
Because of this it can't take advantage of work being done in Ruby core to advance Ruby's socket layer. For this reason IPv6 support langered, among other problems.
This also severely complicates making multiple implementations of the EventMachine API, such as its JRuby backend (which maps onto Java NIO)
I think the real problem is EventMachine's original goal was to be a cross-language I/O backend similar to libevent or libev, but since it wasn't a particularly good one, the only language that wound up using it was Ruby. Compare to Twisted, which is built on libevent, or to Node, which is built on libev/libuv
* select() call used to segfault. * the non blocking read/write behaved very differently under different OSes and I am not talking Windows. There were differences between Linux and OSX for example. * Again frequent crashes when reading/writing when nonblock flag is set.
I am sure, situation is lot better now but for building a reactor library, native Ruby IO primitives fell short. There is always lack of advanced selectors (Epoll/KQueue) as well.
Now as I replied to thomas(?) below and since you yourself have wrapped libevent for Ruby 1.9, there were severe limitation in interpreter back then for even libevent wrapper to work. So I guess, given historical reasons, it made sense Eventmachine did not use native Ruby IO primitives or libevent.
Again, this is apocryphal, but I remember someone trying to wrap Ruby around libevent and failing; I remember there being a reason this had to be done bespoke. And having fallen into this particular NIH trap many times before: there's just not a whole lot to a simple socket I/O loop.
What's been your experience with JRuby/EventMachine?
I am sure you could write a libevent-based reactor, but twisted was started a long time ago (10 years ? The twisted book in o'Reilly was published in 2005), before libevent existed I think.
Or compared to AnyEvent which works with [m]any event loop - https://metacpan.org/module/AnyEvent
s/anti-event/anti-callback/. They are not the same thing. Under the hood, Golang is doing the eventing and select() for you, while letting you write simple procedural code.
> I try to force myself to write socket code like I did when I was 13, reading a line, parsing it, and writing its response, but that code is brittle and harder to follow than a sane set of handler functions.
How is simple, procedural code hard to follow? Even the way you write it sounds simple "read, process, write". That is really more hard to follow than a series of onRead, onWrite callbacks?
> I'm not arguing that evented code is the best answer to every problem
No, but like many others you seem to be confusing and conflating call-back driven code with event-loop based code. Everyone agrees on the benefits of event loops over kernel threads. Not everyone agrees that call-backs are the best interface to event loops - more and more people are switching on to green threads (which look just like normal threaded code) as the best and simplest interface to the event-loop - as seen by Golang, gevent, Eventmachine, Coro (Perl) etc etc.
I don't want to turn this into "events work for every program", because like I said above: I don't think event interfaces work for all kinds of programs. I've found them poorly suited to full-featured web frameworks, for instance.
I did not predict this incredibly boring "evented runtime vs. evented API" controversy, but will dispense with it by saying that it is incredibly boring, so you win it in advance. :|
That list is just screaming for someone to write a comment mentioning that you forgot Erlang.
The lock conditions it has are for particular workloads at particular throughput/utilization, so 99% of the people using it won't hit those conditions initially (and a lot of people aren't writing systems that will ever reach those limits). However, we have a particular service that acts as a TCP multiplexer/router/load balancer to provide high availability for some of our mission critical applications. A little over a year ago, I initially wrote it using EventMachine in a couple of weeks, then spent almost a month trying to find what I thought was a bug in my code where a full deadlock would occur if 3 or more connections were established in a small enough time frame. Turned out to be a bug in EventMachine. After fixing that one, I found another where if 5 or more open connections fired the same event within a small enough time frame, all 5 would hit livelock and given enough connections hitting the condition, the whole process would deadlock. Once I hit the second bug in EventMachine itself directly related to concurrency handling, I switched to Netty and rewrote the whole thing in Scala in about a week and it's been rock solid since.
I've been doing some development on the side in Go because its particular flavor of types and concurrency model is fascinating. It's not that Go is anti-event, it's that it's overwhelmingly stream oriented (not low-level streams, but data streams). Once I finally hit that moment of clarity that goroutines/channels was all about connecting streams of data and not connecting raw streams, they became a much more natural solution. However, don't get me started on the difference between a non-blocking read on a channel and a blocking read on a channel.
> Golang is militantly anti-event; it doesn't even offer a select primitive! Just alternating read/write on two different sockets seems to demand threads!
Go only blocks the goroutine. It uses native event primitives (epoll/kqueue/etc) under the hood so that IO is non-blocking to the rest of the process (e.g. other goroutines). You said you have used Go for 2 months, so I assume you are aware of that.So..can you be more specific about what you were talking about in that statement?
Obviously, the whole of Golang's concurrency model is lightweight threads scheduled on I/O events. But that's not exposed to the programmer; in fact, it's hermetically sealed away from the programmer from what I can tell.
So now you know what I meant by that statement.
Do like the http library does and separate out socket handling to be in terms of net.Listeners/net.Conns, and create a handler abstraction for yourself. Your socket code will be testable, because its trivial to write fake net.Conns that are backed with byte buffers. Your handler code will be testable because they are in terms of parsed objects.
I tend to write socket code once per server project, and then leave it alone for months, so this isn't a big deal for me. What makes Go nice for me is that I can block in client code, and retain the efficiency of using a select loop.
I don't have access to the source for the proxy I wrote for work, but here's one I whipped up quickly: http://play.golang.org/p/Fz19qSehCg
Go doesn't expose select, and has the runtime do that for you; but this allows them to make all Go libraries share a select loop, which has nice performance characteristics. Although, now that I think about it, it is probably possible to have a userspace implementation of select that works atop the runtime's shared select loop. Hmmm...
What EventMachine does (or at least what it did a year or so ago when I was debugging this) is this: sub processes are equated to popen. When the input side of that pipe closes (that is, when the sub process closes STDOUT) the process will be finally be waited upon—and if it doesn't terminate in a hard coded timeout which by the way blocks the rest of your program, then it will be forcibly gunned down, with SIGKILL if necessary.
Among the problems that arise here, note that unless your daemon script is written to unconditionally drop STDOUT upon forking (uncommon) and you attempt to launch a daemon from within a sub process you are managing using EventMachine, the subprocess itself will terminate quickly, the daemon will go on its merry way, and your driver program will never, ever tell you it has finished running until that daemon is dead and anything it has spawned that might possibly use STDOUT is also dead. And god forbid it close that stream and then dare to continue running, for EventMachine will shoot it dead within IIRC 20 seconds, and lock up your driver program for the duration to boot.
Programmers who understand how the Unix process model works will write a very small signal handler for SIGCHLD that writes a byte on a pipe or some similar method of notifying the main event loop and call wait on the child immediately and then close its end of those pipes. I am reliably informed by those who understand the Windows process model that what EventMachine does is even more wrong there. This is a subsystem that was not written by anyone who knew what popen does, could not be bothered (or was perhaps incompetent to read) what any of a dozen standard implementations of it do, and appears to have debugged the code into some form of submission and then released it upon an unsuspecting public.
This is the only colossally wrong decision they made that I can list off the top of my head, but that's because it was so stupid I stopped looking for trouble after that. EventMachine does not handle anything but a very straightforward select loop very well, and I am sufficiently terrified of what lives under the covers in that system that I would rather write the select by hand (massive pain though it may be) than let this system anywhere near it. The thing that really alarms me is that people build walls of cardboard like NeverBlock (which reaches deeply into the guts of the Ruby software I/O and replaces it with EventMachine driven coroutines) atop this foundation of sand and then wonder when it falls over sideways in an impenetrable and impossible to debug fashion.
Coroutine programming (for that is, essentially, what we are talking about) can be a very elegant way to solve certain problems, but it works best when it is simple, or it least localized (e.g. samefringe). In an event driven server, every little piece must be audited carefully to ensure that it does not block. You get all the same problems any preemptive concurrency model does, with some added nastiness; in exchange you get some slightly better scalability numbers. It is at its best in a fairly simple program such as, say, nginx in its proxy configuration, where it speaks streams and SSL and talks to some application server on the other end of a different stream for anything sophisticated.
Your backend processing apparently fits in one machine, since you "hook Mysql up through Redis." I'm personally astonished you get so much done without a distributed environment tolerant of the relevant failures I see when I read that sentence.
We've been happily using Go for a few internal systems are are slowly galvanising most of our backend with it too. Life is happier.
I have yet to test it in production, but it looks pretty promising.
I had a lot of problems with EM blocking networks connections if an event loop was tying up CPU, but I suppose that's to be expected.
I knew a guy who chose Scala for a project so he could use Actors for concurrency.
The system never gave the same answers twice and wouldn't peg all the cores on a 4-way machine.
I spent two days trying to fix it, then I got wise and switched back to Java and got it working in 20 minutes with ExecutorService with (i) no race conditions, and (ii) nearly perfect scaling up to eight cores.
Anyway, just wanted to throw it out there that I've deployed several production apps (gaming/messaging servers) using EventMachine, and combined with em-synchrony, I'm pretty comfortable with its performance and limitations. There seems to be good community support (thanks to Ilya Grigorik's articles and em-synchrony) with plenty of examples/documentations.
Just wondering, are there any people out there with Celluloid app experience that's currently in production? Like any other geek, I love programming "the right way" but I know nothing about Celluloid and it's real-world benefits/drawbacks (performance, API, code maintainability, code documentation, community support, blog articles, etc).