1 - Don't catch errors unless you can actually handle them (and chances are, you can't handle them). Let them bubble up to a global handler, where you can have centralized logging. There's a fairly old discussion with Anders Hejlsberg that talks about this in the context of Java's miserable checked exception that I recommend [1]. This is also why, in my mind, Go gets it wrong.
2 - In the context of error handling (and system quality), logging and monitoring are the most important thing you can do. Period. Only log actionable items, else you'll start to ignore your logs. Make sure your errors come accompanied by a date (this can be done in your central error handler, or at ingestion time (via logstash or what have you)
3 - Display generic/canned errors to users...errors can contain sensitive information.
4 - Turn errors you run into and fix into test cases.
Catching exceptions to throw exceptions with better messages is something I would stronly suggest, since almost no exceptions are useful without contextual information. I.e. which file was not found? The config, not the input or output. Things like this. This is especially useful in C++, where you don't get stack traces, but also in other languages you'll want to present non-technical (i.e. non-dev) users with meaningful messages. Stack traces will just frighten them off.
Your comment on sensitive information also plays into this.
I'll agree that just swallowing errors and going on is a recipe for disaster. This is something that regularly bugged me in most of the C code I've encountered so far.
Also not every user in the world understand English.
I catch exceptions in two levels. One is the user initiated action. I seldom see this specified, but it seems obvious to me: if the user decides to do X, either X is done or a clear and meaningful message is shown, explaining what and why could not be done.
There is another finer grained location to do what you say (collecting details), logging and then re-raising up to the other level.
Swallowing errors is evil and no, not limited to C code.
Much more important is to wrap these exceptions generated in internal code in custom types. List of raised exceptions is a part of code's interface, and one shouldn't generally expose this kind of internal details as public API.
(I have some code for WinCE that walks stack traces in conjunction with SEH so that crashes in production - segfault etc - get logged in a useful manner. It does rely on parsing and decoding instructions ...)
Easy to say but much harder to implement. For example, if you communicate with another service, a few network errors are usually not actionable and you'd have some fallback mechanism in you code. But tons of network errors (e.g. > 20%) is a problem that needs to be fixed now. So would you log the network error or not?
Add Java's checked exceptions, now the practical differences are quite subtle. Of course it's nice to be able to be able to do transformations with higher level functions.
Considering the amount of places I see c# code catch errors and continue as though nothing happened I have to wonder how many wild go and c programmers simply ignore error codes?
However, the fact that "ignore error" is an easy and built-in paradigm that even shows up in the official docs:
fragileThing, _ := scary.MightNotWork()
that fills me with dread.I would even define an error to be a situation the code can't handle properly, and those things are usually not under your control (if I rip out a harddisk while some program is running, the developer can't do anything against that - but I would hope that the program fails accordingly and states why it failed).
I'm obviously paraphrasing here, but things like this do happen (USB devices get disconnected, remote services go down etc).
Edit: Obviously, this applies to different code at different levels. Serialization code might fail due to input - but that also is not under the control of the dev writing the serialization logic. Thus, it should fail.
Most files a program opens are not as a result of user action: configuration, libraries, resources, etc. And usually it doesn't make sense to catch these errors at the point of occurrence, because they'll be all over the codebase. And there's very little you can do in response to them.
You should always know how the api you are using returns an error and use it - most langauges have multiple common ways of notifying of an error.
As to leaking, coming from Java's terrible error handling, you just start using runtime errors all the time anyway, at least with js the result is more concise.
Promises, aside from being far more concise with a huge amount of utility, do not leak errors or exceptions.
doAMillionThings()
.catch((err) => handleAnything(err));http://www.gigamonkeys.com/book/beyond-exception-handling-co...
1) interactive debugger
2) programmatical
There is another one:
3) restart dialog
The program presents you a list of restarts, for example in a GUI dialog, and the end user can select a restart - without interacting with a debugger.
The debugger is just one program, which may display the restarts.
That's how one used it in applications on a Lisp Machine. To call a debugger could be an option in the list of restarts. For real real end users, even the call to the debugger might not be available and all they can do is to choose an option from the list of restarts. Symbolics offered something called 'Firewall', which did all it can to hide the underlying Lisp system from the end user - here the end user should not interact with a debugger or Lisp listener.
But even in a Lisp listener, if you used the 'Copy File' command you might get a dialog shown with the typical options: abort, try again, use other file, enter debugger, ...
https://github.com/matlisp/matlisp-optimization/blob/master/...
As a fellow Indian lisper, are you by any chance using CL for work ?
Last I heard, the only big CL shop, cleartrip, moved all their codebase to Ocaml.
As others have already mentioned, much of the rest is quite specific to Node/JS, and many of the issues raised there could alternatively be solved by simply choosing a better programming language and tools. The degree to which JS has overcomplicated some of these issues is mind-boggling.
Basically the argument is that once you reach a logic error (e.g. NullReferenceException, IndexOutOfBounds etc) you already potentially corrupted the application state, so using any part of the application state is dangerous, and saving it to be used once the program has been restarted makes it worse - then you load the corrupted state into your restarted program. So while saving data is prudent - it should be done at regular intervals so that after a logic/programmer error is detected, the program can reload saved data from before the error occurred, not after.
One can also imagine having nested "top level" handlers for the various contexts where errors in one type of context is not as serious as others. Example: in a graphical application an exception arising from a mistake in UI code does not affect the "document" the user has open, so might be possible to "handle" this error by simply reinitializing the UI and reloading the active document (since we know the active document). An exception due to a logic error thrown during a transaction on the document on the other hand should probably be considered corrupting, so the application must try to reload some document state from earlier instead. If there is no state then the correct thing to do is to tear down the application even if it means losing the document. It's better to lose the work and let the start over, than allow the user to continue working with data he isn't aware is corrupt.
They also recommend configuring Node to dump core on programmer error, which includes (literally) all of the diagnostic information available on the server.
It really depends upon the language and environment used. I work with C (almost legacy code at this point), and if the program generates a segfault, there is no way to safely store any data (for all I know, it could have been trying to auto-save recovery data when it happened). About the best I can hope for is that it shows itself during testing but hey, things slip into production (last time that happened in an asynchronous, event driven C program, the programmer maintaining the code violated an unstated assumption by the initial developer (who was no longer with the company) and program go boom in production). At that point, the program is automatically restarted, and I get to pour through a core dump to figure out the problem.
I'm not a fan of defensive programming as it can hide an obvious bug for a long time (I consider it a Good Thing that the program crashed otherwise we might have gone months, or even years, with noticing the actual bug).
Logging is an art. Too little, and it's hard to diagnose. Too much and it's hard to slog through. There's also the possibility that you don't log the right information. I've had to go back and amend logging statements when something didn't parse right (okay, what are our customers sending us now? Oh nice! The logs don't show the data that didn't parse---the things you don't think about when coding).
And then there are the monumental screw-ups that no one foresaw the consequences of. Again, at work, we receive messages on service S, which transforms and forwards the request to service T, which queries service E. T also sends continuous queries (a fixed query we aren't charged for [1]) to E to make sure it's up. Someone, somewhere, removed the fixed query from E. When the fixed query to E returned "not found," the code in T was written in such a way that failed to distinguish "not found" with "timedout" (because that fixed query should never have been deleted, right?) and thus, T shut down (because it had nothing to query), which in turn shut down S (because it had nothing to send the data to), which in turn meant many people were called ...
Then there was the routing error which caused our network traffic to be three times higher than expected and misrouted UDP replies ...
Error handling and reporting is hard. Maybe not cache invalidation and naming things hard, but hard none-the-less.
[1] Enterprise system here.
Not when you do it the right way! You should only mitigate unexpected situations if you also log it, monitor it and handle it with error callback etc.
Also see my other comment in this thread : https://news.ycombinator.com/item?id=12871541
FWIW Inkscape tries to save the current document to (IIRC) the user's home directory, displays a message to tell the user about it and quits.
I've had segfaults "hidden" for a long time because my artist coworkers weren't reporting crashes in their tools. They assumed a 5 minute fix was something really complicated. Non-defensive programming is no panacea here. Worse, non-defensive programming often meant crashes well after the initial problem anyways, when all sane context was lost.
My takeaway here is that I need to automatically collect crashes - and other failures - instead of relying on end users to report the problem. This is entirely compatible with defensive programming - right now I'm looking at sentry.io and it's competitors (and what I might consider rolling myself) to hook up as a reporting back end for yet another assertion library (since none of them bother with C++ bindings.) On a previous codebase, we had an assert-ish macro:
..._CHECKFAIL( precondition, description, onPreconditionFailed );
Which let code like this (to invent a very bad example) not fatally crash: ..._CHECKFAIL( texture, "Corrupt or missing texture - failed to load [" << texturePath << "]", return PlaceholderTexture() );
return texture;
Instead of giving me a crash deep in my rendering pipeline minutes after loading with no context as to what texture might be missing. Make it annoying as a crash in your internal builds and it will be triaged as a crash. Or even more severely, possibly, if simply hitting the assert automatically opens a bug in your DB and assigns your leads/managers to triage it and CCs QA, whoever committed last, and everyone who reviewed last commit ;)> Logging is an art.
You're right, and it's hard. However. It's very easy to do better than not logging at all.
And I think something similar applies to defensive programming. You want null to crash your program? Do so explicitly, maybe with an error message describing what assumption was violated, preferably in release too instead of adding a possible security vulnerability to your codebase: http://blog.llvm.org/2011/05/what-every-c-programmer-should-... . Basically, always enabled fatal asserts.
This might even be a bit easier than logging - it's hard to pack too much information into a fatal assert. After all, there's only going to be one of them per run.
What's the deal? To someone who programs primarily in C and Ruby, this feels like a tremendous complication of the normal programming process.
Async/await solves this to some extent, because you can just go back to 1 way of error handling, which is throwing and catching exceptions.
The third way (working with EventEmitter) is an odd pattern, but it's really more for specialized use-cases. Wouldn't really call this standard. Imagine a long-running operation that can occasionally broadcast that a non-fatal error occurred.
A global error number is a terrible idea, and return codes are not just not idiomatic.
So really there's just two: one for synchronous and one for asynchronous operations.
You'd be in a very similar situation with C. I don't know C too well, but I imagine that most asynchronous operations would be done with threads, and for those operations you also can't just return an error code.
Does Ruby have concurrency or async primitives? I don't know it really well. If it doesn't, it's also obvious why you wouldn't have this problem. If it does, how do you handle exceptions in asynchronous operations? To me it seems that Javascript, Ruby, C, PHP, Java are all pretty similar in these regards and JS is not at all unique.
Go gets this right. The equivalent of this ES7 function call in javascript:
await foo();
In go is a straight up regular function call:
foo();
But not waiting for the result in javascript:
foo();
Is actually handled with the go keyword:
go func();
This, to me, is the major difference in the asynchronous model between Go and Javascript. In javascript (with ES7) blocking is opt-in, in Go it's opt-out. Go is by far the saner model for a programming language that relies heavily on 'green threads' / reactor pattern.
Eventually Javascript will probably let you write f(await g(x)) and transform one async function into a chain of Promises and continuation functions (await will throw if g fails), but it's not yet a standard part of the language and not everyone wants to preprocess this experimental dialect into something Node can run today.
The conclusion is 1. Throw an exception (which will stop the programm) if it's a programming error. 2. Handle it in place using callback just like you return the error code in other language if that's an operational error (e.g. user didn't input password when login). Emit an error event is really a special case of the callback when you want to handle error somewhere else (maybe globally).
So I think the article is more about when to "throw an exception" vs "return the error" if you take out those javascript special juices.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
e instanceof Error
or: e instanceof MyError
why does toString() have anything to do with this? function myApiFunc(callback)
{
/*
* This pattern does NOT work!
*/
try {
doSomeAsynchronousOperation(function (err) {
if (err)
throw (err);
/* continue as normal */
});
} catch (ex) {
callback(ex);
}
} try {
console.log("see, ");
setTimeout(() => {throw new Error("oops")}, 100);
console.log(
"I can't assume here that"
+ " the prev. line succeeded"
);
} catch(e) {
console.log("error!");
}I'm very much in favor of the opposite approach, defensive coding. Often when I read opinion pieces about how bad defensive coding is, they almost always seem to forget that defensive coding without proper logging, error-handling and monitoring is NOT defensive coding. It is extremely dangerous to just detect error conditions without any feedback: you have no idea what is going on in your system!
IMHO properly applied defensive coding, works as follows:
* Detect inconsistent situations (e.g. in a method, expected an object as input argument, but got a null)
* Log this as an error and provides feedback to the caller of the method that the operation failed (e.g. through an error callback).
* The caller can then do anything to recover, (e.g. reset a state, or move to some sort of error state, close a file or connection, etc.).
* The caller should then also provide feedback to its caller, etc. etc.
This programming methodology gives the following advantages:
* You are made to think about the different problems that can occur and how you should recover them (or not)
* Highly semantic feedback about what is going wrong when an issue occurs; this makes it very easy to pinpoint issues and fix them
* Server application keeps on running to handle other requests, or can be gracefully shut down.
* Client side application UIs don’t break, user is kept in the loop about what is happening
Of course you will need to keep a safety net to catch uncaught exceptions, properly logging and monitoring them (and restart your application if relevant)
The fail-fast approach, as I have seen it applied, doesn’t do any checking or mitigation, with the effect that:
- you are thrown out of you normal execution path, losing a lot of context to do any mitigation (close a file, close a connection, tell a caller something went wrong)
- you only get a stack trace from which it can be hard to figure out what went wrong
- there can be big impact on user experience : UIs can stop working, servers that stop responding (for all users).
I have very good experiences with using the defensive coding paradigm, but it takes more work to do it right; for many, especially in the communities that use dynamic typing, such as the JS community, this seems to be a too big a hurdle to take. This is unfortunate because it IMO it could greatly improve software quality.
Any feedback is welcome!
(Edit: formatting to improve readability) (Edit: clarified defensive coding as an opposite approach to fail-fast)
Terminology aside, though, I agree with much of what you say. The idea that it's generally acceptable for buggy code to just crash out seems to be making an unwelcome return recently, often among the same kinds of developers who don't like big design up front or formal software architecture because they want everything to be done incrementally and organically, and in the case of web apps specifically, often among developers who also consider code that runs for a year or two to be long-lived anyway.
What I notice is that developers who also have a background in statically typed (system) languages, are much more disciplined when it comes to defensive programming and logging/error handling. (I'm afraid this also correlates with age).
BTW, I like your description, "designing systems to make fewer assumptions", for defensive programming!
Probably should change the submission title.
There is no language I know of where error handling is both simple and not overbearing.
Java mixed the two kinds of exceptions up completely and checked exceptions just added insult to that injury.
The best implementation I have seen for an imperative language is in Midori (The language used in Microsofts research OS with the same name).
http://joeduffyblog.com/2016/02/07/the-error-model/#bugs-are...
It's basically "C# done right". The blog post is well worth reading.