That said, C needs to be improved, but I think it should be done via the build system, adding layers on top of C, in a tasteful, thought-out way. Some ideas below:
If we're expanding on C by adding build system complexity, Makefiles need to be improved. Make is a great tool, but it's unityped, like shell scripts. And it's essentially a preprocessor over shell, which leads to a mess of sigils. Maybe redesigning make as an embedded language in some lisp would do it. Additionally we could unify the notion of linking to a library and importing code into the makefile to allow dependencies to specify build steps. This could make some of the added build complexity implicit.
Namespaces could probably be implemeneted as a preprocessor, taking module declarations and import statements and converting all identifiers into they prefix qualified equivalents, emiting warnings when there would be a collision (like if module "foo" declared "bar_baz" and module "foo_bar" declared "baz").
Rust-style syntax-case macros can be implemented as a preprocessor.
Go-style defer statements could be implemented in a preprocessor, and avoid the somewhat verbose goto-style error handling.
As I said, this approach requires a lot of care to avoid adding a huge amount of complexity to the language.
1. "uninitialized var usage is error": unfortunately impossible without at least one of the following compromises: Automatically initialize variables (wastes CPU); False alarms (see Java); Built-in formal proof system; or, Require compilers to solve the halting problem.
2. Removed keyword "static": kills one of my favorite tricks, "self-init'ing functions".
3. New keyword "as": A good invention in Pythonland. Good call to bring this in.
4. New keyword "nil": Redundant with NULL?
5. Example - Base Types: Uses uint8 in place of char. This obscures intent and makes code less readable. Compare: int library_fnc(char asterisk errmsg) versus int library_fnc(uint8 asterisk errmsg). (HN wants to turn my asterisks into italics...) In the former it's clear errmsg is a string, in the latter it's not clear (it could be a pointer to a flag).
6. Example - function types. Doesn't one usually typedef the function pointer, rather than the function itself? So making that require two lines is annoying. Aside that, the author is right that C has confusing function pointer typedef syntax.
7. Multi-part array initialization: Encourages unmaintainable code. Depending on what's in those "..."'s, might require compiler to solve halting problem?
8. Multi-pass parsing: Trades maintainability for instant gratification.
9. Symbol accessibility: The author makes "public" (and implicit "private") modify entire structs rather than individual fields...
10. Multi-file module: May lead to unmaintainable code
11. I'm worried about the language arbitrarily defining things like "the results of building are stored in the 'output' directory". OTOH the recipe.txt idea could help standardize what amounts to a lot of ad hoc Makefile programming.
12. Build process difference: Theoretically could speed up compilation. I'm worried for social reasons. In module-based languages, we tend to fall into module hell: one symptom being the infamous 20-page stacktrace (see: Java, Clojure, etc.) The nature of C's #include incentivizes shallow dependency trees (a very good thing).
13. "Language scope": trades portability for convenience
14. Tooling: This shouldn't be part of the language, it should be separate.
You can always typedef it if you want it. The point is that naming a basic numeric type "char" is not only confusing, but also wrong in the world that's no longer all ASCII.
The removal of that keyword with several different meanings doesn't mean there isn't/won't be a replacement:
http://c2lang.org/site/language/variables/#local-keyword
>4. New keyword "nil": Redundant with NULL?
https://groups.google.com/d/msg/comp.std.c/fh4xKnWOQuo/IAaOe...
>5. Example - Base Types: Uses uint8 in place of char. This obscures intent and makes code less readable.
http://c2lang.org/site/language/basic_types/
C2 apparently still has char however it doesn't seem to be as weird as C's (distinct type, either signed or unsigned). Simply int8.
> 1. "uninitialized var usage is error": unfortunately impossible without at least one of the following compromises: Automatically initialize variables (wastes CPU); False alarms (see Java); Built-in formal proof system; or, Require compilers to solve the halting problem.
I don't think this is true. It should be pretty easy to detect if a variable is initialized or not. I can potentially see how a false alarm would arise, but I don't think that matters in practice. (All the situations I'm imagining involve writing bad code)
> 10. Multi-file module: May lead to unmaintainable code
Go does this already and it is fine.
> 12. Build process difference: Theoretically could speed up compilation. I'm worried for social reasons. In module-based languages, we tend to fall into module hell: one symptom being the infamous 20-page stacktrace (see: Java, Clojure, etc.) The nature of C's #include incentivizes shallow dependency trees (a very good thing).
I can see why this is a concern, but I think it is more a problem with JVM languages because of the type of programming Java encourages.
> 14. Tooling: This shouldn't be part of the language, it should be separate.
I thought this way too. Then I used Go and realized the huge benefit tooling integrated into the language provides. (Go has other problems, but tooling is not one of them).
It's easy to trace the code-paths between a variable's declaration and its usage, as long as those don't involve procedure calls. Then that problem becomes "static-analysis complete".
Isn't this just explaining why such useage can't be made a static error? It seems to me that raising runtime errors would avoid each of these compromises (but maybe that's clearly not what was meant).
Though I concede that pointers make things very hard. Let's say you have
void foo(int*);
void bar() {
int x;
foo(&x); /* Is this an error? */
}
You can't tell if x will be initialised or is expected to be initialised. You don't even know if foo will read one int, or expects an array of ints. I would have really preferred it if they did something on that front. Maybe have to declare foo(int[n] a), which you then call as foo([1]&x). There was a paper on extending C with exactly this - though their syntax was foo(size_t n, int a[n]), but I haven't been able to find it. One big plus is that you could, when compiling in debug mode, insert checks for every access. In general, I really want the successor to C to disambiguate between pointers and dynamically-allocated arrays.4. Same reason C++ added nullptr (and I really wish they'd use the same keyword) - if you have a function int foo(int), foo(NULL) compiles fine, but foo(nullptr) doesn't. In C++ especially, since pointer types require casts, so NULL can't be ((void ptr)0), it must be just 0. Once you disallow casting a void ptr to any other ptr, you run into this problem.
5. This is more a problem of C's strings being naked pointers to chars. Every serious piece of code I've seen uses some sort of string wrapper struct. Though they might want to alias char to uint8, for this.
6. Function types should really be pointers, I agree.
7. yeah...
8. Practically every other language does it. It's kind of ludicrous to want every function declared before its usage, when the compiler can just collect all declarations, then see where they're used.
9. I think that's about the right level of granularity. Seems like you'd use struct embedding to separate the public and private fields, e.g.
type Foo_priv { private fields }
type Foo { public fields; Foo_priv priv; }
The fact you can't create a Foo from outside the module seems like a bonus to me.10. I'd be in favour of enforcing Java's "project directory structure must mirror module structure", i.e. module a.b.c's files are all in src/a/b/c. I've heard a lot of C# devs lament that the files for a certain package are strewn everywhere.
11. Kind of a sore point, but yeah. I think emulating Java would be the right thing again - give the compiler a list of dirs which to check for compiled modules when doing module resolution, say where the output directory is.
12. yeah
13. Sounds like #pragma. You can't really escape the fact that when you have N compilers for your language, every one will support its own extensions.
14. I think his point is that the language makes these tools easy to write.
A C string is a sequence of characters, not a pointer. C strings are manipulated using `char*` pointers.
addendum: That doesn't mean C2 is not a worthwhile experiment. I might even try it out when a compiler is available.
Of course, you can do that with some other languages too, but our lack of runtime and no-overhead makes it significantly better, in my opinion.
The incremental path is the official C standards. C will probably gain modules for example.
Anybody have any experience? Is it still basically a toy language?
I have no doubt that they can make the parsing stage faster, as they won't be parsing the same headers over and over again. This is often a bottleneck in C++, but much less often in C. (I've seen 10+MB C++ files after preprocessing).