C, on the other hand, has needlessly complicated syntax; a function definition is hard to detect, and a pointer to a function is hard to interpret, because it's literally convoluted: https://c-faq.com/decl/spiral.anderson.html
Sadly, this is a general stylistic difference: where Pascal tries to go for clarity, C makes do with cleverness, which is more error-prone.
C is almost LR(1), if we allow prior declarations to decide how some tokens are classified, like whether an identifier is a variable or type name.
Declarations like
void (*signal(int, void (*fp)(int)))(int);
are LR(1).LR(1) sentences are harder to read than LL(1) because you have to keep track of a long prefix of the input, looking for right reductions (if you follow certain LR algorithms). LR parsing algorithms use a stack which essentially provides unlimited lookahead, in comparison to LL(1). Both LL(1) and LR(1) have one symbol of lookahead, but qualitatively it's entirely different because the lookahead in LR is happening after an indefinitely long prefix of the sentence which has not been fully analyzed, and has been shunted into a stack, to be processed later. Many symbols can be pushed onto the stack before a decision is made to recognize a rule and reduce by it. Those pushed symbols represent a prefix of the input that is not yet reduced, while the reduction is happening on the right of that. So it is backwards in a sense; following what is going on in the grammar is bit like understanding a stack language like Forth or PostScript.
An LL(1) grammar allows sentences to be parsed in a left to right scan without pushing anything into a stack to reduce later. Everything is decidable based on looking at the next symbol. Under LL(1), by looking at one symbol, you know what you are parsing; each subsequent symbol narrows it down to something more specific. Importantly, the syntax of symbols that have been processed already (material to the left) are settled; their syntax is not left undecided while we recognize some fragment on the right.
Under LR(1) it's possible for a long sequence of symbols to belong to entirely unrelated phrase structures, only to be decided when something finally appears on the right. A LALR(1) parser generator outputs a machine in which the states end up shared by unrelated rules. The state transitions then effectively track multiple parallel contexts.
Does that include the C preprocessor?
Somehow, I recall someone here (maybe it was user walterbright) suggesting that implementing a C preprocessor was a lot of work - maybe months - so one might consider using Facebook's MIT licensed preprocessor:
[1] https://web.archive.org/web/20230714010215/http://conal.net/...
<specifiers> <declarator> {, <declarator>, ...} ;
The star is a type-deriving operator that is part of the <declarator>, not part of the <specifiers>!This declares two pointers to char:
char *foo, *bar;
This declares foo as a pointer to char, and bar as a char: char* foo, bar;
We have created a trompe l'oeil by separating the * from the declarator to which it begins and attaching it to the specifier to which it doesn't. char* foo, bar;
So that's why I've had so many problems understanding C. I come from the Pascal world, where a type specification is straightforward.(There were suggestions back in the 90s that to make C easier to parse for humans (and not-coincidentally simplify the compiler grammar) this should be `foo, bar: float*;` and your model of pointerness could actually be true. Never got much more traction than some "huh, that would be better, too bad we've been using this for 10 years already and will never change it" comments :-) (with an occasional side of "maybe use typedefs instead")
If you value your codebase anyway.
Personally i used "float *foo" for years until at some point i found "float* foo" more natural (as the pointer is conceptually part of the type) so i switched to that, which i've also been using for years. I've worked on a bunch of codebases which used both though (both in C and C++) - in some cases even mixed because that's what you get with a codebase where a ton of programmers worked over many years :-P.
I do tend to put pointer variable declarations on their own lines though regardless of asterisk placement.
(and of course there is always "float foo[42]" to annoy you with the whole "part of the type" aspect :-P)*
One important (and beautiful) thing to understand about C is that declarations and use in C mirror each other.
Consider the same type written in Go and C: array of ten pointers to functions from int to int.
Go: var funcs [10]*func(int) int
C: int (*funcs[10])(int)
Go's version reads left to right, clearly. C version is ugly.
But beautiful thing about C version is that it mirrors how funcs can be used:
(*funcs[0])(5)
See how it's just like the declaration.
Go's version doesn't have this property.
So, now about the *.
Usage of * doesn't require spaces.
If p is a pointer to int, you use it like this: *p
And not like this: * p
And since type declarations follow usage, therefore "int *p" makes more sense.
There is also a good argument about "int *p, i". In the end, these usages follow from how the C grammar works.
There are many more musings about that on the web, but here is one of my favourites: https://go.dev/blog/declaration-syntax.
Edit: HN formatting.
We might wish for the C declaration syntax to be <type> <name>[, ...], but it’s not: it’s <specifier>[ ...] <declarator>[, ...], where int, long, unsigned, struct stat, union { uint64_t u; double d; }, and even typedef are all specifiers, and foo, (foo), (((foo))), *bar, baz[10], (*spam)(int), and even (*eggs)[STRIDE] are all declarators (the wisdom of using the last one is debatable, but it is genuinely useful if you can count on the future maintainer to know what it means).
Everybody is free to not like the syntax, but actively misleading the reader about its workings seems counterproductive.
Get with the program: types on the left, names on the right, one declaration per line.