a[b] implemented as *(a+b)
Thing, is how we were taught to think about array indexing in the CS lectures of the 70sBoth the C89 and the C99 standard draft contain the following:
> The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))
In fact the expressions a[b] *(a + b) and b[a] are equivalent.
Here is a perfectly valid snippet of C code that will print out 't':
putchar(3["test"]);- Array values decay to pointers in rvalue contexts (though not as the argument of sizeof);
- a[b] is syntactic sugar for *(a+b).
— ⁂ —
These two design decisions have some desirable results:
- Arrays, including strings, can be in effect passed as arguments to functions without implementing a special parameter-passing mechanism for arrays.
- Functions on arrays are implicitly generic over the array length, rather than that length being a part of their type. (When this isn't what you want you should probably be using a struct instead.)
- Array iteration state can be represented as a pointer, preventing bugs in which you index the wrong array. In a sense a single pointer represents an array range or slice, as long as you have some way to identify the array end, like nul-termination in strings or a separate length argument.
- You can change a variable (including a struct field) from being an embedded array to being a pointer to an array allocated elsewhere—or vice versa—without changing the code that uses it. (But if this had been a significant design consideration, -> wouldn't be a separate operator from . in C.)
- It's easy to create new "arrays" at runtime: just return a pointer to some memory.
— ⁂ —
Like all design tradeoffs, these also have some drawbacks, which are so severe that no language of the current millennium has followed C's lead on this, although many of C's other design decisions are wildly popular:
- Bounds checking is impossible.
- Alias analysis for optimization is infeasible.
- If you aren't using a sentinel, you have to pass in a separate argument containing the array length whenever you pass in an array pointer, or stuff these base and limit fields into a slice struct, or something.
- Arguably, these decisions are hard to separate from the fact that C strings are terminated by a sentinel value and thus are not binary-safe.
- 3["hello"] is legal C.
— ⁂ —
Of these five drawbacks, the fifth seems like it may not be as severe as the other four?
you can't overload int::operator[](...)
I imagine there are more optimal and less optimal ways of actually doing the indexing in machine code and the former may be better semantics, but I would think a compiler would generate identical machine code for both.
The size of the objects is implicitly taken into account by the compiler, it knows the size of the objects by the type of the pointer.
Edit: it doesn't blow up, not even with -Wall and -std=c99
Nope. For the code snippet I posted an hour ago, even with -pedantic -Wall -Wextra gcc won't issue any warnings. And why should it? It's perfectly standards conformant, because the standard actually defines the [] operator through the equivalent addition expression.
It's extremely poor style, even if the behaviour is identical.
https://en.cppreference.com/w/c/language/storage_duration
It was considered pretty useless by most, so c++11 recycled the keyword to mean something different.
K&R (Second Ed). Makes no mention of the auto keyword in Section 1.10, but it does say,
> Each local variable in a function comes into existence only when the function is called, and disappears when the function is exited. This is why such variables are usually known as automatic [sic] variables[...]
The type is not given at all, I think by default it would be "int".
Yep, this is called the "implicit int" rule, and it was specifically outlawed[1] by C99 and onward.
[1] https://herbsutter.com/2015/04/16/reader-qa-why-was-implicit...
Still a better type system than Twilight.
On a PDP-11 int would have been 16-bit. On x86 32 bits. But on x86_64 int is 32 bits but pointers are 64-bit. The easiest way to retain the original assumption with minimal changes to the historical source code while targeting a modern CPU is to compile in 32-bit mode.
(#) https://software.intel.com/content/www/us/en/develop/documen... seems to imply that limit is 2³¹-1. I don’t understand why that would be true.
† Or maybe some variant of BCPL -- I'm not exactly sure how functionally different the two were.
C was heavily inspired by B and I suspect written in B aswell. Alternatively, BCPL was extremely portable as it compiled to OCode (what we'd recognise today as bytecode) so that might have been another option. The assignment operators of =+ are straight from B and later changed to += due to Dennis Ritchie's personal taste.
"Douglas McIlroy ported TMG to an early version of Unix. According to Ken Thompson, McIlroy wrote TMG in TMG on a piece of paper and "decided to give his piece of paper his piece of paper," hand-compiling assembly language that he entered and assembled on Thompson's Unix system running on PDP-7."
We are not worthy, friends. We are not worthy.
https://en.wikipedia.org/wiki/GNU_Compiler_Collection#Histor...
From how I read it, it is not capable of bootstrapping itself, and an earlier C compiler in BCPL existed, this is the first C compiler written in C itself.
on an actual pdp11 it CAN bootstrap
Because to be the first, it has to be bootstrapped in an intermediate host language… You have to get a parser running, then the syntax, then the etc… etc…
( immense plug of the Ahl book here…)
To be the first complier in a language, as was pointed out, long before I was born, the compiler has to compile itself, so before it could compile itself, it had to have other language processing programs creating the parsing, the syntax, the etc…
Porting it to GCC just means that they could compile it with GCC, the big test is to get it to compile itself, on what ever platform that is the target platform, because finally, if it cannot generate object code/machine language in the target machine’s binary, then its not really ported.
Later on, UNIX came with tools to build compilers with, YACC and LEX.
If they got it to produce PDP-7 Code, its not really much of a port, really.
It wasn't the first C compiler, it was the first self-hosted C compiler, which is different.
I recommend this[0] paper by Ken Thompson dated 1984 and still relevant.
[0]: https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...