Variables `three`, `five` and `seven` are better described as `next_power_of_three`, `next_power_of_five` and `next_power_of_seven`. Since the `ext4_list_backups` function should iterate through 1 (= 3^0 = 5^0 = 7^0), 3, 5, 7, 3^2, 5^2, 3^3, 7^2, ... and 1 should not repeat three times, the initial value of `next_power_of_three` (or any of others) should be 1 and those of others should not be 1. The naming is a bit unfortunate (couldn't they be `threes` etc., for example?) but actually makes sense.
https://github.com/torvalds/linux/blob/d158fc7f36a25e19791d2...
/*
* Iterate through the groups which hold BACKUP superblock/GDT copies in an
* ext4 filesystem. The counters should be initialized to 1, 5, and 7 before
* calling this for the first time. In a sparse filesystem it will be the
* sequence of powers of 3, 5, and 7: 1, 3, 5, 7, 9, 25, 27, 49, 81, ...
* For a non-sparse filesystem it will be every group: 1, 2, 3, 4, ...
*/
I'm not sure what's so noteworthy about that. There's ton of arcane code in any kernel, as long as it's commented correctly there's no issue.Good commenting is no substitute for good naming. For a variable containing the number 1, "three" is a shitty name.
The function being called is directly above this declaration. It's a static function not used outside of this file. What are the chances that someone would edit this code without understanding what "three" is used for in this context? Pretty slim I wager.
It's good that people are auditing the linux source code but if you stumble upon some weird looking code (which again, is not the case here in my opinion) the right way to deal with it is not to post it on hacker news. Contact the maintainer (look at the MAINTAINER file at the root of the kernel) if possible with a patch to fix the issue.
If I adjust your comment to "for a variable whose only purpose is to be passed to another function as the parameter called 'three', 'three' is a shitty name", would you endorse that sentiment?
How do you know it is commented correctly? Now you have to understand the code and the comment and mentally confirm them to be in sync. You also have to maintain the comment when you make changes to the code and you have to trust that others are going to do the same.
It would be better if the code itself was expressive enough so that a comment wasn't needed.
See http://www.nongnu.org/ext2-doc/ext2.pdf#page15 for a slightly more detailed explanation.
"The first version of ext2 (revision 0) stores a copy at the start of every block group, along with backups of the group descriptor block(s). Because this can consume a considerable amount of space for large filesystems, later revisions can optionally reduce the number of backup copies by only putting backups in specific groups (this is the sparse superblock feature). The groups chosen are 0, 1 and powers of 3, 5 and 7."
(Edit: Yes, I know that 3^0=1, but the wording "0, 1 and powers of 3, 5" does not imply 3^0. Thanks!)
> -- Phil Karlton
"Naming things" includes systems like MAC address allocation, IP address allocation, DNS, URLs for documents, process IDs, name to inode mapping in filesystems, autoincrement primary keys in databases, etc.
Any ideas on what Karlton really meant with the quote?
It seems to me that this quote makes a rather standard usage of the "Zeugma" [1] figure of speech. It consists in putting next to each other two things of very different nature. In that case we have the very technical problem of invalidating caches on the one hand, and the much higher level problem of naming things on the other hand.
That produces a rather stylish quote.
But if Phil Karlton was actually referring to the technical problem of assigning unique identifier as you suggest, the figure of speech is lost and the quote becomes less stylish.
So I prefer tho read "naming things" as the very general problem of naming things.
The non-human-readable things like MACs, IPs and auto-increment keys are easy. But when I have to come up with a short, memorable and non-confusing name for something like a script that can take me longer than writing the code.
>The difficulty only arises if the body of a nested function refers directly (i.e., not via argument passing) to identifiers defined in the environment in which the function is defined, but not in the environment of the function call.
Also, the version I know has much more bite:
>There is also a variation on this that says there are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.
unsigned counter1 = 1;
unsigned counter2 = 5;
unsigned counter3 = 7;
or use an array: unsigned counter[3] = {1,5,7};
No more confusion. That being said, I doubt that the names of these local variables is actually a real source of confusion and is not a problem that needs to be fixed.That's absolutely horrible: you've now introduced the possibility of accidental out-of-bounds access where there previously was none, not to mention that you just forced the compiler to put those variables on the stack instead of using registers.
Contrived example:
int array(int lol)
{
unsigned counter[3] = {1,5,7};
for (int i = 0; i < lol; i++)
counter[i % 3]++;
return counter[0] + counter[1] + counter[2];
}
int vars(int lol)
{
unsigned counter1 = 1;
unsigned counter2 = 5;
unsigned counter3 = 7;
for (int i = 0; i < lol; i++)
switch(i % 3) {
case 0: counter1++;
case 1: counter2++;
case 2: counter3++;
}
return counter1 + counter2 + counter3;
}
The output is quite different (gcc -c -O3 -std=gnu99): Disassembly of section .text:
0000000000000000 <array>:
0: 85 ff test %edi,%edi
2: c7 44 24 e8 01 00 00 movl $0x1,-0x18(%rsp)
9: 00
a: c7 44 24 ec 05 00 00 movl $0x5,-0x14(%rsp)
11: 00
12: c7 44 24 f0 07 00 00 movl $0x7,-0x10(%rsp)
19: 00
1a: 7e 5f jle 7b <array+0x7b>
1c: 41 b8 01 00 00 00 mov $0x1,%r8d
22: 31 c9 xor %ecx,%ecx
24: be 56 55 55 55 mov $0x55555556,%esi
29: eb 1f jmp 4a <array+0x4a>
2b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
30: 89 c8 mov %ecx,%eax
32: f7 ee imul %esi
34: 89 c8 mov %ecx,%eax
36: c1 f8 1f sar $0x1f,%eax
39: 29 c2 sub %eax,%edx
3b: 8d 04 52 lea (%rdx,%rdx,2),%eax
3e: 89 ca mov %ecx,%edx
40: 29 c2 sub %eax,%edx
42: 48 63 c2 movslq %edx,%rax
45: 44 8b 44 84 e8 mov -0x18(%rsp,%rax,4),%r8d
4a: 89 c8 mov %ecx,%eax
4c: f7 ee imul %esi
4e: 89 c8 mov %ecx,%eax
50: c1 f8 1f sar $0x1f,%eax
53: 29 c2 sub %eax,%edx
55: 8d 04 52 lea (%rdx,%rdx,2),%eax
58: 89 ca mov %ecx,%edx
5a: 83 c1 01 add $0x1,%ecx
5d: 29 c2 sub %eax,%edx
5f: 39 f9 cmp %edi,%ecx
61: 48 63 c2 movslq %edx,%rax
64: 41 8d 50 01 lea 0x1(%r8),%edx
68: 89 54 84 e8 mov %edx,-0x18(%rsp,%rax,4)
6c: 75 c2 jne 30 <array+0x30>
6e: 8b 44 24 ec mov -0x14(%rsp),%eax
72: 03 44 24 e8 add -0x18(%rsp),%eax
76: 03 44 24 f0 add -0x10(%rsp),%eax
7a: c3 retq
7b: b8 0d 00 00 00 mov $0xd,%eax
80: c3 retq
81: 66 66 66 66 66 66 2e data32 data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
88: 0f 1f 84 00 00 00 00
8f: 00
0000000000000090 <vars>:
90: 85 ff test %edi,%edi
92: 7e 52 jle e6 <vars+0x56>
94: 31 c9 xor %ecx,%ecx
96: 41 b8 07 00 00 00 mov $0x7,%r8d
9c: 41 b9 05 00 00 00 mov $0x5,%r9d
a2: 41 bb 01 00 00 00 mov $0x1,%r11d
a8: 41 ba 56 55 55 55 mov $0x55555556,%r10d
ae: 89 c8 mov %ecx,%eax
b0: 89 ce mov %ecx,%esi
b2: 41 f7 ea imul %r10d
b5: c1 fe 1f sar $0x1f,%esi
b8: 89 c8 mov %ecx,%eax
ba: 29 f2 sub %esi,%edx
bc: 8d 14 52 lea (%rdx,%rdx,2),%edx
bf: 29 d0 sub %edx,%eax
c1: 83 f8 01 cmp $0x1,%eax
c4: 74 09 je cf <vars+0x3f>
c6: 83 f8 02 cmp $0x2,%eax
c9: 74 08 je d3 <vars+0x43>
cb: 41 83 c3 01 add $0x1,%r11d
cf: 41 83 c1 01 add $0x1,%r9d
d3: 83 c1 01 add $0x1,%ecx
d6: 41 83 c0 01 add $0x1,%r8d
da: 39 f9 cmp %edi,%ecx
dc: 75 d0 jne ae <vars+0x1e>
de: 45 01 d9 add %r11d,%r9d
e1: 43 8d 04 01 lea (%r9,%r8,1),%eax
e5: c3 retq
e6: b8 0d 00 00 00 mov $0xd,%eax
eb: c3 retq
Arrays mean memory - don't use them unless you actually want the compiler to use memory.Trying to get a patch into Linux as a new developer without personally knowing one of the (sub-)lieutenants is a futile exercise.
Find the maintainer of the relevant subsystem, talk to them via mail or IRC about what's bugging you and how to improve it, implement it, send a patch and it will most likely be accepted.
Where I have had difficulties getting something accepted is if your patch might cause problems later or doesn't solve the problem in a way that aligns with the vision of the maintainer. But while that might be annoying as a developer that wants a problem fixed now, I'd say that it's probably for the better of the whole kernel.
You're totally wrong about "personal" part.
Yes, the naming is unfortunate. No, there's probably no way to make it better (and if you think code reviews suck, wait until you see a kernel code review) (Well, this came directly from Linus but there's usually some discussion as well)
But in the end this has shipped and is running. (Well, the double goto from Apple as well, but I doubt that went through a code review)
localised comments may help but naming things properly would help more (and negate a comment being required at all), actually adopting some better practices would be a better fix going forward.
looking at the called function's comment:
The counters should be initialized to 1, 5, and 7 before * calling this for the first time.
so how about making a function with the same name, ending in '_first' (maybe rename the original _next) which initialises these variables internally before calling the version intended to be called multiple times...?
since the parameters are not well described by their names renaming them would help too... it is not a pointer to a 3 a 5 or a 7. it might not even be pointing at multiples... but current_multiple_of_three etc. would be better. there is some reason why these values are significant in this algorithm (which I do not know) and the best names would capture that.
the code is moderately difficult to read in the function itself - the method of swapping around a local copy of a pointer based on the conditions is not very easy to follow. although makes sense here to avoid excessive code... comments would help make it clearer what is going on.
So I think this means three = pow(3, 0);
It doesn't explain why it's not 3,1,7 or 3,5,1, but that's because there is no reason for that.
kernel/sysctl.c
#ifdef CONFIG_LOCKUP_DETECTOR
static int sixty = 60;
#endif
static int __maybe_unused neg_one = -1;
static int zero;
static int __maybe_unused one = 1;
static int __maybe_unused two = 2;
static int __maybe_unused three = 3;
static unsigned long one_ul = 1;
static int one_hundred = 100;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
#endifBesides, it made you read the code and the comments.