(n & ((n & 1) - 1)) + ((n ^ (n >> 1)) & 1)
Or a much more readable version [ n, 1, n + 1, 0 ][n % 4]
which makes it clear that this function cycles through a pattern of length four.Why this works can be seen if we start with some n that is divisible by four, i.e. it has the two least significant bits clear, and then keep XORing it with its successors. We start with xxxxxx00 which is our n. Then we XOR it with n + 1 which is xxxxxx01 and that clears all the x's and leaves us with 00000001. Now we XOR it with n + 2 which is xxxxxx10 and that yields xxxxxx11 which is n + 3. The cycle finishes when we now XOR it it with n + 3 which yields 00000000. So we get n, 1, n + 3, 0 and then the cycle repeats as we are back at zero and at n + 4 which is again divisible by four.
My offhand solution not using xor is to subtract from the sum of 1 to n, which has a closed form solution. The closed form roughly halves the execution time, as we only have to iterate over the range once.
Good to know there's a similar speedup available on the xor path...
xxxxxxx0 ^ xxxxxxx1 = 00000001
If we start at a number divisible by four and do this twice, we get one twice. xxxxxx00 ^ xxxxxx01 = 00000001
xxxxxx10 ^ xxxxxx11 = 00000001
And combining the two of course yields zero and we are right back at the start. xor =: (16 + 2b0110) b.
f =: 3 : 'xor/ y + i. 4'
f"0 ] 2 * 1 + i. 100
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0Summing a hundred millions: +/ f"0 ] 2 * i 100000000 gives zero (it takes a few seconds). So it seems the stated property holds for every even n.
So the cycle of (N, 1, N+3, 0) corresponds to (A) and (B) being: (0,0), (0,1), (1,1), (1, 0) - i.e. the 4 possible combinations of these states.
[(n & ~3), 1, (n & ~3) + 3, 0][n % 4]
where the (n & ~3) makes sure those lower 2 bits are cleared. But note that we only ever can look at the first element when n % 4 == 0. In that case, (n & ~3) == n already. And further, we only ever can look at the third element when n % 4 == 2. In that case (n & ~3) == n - 2, so (n & ~3) + 3 == n + 1. Hence the array can be simplified to the one given in the other comment.
The problem is that in vector sets, the HNSW graph has the invariant that each node has bidirectional links to a set of N nodes. If A links to B, then B links to A. This is unlike most other HNSW implementations. In mine, it is required that links are reciprocal, otherwise you get a crash.
Now, combine this with another fact: for speed concerns, Redis vector sets are not serialized as
element -> vector
And then reloaded and added back to the HNSW. This would be slow. Instead, what I do, is to serialize the graph itself. Each node with its unique ID and all the links. But when I load the graph back, I must be sure it is "sane" and will not crash my systems. And reciprocal links are one of the things to check. Checking that all the links are reciprocal could be done with an hash table (as in the post problem), but that would be slower and memory consuming, so how do we use XOR instead? Each time I see a link A -> B, I normalize it swapping A and B in case A>B. So if links are reciprocal I'll see A->B A->B two times, if I use a register to accumulate the two IDs and XOR them, at the end, if the register is NOT null I got issues: some link may not be reciprocal.However, in this specific case, there is a problem: collisions. The register may be 0 even if there are non reciprocal links in case they are fancy, that is, the non-reciprocal links are a few and they happen to XOR to 0. So, to fix this part, I use a strong (and large) hash function that will make the collision extremely unlikely.
It is nice now to see this post, since I was not aware of this algorithm when I used it a few weeks ago. Sure, at this point I'm old enough that never pretend I invented something, so I was sure this was already used in the past, but well, in case it was not used for reciprocal links testing, this is a new interview questions you may want to use for advanced candidates.
For every normalized link id x:
y = (x << k) | h(x) # append a k-bit hash to the id
acc ^= y
If acc is zero, all links are reciprocal (same guarantee as before).If acc is non-zero, split it back into (x', h'):
* Re-compute h(x').
* If it equals h', exactly one link is unpaired and x' tells you which one (or an astronomically unlikely collision). Otherwise there are >= 2 problems.
This has collision-resistance like the parent comment and adds the ability to pinpoint a single offending link without a second pass or a hash table.
The XOR solution was a valid answer, but not the only answer we would have happily accepted.
The interview question was chosen such that it's very easy to understand and quick to solve, meaning it would indicate the candidate knew at least the basics of programming in C#. Almost surprisingly, we actually had candidates applying for "senior" level positions who struggled with this.
It could be solved in a multitude of ways, e.g:
- XOR as above
- Use of a HashSet<int>
- Use for loop and List which contains a number and its count.
- Use LINQ to group the numbers or something and then find the one with the count.
As long as what they did worked, it was a "valid" answer, we could then often discuss the chosen solution with the candidate and see how they reacted when we let them know of other valid solutions.
It was really great for not being a "one clever trick" question and could act as a springboard to slightly deeper discussions into their technical thought processes and understanding.
a := a + b;
b := a - b;
a := a - b;
I'm still proud of little me and I always remember this solution when I encounter XOR tricks. I didn't knew about bitwise arithmetic at that time, but sometimes simple `+` can work just as well.It's an array of integers so it fits in memory (otherwise it wouldn't be called an array). As it fits in memory, n cannot be that big. I'd still ask for more requirements, TopCoder problem style: I want to know how big n can be that the array fits in memory.
I didn't know that XOR trick. My solution would be a bit arrays with n bits and two for loops: one to light each bit corresponding to a number and one for loop to find the missing number.
And if my bit array doesn't fit in memory, then neither does the array from the problem (and certainly not the HashSet etc.).
That makes the problem harder which makes it more interesting, a lot of the solutions wouldn't work anymore (this isn't necessarily a good interview question though)
Basically xor swapping a[i] with a[j] triggered the evil logic when i was equal to j.
The state of RC4 consists of a random permutation of bytes. Whenever it outputs a value, it further permutes the state by swapping some bytes of the state. Th xor swap trick sets one of these values to zero, whenever RC4 attempts to swap the same item within the permutation. This gradually zeros out the state, until RC4 outputs the plaintext.
> We can thus search for u by applying this idea to one of the partitions and finding the missing element, and then find v by applying it to the other partition.
Since you already have u^v, you need only search for u, which immediately gives you v.
tromp is saying the last step can be simplified. There is no need to use the "XOR of all elements" method on the second partition to find v, since the earlier steps have given us u^v and u, so simply XORing those two values together gives v.
If you’re working on an architecture where a single multiplication and a bit shift is cheaper than N xor’s, and where xor, add, and sub are all the same cost, then you can get a performance win by computing the sum as N(N+1)/2; and you don’t need a blog post to understand why it works.
Example 1: Find the missing number
xor =: (16 + 2b0110) b.
iota1000 =: (i. 1000)
missingNumber =: (xor/ iota1000) xor (xor/ iota1000 -. 129)
echo 'The missing number is ' , ": missingNumber
This print 'The missing number is 129'Example 2: Using a random permutation, find the missing number.
permuted =: (1000 ? 1000)
missingNumber = (xor/ permuted) xor (xor/ permuted -. ? 1000)
Example 3: find the missing number in this matrix. _ (< 2 2) } 5 5 $ (25 ? 25)
12 9 1 20 19
6 18 3 4 8
24 7 _ 15 23
11 21 10 2 5
0 16 17 22 14
Final test: repeat 10 times the example 3 (random matrices) and collect the time it takes you to solve it in a list of times, then compute the linear regression best fit by times %. (1 ,. i. 10)
Did you get better at solving it by playing more times?I am not affiliated with J, but in case you want to try some J code there is a playground: https://jsoftware.github.io/j-playground/bin/html2/
Edited: It seems I am procrastinating a lot about something I have to do but don't want to.
f =: ]`1:`>:`0:@.(4&|)"0
Then: (,. ; #: ; [: #: f) i.16
0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 1
2 0 0 1 0 0 0 1 1
3 0 0 1 1 0 0 0 0
4 0 1 0 0 0 1 0 0
5 0 1 0 1 0 0 0 1
6 0 1 1 0 0 1 1 1
7 0 1 1 1 0 0 0 0
8 1 0 0 0 1 0 0 0
9 1 0 0 1 0 0 0 1
....The parent's comment (also mine) has a style that was designed not to scare non J programmers. One should also consider that some people dislike J code so downvotes are the usual result except when the post provides some additional insight.
Finally, thank you for this small J lesson, is a pleasure to find here fellow J programmers.
That is, one should also prove a ^ (b ^ c) = (a ^ b) ^ c. Instinctive, but non-trivial.
> If more than two elements are missing (or duplicated), then analyzing the individual bits fails because there are several combinations possible for both 0 and 1 as results. The problem then seems to require more complex solutions, which are not based on XOR anymore.
If you consider XOR to be a little bit more general, I think you can still use something like the partitioning algorithm. That is to say, considering XOR on a bit level behaves like XOR_bit(a,b)=a+b%2, you might consider a generalized XOR_bit(a,b,k)=a+b%k. With this I think you can decide partitions with up to k missing numbers, but I'm too tired to verify/implement this right now.
They're really evil on modern CPUs.
In a typical error-correcting code usage, you have an encoder which takes your message, and adds some extra symbols at the end which are calculated so that the syndrome is zero. Then when receiving your message, the receiver calculates the syndrome and if it's not zero, they know that at least one error has occurred. By using the code's decoding algorithm, they can figure out the fewest (and thus hopefully most likely) number of changes which would result in that error syndrome, and use this information to (hopefully) correct the transmission error.
For the missing numbers problem, you can set x_i to "how many times does the number i appear?". Then since the syndrome is sum(x_i * G_i), you can compute the syndrome on an unordered list of the i's. You are expecting the syndrome to be the same as the syndrome of full set 1...n, so when it is not, you can figure out which few x_i's are wrong that would lead to the syndrome you observed. You have an advantage because you know how many numbers are missing, but it's only a slight one.
The author's solution is called the Hamming code: you set F(i) = i, and you do the additions by xoring. Using error-correcting codes generalize to more missing numbers as well, including using xor, but the math becomes more complicated: you would want to use a fancier code such as a BCH or Goppa code. These also use xor, but in more complicated ways.
And sometimes even faster than a load immediate, hence XOR AX, AX instead of MOV AX, 0.
Shorter usually mean faster, even if the instruction itself isn't faster.
Shorter basically means you can fit more in instruction cache, which should in theory improve performance marginally.
> Shorter usually means faster
It depends, so spouting generalities doesn't mean anything. Instruction cache line filling vs. cycle reduction vs. reservation station ordering is typically a compiler constraints optimization problem(s).
XOR is equivalent to addition over the finite field F_2^m. So, in this field, we're calculating the sum. If we have two numbers missing, we calculate the sum and sum of squares, so we know:
x + y
x^2 + y^2
From which we can solve for x and y. (Note all the multiplications are Galois Field multiplications, not integer!)
Similarly for k numbers we calculate sums of higher powers and get a higher order polynomial equation that gives our answer. Of course, the same solution works over the integers and I'd imagine modular arithmetic as well (I haven't checked though).
This is also how BCH error-correction codes work (see https://en.wikipedia.org/wiki/BCH_code): a valid BCH codeword has sum(x^i where bit x is set in the codeword) = 0 for t odd powers i=1,3,5, ... Then if some bits get flipped, you will get a "syndrome" s_i := sum(x^i where bit x was flipped) for those odd powers. Solving from the syndrome to get the indices of the flipped bits is the same problem as here.
The general decoding algorithm is a bit involved, as you can see in the Wikipedia article, but it's not horribly difficult:
• First, extend the syndrome: it gives sum(x^i) for odd i, but you can compute the even powers s_2i = s_i^2.
• The syndrome is a sequence of field values s_i, but we can imagine it as a "syndrome polynomial" S(z) := sum(s_i z^i). This is only a conceptual step, not a computational one.
• We will find a polynomial L(z) which is zero at all errors z=x and nowhere else. This L is called a "locator" polynomial. It turns out (can be checked with some algebra) that L(z) satisfies a "key equation" where certain terms of L(z) * S(z) are zero. The key equation is (almost) linear: solve it with linear algebra (takes cubic time in the number of errors), or solve it faster with the Berlekamp-Massey algorithm (quadratic time instead, maybe subquadratic if you're fancy).
• Find the roots of L(z). There are tricks for this if its degree is low. If the degree is high then you usually just iterate over the field. This takes O(#errors * size of domain) time. It can be sped up by a constant factor using Chien's search algorithm, or by a logarithmic factor using an FFT or AFFT.
You can of course use a different error-correcting code if you prefer (e.g. binary Goppa codes).Edit: bullets are hard.
Further edit just to note: the "^" in the above text refers to powers over the finite field, not the xor operator.
> constant factor using Chien's search algorithm
Chien's search is only really reasonable for small field sizes... which I think doesn't really make sense in this application, where the list is long and the missing elements are relatively few.
Fortunately in characteristic 2 it's quite straight forward and fast to just factor the polynomial using the berlekamp trace algorithm.
L(z) = z^2 - (x+y)z + xy.
You already have x+y, but what's xy? You can compute it as ((x+y)^2 - (x^2 + y^2))/2. This technique generalizes to higher powers, though I forget the exact details: basically you can generate the coefficients of L from the sums of powers with a recurrence.Then you solve for the roots of L, either using your finite field's variant of the quadratic formula, or e.g. just by trying everything in the field.
* But wait, this doesn't actually work! *
Over fields of small characteristic, such as F_2^m, you need to modify the approach and use different powers. For example, in the equations above, I divided by 2. But over F_2^m in the example shown above, you cannot divide by 2, since 2=0. In fact, you cannot solve for (x,y) at all with only x+y and x^2 + y^2, because
(x+y)^2 = x^2 + y^2 + 2xy = x^2 + y^2 + 0xy (since 2=0) = x^2 + y^2
So having that second polynomial gives you no new information. So you need to use other powers such as cubes (a BCH code), or some other technique (e.g. a Goppa code). My sibling comment to yours describes the BCH case.So you don't actually need the first loop (at least for the set of integers 1..n example), but bringing that up is probably out of scope for this article.
It's all theoretical though. On real world data sets that aren't small I don't see why you wouldn't hand these tasks off to C/Zig/Rust unless you're only running them once or twice.
It's silly to as ask a web dev these questions and expect these XOR approaches.
Low-level developers ("bare metal" as the kids say), on the other hand? They should have a deep enough understanding of binary representation and bitwise operations to approach these problems with logic gates.
The epitome of turning technical interviews into a trivia contest to make them feel smart. Because isn't that the point of a tech interview?
I believe you under-estimate what a good interviewer is trying to do with questions such as these:
Either you've seen the trick before and you get an opportunity to show the interviewer that you're an honest person by telling him you have. Huge plus and the interview can move on to other topics.
Either you haven't and you can demonstrate to the interviewer your analytical skills by dissecting the problem step by step and understanding what the code actually does and how.
Bonus if you can see the potential aliasing problem when used to swap two variables.
Not a trivia question at all.
However, then it is clearly still easier to just phrase everything in terms of = (equality) instead!
Equality is for binary inputs is also called XNOR, biconditional, iff, ↔, etc, which is the negation of XOR. But thinking of it immediately as "=" is much more straightforward.
Another advantage of = over ≠/xor is that equality is not just commutative and associative, it's intuitively obvious that it is associative. The associativity of ≠/xor is less obvious. Moreover, equality is also transitive, unlike inequality/xor.
Overall, equality seems a much more natural concept to reason with, yet I don't know of any languages which have a bitwise equality/XNOR/↔ operator, i.e. one that operates on integers rather than Booleans.
For those who do/did assembly, this is the common way to set a register to zero in x86 assembly (probably not only) because the instruction does not need an operand, so is shorter, and executes in one cycle only.
One aspect of XOR is that it is the same as binary addition without carry, and therefore it does not overflow.
`XOR[0...n] = 0 ^ 1 .... ^ n = [n, 1, n + 1, 0][n % 4]`
XOR[0...x] = (x&1^(x&2)>>1)+x*(~x&1)
Actually I found something through Gemini based on the table mod 4 idea in previous post. Thanks.
xor ax, ax
Than: mov ax, 0hThis is nonsensical, where does the second truth table come from? Instead you just observe that, by definition, 1^0 == 0^1.