Instruction fusion has no effect on code size, but only on execution speed.
For example RISC-V has combined compare-and-branch instructions, while the Intel/AMD ISA does not have such instructions, but all Intel & AMD CPUs fuse the compare and branch instruction pairs.
So there is no speed difference, but the separate compare and branch instructions of Intel/AMD remain longer at 5 bytes, instead of the 4 bytes of RISC-V.
Unfortunately for RISC-V, this is the only example favorable for it, because for a large number of ARM or Intel/AMD instructions RISC-V needs a pair of instructions or even more instructions.
Fusing instructions will not help RISC-V with the code density, but it is the only way available for RISC-V to match the speed of other CPUs.
Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance
Yet, as many pointed out to you already, RISC-V has the highest code density of all contemporary 64bit architectures. And aarch64, which you seem to like, is beyond bad.
>but it is the only way available for RISC-V to match the speed of other CPUs.
Higher code density and lack of flags helps the decoder a big deal. This means it is far cheaper for RISC-V to keep execution units well fed. It also enables smaller caches and conversely higher clock speeds. It's great for performance.
This, if anything, makes RISC-V the better ISA.
>Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance
Grasping at straws. RISC-V has been designed for fusion, from the get-go. The cost of doing fusion with it has been quoted to be as low as 400 gates. This is something you've been told elsewhere in the discussion, but that you chose to ignore, for reasons unknown.
> This is something you've been told elsewhere in the discussion, but that you chose to ignore, for reasons unknown.
I would call it RISC-V bashing.
Everyone loves to hate RISC-V, probably because it's new and heavily hyped.
It is really common to see irrelevant and uninformed criticism about RISC-V. The article, which seems to be enjoyed by the HN audience, literally says: "I believe that an average computer science student could come up with a better instruction set that Risc V in a single term project". How can anyone say such a thing about a collaborative project of more than 10 years, fed by many scientific works and projects and many companies in the industry?
I do not mean that RISC-V is perfect, there are some points which are source of debate (e.g. favoring a vector extension rather than the classic SIMD is a source of interesting discussion). But I would appreciate on HN to read better analysis and more interesting discussions.
I'm very skeptical that a RISC-V decoder would be much more complex than an X86 one, even with instruction fusion. For the simpler fusion pairs, decoding the fused instructions wouldn't be more complex than matching some of the crazy instruction encoding in X86.
For ARM I'm not so sure, but RISC-V does have very significant instruction decoding benefits over ARM too, so my guess would be that they'd be similar enough.
And if you compare 32-bit CPUs then RISC-V has twice as many registers reducing the number of instructions needed to read and write from memory.
RISC-V branching takes less space and so does vector instructions. There are many case like that which adds up end results in RISC-V having the most dense ISA in all studies when using compressed instructions.
On the other hand just splitting up x86 instructions is very expensive, and decoding in general takes a lot of work before you even start to do fancy tricks.
The CPU has a frontend, which has a decoder, which is the part that "reads" the program instructions. When it "sees" certain pattern, like "instruction x to register r followed by instruction y consuming r", it can treat this "as if" it was a single instruction if the CPU has hardware for executing that single instruction (even if the ISA doesn't have a name for that instruction).
This allows the people that build the CPU to choose whether this is something they want to add hardware for. If they don't, this runs in e.g. 2 cycles, but if they do then it runs in 1. A server CPU might want to pay the cost of running it in 1 cycle, but a micro controller CPU might not.