On x86 at least, it is a valid instruction. INT3, or CC in hex. There are also the debug registers which implement breakpoints without modifying any code, although it's limited to a maximum of 4 at once.
Characterising gdb as a "C debugger" is quite appropriate --- try to debug the Asm directly with it is an excruciating experience.
One tool that I started exploring is https://pernos.co/, the ability to do dataflow analysis is super cool. Let's you easily answer the question "How did this value get into this register".
It's really striking that it can't work on ARM though, due to an ARM architectural issue which x86 doesn't have.
WinDbg also lets you set the default base to 16, a feature whose usefulness is greatly appreciated when working with Asm: https://docs.microsoft.com/en-us/windows-hardware/drivers/de... GDB... makes an attempt: https://sourceware.org/bugzilla/show_bug.cgi?id=23390
I will rewrite this sentence, "invalid instruction" is too limited. It should be something like "an instruction that traps, and which isn't already used for another purpose". Syscalls trap but they are already used; and INT3/CC are valid instructions.
Another great one that actually walks through writing a basic debugger is Eli Bendersky's series[1].
One nitpick:
> It could, and that would work (that the way valgrind memory debugger works), but that would be too slow. Valgrind slows the application 1000x down, GDB doesn't. That's also the way virtual machines like Qemu work.
This is usecase-dependent: running a program until you hit a breakpoint will be significantly faster with `int 3`, but running a piece of instrumentation on every instruction (or branch, or basic block, or ...) will be significantly faster with Valgrind (or another dynamic binary instrumentation framework). This is because Valgrind and other DBI tools can rewrite the instruction stream to sidecar instrumentation into the same process, versus converting every instruction (or other program feature) into a sequence of expensive system calls.
RISC-V actually did this in a special instruction called ebreak. It can change the CPU privileged mode into Debug Mode.
As I mentioned above, I will rewrite this sentence to include dedicated special instructions. It was a wording mistake not to mention it!
Isn't it possible the compiler will re-order or combine statements differently than how they are written in source?
But there's a wide "grey area" depending on optimization level where the debug information is more or less off yet the mapping is still "good enough". Debugging on optimized builds can work surprisingly well if you know what to expect (e.g. the debugging cursor might jump to unexpected places in the source code because the line mapping is off, or you can't step into a function because it has been inlined).