Also there is a line limit in C64 BASIC that would overflow if certain shorthand would be expanded and for beginners to see their fully written keywords being transformed to shorthand after loading would be even more confusing.
10 for i = 1 to 10
20 : (arbitrary number of spaces) print "hello"
30 next
The short hand issue is real, too:
1?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?
expands into six lines of "1 print:print:....:print" that you can't simply edit because the limit is 80 characters (two lines)
And the tokenization didn't prevent you from style differences anyway - as the article points out it e.g. keeps spaces etc. It only tokenized a few things, like keywords and line numbers.
(EDIT: in the late 90's I worked on a project written in Word BASIC.... It was also tokenized and that was used as an opportunity to translate the keywords in the localised versions of Word. But someone had managed to write a bunch of code in the Danish version and somehow exported it as text and imported it into the Norwegian version - the languages are similar enough that it was really hard to tell (no syntax highlighting, and they'd edited a bunch before realising and I had the fun job of untangling it... Yay...)
This had the side-effect that you could still display and (presumably with a bit more mental effort, read) the program listing fine, but re-entering a line as shown in that listing would fail because the computer depended on the spaces to do the parsing, even if they were redundant after the tokenisation happened.
I was always surprised that it did all that, but wasn't any faster than Commodore BASIC.