It's the pattern with those "stupid specific architectures". Very good at this one thing. But only ever "good for their size", and only to a point.
They don't scale up and they don't generalize. Go far enough on task complexity and LLMs just kill them.
Does that make them useless? As an LLM replacement, yes. In general? Maybe not, I can think of things. But I'm yet to find any paper demonstrating a real world use.
It's a special-purpose design for constraint-satisfaction problems with simple rules, but complex interactions. E.g. when solving a Sudoku, the set of valid choices at every step is easy to determine, but you could make a series of valid choices that back you into a corner where no more progress is possible and you have to backtrack.
Meanwhile, LLM reasoning failures are more often of the kind where a choice is clearly invalid (as judged by a human observer), but the LLM picks it anyway, because the underlying rule is complex and context-dependent and the model only learned an imperfect approximation that often breaks down.
GRAM won't help with that.