Yeah, I was trying to determine the utility function that would determine how to select words, but I'm more of a gut-feel, intuitive sort of person. I also tried a squared term like yours, but for some reason, it didn't feel right when I tested it. That version decided the best initial word was `LARES`. I have a sneaky suspicion that we need to account for how common a word was. I think my solver was getting to hung up worrying about BRAXY and CRURA, and giving them similar weight to a word like TRACK.
However, it was very hard to debug because a slightly buggy version was still decent at playing the game! In fact, I'm fairly certain I still have some bugs. I need to comment my code and get it up on Github. It's also super brute force O(n^2)