Best Wordle guessing strategy (opens in new tab)

(slc.is)

143 pointssplch4y ago151 comments

151 comments

It seems clear to me that there is an optimal starting word, but that the best second word has to depend on the info you gain from the first.

Case in point: If you get 3 green 2 yellow in the first word, you can solve on the next guess.

Of course, you can constrain your strategies to "I always use the same two/three starting words", and in many cases that will be fine. But it's quite obviously not optimal.

Also, the optimal strategy must depend on your goal metric. Do you go for "least average guesses", "least maximum guesses", or "least average guesses while never losing"? There's lots of unstated assumptions in all the analyses thrown around...

nwiswell4y ago

> It seems clear to me that there is an optimal starting word

I took one step further and calculated the optimal starting word for obtaining green matches (I assume that this also makes it likely to produce yellow matches, although I did not explicitly optimize for that).

Beginning with the full list of 5-letter words, I calculated the frequency of each letter of the alphabet in each of the 5 possible positions for a 5 letter word.

Then I iterated through the list a second time, this time assigning a score for each word equal to the sum of frequencies for each letter in its respective position.

By a significant margin, the highest score is SLATE (over 1400). Runners up (over 1300) are SAUTE, SHIRE, and CRATE.

Caveat: this approach assumes that all possible words are equally likely to be the answer.

codefined4y ago

Interestingly, doing a similar analysis where we bucketed each word into 243 (3^5) buckets based on the possible result, we found "RAISE" as the best word. Source[0]

[0] https://gist.github.com/popey456963/a654e98d0180566b897b70ee...

2 more replies

Nursie4y ago

Interesting. I've been using ORATE as it knocks out a lot of the top frequency letters from the english language as a whole. But I guess the distribution in five letter words, and more so in a subset of five letter words, may skew differently.

1 more reply

paulcole4y ago

> Beginning with the full list of 5-letter words, I calculated the frequency of each letter of the alphabet in each of the 5 possible positions for a 5 letter word.

There’s a great many 5-letter words that the creator will never consider because they’re too obscure. Treating all 5-letter words as possible is a mistake when calculating strategy for this.

High-frequency letters will be over-represented in your analysis.

1 more reply

alkonaut4y ago

Nice. I tried a simpler method of only looking at the wikipedia table of overall letter frequency in dictionaries (not in 5 letter words) and the vowels are high, especially E and I followed by A. Then I checked the frequeny of starting letter and that is led by "S". So my guess for a 5 letter word became one with at least two vowels from E,I,A, starts with an S, and has the remaining two characters from the reasonably high frequency set (R, T, N, etc). My guess then was SIREN which has that property. Just out of curiosity, if you still have the output from your program, how bad was SIREN compared to SLATE?

tcskeptic4y ago

That is a really interesting approach -- if you remove the words with any of the letters in SLATE, what is the highest ranked word? (If you have it to hand)

pyrale4y ago

> Case in point: If you get 3 green 2 yellow in the first word, you can solve on the next guess.

That's a very specific edge case, though.

I agree with you that the second word should vary depending on the first result, but picking the best first and second words require lots of analysis. The first word should be picked in order to open up the best options for second word, and while I believe the second word should not contain letters from the first, unless you got extremely lucky, the distribution of letters of remaining available words probably changes which letters you want to cover.

Failing to do the required analysis, my current strategy is to pick two words that cover the 10 most common letters in english, ETAOINSHRD. Sometimes it's "ethos" and "nadir", sometimes "thine" and "roads", etc. So far, it's worked well.

eightails4y ago

> It seems clear to me that there is an optimal starting word, but that the best second word has to depend on the info you gain from the first.

Definitely. The likelihood of a letter appearing in a given place changes depending on the letters around it. A Q will almost always be followed with a U, for example.

I wrote a script yesterday which spits out the relative probabilities of possible letters in each unknown position, given the current known/excluded letters -- it was interesting to see the effect in action.

christiangenco4y ago

I modeled my Wordle solver[1] after Donald Knuth's Mastermind strategy[2].

An optimal guess is found by looking at the list of possible solutions and for each of those possible solutions checking how much each possible guess would narrow down the list of possible solutions.

Once you have a list of all possible guesses and how much each of those possible guesses would narrow down the total number of possible solutions for each possible solution, sort them by the maximum number of total remaining guesses.

The optimal first guesses from my solver are ARISE, RAISE, AESIR, REAIS, or SERAI because each will narrow down the possible word list to at minimum 168 remaining words.

Each guess after that uses the same algorithm with the list of possible guesses filtered with the information you learned in previous guesses.

edit: formatting

1. https://wordlesolver.com source: https://github.com/christiangenco/wordlesolver

2. https://math.stackexchange.com/questions/1192961/knuths-mast...

svat4y ago

Thanks for sharing your code. I had seen other people also mention that these 5 guesses (AESIR, ARISE, RAISE, REAIS, SERAI) all narrow down the list to 168 (in the worst case), but I was curious why the code I wrote to play with this was giving different answers.

After reading your code, I see an error in it. In fact, only RAISE is optimal; the others are worse, and leave bigger lists (according to my code):

    RAISE: 168
    …
    REAIS: 203
    …
    AESIR: 220
    ARISE: 220
    …
    SERAI: 241

The error in your code is that your "evaluateGuess" function (right at the top, in the first 10 or so lines of https://github.com/christiangenco/wordlesolver/blob/9c3bd94a...):

    function evaluateGuess({ solution, guess }) {
      return [...guess].map((letter, index) => {
        return {
          letter,
          included: solution.includes(letter),
          position: letter === solution[index],
        };
      });
    }

is too simplistic, and not actually what the real game does. In the game, for each letter position, there are three possible responses:

• Correct (Green), what you call "position"

• Present (Yellow), what you call "included"

• Absent (Grey)

Here are three test cases, that you could try out on the real Wordle in recent days:

• When the solution is "FAVOR" and our guess is "ERROR", Wordle's response is [Grey, Grey, Grey, Green, Green] — note that for the first two Rs in "ERROR", the correct response is Grey (Absent), because the last R has already "used up" the "Green" response.

• When the solution is "FAVOR" and our guess is "ROARS", Wordle's response is [Yellow, Yellow, Yellow, Grey, Grey] — note that only the first R in "ROARS" gets a Yellow response and the second one gets Grey, because there's only one R in the solution.

• (As pointed out by @pedrosorio in a sibling comment) When the solution is "ABBEY" and our guess is "APNEA", Wordle's response is [Green, Grey, Grey, Green, Grey], but your solver thinks that the second "A" would get a "Yellow" response too.

You mentioned Donald Knuth's Mastermind paper; in fact in the paper (http://www.cs.uni.edu/~wallingf/teaching/cs3530/resources/kn...) Knuth points this out on the very first page:

> Rule 2 is somewhat difficult to state precisely and unambiguously, and the manufacturers have in fact not succeeded in doing so on the directions they furnish with the game […]

and gives an exact rule that you may want to study carefully.

In my code, the `response` function I use (it's not the most efficient, but we can just memoize it) is:

    def response(h, g):
        '''
        - The hidden word is h.
        - The guess is g.
        For each position in the word g, some color:
        - 'green' if in the same position
        - 'yellow' if present (after subtracting 'green's)
        - 'grey' if absent (after subtracting "green"s and "yellow"s)
        '''
        assert len(h) == len(g)
        L = len(h)
        green = [i for i in range(L) if h[i] == g[i]]
        yellow = []
        for i in range(L):
            # We want to check whether g[i] is "present" in h
            if i in green: continue
            for j in range(L):
                if j in green: continue
                if j in yellow: continue
                if h[j] == g[i]:
                    yellow.append(i)
                    break
        return (green, yellow)

Note the three "continue" statements — they are crucial, to match the behaviour of the real Wordle (or Master Mind) on the three test cases I mentioned above.

svat4y ago

The testcases notwithstanding, I found a bug in my code as well. Fixed my `response` function to:

    def response(h, g):
        assert len(h) == len(g)
        L = len(h)
        correct = [i for i in range(L) if h[i] == g[i]]
        present_h = []
        present_g = []
        for i in range(L):
            # We want to check whether g[i] is "present" in h
            if i in correct: continue
            for j in range(L):
                if j in correct: continue
                if j in present_h: continue
                if h[j] == g[i]:
                    present_g.append(i)
                    present_h.append(j)
                    break
        return (correct, present_g)

and now I too get (AESIR, ARISE, RAISE, REAIS, SERAI) all leaving 168 words. (But the testcases in the above comment still hold, though, make sure your code works for them.)

pedrosorio4y ago

This does not quite work.

For today's word (abbey):

- input "arise" (match "a", wrong position "e")

- solver says there are 10 possible words, one of them is "apnea"

- input "apnea" (match "a", match "e") EDIT: fixed

- solver says there are 0 possible words (it assumes last "a" should appear yellow, but the wordle page has the last "a" as grey because there is only one "a" in the word)

geoduck144y ago

SPOILER!!!!

1 more reply

phalangion4y ago

Apnea would be match for “a” and “e”

2 more replies

geoduck144y ago

I like the tool, but it feels like cheating because it uses the list of "possible solutions"

For instance, "waxed" is not suggested even though it is a legitimate word

Syzygies4y ago

    5.291 soare
    5.294 roate
    5.299 raise
    5.311 raile
    5.311 reast
    5.321 slate
    5.342 crate
    5.342 salet
    5.345 irate
    5.346 trace
    5.356 arise
    5.360 orate
    5.370 stare
    5.382 carte
    5.390 raine
    5.400 caret
    5.402 ariel
    5.406 taler
    5.406 carle
    5.407 slane

Shown are the twenty best initial guesses using Claude Shannon's definition of information entropy. Each number is the expected number of yes/no questions needed to resolve the remaining uncertainty. Shannon is the "father of information theory", and this is the right measure.

One might recognize SOARE as identified elsewhere by a different measure. I am relieved that I give up very little by moving down to the first word I recognize on this list.

This takes about 20 minutes to code in Ruby, and four minutes to run. There's no point to using a more efficient language, or wasting part of an hour as I did searching for a clever way to score guess words.

In the 1960's my dad used entropy to program Jotto on Kodak's computers. It wasn't feasible then to evaluate every possible clue word, but he determined that one did well enough with a random subset.

hddqsb4y ago

Maximising the information gained using Shannon's entropy is a very good strategy (assuming the goal is to minimise the expected number of guesses), however it is not necessarily optimal!

I have a counter example for a simplified version the game with the following rule changes:

1. The player is only told which letters in the guess are correct (i.e. they are not told about letters that are present but in a different location).

2. If the player knows there is only one possible solution, the player wins immediately (without having to explicitly guess that word).

3. The set of words that the player is allowed to guess may be disjoint from the set of possible solutions.

Here is the list of possible solutions:

    aaaa
    aaab
    aaba
    babb
    abaa
    bbab
    bbba
    bbbb

(There are 8 words. The 2nd, 3rd and 4th letters are the binary patterns of length 3, and the 1st letter is a carefully chosen "red herring".)

Here is the dictionary of words the player is allowed go guess:

    axxx
    xaxx
    xxax
    xxxa

(Each guess effectively lets the player query a single letter of the solution.)

The information gain for each possible initial guess is identical (all guesses result in a 4-4 split), so a strategy based on information gain would have to make an arbitrary choice.

If the initial guess is axxx (the "red herring"), the expected number of guesses is 3.25.

But a better strategy is to guess xaxx (then guess xxax and xxxa). The expected number of guesses is then 3.

(In this example information gain was tied, but I have a larger example where the information gain for the "red herring" is greater than the information gain for the optimal first guess.)

Syzygies4y ago

Interesting. There can't be a proof of general optimality for Shannon entropy, because the words are irregularly distributed. However, (unlike your lists) they're not distributed by an adversary trying to foil Wordle/Jotto strategies.

I suspect a law of large numbers / central limit theorem type result that Shannon entropy is asymptotically optimal for randomly chosen lists, even those generated by state machines like gibberish generators that nearly output English words. In other words, I conjecture that your configurations are rare for long lists.

Early in my career, I was naive enough to code up Grobner bases with a friend, to tackle problems in algebraic geometry. I didn't yet know that computer scientists at MIT had tried random equations with horrid running times, and other computer scientists at MIT had established special cases with exponential space complete complexity. Our first theorem explained why algebraic geometers were lucky here. This is a trichotomy one often sees: "Good reason for asking" / "Monkeys at a keyboard" / "Troublemakers at a demo".

Languages evolve like coding theory, attempting a Hamming distance between words to enhance intelligibility. It could well be that the Wordle dictionary behaves quasirandomly, more uniformly spaced that a true random dictionary, so Shannon entropy behaves better than expected.

1 more reply

Syzygies4y ago

A fork using primes: Primel

https://converged.yt/primel/

The best first guess using Shannon entropy is a dead heat between 12953 and 12539. Not surprisingly to the number theorists out there...

If you study the JavaScript, apparently this guy's girlfriend knows every prime.

SoylentOrange4y ago

I don't think this list is accurate, because the expected number of guesses for the top words should be fewer than 5. I'll post something in more detail later, but the best guesses on the 12,972 5-letter words have an expected number of guesses of around 4.3.

Syzygies4y ago

It's a units issue. I'm measuring in yes/no questions, which is the native scale for Shannon entropy. You're measuring in Wordle questions. There's more information in a Wordle answer than in a yes/no answer.

Syzygies4y ago

> wasting part of an hour as I did searching for a clever way to score guess words.

I'd love to see scoring a Wordle guess added to one of those programming problem sites, as code golf in any language would be amusing. I was once a commercial APL programmer, so APL comes to mind.

KerryJones4y ago

Would you mind sharing your code?

bee_rider4y ago

Which word list did you use?

strstr4y ago

I believe tares is the optimal word (according to shannon entropy) from the game’s list.

2 more replies

paxys4y ago

I don't like the "exploration" strategy of playing Wordle. Once you have a start word, work onwards from there (aka "hard mode") rather than enter another random guess to narrow down more letters. I know it is a valid strategy, but idk just feels like cheating to me.

brianpan4y ago

A few days ago I got 4 of 5 letters then lost with 3 wrong guesses at the last letter. I should have spent one turn getting 5 guesses at the last letter instead of spending 3 turns guessing 3 letters.

It's not cheating, it's the better way. If you don't like it, you can turn on hard mode!

peeters4y ago

It's too bad hard mode only requires that you use positive clues, not negative. I've been trying to push "true hard mode" with my work colleagues. Where every guess must be a potential winner given the information revealed thus far.

So, hard mode, plus no reusing grey letters, plus no using yellow letters in the same incorrect slot.

alkonaut4y ago

What, one can re-use a gray letter in hard mode? That's not hard enough!. I find the fun challenge isn't finding the final word but finding valid guesses (i.e. words that exist and aren't using any grays) along the way.

albert_e4y ago

this is my default approach unless really out of options i can think of

but i do slip up at times and jump to punch in a word that i later realize violates one of those yellow rules for instance

reificator4y ago

Go to the gear in the top right, and enable `Hard Mode - Any revealed hints must be used in subsequent guesses`.

duderific4y ago

I was playing hard mode accidentally. I didn't realize there was a more optimal way to play.

kzrdude4y ago

Without exploring, I just can't see any words that fit the known letters. I might not have that great imagination of words, or be that good to fit them to fixed clues.

jsmeaton4y ago

Agree. I used one of the strategies over a couple of days and it took all of the fun out of the game. Switching to hard mode brought all of that fun right back.

If you're going to write a program to optimise your guesses then why not just go to the source and pull the answer straight from the encoded answer list? I know people derive fun from these things in different ways but that's kind of where I landed.

jackson14424y ago

Agreed. That strategy hasn't failed me yet, and it definitely makes me think a little harder.

colinmsaunders4y ago

Fight your wordle bot at https://botfights.io/game/wordle

kregasaurusrex4y ago

Looks like fun!

schaefer4y ago

omg, thank you!

biesnecker4y ago

Over the weekend I wrote a tool to help me figure this out: https://github.com/biesnecker/wordlesmith

Wordle's wordlist is interesting. There is a large (~10,000 words) list of words that it will accept as guesses but that will never be the answer, and a much smaller (~2500 words) list of words it will both accept and could be the answer.

My tool's simple algorithm scores words by taking the product of the frequency with which each of its letters appears in the wordlist multiplied by the frequency that the letter appears in that specific spot. When looking at the entire list of words the top five choices are:

1. tares 2. lares 3. cares 4. pares 5. dares

However, if you just look at the words that could be answers (which I what I've decided to do), the list changes to:

1. soare 2. saine 3. slane 4. saice 5. slate

In either case I've never had a puzzle where repeatedly choosing the first suggested word didn't get me to the answer in six or fewer guesses.

8note4y ago

The American spelling of favor got me yesterday. I'd never think of it as a possibility for a 5 letter word. That word list is pretty handy for knowing which words are in scope

jhas78asd4y ago

Fun. I spent an hour and made this and just tried to score words with vowels and wheel of fortune consonants to start to make a good initial guess. IRATE came up. I like the frequency idea and the entropy discussion and have more ideas.

https://github.com/andyatkinson/wordle_solver

ledauphin4y ago

One of the things noted by the OP is that you can get stuck in a rut if you don't do more info-gathering than just the first word.

I would recommend not even trying to guess the word on attempt 2 - much better almost always to guess 5 totally different letters.

peeters4y ago

Hard mode is so much more fun (which disallows this strategy). It changes it up much more each day and is more challenging.

LimitedInfo4y ago

Happy to see tares as your top choice. I used the same method and came to the same word.

baby4y ago

I refuse to read this post because I'm enjoying this slow, daily, little game. I'm convinced over-thinking this would spoil the game.

noah_buddy4y ago

I don't think reading would ruin and I'm glad others are having fun with solvers. I just really enjoy putting in a random five letter word I've read or seen in the last few days and going from there. So far, I've not lost a single game and that's good enough for me!

andrewingram4y ago

I prefer to minimise vowels in early guesses, because I find i'm good at filling in the blanks between consonants, but not the other way around.

less_less4y ago

I also tried to make a wordle solver, this one just using full brute force to find a strategy that minimizes the maximum number of guesses, and then secondarily the expected number of guesses. The solver tries guesses in a heuristic order, and prunes solutions once they're clearly worse than a previously-found solution, but it eventually should find the global optimal strategy.

I didn't entirely succeed due to the obvious exponential runtime, but I tried a couple constraints: using only words in the dictionary of possible answers; hard mode; or "harder mode" where all guesses must be consistent with the information you have. (The actual hard mode is a weaker constraint, but I misinterpreted it as "harder mode" at first.) Harder mode is of course much easier to brute force, because there are fewer options.

Anyway I tried on several tuples: (max guesses, mode, dictionary size). What I found:

- (4, harder, large): no solution.

- (4, normal, small): no solution.

- (4, normal, large): didn't finish after like a day, but no solution found.

- (5, harder, small): this is pretty hard to guarantee a win; best starting word is SCAMP (-FLOUT-DEIGN if no hits).

- (6, harder, small): best staring word is PLATE (-CHURN-MOODY-SKIFF if no hits).

- (5, normal, small): didn't finish; best starting word so far is TRACE (-GODLY-SPUNK if no hits).

- (5, harder, large): didn't finish; best starting word so far is PALET.

- (6, harder, large): didn't finish; best starting word so far is SALET.

Hard mode really is hard. There are lots of clusters with many options, such as /.OUND/, /.IGHT/, /.ASTE/, /S.ORE/ etc. You can easily end up matching a cluster without enough freedom to solve it in time, especially if you start with common letters.

It isn't possible to guarantee a hard mode win by starting with with e.g. RAISE, because if you get e.g. RS in yellow and E in green, then you have 6 remaining words matching /S.ORE/ (plus 4 others), and you can't deal with more than one consonant per guess. Starting with TRACE is even worse: you can't even guarantee a win in 7 guesses due to the cluster /.ATCH/.

thurn4y ago

I also did this and also got AEROS/UNLIT (well, AROSE/UNTIL in my case). You've made the same minor error I did of looking at the frequency distribution of all guessable wordle words, not the smaller list of possible answers, which has a different frequency distribution. If you look at that you get ORATE/LYSIN as the best starting pair (based on frequency only, not considering position)

ghusbands4y ago

Calling not cheating an "error" is odd. The hypothetical game is one in which any valid word could be the answer. If you're going to use the list of answers, just look up the answer for today's date and get it right on the first guess.

less_less4y ago

If you're writing a program to solve Wordle, you're already cheating in some sense.

Whether or not you use the target word list, it's important to recognize that there *are* two separate lists: you can guess almost any plausible 5-letter word, even bullshit Scrabble words like SOARE. But the target word will always be a relatively common word, because the game is designed to be winnable for ordinary English speakers. Honestly the word list is still a little harder than I would have picked for a broad audience; eg REBUS the other day was on the obscure side.

It's also worth noting that the target word will never be a 3- or 4-letter word pluralized by adding -ES or -S. This is something that you could notice simply by playing for a couple weeks. These seem to have been excluded manually or by a regex: it does have plural words of other forms, and it has words with other endings like -ER or -ING.

thurn4y ago

Well, the author is already looking at the larger Wordle wordlist, probably using an english dictionary would be more in the 'not cheating' spirit.

hddqsb4y ago

Interesting post, but the premise is misleading. The title says "Best ... strategy" and the first sentence reads "This post will derive an optimal ... strategy", but there is nothing to suggest that this strategy is optimal.

One strategy which is obviously optimal is to use minimax (recursively, not like Knuth's Mastermind strategy which was mentioned by @christiangenco), however this strategy is not computationally feasible.

---

Aside: There is something broken with this site's history manipulation. When I open the blog post it creates two entries in the history list, and clicking the browser's back button takes me to a page with the same URL as the blog post that displays "Error Page Not Found".

less_less4y ago

I agree with your criticism. It's also not clear what "optimal" means, since there are multiple desirable metrics: fewest average guesses, fewest worst-case guesses, etc.

Full brute force is allllmost computationally feasible, at least for some metrics. Like you could probably exhaust the search space in a month on a small cluster, at least if you're minimizing either (worst case, average case) or (pr(lose), pr(take 6 guesses), pr(take 5 guesses) ...). It's also significantly easier to brute-force in hard mode.

taejo4y ago

> One strategy which is obviously optimal is to use minimax (recursively, not like Knuth's Mastermind strategy which was mentioned by @christiangenco), however this strategy is not computationally feasible.

With some restrictions it's possible to do a full minimax. In that case, you still don't know if it's optimal, but you've at least got an upper bound. And it turns out, it's possible to guarantee victory while finishing in an expected 3.554 guesses. Alternatively, you can finish in an expected 3.212 guesses if you don't mind a small chance of taking more than six guesses.

hddqsb4y ago

To add to my previous comment...

I realise that this blog post is about a strategy for humans to use rather than for an ideal solver, so a degree of inexactness is to be expected. Still, I dislike the way the words "best" and "optimal" are thrown around. The guesses presented in the blog post are "best" in the sense that they maximise some heuristic (namely covering the most frequent letters), but the blog post doesn't explain why that heuristic is good.

Regarding an ideal solver, to prove that a strategy is optimal (in terms of the worst-case number of guesses) one would need to show two things:

- That the worst-case number of guesses of the strategy is N, and

- That there is no strategy whose worst-case number of guesses is less than N.

The first is quite easy to do (for an automated strategy): simply run the strategy against each possible secret "wordle" and keep track of the maximum number of guesses. The second is much harder.

(Since the worst-case number of guesses is likely to be small, this is not a fine-grained way to compare strategies. One could also look at the distribution of the number of guesses to get more information; for example a strategy that takes 6 guesses half the time and 5 guesses the other half is clearly better than a strategy that always takes 6 guesses. Still, it would be really nice if people who published automated strategies also published their worst-case number of guesses.)

Knuth's strategy is optimal (in terms of the worst-case number of guesses) for Mastermind, but it might not be optimal for Wordle. Knuth showed that his strategy takes at most 5 guesses for Mastermind, and presumably there is no strategy that takes at most 4 guesses. But his strategy is not optimal in terms of the distribution of the number of guesses; there are strategies with a lower average number of guesses (and the same worst-case number of guesses; see https://mathworld.wolfram.com/Mastermind.html). Reducing the set of possibilities as much as possible is a very sensible strategy, but it might not be optimal because the number of guesses required to solve a set depends on the nature of its elements, not just on the size of the set. In the case of Mastermind, this inefficiency does not affect the worst-case number of guesses; but in the case of Wordle it might.

ledauphin4y ago

I did some work on this as well.

One thing I think this analysis seems to be missing is taking into account letter position within the word set. If you take that into account, it actually shifts the first guesses, since you're far better off to discover a green square than a yellow one.

There are some pretty weird words that pop up if you do this analysis, so I've settled on some that are slightly suboptimal but won't make me feel like a robot every day.

I will say that "cares", in my analysis, is a significantly better option than the author's suggestion of aeros, because "e" is more common in position 4 than in 2, and a is much more common in position 2 than 1.

ncmncm4y ago

I bet "s" in final position is much rarer than in typical text, because Wardle seems (reasonably!) to prefer base words over plurals, as well as past tense, participle, and other modifications.

For me the most useful immediate information is about vowels, making "adieu" quite useful. But "irate" and "inter" are also pretty good.

Most people focus on the yellow and green results, but the grey results can be even more informative. A grey result doesn't just tell you about its own position, but about all five. So each grey result from the first guess rules out huge swaths of the dictionary.

The more letters from ETAOIN SHRDLU you can cram into the first two guesses, the better off you are. Thus, any letters in color on the first guess would be wasted if repeated in the second word.

ledauphin4y ago

We all draw our lines somewhere different, of course, but to my mind, taking into account Wardle's specific choices as exposed in the source code feels like cheating, vs taking the full list of 5 letter words that his code "accepts".

Knowing what answers are actually "true" ahead of time is more information than I'm comfortable taking advantage of.

And yes, eliminating letters is crucial - in my opinion it's always better to guess 5 totally new letters for the second word.

1 more reply

AnotherGoodName4y ago

You could probably have fun with

Player 1) create a predictable process that can guess wordle words

Player 2) given the full code of 1 find a word that will not be found by that process in 6 steps

The game is over if someone can decide a process that catches all valid 5 letter words in 6 steps.

ghusbands4y ago

I believe there are already plenty of solvers doing it in five steps, worst case, as in [1]. I don't think any are doing it in four, but I doubt anyone has yet demonstrated that four is impossible.

[1] https://news.ycombinator.com/item?id=29890845

ghusbands4y ago

For future readers, the following link claims that five steps, worst case, is incorrect, and six is the smallest maximum: https://www.poirrier.ca/notes/wordle-optimal/

mmmrtl4y ago

Or maximize entropy: https://twitter.com/quantian1/status/1480337678281363461

lkbm4y ago

Ah, helpful to see people analyzing for hard mode. I don't have that enabled, but I try to force myself to play that way.

Jedd4y ago

A friend wrote up a post a couple of weeks ago about solving Wordle with python:

https://poetix.medium.com/playing-wordle-with-python-6750185...

I note that a very recent Wordle game used the same letter twice, rudely negating one of his otherwise quite reasonable assumptions there.

geoduck144y ago

That double letter threw me for a loop. ><

lucideer4y ago

I've spoken to a lot of people who say they take this "exploration" strategy, but I think it misses the human language element of deduction.

I've been playing Wordle for a while and while I'm no wordsmith, if you take an analytical strategy to word selection from line 2 onward it's never that difficult to get 3/6 (which I presume is the highest score attainable without significant luck).

English is one of the most irregular languages, with so many influences and variations, but it still has structure and common patterns. Once you have knowledge from a well selected exploratory 1st line, there's a lot you can deduct about potential variants in the word structure of the answer to make an informed 2nd line choice that'll be dual purpose: both being a good exploratory word and also a reasonably hopeful lucky guess.

Afterall, if you're aiming for 4/6 you're not really aiming very high.

cookie_monsta4y ago

100% agree. Wordle is not a hard game - it has a pleasing mix of luck and skill, and while all the theories and strategising are interesting, none of them are really necessary. I don't understand the "exploration" strategy at all - playing "hard mode" means every guess is a possible winner.

I've done about 30 now and my highest amount of guesses was 5 when I had the "ank" of CRANK and from there it was pure guessing as there were multiple, equally likely (as I understand it) possibilities.

My best score was 2 guesses and it felt like a very hollow victory - just a lucky guess, really. 3 feels like you've done some work and yeah, 4 feels average.

1 more reply

ricardobeat4y ago

This strategy optimizes for figuring out all the letters that are used in today’s word. But that’s not the goal of the game, it’s finding one word in particular. Those sound similar but are not.

In the example given, the first guess already told you there is an E, but then it’s not used in the next guesses. Figuring out the position of that letter, instead of trying to find a third one, will massively reduce your search space, I’m sure someone can do the math on this being more beneficial.

Yesterday I got one green one yellow in the first guess, and got the word in three steps from there. There are very few words that could fit after those two letters + all the excluded ones. You literally just iterate through the alphabet and possible words in your head while excluding the gray letters.

EDIT: just did today in 4 guesses again using this approach. Lucky streak?

My starting word is BACON.

ralferoo4y ago

> Lucky streak? My starting word is BACON.

I see what you did there

csours4y ago

Using only my memory of Wheel of Fortune default letters, I use

ARTSY MODEL CHUNK

They may not be the best words in the world, but I thought of them, so they are best to me.

YossarianFrPrez4y ago

It's interesting to think about optimal starting words, as some people might be more prone to thinking in terms of phonemes ("th", "ed", "sl") rather than individual letters. For this and other reasons, I imagine cognitive scientists would also be interested in the Wordle play data, as they can look to see which hypothesis testing strategies people people use over the course of game.

Also, one version of Wordle people can play is to pick the best starting word every time, another involves starting with a different word every time.

jlynn4y ago

Two things I've wanted to explore with Wordle:

1. Some sense of "par". Can you crawl twitter for everyone's solution tweets and get a sense of the average guesses for the day? 2. "Unwordle". For people who follow the hard-mode rules, how far can you get deciphering their guesses based on their shared color grid? Could you make it competitive between friends to see how well you can guess each others guesses? Would that encourage more creative guesses to trick your friends?

jonathankoren4y ago

I made my own Wordle solver recently that uses frequency analysis of letters and bigrams of five letter English words as a scoring function. It works fairly well.

Someone in this discussion suggested using frequency analysis at position, which seems interesting, especially when trying to locate misplaced letters. I might have to try that.

https://github.com/jonathankoren/wordle-solver

Supermancho4y ago

What was your source for "English words"? This is an interesting problem, itself.

jonathankoren4y ago

Googling will give you countless lists. The internet solved that problem decades ago.

The biggest problem is when your guessing dictionary contains words that aren’t in the acceptable word list in Wordle. For that, I generate 10 guesses on each turn. At least one of them should show up. I didn’t want to use Wordle’s dictionary because that felt like cheating.

1 more reply

schappim4y ago

I did the same. My source of words was the Unix dictionary found in macOS. The path is:

  /usr/share/dict/words

When I did the frequency analysis, I scoped it to only 5 letter words.

If you're looking for a great "first guess word", "trace" is one of the words that will yield the most clues.

wizofaus4y ago

FWIW, I just had a game where these were my guesses:

IRATE SOUND PLUMB FOUND (the answer)

In fact I nearly always play "SOUND" after "IRATE", even if I know it can't be correct, and in this case I knew PLUMB couldn't be correct - after the 2nd word I knew it was -OUND, but there are a lot of words ending in -OUND, so I wanted to eliminate as many as possible in the next move. PLUMB would either eliminate or indicate POUND MOUND or BOUND as the solution, and once it did the former, it was either FOUND HOUND WOUND, and I got lucky. In fact I should have played WHOMP which would have given me a better chance (given LOUND is not a word). In this sense the game is somewhat different to the coloured peg game "Mastermind", where there's no benefit in making a guess that can't possibly be correct.

aghilmort4y ago

Experimented with goal-based strategies last weekend:

- least time - fewest moves - fewest unique letters - etc.

Most interesting to me was solving in the least time and having fixed words to maximize letter coverage, which theoretically was 25/26 letters during first five guesses. Access to such a list would make it such that the goal is given those words, can we find the answer in less than some fixed time, e.g., one minute from start to finish.

Further interesting was attempting to find such a list without computer assistance, short of having access to word lists and filters, which ultimately led to a near optimal list of words for the first five moves with 24/26 letters:

** spoiler -> the (5) seed words are located at https://pastebin.com/D1DkzXA4 -- the password is gTgRCFrLYL

The 'a' is notably repeated, and worse, is in the same position; was planning to run a computer search to (1) explore if a perfect solution of five dictionary words with entirely unique letters exists &/or failing (1), then (2) is there a list with 24/25 unique letters or 1-2 repeated letters s.t. all letters are in different positions.

There are of course places where such a set of words with fairly maximal coverage still falls short -- a nearly so example is the word pair UNLIT and UNTIL. Determining how many such combos that might be nondeterministic with such a fixed set of starting words exist in the dictionary would be good if someone wanted to dig deeper, e.g., {ARISE, RAISE, AESIR, REAIS, SERAI} which was cited in another comment.

Imnimo4y ago

I think there are two natural optimization criteria.

One is to try to maximize the expected information gain of your guess. If we have an initial set of 128 possible words, we begin with log2(128) = 7 bits of entropy. When we make a guess and receive a response, we narrow down the list of words to the set compatible with that response. If, for example, there are 32 words compatible with our response, then we now have log2(32) = 5 bits of entropy, and our guess was worth 2 bits. For a given guess, there are many possible replies each with its own information gain - in the best case, we get all greens, and are left with 0 bits of uncertainty (for a gain of 7), while in other cases we may get all greys, and learn comparatively little. Further, each reply has its own probability of occurring - all greens is only 1/128, but other replies might be more likely if there are several possible targets that would generate the same reply. Thus, we weight the information gain of each reply by its probability to arrive at the expected information gain for that guess. For the word list provided in the article, I get TARES as the best first guess by this metric.

The second strategy is to continue down the game tree, and find the guess with the lowest number of expected (or alternatively the lowest worst-case) number of guesses. In principle, we might find that while TARES gains us a lot of information as a first guess, it leaves us without a good second guess (since we are restricted to guessing real words, and those words might all be redundant in some way), and thus our total expected number of guesses is larger than if we had taken a slightly less informative first guess. My hunch is that in practice this sort of situation is unlikely to occur, and the best first guess by this metric is probably similar to that of the first metric, although I haven't tried.

oli56794y ago

If it wasn't for the 6-guess limit and you had a vocabulary, then you could use minimax to try and minimize the maximum number of guesses.

E.g. assuming you make the best possible future guesses but your opponent has chosen the least convenient word and propitiating up these minimum and maximums.

I am trying to think if this can be tweaked?

handmodel4y ago

I really like the "hard" version of Wordle better where you can only guess clues which make sense given the clues. It seems more in the spirit and although there may be 26 ideal starting words it would be very hard to memorize the set list of words that could be the best guesses as a result.

not2b4y ago

I've picked a first word with the most common letters (ATONE for example), then do a grep on the 'enable1' word list (the one that a number of word games use) to exclude all 'impossible' words. The second word is chosen to have many common letters not in the first word but also matching any constraints. Then I do another filter; often by the 3rd guess there are fewer than 100 possibilities left. I then choose the word that seems the most common, because Wordle often chooses relatively common words as the answer, I haven't seen an obscure word picked yet. I've gotten it in three this way a couple of times, and if not, repeating the grep pipeline often leaves only a few words for the fourth guess.

petercooper4y ago

Like many, I spent some time thinking about this, and realized if I could analyze the words used so far, I could find some patterns. For example, the creator might not ever use plurals meaning RATES might not be such a helpful starting word with the S.

So I Googled the last several solutions to find such a list and instead I found... a gist where someone had reverse engineered the game and put up a list of every solution, including for the next several months. On the plus side, I now don't spend time on the game.. but this is perhaps a warning to not go down the rabbit hole yourself if you want to keep playing ;-)

Arnavion4y ago

It's because a) the word list is hard-coded in the game's JS, so anyone can see it with "View Source", and b) the list is indexed based on the current day (as reported by the JS) and wraps around to the start.

So you don't just have solutions for the next several months. You have all the solutions ever :)

bejd4y ago

It probably didn't take a lot of reverse engineering: the solution list is stored in plaintext in the site's javascript!

petercooper4y ago

I saw that, but I had naively assumed it would at least not be in the right order and some sort of date based hash would be used to shuffle it up.. ha! The simplest things..

vple4y ago

I also wrote a tool for playing: https://github.com/vple/wordle-solver/blob/main/solver.js

There's probably more tuning I can do for the algo, but roughly:

- I took all the words from the site's js as the dictionary.

- From remaining eligible words, compute the letter distribution (ignoring letters you already know are in the solution).

- Pick a word that uses as many of the most frequent letters as possible.

- Use one of those as a guess.

The goal is essentially to greedily reduce the remaining candidate words as much as possible per guess.

jonathankoren4y ago

This is pretty much what I did, but I mixed in a regexp to hold the location restrictions, and a penalty for using the same letter multiple times. (eg guessing “added” is worse than “aspen” for “a..e.”)

I do wonder if looking at how a letter splits the space of letters and words would be interesting

vple4y ago

Yeah, I wasn't sure how I wanted to deal with duplicates so I mostly ignored them. I track letter positions directly (just a bunch of tuples), but don't actually do anything with this other than restricting candidates words.

I think if I work on this some more I'd try to factor in letter positioning when deciding what to guess. My hunch is that it won't make too much of a difference though.

1 more reply

manuw4y ago

Yesterday I had fun to find out what's the best "opener" for Wordle. I wrote quickly something about it here[0] and here[1]. Looks like "irate" is a good opener.

[0] https://dev.to/mnlwldr/opener-for-wordle-11k0

[1] https://twitter.com/mnlwldr/status/1481678410854412289

Kiro4y ago

Always trying a new first word every day makes it much more fun imo. I love sitting with a blank Wordle and just let my mind try to settle on a random 5-letter word from my subconscious.

KWD4y ago

Same way I approach it. Just whatever word comes to mind at that time.

WhiteOwlEd4y ago

I feel the best strategy to win at Wordle would be to see if someone already posted the answer.

If no answer, use reliable hints ( i.e on Twitter) to better inform the initial guess.

Beyond that, the best initial word goes beyond information theory, letter frequency, etc to also be a word that has a realistic chance of being word of the day. As a trivial example, AEIOU may reveal some information on the first guess, but I find it extremely unlikely that you will win Wordle in 1 with this guess.

azinman24y ago

If you’re looking up the answer, then you’re cheating. What’s the point? You can already get it via JavaScript if you’re so inclined.

meterplech4y ago

I built a command line wordle solver that provides an updated best guess as the game progresses and you learn more. Right now it's just based on word frequency, but working to add more.

Also added a command line version of wordle in case you want to play more and practice and a simulator I'm using now to explore optimal strategy more.

https://github.com/jkatzur/wordle-solver

plank4y ago

Seems that another thing to consider is multiple usages of same letter. Eerie for instance. So if the first try gets a single letter, this does not immediately mean that one should not try multiple letters.

Considering that eg the letter e is used very frequently, one might try words with more the one e, for instance? Off course this does not mean that I have a different, best strategy, only that it might be more complicated.

tobinfricke4y ago

Yet another Wordle solver!

https://github.com/nikitaborisov/autowordl

We choose the guess that is expected to result in the smallest set of possible solutions after one guess.

For the secret answer "QUERY" it suggests the following sequence of guesses:

1. LARES

90 possible answers after this guess.

2. GROUT

Only four possibilities now: ['ENURE', 'INURE', 'QUERY', 'QUIRE']

3. BRIBE

This narrows the field down to one possibility:

4. QUERY

ghusbands4y ago

For that third guess, you could just use QUIRE. Still reduces the possibilities to one, but has a chance of getting the right answer one step sooner. In general, favouring something within the possibilities, where it doesn't pessimise, might reduce your search tree a bit.

tobinfricke4y ago

Yup, I definitely agree. I haven't implemented smart tie-breaking logic yet.

carderne4y ago

I did something similar including third guesses, but got bored before including letter positions. I included a whole bunch of words in case anyone wants to mix it up each morning. [0]

The main problem I think is that the words certainly aren’t chosen purely randomly, so the actual letter frequency could be completely different…

[0] https://rdrn.me/wordle/

IMAYousaf4y ago

I personally always start with the word OUIJA.

From there it's simple to rotate if any of the vowels get picked up into words with common non-vowels like S and T etc.

If no vowel appears out of OUIA, I always go for a word like STREP.

Becomes relatively trivial from there.

Today (01/13/2022) the word was ABBEY.

I'll represent GREEN squares with * and GREY squares with ^

It went as follows:

OUIJA^ B^E^A^ST FA^B*LE^ ABBEY

After FABLE, the first word I could think of that had at least 1 A, at least 1 B, and at least one E was ABBEY.

Adrock4y ago

I computed the Pareto optimal frontier of initial Wordle guesses, showing the tradeoff between maximizing green squares and maximizing yellow squares in this thread: https://twitter.com/adereth/status/1478435989982875648

thurn4y ago

This is interesting, you should re-do this analysis using the actual lists of 2000 valid words and 10,000 valid guesses, since it does fairly significantly change the math vs. a normal English dictionary (the word can almost never be plural, for example, dramatically reducing the frequency of S in the final position).

ledauphin4y ago

correct me if I'm wrong, but it seems like greens are strictly more valuable than yellows, so giving them equal weight doesn't make much sense to me.

Adrock4y ago

Yes, that’s true. But how much more valuable? Would you trade 0.1 expected greens for 0.2 expected yellows? Computing the Pareto optimal frontier shows you all your options.

rckrd4y ago

My contribution to the best-first guess on Wordle is a greedy algorithm that picks the word that eliminates the most solutions. [1]

[1] https://matt-rickard.com/wordle-whats-the-best-starting-word...

elicash4y ago

On easy mode, what should the second word be? Third, fourth? How often can you get the word ENTIRELY by eliminating as many words as possible in this way?

ipsin4y ago

This strategy seems to avoid words that Wordle will accept, while not choosing them as word-of-the-day, e.g. SERAI.

If your goal is to minimize the number of remaining words in the worst case, I think SERAI is the winner, leaving you with at most 697 choices. [Assuming my code is right]

propter_hoc4y ago

Genuinely curious, why would you use SERAI rather than ARISE? Functionally equivalent, doesn't seem like a made-up word, and might actually be the answer one day.

ipsin4y ago

As CrazyStat hints, anagrams are not equivalent in Wordle, because changing the word order means that you're also changing the bucketing, the words in the anagram shifting between yellow and green.

To be more specific, I counted 882 possible words in the worst case using ARISE, versus 697 using SERAI.

CrazyStat4y ago

S is very common as the first letter, so having it in that position gives you more information than having it in fourth position (due to higher probability of green).

sg474y ago

My first two words are POINT and SHEAR though I'm going to change it to SNARE. This covers most of the vowels and frequently occurring letters. From then on it's trying to guess the other consonants and the position of the letters.

kkcorps4y ago

I have been using - WORDS, CHIME, FAULT for the first 3 words. Eliminates all vowels. Then I either proceed to guess the correct position of letters if I have 2-3 yellows from the first 3, or proceed to eliminate consonants.

schappim4y ago

If you're looking for a source of words to build your own solver, you can find one in many *nixy distros (including macOS) here:

  /usr/share/dict/words

ghusbands4y ago

Or just combine the arrays in the wordle source (being careful not to look too closely at the word order in one of them).

Mizza4y ago

What's the opposite - a "bongcloud" for Wordle?

CrazyStat4y ago

IMMIX

k2xl4y ago

Do any bots incorporate the history of wordle words? Wordle never repeats previous words so I wonder if the “best Xth word to guess” could change at some point?

takeda4y ago

I would consider that cheating. If you consider a history you might get all past (as well as future) words from the source code.

For example this Saturday's word will be panic.

k2xl4y ago

I meant for a bot perspective.

davidw4y ago

“A strange game. The only winning move is not to play.”

TheRealNGenius4y ago

That applies more to life than to wordle

davidw4y ago

Well that's bleak. Life can be pretty good, too! Talk to someone if you need to, it helps.

yogrish4y ago

Best start would be to start with a word with maximum VOWELS.(PIOUS,SAUTE etc). Next guess can be based on previous yellows/greens.

kranner4y ago

ADIEU!

natch4y ago

I'm looking for the best strategy for guessing which Wordle app in the App Store is the real one.

Tijdreiziger4y ago

None: https://techcrunch.com/2022/01/12/wordle-is-being-punished-b...

Discussed yesterday at https://news.ycombinator.com/item?id=29906892

pfooti4y ago

Just use SWALE, it hasn't done me wrong yet.

Madmallard4y ago

I pretty much always start with stare

deepsquirrelnet4y ago

I'm sure this is an application for Bayesian Inference, but I'm too uneducated to say anything more insightful than that.

mar_31044y ago

SPADE WROTH CLIMB FUNKY

Left:GJQVXZ

j / k navigate · click thread line to collapse

151 comments

MauranKilom4y ago

It seems clear to me that there is an optimal starting word, but that the best second word has to depend on the info you gain from the first.

Case in point: If you get 3 green 2 yellow in the first word, you can solve on the next guess.

Of course, you can constrain your strategies to "I always use the same two/three starting words", and in many cases that will be fine. But it's quite obviously not optimal.

nwiswell4y ago

> It seems clear to me that there is an optimal starting word

Beginning with the full list of 5-letter words, I calculated the frequency of each letter of the alphabet in each of the 5 possible positions for a 5 letter word.

Then I iterated through the list a second time, this time assigning a score for each word equal to the sum of frequencies for each letter in its respective position.

By a significant margin, the highest score is SLATE (over 1400). Runners up (over 1300) are SAUTE, SHIRE, and CRATE.

Caveat: this approach assumes that all possible words are equally likely to be the answer.

codefined4y ago

Interestingly, doing a similar analysis where we bucketed each word into 243 (3^5) buckets based on the possible result, we found "RAISE" as the best word. Source[0]

[0] https://gist.github.com/popey456963/a654e98d0180566b897b70ee...

2 more replies

Nursie4y ago

1 more reply

paulcole4y ago

> Beginning with the full list of 5-letter words, I calculated the frequency of each letter of the alphabet in each of the 5 possible positions for a 5 letter word.

There’s a great many 5-letter words that the creator will never consider because they’re too obscure. Treating all 5-letter words as possible is a mistake when calculating strategy for this.

High-frequency letters will be over-represented in your analysis.

1 more reply

alkonaut4y ago

tcskeptic4y ago

That is a really interesting approach -- if you remove the words with any of the letters in SLATE, what is the highest ranked word? (If you have it to hand)

pyrale4y ago

> Case in point: If you get 3 green 2 yellow in the first word, you can solve on the next guess.

That's a very specific edge case, though.

eightails4y ago

> It seems clear to me that there is an optimal starting word, but that the best second word has to depend on the info you gain from the first.

Definitely. The likelihood of a letter appearing in a given place changes depending on the letters around it. A Q will almost always be followed with a U, for example.

christiangenco4y ago

I modeled my Wordle solver[1] after Donald Knuth's Mastermind strategy[2].

An optimal guess is found by looking at the list of possible solutions and for each of those possible solutions checking how much each possible guess would narrow down the list of possible solutions.

The optimal first guesses from my solver are ARISE, RAISE, AESIR, REAIS, or SERAI because each will narrow down the possible word list to at minimum 168 remaining words.

Each guess after that uses the same algorithm with the list of possible guesses filtered with the information you learned in previous guesses.

edit: formatting

1. https://wordlesolver.com source: https://github.com/christiangenco/wordlesolver

2. https://math.stackexchange.com/questions/1192961/knuths-mast...

svat4y ago

After reading your code, I see an error in it. In fact, only RAISE is optimal; the others are worse, and leave bigger lists (according to my code):

    RAISE: 168
    …
    REAIS: 203
    …
    AESIR: 220
    ARISE: 220
    …
    SERAI: 241

The error in your code is that your "evaluateGuess" function (right at the top, in the first 10 or so lines of https://github.com/christiangenco/wordlesolver/blob/9c3bd94a...):

    function evaluateGuess({ solution, guess }) {
      return [...guess].map((letter, index) => {
        return {
          letter,
          included: solution.includes(letter),
          position: letter === solution[index],
        };
      });
    }

is too simplistic, and not actually what the real game does. In the game, for each letter position, there are three possible responses:

• Correct (Green), what you call "position"

• Present (Yellow), what you call "included"

• Absent (Grey)

Here are three test cases, that you could try out on the real Wordle in recent days:

You mentioned Donald Knuth's Mastermind paper; in fact in the paper (http://www.cs.uni.edu/~wallingf/teaching/cs3530/resources/kn...) Knuth points this out on the very first page:

> Rule 2 is somewhat difficult to state precisely and unambiguously, and the manufacturers have in fact not succeeded in doing so on the directions they furnish with the game […]

and gives an exact rule that you may want to study carefully.

In my code, the `response` function I use (it's not the most efficient, but we can just memoize it) is:

    def response(h, g):
        '''
        - The hidden word is h.
        - The guess is g.
        For each position in the word g, some color:
        - 'green' if in the same position
        - 'yellow' if present (after subtracting 'green's)
        - 'grey' if absent (after subtracting "green"s and "yellow"s)
        '''
        assert len(h) == len(g)
        L = len(h)
        green = [i for i in range(L) if h[i] == g[i]]
        yellow = []
        for i in range(L):
            # We want to check whether g[i] is "present" in h
            if i in green: continue
            for j in range(L):
                if j in green: continue
                if j in yellow: continue
                if h[j] == g[i]:
                    yellow.append(i)
                    break
        return (green, yellow)

Note the three "continue" statements — they are crucial, to match the behaviour of the real Wordle (or Master Mind) on the three test cases I mentioned above.

svat4y ago

The testcases notwithstanding, I found a bug in my code as well. Fixed my `response` function to:

    def response(h, g):
        assert len(h) == len(g)
        L = len(h)
        correct = [i for i in range(L) if h[i] == g[i]]
        present_h = []
        present_g = []
        for i in range(L):
            # We want to check whether g[i] is "present" in h
            if i in correct: continue
            for j in range(L):
                if j in correct: continue
                if j in present_h: continue
                if h[j] == g[i]:
                    present_g.append(i)
                    present_h.append(j)
                    break
        return (correct, present_g)

and now I too get (AESIR, ARISE, RAISE, REAIS, SERAI) all leaving 168 words. (But the testcases in the above comment still hold, though, make sure your code works for them.)

pedrosorio4y ago

This does not quite work.

For today's word (abbey):

- input "arise" (match "a", wrong position "e")

- solver says there are 10 possible words, one of them is "apnea"

- input "apnea" (match "a", match "e") EDIT: fixed

- solver says there are 0 possible words (it assumes last "a" should appear yellow, but the wordle page has the last "a" as grey because there is only one "a" in the word)

geoduck144y ago

SPOILER!!!!

1 more reply

phalangion4y ago

Apnea would be match for “a” and “e”

2 more replies

geoduck144y ago

I like the tool, but it feels like cheating because it uses the list of "possible solutions"

For instance, "waxed" is not suggested even though it is a legitimate word

Syzygies4y ago

    5.291 soare
    5.294 roate
    5.299 raise
    5.311 raile
    5.311 reast
    5.321 slate
    5.342 crate
    5.342 salet
    5.345 irate
    5.346 trace
    5.356 arise
    5.360 orate
    5.370 stare
    5.382 carte
    5.390 raine
    5.400 caret
    5.402 ariel
    5.406 taler
    5.406 carle
    5.407 slane

One might recognize SOARE as identified elsewhere by a different measure. I am relieved that I give up very little by moving down to the first word I recognize on this list.

In the 1960's my dad used entropy to program Jotto on Kodak's computers. It wasn't feasible then to evaluate every possible clue word, but he determined that one did well enough with a random subset.

hddqsb4y ago

Maximising the information gained using Shannon's entropy is a very good strategy (assuming the goal is to minimise the expected number of guesses), however it is not necessarily optimal!

I have a counter example for a simplified version the game with the following rule changes:

1. The player is only told which letters in the guess are correct (i.e. they are not told about letters that are present but in a different location).

2. If the player knows there is only one possible solution, the player wins immediately (without having to explicitly guess that word).

3. The set of words that the player is allowed to guess may be disjoint from the set of possible solutions.

Here is the list of possible solutions:

    aaaa
    aaab
    aaba
    babb
    abaa
    bbab
    bbba
    bbbb

(There are 8 words. The 2nd, 3rd and 4th letters are the binary patterns of length 3, and the 1st letter is a carefully chosen "red herring".)

Here is the dictionary of words the player is allowed go guess:

    axxx
    xaxx
    xxax
    xxxa

(Each guess effectively lets the player query a single letter of the solution.)

The information gain for each possible initial guess is identical (all guesses result in a 4-4 split), so a strategy based on information gain would have to make an arbitrary choice.

If the initial guess is axxx (the "red herring"), the expected number of guesses is 3.25.

But a better strategy is to guess xaxx (then guess xxax and xxxa). The expected number of guesses is then 3.

(In this example information gain was tied, but I have a larger example where the information gain for the "red herring" is greater than the information gain for the optimal first guess.)

Syzygies4y ago

1 more reply

Syzygies4y ago

A fork using primes: Primel

https://converged.yt/primel/

The best first guess using Shannon entropy is a dead heat between 12953 and 12539. Not surprisingly to the number theorists out there...

If you study the JavaScript, apparently this guy's girlfriend knows every prime.

SoylentOrange4y ago

Syzygies4y ago

> wasting part of an hour as I did searching for a clever way to score guess words.

I'd love to see scoring a Wordle guess added to one of those programming problem sites, as code golf in any language would be amusing. I was once a commercial APL programmer, so APL comes to mind.

KerryJones4y ago

Would you mind sharing your code?

bee_rider4y ago

Which word list did you use?

strstr4y ago

I believe tares is the optimal word (according to shannon entropy) from the game’s list.

2 more replies

paxys4y ago

brianpan4y ago

It's not cheating, it's the better way. If you don't like it, you can turn on hard mode!

peeters4y ago

So, hard mode, plus no reusing grey letters, plus no using yellow letters in the same incorrect slot.

alkonaut4y ago

albert_e4y ago

this is my default approach unless really out of options i can think of

but i do slip up at times and jump to punch in a word that i later realize violates one of those yellow rules for instance

reificator4y ago

Go to the gear in the top right, and enable `Hard Mode - Any revealed hints must be used in subsequent guesses`.

duderific4y ago

I was playing hard mode accidentally. I didn't realize there was a more optimal way to play.

kzrdude4y ago

Without exploring, I just can't see any words that fit the known letters. I might not have that great imagination of words, or be that good to fit them to fixed clues.

jsmeaton4y ago

Agree. I used one of the strategies over a couple of days and it took all of the fun out of the game. Switching to hard mode brought all of that fun right back.

jackson14424y ago

Agreed. That strategy hasn't failed me yet, and it definitely makes me think a little harder.

colinmsaunders4y ago

Fight your wordle bot at https://botfights.io/game/wordle

kregasaurusrex4y ago

Looks like fun!

schaefer4y ago

omg, thank you!

biesnecker4y ago

Over the weekend I wrote a tool to help me figure this out: https://github.com/biesnecker/wordlesmith

1. tares 2. lares 3. cares 4. pares 5. dares

However, if you just look at the words that could be answers (which I what I've decided to do), the list changes to:

1. soare 2. saine 3. slane 4. saice 5. slate

In either case I've never had a puzzle where repeatedly choosing the first suggested word didn't get me to the answer in six or fewer guesses.

8note4y ago

The American spelling of favor got me yesterday. I'd never think of it as a possibility for a 5 letter word. That word list is pretty handy for knowing which words are in scope

jhas78asd4y ago

https://github.com/andyatkinson/wordle_solver

ledauphin4y ago

One of the things noted by the OP is that you can get stuck in a rut if you don't do more info-gathering than just the first word.

I would recommend not even trying to guess the word on attempt 2 - much better almost always to guess 5 totally different letters.

peeters4y ago

Hard mode is so much more fun (which disallows this strategy). It changes it up much more each day and is more challenging.

LimitedInfo4y ago

Happy to see tares as your top choice. I used the same method and came to the same word.

baby4y ago

I refuse to read this post because I'm enjoying this slow, daily, little game. I'm convinced over-thinking this would spoil the game.

noah_buddy4y ago

andrewingram4y ago

I prefer to minimise vowels in early guesses, because I find i'm good at filling in the blanks between consonants, but not the other way around.

less_less4y ago

Anyway I tried on several tuples: (max guesses, mode, dictionary size). What I found:

- (4, harder, large): no solution.

- (4, normal, small): no solution.

- (4, normal, large): didn't finish after like a day, but no solution found.

- (5, harder, small): this is pretty hard to guarantee a win; best starting word is SCAMP (-FLOUT-DEIGN if no hits).

- (6, harder, small): best staring word is PLATE (-CHURN-MOODY-SKIFF if no hits).

- (5, normal, small): didn't finish; best starting word so far is TRACE (-GODLY-SPUNK if no hits).

- (5, harder, large): didn't finish; best starting word so far is PALET.

- (6, harder, large): didn't finish; best starting word so far is SALET.

thurn4y ago

ghusbands4y ago

less_less4y ago

If you're writing a program to solve Wordle, you're already cheating in some sense.

thurn4y ago

Well, the author is already looking at the larger Wordle wordlist, probably using an english dictionary would be more in the 'not cheating' spirit.

hddqsb4y ago

---

less_less4y ago

I agree with your criticism. It's also not clear what "optimal" means, since there are multiple desirable metrics: fewest average guesses, fewest worst-case guesses, etc.

taejo4y ago

hddqsb4y ago

To add to my previous comment...

Regarding an ideal solver, to prove that a strategy is optimal (in terms of the worst-case number of guesses) one would need to show two things:

- That the worst-case number of guesses of the strategy is N, and

- That there is no strategy whose worst-case number of guesses is less than N.

The first is quite easy to do (for an automated strategy): simply run the strategy against each possible secret "wordle" and keep track of the maximum number of guesses. The second is much harder.

ledauphin4y ago

I did some work on this as well.

There are some pretty weird words that pop up if you do this analysis, so I've settled on some that are slightly suboptimal but won't make me feel like a robot every day.

ncmncm4y ago

I bet "s" in final position is much rarer than in typical text, because Wardle seems (reasonably!) to prefer base words over plurals, as well as past tense, participle, and other modifications.

For me the most useful immediate information is about vowels, making "adieu" quite useful. But "irate" and "inter" are also pretty good.

The more letters from ETAOIN SHRDLU you can cram into the first two guesses, the better off you are. Thus, any letters in color on the first guess would be wasted if repeated in the second word.

ledauphin4y ago

Knowing what answers are actually "true" ahead of time is more information than I'm comfortable taking advantage of.

And yes, eliminating letters is crucial - in my opinion it's always better to guess 5 totally new letters for the second word.

1 more reply

AnotherGoodName4y ago

You could probably have fun with

Player 1) create a predictable process that can guess wordle words

Player 2) given the full code of 1 find a word that will not be found by that process in 6 steps

The game is over if someone can decide a process that catches all valid 5 letter words in 6 steps.

ghusbands4y ago

I believe there are already plenty of solvers doing it in five steps, worst case, as in [1]. I don't think any are doing it in four, but I doubt anyone has yet demonstrated that four is impossible.

[1] https://news.ycombinator.com/item?id=29890845

ghusbands4y ago

For future readers, the following link claims that five steps, worst case, is incorrect, and six is the smallest maximum: https://www.poirrier.ca/notes/wordle-optimal/

mmmrtl4y ago

Or maximize entropy: https://twitter.com/quantian1/status/1480337678281363461

lkbm4y ago

Ah, helpful to see people analyzing for hard mode. I don't have that enabled, but I try to force myself to play that way.

Jedd4y ago

A friend wrote up a post a couple of weeks ago about solving Wordle with python:

https://poetix.medium.com/playing-wordle-with-python-6750185...

I note that a very recent Wordle game used the same letter twice, rudely negating one of his otherwise quite reasonable assumptions there.

geoduck144y ago

That double letter threw me for a loop. ><

lucideer4y ago

I've spoken to a lot of people who say they take this "exploration" strategy, but I think it misses the human language element of deduction.

Afterall, if you're aiming for 4/6 you're not really aiming very high.

cookie_monsta4y ago

My best score was 2 guesses and it felt like a very hollow victory - just a lucky guess, really. 3 feels like you've done some work and yeah, 4 feels average.

1 more reply

ricardobeat4y ago

EDIT: just did today in 4 guesses again using this approach. Lucky streak?

My starting word is BACON.

ralferoo4y ago

> Lucky streak? My starting word is BACON.

I see what you did there

csours4y ago

Using only my memory of Wheel of Fortune default letters, I use

ARTSY MODEL CHUNK

They may not be the best words in the world, but I thought of them, so they are best to me.

YossarianFrPrez4y ago

Also, one version of Wordle people can play is to pick the best starting word every time, another involves starting with a different word every time.

jlynn4y ago

Two things I've wanted to explore with Wordle:

jonathankoren4y ago

I made my own Wordle solver recently that uses frequency analysis of letters and bigrams of five letter English words as a scoring function. It works fairly well.

Someone in this discussion suggested using frequency analysis at position, which seems interesting, especially when trying to locate misplaced letters. I might have to try that.

https://github.com/jonathankoren/wordle-solver

Supermancho4y ago

What was your source for "English words"? This is an interesting problem, itself.

jonathankoren4y ago

Googling will give you countless lists. The internet solved that problem decades ago.

1 more reply

schappim4y ago

I did the same. My source of words was the Unix dictionary found in macOS. The path is:

  /usr/share/dict/words

When I did the frequency analysis, I scoped it to only 5 letter words.

If you're looking for a great "first guess word", "trace" is one of the words that will yield the most clues.

wizofaus4y ago

FWIW, I just had a game where these were my guesses:

IRATE SOUND PLUMB FOUND (the answer)

aghilmort4y ago

Experimented with goal-based strategies last weekend:

- least time - fewest moves - fewest unique letters - etc.

** spoiler -> the (5) seed words are located at https://pastebin.com/D1DkzXA4 -- the password is gTgRCFrLYL

Imnimo4y ago

I think there are two natural optimization criteria.

oli56794y ago

If it wasn't for the 6-guess limit and you had a vocabulary, then you could use minimax to try and minimize the maximum number of guesses.

E.g. assuming you make the best possible future guesses but your opponent has chosen the least convenient word and propitiating up these minimum and maximums.

I am trying to think if this can be tweaked?

handmodel4y ago

not2b4y ago

petercooper4y ago

Arnavion4y ago

So you don't just have solutions for the next several months. You have all the solutions ever :)

bejd4y ago

It probably didn't take a lot of reverse engineering: the solution list is stored in plaintext in the site's javascript!

petercooper4y ago

I saw that, but I had naively assumed it would at least not be in the right order and some sort of date based hash would be used to shuffle it up.. ha! The simplest things..

vple4y ago

I also wrote a tool for playing: https://github.com/vple/wordle-solver/blob/main/solver.js

There's probably more tuning I can do for the algo, but roughly:

- I took all the words from the site's js as the dictionary.

- From remaining eligible words, compute the letter distribution (ignoring letters you already know are in the solution).

- Pick a word that uses as many of the most frequent letters as possible.

- Use one of those as a guess.

The goal is essentially to greedily reduce the remaining candidate words as much as possible per guess.

jonathankoren4y ago

I do wonder if looking at how a letter splits the space of letters and words would be interesting

vple4y ago

I think if I work on this some more I'd try to factor in letter positioning when deciding what to guess. My hunch is that it won't make too much of a difference though.

1 more reply

manuw4y ago

Yesterday I had fun to find out what's the best "opener" for Wordle. I wrote quickly something about it here[0] and here[1]. Looks like "irate" is a good opener.

[0] https://dev.to/mnlwldr/opener-for-wordle-11k0

[1] https://twitter.com/mnlwldr/status/1481678410854412289

Kiro4y ago

Always trying a new first word every day makes it much more fun imo. I love sitting with a blank Wordle and just let my mind try to settle on a random 5-letter word from my subconscious.

KWD4y ago

Same way I approach it. Just whatever word comes to mind at that time.

WhiteOwlEd4y ago

I feel the best strategy to win at Wordle would be to see if someone already posted the answer.

If no answer, use reliable hints ( i.e on Twitter) to better inform the initial guess.

azinman24y ago

If you’re looking up the answer, then you’re cheating. What’s the point? You can already get it via JavaScript if you’re so inclined.

meterplech4y ago

I built a command line wordle solver that provides an updated best guess as the game progresses and you learn more. Right now it's just based on word frequency, but working to add more.

Also added a command line version of wordle in case you want to play more and practice and a simulator I'm using now to explore optimal strategy more.

https://github.com/jkatzur/wordle-solver

plank4y ago

tobinfricke4y ago

Yet another Wordle solver!

https://github.com/nikitaborisov/autowordl

We choose the guess that is expected to result in the smallest set of possible solutions after one guess.

For the secret answer "QUERY" it suggests the following sequence of guesses:

1. LARES

90 possible answers after this guess.

2. GROUT

Only four possibilities now: ['ENURE', 'INURE', 'QUERY', 'QUIRE']

3. BRIBE

This narrows the field down to one possibility:

4. QUERY

ghusbands4y ago

tobinfricke4y ago

Yup, I definitely agree. I haven't implemented smart tie-breaking logic yet.

carderne4y ago

I did something similar including third guesses, but got bored before including letter positions. I included a whole bunch of words in case anyone wants to mix it up each morning. [0]

The main problem I think is that the words certainly aren’t chosen purely randomly, so the actual letter frequency could be completely different…

[0] https://rdrn.me/wordle/

IMAYousaf4y ago

I personally always start with the word OUIJA.

From there it's simple to rotate if any of the vowels get picked up into words with common non-vowels like S and T etc.

If no vowel appears out of OUIA, I always go for a word like STREP.

Becomes relatively trivial from there.

Today (01/13/2022) the word was ABBEY.

I'll represent GREEN squares with * and GREY squares with ^

It went as follows:

OUIJA^ B^E^A^ST FA^B*LE^ ABBEY

After FABLE, the first word I could think of that had at least 1 A, at least 1 B, and at least one E was ABBEY.

Adrock4y ago

thurn4y ago

ledauphin4y ago

correct me if I'm wrong, but it seems like greens are strictly more valuable than yellows, so giving them equal weight doesn't make much sense to me.

Adrock4y ago

Yes, that’s true. But how much more valuable? Would you trade 0.1 expected greens for 0.2 expected yellows? Computing the Pareto optimal frontier shows you all your options.

rckrd4y ago

My contribution to the best-first guess on Wordle is a greedy algorithm that picks the word that eliminates the most solutions. [1]

[1] https://matt-rickard.com/wordle-whats-the-best-starting-word...

elicash4y ago

On easy mode, what should the second word be? Third, fourth? How often can you get the word ENTIRELY by eliminating as many words as possible in this way?

ipsin4y ago

This strategy seems to avoid words that Wordle will accept, while not choosing them as word-of-the-day, e.g. SERAI.

If your goal is to minimize the number of remaining words in the worst case, I think SERAI is the winner, leaving you with at most 697 choices. [Assuming my code is right]

propter_hoc4y ago

Genuinely curious, why would you use SERAI rather than ARISE? Functionally equivalent, doesn't seem like a made-up word, and might actually be the answer one day.

ipsin4y ago

As CrazyStat hints, anagrams are not equivalent in Wordle, because changing the word order means that you're also changing the bucketing, the words in the anagram shifting between yellow and green.

To be more specific, I counted 882 possible words in the worst case using ARISE, versus 697 using SERAI.

CrazyStat4y ago

S is very common as the first letter, so having it in that position gives you more information than having it in fourth position (due to higher probability of green).

sg474y ago

kkcorps4y ago

schappim4y ago

If you're looking for a source of words to build your own solver, you can find one in many *nixy distros (including macOS) here:

  /usr/share/dict/words

ghusbands4y ago

Or just combine the arrays in the wordle source (being careful not to look too closely at the word order in one of them).

Mizza4y ago

What's the opposite - a "bongcloud" for Wordle?

CrazyStat4y ago

IMMIX

k2xl4y ago

Do any bots incorporate the history of wordle words? Wordle never repeats previous words so I wonder if the “best Xth word to guess” could change at some point?

takeda4y ago

I would consider that cheating. If you consider a history you might get all past (as well as future) words from the source code.

For example this Saturday's word will be panic.

k2xl4y ago

I meant for a bot perspective.

davidw4y ago

“A strange game. The only winning move is not to play.”

TheRealNGenius4y ago

That applies more to life than to wordle

davidw4y ago

Well that's bleak. Life can be pretty good, too! Talk to someone if you need to, it helps.

yogrish4y ago

Best start would be to start with a word with maximum VOWELS.(PIOUS,SAUTE etc). Next guess can be based on previous yellows/greens.

kranner4y ago

ADIEU!

natch4y ago

I'm looking for the best strategy for guessing which Wordle app in the App Store is the real one.

Tijdreiziger4y ago

None: https://techcrunch.com/2022/01/12/wordle-is-being-punished-b...

Discussed yesterday at https://news.ycombinator.com/item?id=29906892

pfooti4y ago

Just use SWALE, it hasn't done me wrong yet.

Madmallard4y ago

I pretty much always start with stare

deepsquirrelnet4y ago

I'm sure this is an application for Bayesian Inference, but I'm too uneducated to say anything more insightful than that.

mar_31044y ago

SPADE WROTH CLIMB FUNKY

Left:GJQVXZ

j / k navigate · click thread line to collapse