Back
Close
  • 6

Learning Opportunities

This puzzle can be solved using the following concepts. Practice using these concepts and improve your skills.

Statement

 Goal

You like to play 'Wordle' and decide to code up a script to produce a simple RegEx ('Regular Expression') pattern to filter words that match your information about the 5 letter word.

You are given an input n and then n lines of pairs of guess and result, representing words that have been guessed and information about the correctness of each letter, respectively.
e.g. utter YYG_G
A result of G (Green) means the letter in the corresponding position is correct (the second t and the r in the example). A result of Y (Yellow) means the letter in the corresponding position appears somewhere else in the word and is not already there in this guess (the u and the t in the example). This example's only solution is the word tutor, which you can see has the 3rd and 5th letters unchanged from the guess utter but the 1st and 2nd letters have moved to new positions. G means 'appears here' and Y means 'appears elsewhere'.

You can capture this in RegEx! Of course, there are many different ways to do it, so you've defined a specific way that you want to generate it. Details are below:

• The RegEx starts with a ^ and ends with $. These are 'anchors' that define the start and end of our word.

• Whenever letters are grouped together in square brackets, they appear in alphabetical order. (e.g. [defg])

• Any letters with a Y result - and no G results - appear somewhere unknown in the word, we use a 'positive look-ahead' to ensure they occur. A 'positive look-ahead' for the letter b looks like (?=.*b) but if there are multiple letters with Y results they cannot be grouped in square brackets, instead they should be in separate look-aheads ordered alphabetically. If the Y result letters are b, l, d then the start of your RegEx will be ^(?=.*b)(?=.*d)(?=.*l). If there are no Y result letters, then don't include any look-aheads.

• Any letters with a _ result - and no G or Y results - do not appear in the word and are excluded with a 'negative look-ahead'. A 'negative look-ahead' for the letter d looks like (?!.*d) but if there are multiple letters to exclude they are grouped into square brackets. If the _ result letters are d, f, e then the next section of your RegEx will be (?!.*[def]).

Be sure not to include letters in the look-ahead that have had G or Y results as well as _, since a letter can get a _ if it appears twice in the guess but only once in the answer. (e.g. upper __GGG, where the first p is marked _ but the second is marked G, the first p is not marked Y since only one p is needed in this word, one valid solution would be taper). If there are no eligible letters, then don't include the look-ahead.

• Any positions that haven't received a G result are then shown as . meaning any character.

Y result positions - that are not also G positions - have an 'immediate negative look-ahead' placed in front of their ., to ensure the Y letter is excluded from this position, these negative look-aheads look like (?!c) where the c is whichever letter needs to be excluded at the position they precede. Note it doesn't have the .* part which means 'anything' and would make this look-ahead exclude the letter from appearing anywhere from this point on, rather than just excluding it as the immediate next letter. If multiple letters have received a Y result for the same position then they can be grouped in square brackets. E.g. if a position excludes letters r, e, d then it will appear as (?![der])..

• Any positions with a G result will appear in the RegEx as the letter that was marked G in the guess.

You can test your RegEx on the RegEx dictionary by Lou Hevly to see which words match your answers: https://www.visca.com/regexdict/

Example 1
2
paper ___G_
boils _G___

Answer
^(?!.*[abilprs]).o.e.$
Solution
There are no Y result letters, so no positive look-aheads. The _ result letters with no other results are [abilprs], so they go in the negative look-ahead (?!.*[abilprs]). The G result positions are the 2nd and 4th letters which are o and e, respectively. Unknown letters are marked ..

Example 2
2
waged __YG_
boils _G__G

Answer
^(?=.*g)(?!.*[abdilw]).o(?!g).es$
Solution
The only Y result letter is g, so we start with a positive look-ahead (?=.*g). The _ result letters with no other results are [abdilw], so they go in the negative look-ahead (?!.*[abdilw]). The G result positions are the 2nd, 4th and 5th letters which are o, e and s, respectively. Unknown letters are marked .. The 3rd letter's . has an immediate negative look-ahead to exclude a g from appearing as the 3rd letter (?!g)..
Input
Line 1 : An integer n, the number of guesses made
Next n lines : A space-separated guess and result, both 5 characters
Output
1 line : The RegEx capturing all possible solutions, structured as described
Constraints
0 ≤ n ≤ 6
Each guess will always be in lowercase
Example
Input
0
Output
^.....$

A higher resolution is required to access the IDE