Giter VIP home page Giter VIP logo

regex-crossword-bbc's Introduction

Puzzle No. 3 – Wednesday 5 July

This is no ordinary crossword!

Instead of a word or phrase, each clue is a regular expression (or a ‘regex’). To complete the puzzle, find the letter matching both the horizontal and vertical regex for each square.

BBC RegEx Crossword

(source)


Step 0 - Setup the puzzle and clean regexes

OK, so first thing, let's (painstakingly) write out the Regular Expressions that are in the attached image:

0: [^XZCVHFJLQM]+
1: [^PZVJG]{4}(.)[EFUG]{6}\1[^\sPZVJI]{2}
2: [^\sPQFB]{7}[^MGVAJNZ\s]+[^MVZJ]
3: N[OYSRU]{5}[NICE]{6}\s\-
4: .A[A\sDL]{4}O[AECLV\s]+

A: \sA?(SA|PE|N\s){2}
B: ([XYZ])(P|GO|EL)\1.
C: (LS|CA|OS)[L\sP][DO]{2}
D: (U)(T)\2\1[AOB?]
E: (.)\1\1\1\s
F: (FF|BE|QU|OS){2}L
G: ES?F?(OBZ|UCO|PTE)
H: S[MVU]B(TZ|BP|IV)
I: T[GLMV]{2}(E)\1
J: (JK|AE|EN|MG){2}L
K: N(XN|ZB|CA|FS){2}
L: [XHDJ]R[MZVIJ]EC
M: X?W(\sE|OS|PE){2}
N: [VIMZJ]{3}\-\s

There are several character classes used throughout the puzzle, but the literals within the character class aren't sorted alphabetically, which makes it hard to see the similarities between say [^MVZJ] and [VIMZJ].

So, let's sort the literals to makes things easier for us down the road. We'll use this going forward.

# Character Classes Sorted

0: [^CFHJLMQVXZ]+
1: [^GJPVZ]{4}(.)[EFGU]{6}\1[^IJPVZ\s]{2}
2: [^BFPQ\s]{7}[^AGJMNVZ\s]+[^JMVZ]
3: N[ORSUY]{5}[CEIN]{6}\s\-
4: .A[ADL\s]{4}O[ACELV\s]+

A: \sA?(SA|PE|N\s){2}
B: ([XYZ])(P|GO|EL)\1.
C: (LS|CA|OS)[LP\s][DO]{2}
D: (U)(T)\2\1[ABO?]
E: (.)\1\1\1\s
F: (FF|BE|QU|OS){2}L
G: ES?F?(OBZ|UCO|PTE)
H: S[MUV]B(TZ|BP|IV)
I: T[GLMV]{2}(E)\1
J: (JK|AE|EN|MG){2}L
K: N(XN|ZB|CA|FS){2}
L: [DHJX]R[IJMVZ]EC
M: X?W(\sE|OS|PE){2}
N: [IJMVZ]{3}\-\s

OK, so let's setup our blank "board." I'm using dots (.) to denote a space where we don't know the answer. At the start, we only have dots!

+---+-----------------------------+
|░░░| A B C D E F G H I J K L M N |
+---+-----------------------------+
| 0 | . . . . . . . . . . . . . . |
| 1 | . . . . . . . . . . . . . . |
| 2 | . . . . . . . . . . . . . . |
| 3 | . . . . . . . . . . . . . . |
| 4 | . . . . . . . . . . . . . . |
+---+-----------------------------+

The other "cleanup" step, by looking at the board, is we see each row has a length of 14 and each column has a height of 5.

If we look closely, some of our regexes can be cleaned up because they would create a string that doesn't match the length in either the row or column.

These are the rules that can be cleaned:

  1. Regex B (([XYZ])(P|GO|EL)\1.) - If the char P was chosen in the 2nd captured group then the length would be 4; we need a length of 5. So that can be removed. Regex B becomes ([XYZ])(GO|EL)\1..
  2. Regex M (X?W(\sE|OS|PE){2}) - In the X? part of the regex, we have an optional modifier on the X, so the X can appear 0 or 1 times. If it was shown once, then the whole string would have a length of 6; we need a length of 5. So that can be removed, so regex M becomes W(\sE|OS|PE){2}.
  3. Regex A (\sA?(SA|PE|N\s){2}) - Regex A is similar to regex M with tha A? part, so we can remove that so A becomes \s(SA|PE|N\s){2}.
  4. Regex G (ES?F?(OBZ|UCO|PTE)) - The part that can be cleaned up is the S?F? section. The following strings match that pattern: '' (empty), S, F, and SF. If we had an empty string, the whole length would be 4 (we need 5). If we chose SF, the whole length would be 6. So the only valid choices are S or F. This means we can modify the regex a tiny bit to make it a bit clearer, and replace it with a character class of [FS]. So regex G becomes E[FS](OBZ|UCO|PTE).

After our cleaning steps, our final rules look like:

# Character classes sorted and all rules "cleaned"

0: [^CFHJLMQVXZ]+
1: [^GJPVZ]{4}(.)[EFGU]{6}\1[^IJPVZ\s]{2}
2: [^BFPQ\s]{7}[^AGJMNVZ\s]+[^JMVZ]
3: N[ORSUY]{5}[CEIN]{6}\s\-
4: .A[ADL\s]{4}O[ACELV\s]+

A: \s(SA|PE|N\s){2}
B: ([XYZ])(GO|EL)\1.
C: (LS|CA|OS)[LP\s][DO]{2}
D: (U)(T)\2\1[ABO?]
E: (.)\1\1\1\s
F: (FF|BE|QU|OS){2}L
G: E[FS](OBZ|UCO|PTE)
H: S[MUV]B(TZ|BP|IV)
I: T[GLMV]{2}(E)\1
J: (JK|AE|EN|MG){2}L
K: N(XN|ZB|CA|FS){2}
L: [DHJX]R[IJMVZ]EC
M: W(\sE|OS|PE){2}
N: [IJMVZ]{3}\-\s

Everything should be setup the way we want it, so now we can start solving it!

Step 1 - Literals and Backreferences

In the RegExes, there are a decent number of simple literals. Let's fill those in.

Filled in:

A0, D0, G0, H0, I0, K0, M0 D1, L1,
H2,
A3, I3, L3, M3, N3
B4, E4, F4, G4, J4, L4, N4

+---+-----------------------------+
|░░░| A B C D E F G H I J K L M N |
+---+-----------------------------+
| 0 |   . . U . . E S T . N . W . |
| 1 | . . . T . . . . . . . R . . |
| 2 | . . . . . . . B . . . . . . |
| 3 | N . . . . . . . E . . E   - |
| 4 | . A . .   L O . . L . C .   |
+---+-----------------------------+

Some of those literals are referenced by backreferences. Let's fill those in where we can.

Filled in:

D2,
D3,
I4

+---+-----------------------------+
|░░░| A B C D E F G H I J K L M N |
+---+-----------------------------+
| 0 |   . . U . . E S T . N . W . |
| 1 | . . . T . . . . . . . R . . |
| 2 | . . . T . . . B . . . . . . |
| 3 | N . . U . . . . E . . E   - |
| 4 | . A . .   L O . E L . C .   |
+---+-----------------------------+

There is another backreference at L1, which references E1, so let's fill that in.

Filled in:

E1

+---+-----------------------------+
|░░░| A B C D E F G H I J K L M N |
+---+-----------------------------+
| 0 |   . . U . . E S T . N . W . |
| 1 | . . . T R . . . . . . R . . |
| 2 | . . . T . . . B . . . . . . |
| 3 | N . . U . . . . E . . E   - |
| 4 | . A . .   L O . E L . C .   |
+---+-----------------------------+

OK, that should be it for all the obvious answers. Next we have to start getting a little clever. For each of our characters so far, we only used one rule to get the letter. For example, for D0, we used the regex from rule 0. Let's make a list of where the answers were derived from, and then use that to "inspect" the other linked rule. So again from our example, the cell D0 used regex 0 to determine its state, so let's inspect rule D to see if that helps us with any part of the rule.

  • A0 - derived from regex A → look at regex 0
  • D0 - derived from regex D → look at regex 0
  • G0 - derived from regex G → look at regex 0
  • H0 - derived from regex H → look at regex 0
  • I0 - derived from regex I → look at regex 0
  • K0 - derived from regex K → look at regex 0
  • M0 - derived from regex M → look at regex 0
  • D1 - derived from regex D → look at regex 1
  • E1 - derived from regex 1 → look at regex E
  • L1 - derived from regex L → look at regex 1
  • D2 - derived from regex D → look at regex 2
  • H2 - derived from regex H → look at regex 2
  • A3 - derived from regex 3 → look at regex A
  • D3 - derived from regex D → look at regex 3
  • I3 - derived from regex I → look at regex 3
  • L3 - derived from regex L → look at regex 3
  • M3 - derived from regex M → look at regex 3
  • N3 - derived from regex N → look at regex 3
  • B4 - derived from regex 4 → look at regex B
  • E4 - derived from regex E → look at regex 4
  • F4 - derived from regex F → look at regex 4
  • G4 - derived from regex 4 → look at regex G
  • J4 - derived from regex J → look at regex 4
  • L4 - derived from regex L → look at regex 4
  • N4 - derived from regex N → look at regex 4
  • I4 - derived from regex I → look at regex 4

Looking at regex 0, that isn't going to be a big help just yet:

0: [^CFHJLMQVXZ]+

So we can skip that. D1 (value of T) also doesn't help with regex 1, it merely fits in the character class [^GJPVZ]{4}.

E1 (value of R) is the first one that helps us. We used regex 1, so let's look at regex E:

E: (.)\1\1\1\s

OK, so since the second char is an R, and it was a backreference to the first char which was a dot, that means the first char is an R. Also, there are two more backreferences to that first char (chars 3 and 4), so those are also Rs.

So we can get E0, E2, and E3 to give:

Filled in:

E0, E2, E3

+---+-----------------------------+
|░░░| A B C D E F G H I J K L M N |
+---+-----------------------------+
| 0 |   . . U R . E S T . N . W . |
| 1 | . . . T R . . . . . . R . . |
| 2 | . . . T R . . B . . . . . . |
| 3 | N . . U R . . . E . . E   - |
| 4 | . A . .   L O . E L . C .   |
+---+-----------------------------+

Moving on, L1 already helped with the one backreference (for cell E1). D2 and H2 also don't tell us anything about regex 2.

regex-crossword-bbc's People

Contributors

romellem avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.