Finding Word Squares
I completed a Ruby challenge to find word squares. A (perfect) word square is formed from a group words such that if you arrange them in a square, the same words can be read vertically and horizontally. I used a trie and recursion in my solution.
One of goals of the challenge is to find a 7 x 7 square in under a minute. My solution found a 7 x 7 in five seconds (as measured by RSpec) and an 8 x 8 in 123 minutes. I ran these tests on a 2009 MacBook Pro with macOS Mavericks.
Assumptions
The challenge didn’t specify perfect word squares, because it is possible to find squares where the word in the first row is not the same as the word in the first column. I limited my scope to perfect squares only.
Another assumption I made is to ignore capitalization. Upcase provided a list of 235,886 words that contained proper nouns such as ‘Aaronite’. I converted all letters to lowercase.
Classes
I created two classes: WordReader
and WileySquare
. WordReader is responsible for reading a list of words and returning only those that have a particular size (length).
WileySquare contains the algorithm to find a perfect word square. It uses the trie data structure, courtesy of the algorithms gem.
Algorithm
First, the algorithm populates the trie with the list of words that have length N. The algorithm then loops through each word and checks if it can be used to form an N x N word square. If a square is found, that square is returned and the algorithm ends. The steps described above are capsulated in the method find_square
below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def find_square
word_list = WordReader.read_words_with_size(@file_name, @word_size)
if word_list.nil? or word_list.empty?
return EMPTY_LIST
elsif word_list.size < @word_size
return NOT_ENOUGH
end
@word_list = word_list
populate_trie()
@word_list.each do |word|
a_wiley_square = construct_square_with(word)
if a_wiley_square.size < @word_size
next
else
return a_wiley_square
end
end
return NO_SQUARE
end
The method construct_square_with(word)
uses the word
to an array with a single element. It then calls the helper method _construct_square_with(words_array, position)
that takes an array of words and a position value. The position value is an positive integer than represents which row or column we need to fill. For example, the second word in a 4 x 4 is in position 2, the third word is in position 3, and the last word is in position 4.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def _construct_square_with(words_array, position)
possible_words = find_words_for_position(words_array, position)
if position == @word_size and !possible_words.empty?
return words_array.concat([possible_words.first]) # we found a square!
elsif possible_words.empty?
return words_array # return an incomplete square
else
possible_words.each do |p_word|
temp_array = words_array.dup.concat([p_word])
if check_trie(temp_array, position)
candidate_square = _construct_square_with(temp_array, position + 1)
return candidate_square if candidate_square.size == @word_size
end
next
end
end
return words_array # return an incomplete square
end
Line 3 of _construct_square_with
utilizes the trie by calling the method find_words_for_position
. This method finds words (trie keys) that are compatible with the words in the words_array
. As an example, finding candidate words for the third position requires the trie to search for keys that start with the 3rd character from each of first and second words.
Once we have candidates words for ith position, we check if this position is the last one we need. If so, return the square formed by adding the first candidate word to the existing array of words (lines 4–5 above). If there are no candidate words, we return an incomplete square (lines 6–7)
If the ith position is not the last one, lines 9–16 continues the search for the square by looping through the candidate words and (recursively) calling _construct_square_with
. If no square is found, we return the original words_array
parameter (line 19).
On line 11, I included a check for the viability of the candidate word. The method check_trie
is similar to find_words_for_position
, in that it uses the trie to find keys (words). The difference is that it searches for words for the remaining positions given the word array and a candidate word.
For example, if the algorithm started with CARD, and a candidate word for position 2 is AREA, then we need to check the trie for words that begin with RE (for the third position) and DA (fourth position) for a 4 x 4 square. If the trie cannot find possible keys, then the algorithm proceeds to the next candidate word.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# example
# C A R D
# A R E A => candidate word for position 2
# R E * * => check existence of words that start with R E
# D A * * => check existence of words that start with D A
def check_trie(words, position)
wildcards = "*" * (@word_size - position)
(position...@word_size).each do |c|
# get the chars from each word for each position, concat them, and test the trie
prefix = ""
words.each do |word|
prefix.concat(word[c])
end
return false if @trie.wildcard("#{prefix}#{wildcards}").empty?
end
return true
end
Observations
The algorithm goes through the list from WordReader sequentially, and this list is arranged alphabetically. For word squares with size less than 8, the algorithm was able to find a square within the first 5 words. For the 8 x 8, a solution was found using the 165th word as the starting word.
The gem’s implementation of tries allows for storing a [key, value] pair on each node. In my solution, I simply stored the word as the value.