Finding Word Squares

I completed a Ruby challenge to find word squares. A (perfect) word square is formed from a group words such that if you arrange them in a square, the same words can be read vertically and horizontally. I used a trie and recursion in my solution.

One of goals of the challenge is to find a 7 x 7 square in under a minute. My solution found a 7 x 7 in five seconds (as measured by RSpec) and an 8 x 8 in 123 minutes. I ran these tests on a 2009 MacBook Pro with macOS Mavericks.

Assumptions

The challenge didn’t specify perfect word squares, because it is possible to find squares where the word in the first row is not the same as the word in the first column. I limited my scope to perfect squares only.

Another assumption I made is to ignore capitalization. Upcase provided a list of 235,886 words that contained proper nouns such as ‘Aaronite’. I converted all letters to lowercase.

Classes

I created two classes: WordReader and WileySquare. WordReader is responsible for reading a list of words and returning only those that have a particular size (length).

WileySquare contains the algorithm to find a perfect word square. It uses the trie data structure, courtesy of the algorithms gem.

Algorithm

First, the algorithm populates the trie with the list of words that have length N. The algorithm then loops through each word and checks if it can be used to form an N x N word square. If a square is found, that square is returned and the algorithm ends. The steps described above are capsulated in the method find_square below:

def find_square
  word_list = WordReader.read_words_with_size(@file_name, @word_size)
  
  if word_list.nil? or word_list.empty?
    return EMPTY_LIST
  elsif word_list.size < @word_size
    return NOT_ENOUGH
  end
  
  @word_list = word_list
  populate_trie()
  
  @word_list.each do |word|
    a_wiley_square =  construct_square_with(word) 
    if a_wiley_square.size < @word_size
      next
    else
      return a_wiley_square
    end
  end
  
  return NO_SQUARE
end

The method construct_square_with(word) uses the word to an array with a single element. It then calls the helper method _construct_square_with(words_array, position) that takes an array of words and a position value. The position value is an positive integer than represents which row or column we need to fill. For example, the second word in a 4 x 4 is in position 2, the third word is in position 3, and the last word is in position 4.

def _construct_square_with(words_array, position)
  possible_words = find_words_for_position(words_array, position)
  
  if position == @word_size and !possible_words.empty?
    return words_array.concat([possible_words.first]) # we found a square!
  elsif possible_words.empty?
    return words_array # return an incomplete square
  else
    possible_words.each do |p_word|
      temp_array = words_array.dup.concat([p_word])          
      if check_trie(temp_array, position)
        candidate_square = _construct_square_with(temp_array, position + 1) 
        return candidate_square if candidate_square.size == @word_size
      end
      next
    end 
  end
  
  return words_array # return an incomplete square
end

Line 3 of _construct_square_with utilizes the trie by calling the method find_words_for_position. This method finds words (trie keys) that are compatible with the words in the words_array. As an example, finding candidate words for the third position requires the trie to search for keys that start with the 3rd character from each of first and second words.

Once we have candidates words for ith position, we check if this position is the last one we need. If so, return the square formed by adding the first candidate word to the existing array of words (lines 4–5 above). If there are no candidate words, we return an incomplete square (lines 6–7)

If the ith position is not the last one, lines 9–16 continues the search for the square by looping through the candidate words and (recursively) calling _construct_square_with. If no square is found, we return the original words_array parameter (line 19).

On line 11, I included a check for the viability of the candidate word. The method check_trie is similar to find_words_for_position, in that it uses the trie to find keys (words). The difference is that it searches for words for the remaining positions given the word array and a candidate word.

For example, if the algorithm started with CARD, and a candidate word for position 2 is AREA, then we need to check the trie for words that begin with RE (for the third position) and DA (fourth position) for a 4 x 4 square. If the trie cannot find possible keys, then the algorithm proceeds to the next candidate word.

# example
# C  A  R  D
# A  R  E  A => candidate word for position 2
# R  E  *  * => check existence of words that start with R E
# D  A  *  * => check existence of words that start with D A

def check_trie(words, position)
  wildcards = "*" * (@word_size - position)
  
  (position...@word_size).each do |c| 
    # get the chars from each word for each position, concat them, and test the trie
    prefix = ""
    words.each do |word|
      prefix.concat(word[c])
    end
    return false if @trie.wildcard("#{prefix}#{wildcards}").empty?
  end
  
  return true
end

Observations

The algorithm goes through the list from WordReader sequentially, and this list is arranged alphabetically. For word squares with size less than 8, the algorithm was able to find a square within the first 5 words. For the 8 x 8, a solution was found using the 165th word as the starting word.

The gem’s implementation of tries allows for storing a [key, value] pair on each node. In my solution, I simply stored the word as the value.