Bradley Harris

Lesson Learned - Two Character Strings

I needed to generate a list of all possible combinations two capital letters for some test data. The straightforward way to do that is with nested loops but I was sure there was a snappy one liner to do it with a list comprehension. Here is what I came up with in Python:

from itertools import product
from string import ascii_uppercase

[a+b for a, b in list(product(ascii_uppercase, ascii_uppercase))]

Got the job done but the need to dig into itertools to make it happen had me wondering. Would this be simpler in Ruby?

chars = ('A'..'Z').to_a
chars.product(chars).collect { |x, y| x + y }

The syntax differs but the concepts used are really very similar. I think Ruby tends to encourage this sort of thing a little more by including the “.product” method on the Array class. Both languages provide the same basic tools for the job though.

Things get a little more interesting when generating all of the 3 character strings.

Python:

[a+b+c for a, b, c in list(product(ascii_uppercase, ascii_uppercase, ascii_uppercase))]

Ruby:

chars.product(chars, chars).collect { |x, y, z| x + y + z }

Now the stand alone nature of “product” in Python makes things read more clearly. At first it wasn’t even obvious to me that the Ruby “.product” method could take N arguments until I took another look at the documentation.

To drag this out to its nerdy conclusion: the problem of generating all N character strings of capital letters is fundamentally 26N. Those magical “product” methods that Python and Ruby give us are implemented as some version of those nested loops I wanted to avoid. For the 2 or 3 character versions I needed for my tests, there is no real harm. I started wondering though: What if I needed 20 character strings?

("%0+20s" % index.to_s(26)).gsub(" ", "0").split("").collect {|x| chars[x.to_i(26)]}.join

This snippet will return the string at “index” from the list of all 20 character strings without incurring the cost to generate the whole list.

Step by step:

("%0+20s" % index.to_s(26)).gsub(" ", "0")

Convert the index to a Base 26 string representation and pad the converted number with zeros. Kernel::sprintf can zero pad integer values but not Base 26 numbers so we space pad it as a string with spaces and substitute zeros for the spaces.

.split("")

Split converted and padded string into an array of characters. Each item in that list will represent the Base 26 index of a letter in the final string.

.collect {|x| chars[x.to_i(26)]}.join

Convert the Base 26 indices back to integer indices and look up the proper character. Finally join those characters back into a string.