Permutations of a string: Dynamic Programming with Python
Permutations of a string
We want to compute all the permutations of a given string. The idea:
for every char in a string: permutations = char + permutations(rest of the string)
e.g. given the string ‘abc’:
a + perm('bc') = abc, acb b + perm('ac') = bac, bca c + perm('ab') = cab, cba
This translate into the following code (could be prettier if strings were mutable in python, but still)
def permute(s): res = [] if len(s) == 1: res = [s] else: for pos, ch in enumerate(s): for perm in permute(s[:pos] + s[pos+1:]): res += [ch + perm] return res
The problem with this snippet is that obviously its running time is N!
. We can do better than this using Dynamic Programming.
Dynamic Programming
The idea behind DP is to cache the results of previous computations, usually in a hash table for fast lookups. We can use this to speed up the previous code:
def perm(s): res = [] if len(s) == 1: res = [s] return [s] if s in cache: return cache[s] else: for i, c in enumerate(s): for p in perm(s[:i] + s[i+1:]): res += 1 cache[s] = res return res
The base case stays the same. Before the recursive call, we check if the argument of perm()
is already been computed; in this case we return the cached results; otherwise we compute it and we add the result to the cache.
A peek inside the recursion tree
It is interesting to look inside the recursion tree and see exactly what is being cached and when. The recursion tree for perm('abcd')
is the following:
b + p(cd) a + p(bcd) c + p(bd) d + p(bc) a + p(cd) [cache hit] b + p(acd) c + p(ad) d + p(ac) a + p(bd) [cache hit] c + p(abd) b + p(ad) [cache hit] d + p(ab) a + p(bc) [cache hit] d + p(abcd) b + p(ac) [cache hit] c + p(ab) [cache hit]
And if we add a couple of print
to the code we can see what is been cached as well as the cache hits in action:
cd ['cd', 'dc'] bd ['bd', 'db'] bc ['bc', 'cb'] bcd ['bcd', 'bdc', 'cbd', 'cdb', 'dbc', 'dcb'] CACHE HIT cd -> ['cd', 'dc'] ad ['ad', 'da'] ac ['ac', 'ca'] acd ['acd', 'adc', 'cad', 'cda', 'dac', 'dca'] CACHE HIT bd -> ['bd', 'db'] CACHE HIT ad -> ['ad', 'da'] ab ['ab', 'ba'] abd ['abd', 'adb', 'bad', 'bda', 'dab', 'dba'] CACHE HIT bc -> ['bc', 'cb'] CACHE HIT ac -> ['ac', 'ca'] CACHE HIT ab -> ['ab', 'ba'] abc ['abc', 'acb', 'bac', 'bca', 'cab', 'cba'] abcd ['abcd', 'abdc', 'acbd', 'acdb', 'adbc', 'adcb', 'bacd', 'badc', 'bcad', 'bcda', 'bdac', 'bdca', 'cabd', 'cadb', 'cbad', 'cbda', 'cdab', 'cdba', 'dabc', 'dacb', 'dbac', 'dbca', 'dcab', 'dcba']
The code used is simply:
def perm(s): res = [] if len(s) == 1: res = [s] return [s] if s in cache: # cache hit print 'CACHE HIT ' + str(s) + ' -> ' + str(cache[s]) return cache[s] else: for i, c in enumerate(s): for p in perm(s[:i] + s[i+1:]): res += 1 # last cached permutation print s, res cache[s] = res return res
Dynamic Programming using @decorators
An elegant way to implement DP in Python is using a decorator.
def memo(fn): cache = {} miss = object() def wrapper(*args): result = cache.get(args, miss) if result is miss: result = fn(*args) cache[args] = result return result return wrapper
Which checks if the args provided to fn are cached in a dictionary; if they are, the result of fn(*args)
is returned; if they are not, fn(*args)
is executed and its result cached for future use.
Using a decorator is such a good idea that Python3 provides a built-in module for that, @lru_cache.