Thursday, November 20, 2014

Find if a Python string is an anagram of a palindrome

By Vasudev Ram

I saw this interesting thread on Hacker News some 10-odd days ago:

HN: Rust and Go

Apart from being generally of interest, it had a sub-thread that was about finding if a given string is an anagram of a palindrome. A few people replied in the thread, giving solutions in different languages, such as Scala, JavaScript, Go and Python.

Some of the Python solutions were already optimized to some extent (e.g. using collections.Counter and functools.partial - it was a thread about the merits of programming languages, after all), so I decided to write one or two simple or naive solutions instead, and then see if those could be optimized some, maybe differently from the solutions in the HN thread.

Here is one such simple solution to the problem, of finding out if a string is an anagram of a palindrome. I've named it iaop_01.py (for Is Anagram Of Palindrome, version 01). The solution includes a scramble() function, to make an anagram of a palindrome, so that we have input for the test, and a main function to run the rest of the code to exercise things, for both the case when the string is an anagram of a palindrome, and when it is not.

The logic I've used is this (in pseudocode, even though Python is executable pseudocode, ha ha):


For each character c in the string s:
        If c already occurs as a key in dict char_counts,
        increment its count (the value corresponding to the key),
        else set its count to 1.
    After the loop, the char_counts dict will contain the counts 
    of all the characters in the string, keyed by character.
    Then we check how many of those counts are odd.
    If at most one count is odd, the string is an anagram of 
    a palindrome, else not.


And here is the Python code for iaop_01.py:

"""
Program to find out whether a string is an anagram of a palindrome.
Based on the question posed in this Hacker News thread:
    https://news.ycombinator.com/item?id=8575589
"""

from random import shuffle

def anagram_of_palindrome(s):
    char_counts = {}
    for c in s:
        char_counts[c] = char_counts.get(c, 0) + 1
    odd_counts = 0
    for v in char_counts.values():
        if v % 2 == 1:
            odd_counts += 1
    return odd_counts <= 1

def scramble(s):
    lis = [ c for c in s ]
    shuffle(lis)
    return ''.join(lis)

def main():
    # First, test with a list of strings which are anagrams of palindromes.
    aops = ['a', 'bb', 'cdc', 'foof', 'madamimadam', 'ablewasiereisawelba']
    for s in aops:
        s2 = scramble(s)
        print "{} is an anagram of palindrome ({}): {}".format(s2, \
            s, anagram_of_palindrome(s2))
    print
    # Next, test with a list of strings which are not anagrams of palindromes.
    not_aops = ['ab', 'bc', 'cde', 'fool', 'padamimadam']
    for s in not_aops:
        s2 = scramble(s)
        print "{} is an anagram of a palindrome: {}".format(s2, \
            anagram_of_palindrome(s2))

main()
And here is the output of running it:
$ python iaop_01.py
a is an anagram of palindrome (a): True
bb is an anagram of palindrome (bb): True
ccd is an anagram of palindrome (cdc): True
ffoo is an anagram of palindrome (foof): True
daadmamimma is an anagram of palindrome (madamimadam): True
srewaeaawbeilebials is an anagram of palindrome (ablewasiereisawelba): True

ba is an anagram of a palindrome: False
bc is an anagram of a palindrome: False
dec is an anagram of a palindrome: False
loof is an anagram of a palindrome: False
ampdaaiammd is an anagram of a palindrome: False

One simple optimization that can be made is to add these two lines:
if odd_counts > 1:
    return False

just after the line "odd_count += 1". What that does is stop early if it finds that the number of characters with odd counts is greater than 1, even if there are many more counts to be checked, since our rule has been satisfied.

If I think up more optimizations to the above solution, or any alternative solutions, I'll show them in a future post. Update: Since it is on a related topic, you may also like to check out this other post I wrote a while ago: A simple text file indexing program in Python.

BTW, the two longer palindromes are lower-cased, scrunched-together versions of these well-known palindromes:

Madam, I'm Adam.

Able was I ere I saw Elba

(attributed to Napoleon).

- Vasudev Ram - Dancing Bison Enterprises

Signup for news about products from me.

Contact Page

2 comments:

martineh said...

def anagram_of_palindrome(s):
s1 = set()
for c in s:
if c in s1:
s1.remove(c)
else:
s1.add(c)
return len(s1) < 2

Vasudev Ram said...

@martineh: That's a clever solution.

On first look I thought it would not work, because it was adding and removing (different occurrences of) the same character from the set, but I ran it, and it did work.

Then checked the code again, and figured out how it works, and I'm mentioning it for the benefit of any readers who didn't figure it out: for characters with an even count, at the end of the loop, they will not be in the set (because of an even number of adds and removes); only characters with an odd count will be in the set (once only, due to the last add of that character); the return statement takes care of the rest. And to understand the logic we have to remember that a set can contain only unique items.

I was going to use a set in a later optimization, but not sure I would have thought of your solution.

Excellent, and thanks for commenting :)