Advent of Code day 2 [spoilers]

Advent of Code Day 2 asks us to do some string searches. It took me about 40 minutes, and I can't say I'm particularly happy with the results, but it worked.

Part 1 asks us to find the number of words in the input with exactly two of some letter, or exactly three of some letter.

I decided this would be easiest if we sorted the letters in each word. (In Python I'd probably just use a dictionary and count, tbh.) I started out with a recursive solution trying to pattern match; my idea was:

[] -> false
xxyZ -> true
xxxZ -> recursively solve Z
xyZ -> recursively solve yZ

Unfortunately the case where there are three matching letters at the start is buggy, Z might start with another couple of the same letter.

Also, it's more idomatic in Haskell to use a higher-order function instead of writing the recursion yourself. I came up with a way to use foldl to handle one letter at a time, and build up the frequency table in reverse order. I don't think that lazy evaluation prevents us from processing the whole word before looking for frequency 2 or frequency 3, though. If the output list were in-order, this would work but is accessing the last element of the list fast? Or would reversing the list help? I don't know.

countLetter :: [(Char,Int)] -> Char -> [(Char,Int)]
countLetter [] c = [(c,1)]
countLetter l@((c,n):rest) d
 | c == d = [(c,n+1)] ++ rest
 | otherwise = [(d,1)] ++ l

countLetters :: String -> [(Char,Int)]
countLetters s = foldl countLetter [] ( sort s )

exactly :: Int -> (Char,Int) -> Bool
exactly n (_,k) = (n == k)

exactlyTwoOfSomeLetter :: String -> Bool
exactlyTwoOfSomeLetter s = any (exactly 2) ( countLetters s ) 
  
exactlyThreeOfSomeLetter :: String -> Bool
exactlyThreeOfSomeLetter s = any (exactly 3) ( countLetters s ) 

checksum :: [String] -> Int
checksum l = (length (filter exactlyTwoOfSomeLetter l)) * (length (filter exactlyThreeOfSomeLetter l))

Part 2 asks us to find two words in the list that differ by just one letter (in a particular position.) I think there are ways to solve this that aren't O(N^2) but I didn't believe any of them were going to be easy, or necessary.

I wrote a recursive test for differing by one character:

differsByOne :: String -> String -> Bool
differsByOne [] [] = False
differsByOne (a:b) (c:d)
  | a/=c = (b==d)
  | otherwise = differsByOne b d

Maybe it would be better to zip the two words and then check the number of mismatches? But this lets us use a whole-string compare on the tail which might be more efficient if the words were large. They were 26 letters each, so I didn't need to worry about mismatched lengths.

The rest of the part 2 solution is just gross:

findMatch :: String -> [String] -> Maybe String
findMatch x l =
  case filter (differsByOne x) l of
    [] -> Nothing
    (a:b) -> Just a

distance1 :: [String] -> Maybe (String, String)
distance1 [] = Nothing
distance1 (x:s) =
  case findMatch x s of
    Nothing -> distance1 s
    Just y -> Just (x, y)

I looked for an equivalent to Python's itertools.combinations( list, 2 ) which would be a better way to get all pairs. Strangely, this does not seem to be part of the standard library. There are various horrific-looking pieces of code here: https://stackoverflow.com/questions/21265454/subsequences-of-length-n-from-list-performance/21288092#21288092 and here: https://stackoverflow.com/questions/21775378/get-all-possible-combinations-of-k-elements-from-a-list when I was really expecting to see "just use X."

This Haskell package is closest to what I was expecting to find: http://hackage.haskell.org/package/permutation-0.5.0.5/docs/Data-Choose.html, but it works only on integers and so I'd still have to index into the original list.

RosettaCode gives this one-liner:

import Data.List (sort, subsequences)
comb m n = sort . filter ((==m) . length) $ subsequences [0..n-1]

which doesn't seem like a very efficient way to go about it, but it's hard for me to tell. I'm kind of boggled that there isn't a standard way of doing this common combinatorial operation.

Full solution: https://github.com/mgritter/aoc2018/blob/master/day2/day2.hs

Things I learned today:

Not-equals is /= instead of !=
fold comes in at least 4 varieties and I understand the difference between "left" and "right" but not the primed and unprimed versions
No "all K-combinations" function?!?!