Visit the hamming exercise on Exercism to read the full instructions and download the exercise files.
Dig Deeper
Recursion
Recursion
module Hamming
let distance (strand1: string) (strand2: string): int option =
let rec doDistance (letters1: char list) (letters2: char list) (distance: int): int option =
match letters1, letters2 with
| [], [] -> Some distance
| [], _ -> None
| _, [] -> None
| head1 :: tail1, head2 :: tail2 when head1 <> head2 -> doDistance tail1 tail2 (distance + 1)
| _ :: tail1, _ :: tail2 -> doDistance tail1 tail2 distance
doDistance (Seq.toList strand1) (Seq.toList strand2) 0
To use (tail call) recursion to calculate the distance, we’ll introduce a helper function: doDistance.
We define this function within the distance function (also known as a nested function), but it could just as well have been defined outside the distance function.
let rec doDistance (letters1: char list) (letters2: char list) (distance: int): int option
This function takes the remaining letters for both strands as a char list, which means that we’ll be able to pattern match on it.
Besides these two lists, we’ll also take an accumulator parameter: distance, of type int.
This parameter represents the current distance and is updated between the recursive function calls until we’re done processing, at which point it will represent the total distance.
To allow a function to recursively call itself, the `rec` modified must be added.
In other words: by default, functions cannot call themselves.
Within this function, we pattern match on both letter lists at the same time, using:
match letters1, letters2 with
We first check to see if both lists are empty, which means that the strands must have had the same length.
Therefore, everything is fine and we can return the distance by wrapping it in Some:
| [], [] -> Some distance
The next two cases check if either of the lists is empty.
As we previously checked if both lists were empty, one of the lists being empty means that the other one isn’t.
This is turn implies that the strands were of different lengths, so we’ll return None:
| [], _ -> None
| _, [] -> None
At this point, we know that both lists are not empty, so we can use pattern matching to check if the first character in both lists is unequal.
If so, we’ll recursively call our function but with the tails of both lists, and the distance increment by one (as the character were unequal):
| head1 :: tail1, head2 :: tail2 when head1 <> head2 -> doDistance tail1 tail2 (distance + 1)
The final pattern is one where the lists are not empty and the first characters are both equal.
In this case, we can recursively call our function with the tails of both lists, keeping the same distance:
| _ :: tail1, _ :: tail2 -> doDistance tail1 tail2 distance
Calling the recursive helper function
The final step is to call our recursive helper function.
We’ll use Seq.toList to convert the string to char lists, and pass in an initial distance of 0:
doDistance (Seq.toList strand1) (Seq.toList strand2) 0
And with that, we have a working, tail recursive implementation!
Tail call recursion prevents stack overflows when a recursive function is called many times.
While the exercise does not have large test cases that would cause a stack overflow, it is good practice to always use using tail recursion when implementing a recursive functions.
If you'd like to read more about tail recursion, [this MSDN article](https://blogs.msdn.microsoft.com/fsharpteam/2011/07/08/tail-calls-in-f/) goes into more detail.
Another good resource on tail recursion is [this blog post](http://blog.ploeh.dk/2015/12/22/tail-recurse/).
Zip
Zip
module Hamming
let distance (strand1: string) (strand2: string): int option =
if strand1.Length <> strand2.Length then
None
else
Seq.zip strand1 strand2
|> Seq.filter (fun (letter1, letter2) -> letter1 <> letter2)
|> Seq.length
|> Some
Error path
We start by checking if the strings have unequal lengths, and return None if so:
if strand1.Length <> strand2.Length then
None
Note that we're using `string` class' `Length` property, not a function like `Seq.length`.
Even though F# is a functional-first language, you'll use types (like the `string` class) defined in the .NET framework, which is an object-oriented framework.
Inevitably, you'll thus use objects that have methods and properties defined on them.
Don't worry about using methods and objects though, F# is a multi-paradigm language and embraces the interaction with object-oriented code (like the `string` class).
Happy path
In the happy path, we know that the strings have the same length.
We’re using this in our call to Seq.zip:
Seq.zip strand1 strand2
What Seq.zip does is it takes two sequences and returns a sequence of (tuple) pairs, with the first pair containing the first element of the first sequence and the first element of the second sequence, and so on.
We can use this to create letter pairs, as a string can be treated like a char sequence:
Seq.zip "GAG" "CAT"
is equivalent to:
seq { ('G', 'C'); ('A', 'A'); ('G', 'T') }
The next step is to select only those pairs where the letters are different, we se do by piping the zipped sequence to Seq.filter:
|> Seq.filter (fun (letter1, letter2) -> letter1 <> letter2)
With that, we now have a sequence of pairs with different letters.
As the hamming distance is the number of different letter pairs, we can calculate the distance by counting the letter pairs using Seq.length and wrapping the count in a Some value:
|> Seq.length
|> Some
List comprehension
List comprehension
module Hamming
let distance (strand1: string) (strand2: string) : int option =
if strand1.Length <> strand2.Length
then None
else
[ for idx in 0 .. strand1.Length - 1 do
if strand1[idx] <> strand2[idx] then yield 1 else yield 0 ]
|> List.sum
|> Some
Error path
We start by checking if the strings have unequal lengths, and return None if so:
if strand1.Length <> strand2.Length
then None
Note that we're using `string` class' `Length` property, not a function like `Seq.length`.
Even though F# is a functional-first language, you'll use types (like the `string` class) defined in the .NET framework, which is an object-oriented framework.
Inevitably, you'll thus use objects that have methods and properties defined on them.
Don't worry about using methods and objects though, F# is a multi-paradigm language and embraces the interaction with object-oriented code (like the `string` class).
Happy path
In the happy path, we know that the strings have the same length so we can use the length (minus one) of the first string as the max of a range of indices to use to access each char of both strings and compare them:
for idx in 0 .. strand1.Length - 1 do
The entire for expression is surrounded by square brackets ([]) indicating that this is a List comprehension.
This gives you the power of returning intermediate results based on comparing each pair of chars (returning a 1 when they differ or a 0 if they don’t) and then continuing the next pair until you reach the end:
if strand1[idx] <> strand2[idx] then yield 1 else yield 0
The yield keyword indicates that this concerns an intermediate result, this can also be used in C#.
The resulting list of 1’s and 0’s is then piped into a List.sum to get the hamming distance and finally the result is wrapped in a Some.
|> List.sum
|> Some
Source: Exercism fsharp/hamming