Archive

Archive for the ‘Haskell’ Category

Prime Factorization using Unfold in Haskell

March 17th, 2009
Comments Off

I randomly yesterday started thinking about the unfoldr function in Haskell while working out at the gym (how nerdy is that, I am lifting iron but thinking of functional programming). Unfoldr take a single and an unfolding function and turns it into a list (the opposite of fold).  At the gym I was thinking about an application where I can use this and I decided that when I got home I would use it to write a prime factorization function.  This is a method that when given a number returns the list of its prime factors.

It was easy to write the only part I am not pleased about is the code I used to deal with tuples.  It seems clumsy and I am still looking for a way to clean that up.

One note: The code below references a list of prime numbers called primes , which is not shown.

Here is the code:

   1: primeFactors x = unfoldr findFactor x
   2:                  where
   3:                    first (a,b,c) = a
   4:                    findFactor 1 = Nothing
   5:                    findFactor b = (\(_,d,p)-> Just (p, d))
   6:                                   $ head $ filter ((==0).first) 
   7:                                   $  map (\p -> (b `mod` p, b `div` p, p))  primes

This function will take any number which is greater than 1 and return a list of its prime factors.  But don’t take my word for it, I wrote a quickcheck property to ensure the prime factors multiply back to the original number:

   1: prop_factors num =  num > 1 ==> num == (foldr1 (*) $ primeFactors num)

When running quickcheck on this property you see the following: 

quickCheck prop_factors
OK, passed 100 tests.

 

Author: MattManela Categories: Haskell Tags:

Prime Factorization using Unfold in Haskell

March 17th, 2009
Comments Off

I randomly yesterday started thinking about the unfoldr function in Haskell while working out at the gym (how nerdy is that, I am lifting iron but thinking of functional programming). Unfoldr take a single and an unfolding function and turns it into a list (the opposite of fold).  At the gym I was thinking about an application where I can use this and I decided that when I got home I would use it to write a prime factorization function.  This is a method that when given a number returns the list of its prime factors.

It was easy to write the only part I am not pleased about is the code I used to deal with tuples.  It seems clumsy and I am still looking for a way to clean that up.

One note: The code below references a list of prime numbers called primes , which is not shown.

Here is the code:

   1: primeFactors x = unfoldr findFactor x
   2:                  where
   3:                    first (a,b,c) = a
   4:                    findFactor 1 = Nothing
   5:                    findFactor b = (\(_,d,p)-> Just (p, d))
   6:                                   $ head $ filter ((==0).first) 
   7:                                   $  map (\p -> (b `mod` p, b `div` p, p))  primes

This function will take any number which is greater than 1 and return a list of its prime factors.  But don’t take my word for it, I wrote a quickcheck property to ensure the prime factors multiply back to the original number:

   1: prop_factors num =  num > 1 ==> num == (foldr1 (*) $ primeFactors num)

When running quickcheck on this property you see the following: 

quickCheck prop_factors
OK, passed 100 tests.

 

Author: MattManela Categories: Haskell, Programming Tags:

Prime Factorization using Unfold in Haskell

March 17th, 2009
Comments Off

I randomly yesterday started thinking about the unfoldr function in Haskell while working out at the gym (how nerdy is that, I am lifting iron but thinking of functional programming). Unfoldr take a single and an unfolding function and turns it into a list (the opposite of fold).  At the gym I was thinking about an application where I can use this and I decided that when I got home I would use it to write a prime factorization function.  This is a method that when given a number returns the list of its prime factors.

It was easy to write the only part I am not pleased about is the code I used to deal with tuples.  It seems clumsy and I am still looking for a way to clean that up.

One note: The code below references a list of prime numbers called primes , which is not shown.

Here is the code:

   1: primeFactors x = unfoldr findFactor x
   2:                  where
   3:                    first (a,b,c) = a
   4:                    findFactor 1 = Nothing
   5:                    findFactor b = (\(_,d,p)-> Just (p, d))
   6:                                   $ head $ filter ((==0).first) 
   7:                                   $  map (\p -> (b `mod` p, b `div` p, p))  primes

This function will take any number which is greater than 1 and return a list of its prime factors.  But don’t take my word for it, I wrote a quickcheck property to ensure the prime factors multiply back to the original number:

   1: prop_factors num =  num > 1 ==> num == (foldr1 (*) $ primeFactors num)

When running quickcheck on this property you see the following: 

quickCheck prop_factors
OK, passed 100 tests.

 

Author: MattManela Categories: Haskell, Programming Tags:

Parameterized State Transformer Monad in F#?

November 5th, 2008
Comments Off

I have have been playing around with F# and I decided to create a state monad.  This worked out really well since I was able to leverage the F# computation expressions.  I then decided to try to extend this and make it more general by creating a parameterized state transformer monad.  This is a state monad which encapsulates another monad.  What this allow you to do is turn any computation into a statefull one.

 

This concept exists in Haskell which you can see here. However, my attempts at replicating this in F# failed.  Unlike in Haskell, the computation expressions in F# don't share a common interface (or type class).  This prevents a computation from being able to generically take another computation as a parameter.  The reason is that an operation on the parameterized state transformer monad such as bind will result in a bind called on its encapsulated monad.  But since there is no interface for computations there is no bind method to call. 

I attempted to fix this by creating my own monad interface but this didn't work:

   1: type Monad =
   2:     abstract Bind : 'm * ('a -> 'm) -> 'm
   3:     abstract Delay : (unit ->'m) -> 'm
   4:     abstract Let : 'a * ('a -> 'm) -> 'm
   5:     abstract Return : 'a -> 'm
   6:     abstract Zero : unit -> 'm
   7:     abstract Combine : 'm -> 'm2 -> 'm3
   8:     abstract Run : (unit ->'m) -> 'm

 

After playing with that and failing I ended up with a less than ideal solution.  Given a Maybe monad like below:

   1: // Maybe Monad    
   2: type MaybeBuilder() =
   3:     member b.Return(x) = Some x
   4:     member b.Run(f) = f()
   5:     member b.Delay(f) = f
   6:     member b.Let(p,rest) = rest p
   7:     member b.Zero () = None
   8:     member b.Combine(m1,dm2) = match m1 with
   9:                                  | None -> dm2()
  10:                                  | x -> x
  11:  
  12:     member b.Bind(p,rest) = match p with 
  13:                             | None -> None
  14:                             | Some r -> rest r

 

I created a state transformer monad which take an argument of type m:MaybeBuilder

   1: type StateMBuilder(m:MaybeBuilder) = 
   2:     member b.Return(x) = fun s -> m.Return (x,s)
   3:     member b.Run(f) = fun inp -> (f inp)()
   4:     member b.Delay(f) = fun inp -> fun () -> f() inp
   5:     member b.Let(p,rest) = rest p
   6:     member b.Zero () = fun s -> m.Zero()
   7:     member b.Bind(p,rest) = fun s -> m.Bind(p s,fun (v,s2) -> rest v s2)
   8:     member b.Combine(p1,dp2) = fun s -> m.Combine(p1 s, dp2 s)
   9:  
  10:     // State specific functions
  11:     member b.Update f = fun s -> try m.Return (s, f s) with e -> m.Zero()
  12:     member b.Read () = b.Update id
  13:     member b.Set t = b.Update (fun _ -> t)

 

Now although this type is technically parameterized ;) it isn't really the idea since its not generic, it has to be a MaybeBuilder.  To use this with a different monad I would need to change m:MaybeBuild to m:SomeOtherMonad. 

I am still going to play with this but this is as far as I have gotten.

After all of that here is how you create a state transformer monad parameterized over the maybe monad:

   1: let maybe = MaybeBuilder()
   2: let state = StateMBuilder(maybe)

 

If anyone has an idea how I can make this work I would love to hear it.

Author: MattManela Categories: F#, Haskell, Programming Tags:

Writing a Regular Expression parser in Haskell: Part 4

March 11th, 2008

With the previous two modules in place we are now set up to use a DFA to match against a string.  In my implementation I support either a greedy match or an short match.  In a full featured regular expression engine this ability to choose greedy or not would be per operator but for simplicity I have it for the overall match. 

 

To do the matching I have a general function which will create a list of all matches.  Then the difference between short and greedy matching is which of the candidate solutions does it choose.

This is the method:

   1: doMatch func machine st [] = doAccept  machine st []
   2: doMatch func machine st string =  func $ map (\f -> doMatch’ st f []) (tails string)
   3:     where
   4:       doMatch’ state [] soFar = doAccept machine st soFar
   5:       doMatch’ state (s:str) soFar = 
   6:           case findTransition machine s state of
   7:             Nothing -> doAccept machine state soFar
   8:             Just (from, to, val) -> case doMatch’ to str (soFar ++ [s]) of
   9:                                       (False,_) -> case canAccept machine to of
  10:                                                     True -> (True, soFar ++ [s])
  11:                                                     False -> doMatch’ to str (soFar ++ [s])
  12:                                       (True,res) -> (True,res)

 

This creates the list of matches and uses the passed in function to determine how to filter to either the shortest or longest match.

For short or long matches I pass in one of these two functions:

   1: — Get the shortest match
   2: shortest matches = case  filter (\s->fst s) (sort matches) of
   3:                      [] -> (False,”")
   4:                      ms -> head ms
   5:  
   6: — Get the longest match
   7: longest matches = last.sort $ matches

 

I created aliases for the functions to make it more handy:

   1: (=~) = greedyMatch
   2: (=~?) = shortMatch

 

And then the final result:

   1: SimpleRegex> “hiphiphiphorray” =~? “hip(hip)
   2: (True,”hip”)
   3:  
   4: SimpleRegex> “hiphiphiphorray” =~ “hip(hip)
   5: (True,”hiphiphip”)

 

 

I attached a zip of all the files for this project.

Enjoy!

Author: Matt Categories: Haskell, Regular Expression Tags:

Writing a Regular Expression parser in Haskell: Part 3

March 11th, 2008

The third module in the simple regular expression parser is called: NFAtoDFA.  Which as you might have guessed, takes the NFA that resulted from the first module and converts it into a DFA.  The structure that the DFA uses is the same that the NFA uses since they are both finite state machines.

 

Converting an NFA to a DFA requires mapping sets of nodes in the NFA to a single node in the DFA.  Many nodes in a NFA will correspond to one node in the DFA.  Making this change requires updating transitions to point to and from sets of nodes.  To manage this transformation I create a state monad using the following context:

 

   1: — The state which we pass to build the DFA
   2: data ConvertContext = ConvertContext { nfa :: FiniteMachine,
   3:                                        trans :: [Transition],
   4:                                        setMap :: Map.Map (Set Node) Integer,
   5:                                        setStack :: [Set Node],
   6:                                        begin :: Node,
   7:                                        accept :: Set Node,
   8:                                        nextNode :: Node
   9:                                      } deriving (Show, Eq)
  10: type ConvertState a = State ConvertContext a

 

Most of the code in this module is just managing this context and updating it according to two operation:

  1. Epsilon Closure
  2. Set Move

These are explained in more detail in this article.

Basically, epsilon closure is the process of taking a set of initial nodes and returning a new set of all nodes you can traverse to purely on epsilon transitions.  To help with this I created some smaller methods to build up to an epsilon closure.

First are a couple methods (findToNodes and closure):

   1: closure trans value nodes oldSet = Set.union (findToNodes trans value nodes) oldSet
   2:  
   3: — Search the table of transitions to find all nodes you can reach given an initial set of nodes
   4: findToNodes trans value fromNodes = foldr match Set.empty trans
   5:     where 
   6:       match (from, to, val) nodes
   7:           | (from == fromNodes) && (val == value) = Set.insert to nodes
   8:           | otherwise = nodes

findToNodes searches a transition table for all nodes which go from any node in fromNodes on value.  It will builds up a set with all the to  nodes that match. 

closure wraps findToNodes to let us easily union together an initial set and the nodes we can reach from that set.

With this in hand we can write clearly a epsilon closure function:

   1: — Given an initial set of nodes, find the set of all nodes you can reach by taking 
   2: — transitions on epsilon only
   3: epsilonClosure trans nodes = foldUntilRepeat Set.union Set.empty $
   4:                              iterate (Set.fold (closure trans epsilon) Set.empty) nodes
   5:  
   6:  

 

This function takes full advantage of the lazy nature of Haskell.  It repeats the closure on epsilon over and over and streams its results into our function foldUntilRepeat.  This method does what it says, it will fold the values that are streamed in until it sees the same value twice. 

 

The set move is just combination of what you have already seen:

   1: — Given a starting set of nodes the set of all nodes that you can reach on a given value
   2: — This includes epislonClosure on the desitination nodes
   3: moveClosure trans value nodes = epsilonClosure trans $ 
   4:                                 Set.fold (closure trans value) Set.empty nodes
   5:  

 

With these functions in hand, this module just becomes calling them and updating the context until we have no more nodes in the NFA to process.

 

In the next installment I will discuss using the output of this modules to match a regex against a string.

Also, once again the code is attached.

Author: Matt Categories: Haskell, Regular Expression Tags:

Writing a Regular Expression parser in Haskell: Part 2

March 11th, 2008

The first module in my simple regular expression parse is called RegexToNFA.  This module exposes the types that make up a finite state machine and also the functions to convert a regular expression string into a finite state machine.

My structure for a FSM follows closely from the mathematical definition:

   1: data FiniteMachine = FiniteMachine{  table :: [Transition],
   2:                                      alphabet :: Set Char,
   3:                                      start :: Node,
   4:                                      final :: Set Node
   5:  
   6:  
   7: — NFA node
   8: type Node = Integer
   9:  
  10: — The value for an edge in a NFA
  11: type TransitionValue = Maybe Char
  12:  
  13: — A transition in a NFA is a tuple of
  14: — StartNode , DestinationNode, Value to transition on
  15: type Transition = (Node,Node,TransitionValue)
  16:  
  17: — The value of the edge in the NFA is a Maybe Char 
  18: — Where Nothing is the epsilon transition
  19: — therefore lets just rename Nothing to epsilon

 

I have the value which you transition on as a Maybe Char (which I alias as TransitionValue).  This allowed me to define epsilon as Nothing data constructor. 

 

With this structure defined my goal now is to convert a regular expression pattern such as: (a|b)* into a FiniteMachine.  In order to do this there is a lot of state that I need to keep track of which naturally leads to the use of the State monad.  To do this I set up a structure for what data I want to be kept track of and then create a state monad using that structure:

   1: — The state that gets passed around which we used to build up the NFA
   2: data ParseContext = Context 
   3:                     {
   4:                       nodeList :: [Node],
   5:                       transitions :: [Transition],
   6:                       operators :: OperatorList,
   7:                       nextNode :: Node,
   8:                       values :: Set Char
   9:                     } deriving (Show, Eq)
  10:  
  11: — Alias the State data constructor with a more friendly name
  12: type RegexParseState a = State ParseContext a

 

This structure is passed between functions to allow them to see the current state of the parsing and create a new state.  I define many functions, each which deal with a piece of the puzzle of converting the input string into a FSM.  I am not going to address them all but I will point out some which is note worth:

convertToNFA – This is the top level function, it is exposed externally and lets you convert a regex to a NFA.

processOperator – This function determines when we should execute an operator given its precedence.  We assign each operator a precedence which lets us determine when we should execute an operator.  For example in the expression a|b*, we want to execute star before we execute union. 

 

Last but not least are the methods which execute the operators.  For example, there is one called doConcat, which performs the concatenation of two values in the regular expression. doConcat isn’t pretty, since its doing the dirty work of examining the state and create a new state to reflect a partially completed FSM.

   1: — Execute the concat operator
   2: doConcat :: RegexParseState ()
   3: doConcat = do
   4:   st <- get
   5:   let nodes = nodeList st
   6:       newNodes = (nodes !! 0) : (nodes !! 3) : (drop 4  nodes)
   7:       newTransitions = transitions st ++ [(nodes !! 2, nodes !! 1, epsilon)]
   8:       newOperators = tail $ operators st
   9:   put $ st { nodeList = newNodes,
  10:              transitions = newTransitions ,
  11:              operators = newOperators}

 

With all this in place, lets finally see what this module actually outputs.

> convertToNFA “a|(bc)*”

FiniteMachine {table =
[(4,5,Just 'c'),(2,3,Just 'b'),(0,1,Just'a'),
(3,4,Nothing),(6,2,Nothing),(6,0,Nothing),
(1,7,Nothing),(5,7,Nothing),(8,6,Nothing),
(8,7,Nothing),(7,9,Nothing),(9,8,Nothing)],
alphabet = fromList “abc”,
start = 8,
final = fromList [9]}

If you examine the table list in the output, you will see all the transitions for the NFA that accepts “a|(bc)*” and that the start state is node 8 and the accept state is node 9. 

I uploaded the RegexToNFA.hs file for your examination.  I tried to comment it a good amount and I feel it should be pretty easy to read and understand.

 

In the next part I will talk about the next modules: NFAtoDFA

Author: Matt Categories: Haskell, Regular Expression Tags:

Writing a Regular Expression parser in Haskell: Part 1

March 11th, 2008

A few weeks ago I read this article about writing a simple regular expression parser.  That article does a really good job of explaining the theory behind regular expression.  It then goes step by step into how to write a program (he uses C++) to parse a regular expression, convert it into a NFA, convert that into a DFA and then use that DFA to match strings.

After reading that I decided to write my own simple regular expression parser using Haskell.  I saw it as a challenge to try to see how you deal with a more complex program in a pure functional language.  After a couple weeks ( Grand Theft Auto 4 kind of ruined my progress for a while ) I have some results.

I split the project into 3 modules.

  1. RegexToNFA – Provides functionality to parse a simple regular expression and return a NFA.
    1. This modules also define the FiniteMachine type which is a general structure for finite state automata.

  2. NFAtoDFA – Providers functionality to convert a NFA into a DFA.
    1. This module uses the same FiniteMachine type from RegexToNFA

  3. SimpleRegex – Provides the functionality to give take a regular expression and a string and return what it matches (if it matches anything).
    1. This modules uses RegexToNFA and sends its results to NFAtoDFA and then uses the resulting DFA to match against a string.

 

This is a very simple and limited regular expression parser.  It supports only union(|), concatenation, closure(*) and parenthesis.  In addition, I don’t preserve information after the NFA is created about the location of the parenthesis.  This means you can’t pull out sub-matches when a entire expression matches.

In my next three I will talk about each module and point out interesting parts of them.  There is nothing too complex but shows how to approach it in Haskell (making heavy use of the State monad).

 

If you want to see a much more complex and full featured regular expression parser written in Haskell take a look at this.

 

Click here to continue to Part 2.

Author: Matt Categories: Haskell, Regular Expression Tags:

Breadth First Tree Traversal in Haskell

March 11th, 2008

As my interest in functional languages has grown, I have become increasingly interested in using them to implement algorithms which I can already write with imperative languages.

For example, I was taught to implement (and I assume most other people as well)  a breadth first traversal of a tree using a queue and a loop.  An example using this method can be found at the wikipedia page for a breadth first search.  When I wanted to try implement a breadth first search in Haskell I quickly realized that algorithm wouldn’t port over very well.  I thought a bit and was able to come up with this algorithm:

   1: — My Implementation
   2: breadth :: Tree a -> [a]
   3: breadth nd =  map rootLabel $ nd : (breadth’ [nd])
   4:     where             
   5:       breadth’ [] = []
   6:       breadth’ nds = let cs = foldr ((++).subForest) [] nds in
   7:                      cs ++ breadth’ cs

 

The idea was that each call to breadth’ takes a list of nodes (which represents of level of the tree) and will concatenate the children of each of those nodes together and recursively call itself again with that list.  This works but its not pretty Haskell.  When choosing Haskell (from what I have learned) it is best if you can avoid explicit recursion and use built in combinators.  After I coded my breadth first traversal function I decided to look into the Haskell standard libraries to see how it is done there.  What I found was a function called levels which returns a list of lists, where each sub-list is a level of the tree.  I slightly modified this to have the same functionality as my breadth function which creates one list of all the nodes in the breadth first order.

This is the resulting code:

   1: — Haskell Standard Libraray Implementation
   2: br :: Tree a -> [a]
   3: br t = map rootLabel $
   4:        concat $
   5:        takeWhile (not . null) $                
   6:        iterate (concatMap subForest) [t]

 

This is really slick implementation of what I did above.  The algorithm is the same but the way they went about writing it is so much prettier.

Author: Matt Categories: Haskell Tags:

Palindrome Creator in Haskell

March 11th, 2008

The past few days I have been solving problems at this site called Project Euler.  This site contains many seemingly simple math programming problems.  I have been using Haskell to solve the problems on the site and in order to help solve one of the problems I wrote this bit of code to generate palindromes

   1: palindrome :: (Integral a) => a -> [Char] -> [String]
   2: palindrome n al =  concat  $  map (pal n) al
   3:     where 
   4:       pal :: (Integral a)=> a -> Char -> [String]
   5:       pal n x 
   6:           | n > 2 =  map (surround x) (palindrome (n-2) al)
   7:           | n > 1 = [[x,x]]
   8:           | otherwise = [[x]]
   9:           where 
  10:             surround :: Char -> String -> String
  11:             surround lt str = [lt] ++ str ++ [lt]

 

This code take a length(as an integer) and a list of characters and returns all possible palindromes that can be created.  For example:

palindrome 3 ['a'..'f']

will result in

["aaa","aba","aca","ada","aea","afa","bab","bbb","bcb","bdb",

"beb","bfb","cac","cbc","ccc","cdc","cec","cfc","dad","dbd",

"dcd","ddd","ded","dfd","eae","ebe","ece","ede","eee",

"efe","faf","fbf","fcf","fdf","fef","fff"] 

 

While I doubt this is the greatest implementation of a method which generates palindromes,  it was the first one I came up with and I am curious if anyone can do it in a very different (or better) way. So, if anyone reading this wants to show me a different way (in any language) feel free.

Author: Matt Categories: Haskell Tags: