半群 Semigroup 是这样一种数学结构,对非空集合 S,对 S 上的二元运算·: S x S -> S,其满足结合律(即对集合 S 上的元素 a,b,c,有a · (b · c) = (a · b) · c),则二元组(S, ·)为半群;幺半群则在半群的基础上添加了一个幺元 identity element——任何元素对其作运算·仍旧得到它自身,这是说假如令幺元为 e,则对 S 上任意元素 a,有a · e = a = e · a,三元组(S, ·, e)为幺半群。
-- | The class of monoids (types with an associative binary operation that -- has an identity). Instances should satisfy the following: -- -- [Right identity] @x '<>' 'mempty' = x@ -- [Left identity] @'mempty' '<>' x = x@ -- [Associativity] @x '<>' (y '<>' z) = (x '<>' y) '<>' z@ ('Semigroup' law) -- [Concatenation] @'mconcat' = 'foldr' ('<>') 'mempty'@ -- -- The method names refer to the monoid of lists under concatenation, -- but there are many other instances. -- -- Some types can be viewed as a monoid in more than one way, -- e.g. both addition and multiplication on numbers. -- In such cases we often define @newtype@s and make those instances -- of 'Monoid', e.g. 'Data.Semigroup.Sum' and 'Data.Semigroup.Product'. -- -- __NOTE__: 'Semigroup' is a superclass of 'Monoid' since /base-4.11.0.0/. classSemigroup a => Monoid a where -- | Identity of 'mappend' -- -- >>> "Hello world" <> mempty -- "Hello world" mempty :: a
-- | An associative operation -- -- __NOTE__: This method is redundant and has the default -- implementation @'mappend' = ('<>')@ since /base-4.11.0.0/. -- Should it be implemented manually, since 'mappend' is a synonym for -- ('<>'), it is expected that the two functions are defined the same -- way. In a future GHC release 'mappend' will be removed from 'Monoid'. mappend :: a -> a -> a mappend = (<>) {-# INLINE mappend #-}
-- | Fold a list using the monoid. -- -- For most types, the default definition for 'mconcat' will be -- used, but the function is included in the class definition so -- that an optimized version can be provided for specific types. -- -- >>> mconcat ["Hello", " ", "Haskell", "!"] -- "Hello Haskell!" mconcat :: [a] -> a mconcat = foldr mappend mempty {-# INLINE mconcat #-} -- INLINE in the hope of fusion with mconcat's argument (see !4890)
实现其只需要指定二元运算(在半群实例中)和幺元即可,下面定义数字加法幺半群:
1 2 3 4 5 6 7 8 9
newtypePlus = PlusInt
instanceSemigroupPluswhere (<>) :: Plus -> Plus -> Plus (Plus a) <> (Plus b) = Plus $ a + b
instanceMonoidPluswhere mempty :: Plus mempty = Plus0
Monoid 的实例需要满足下面的定律:
1 2 3
(x <> y) <> z = x <> (y <> z) -- associativity mempty <> x = x -- left identity x <> mempty = x -- right identity
容易发现,(Bool, &&, True)和(Bool, ||, False)也是幺半群:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
-- 在标准库中,这里的 And 名称为 All,Or 名称为 Any newtypeAnd = AndBool instanceSemigroupAndwhere (<>) :: And -> And -> And (And a) <> (And b) = And $ a && b instanceMonoidAndwhere mempty :: And mempty = AndTrue
newtypeOr = OrBool instanceSemigroupOrwhere (<>) :: Or -> Or -> Or (Or a) <> (Or b) = Or $ a || b instanceMonoidOrwhere mempty :: Or mempty = OrFalse
Ordering 类型,即 compare 函数的返回值类型,也是 Monoid,其幺元是EQ,二元运算的行为类似 First,其非常适合这种需求——先比较 x,如果 x 相等,比较 y。比如这里写一个函数比较两个字符串的长度大小,其中规定若两个字符串等长,则比较两个字符串内容:
1 2
lengthCompare :: String -> String -> Ordering lengthCompare a b = (length a `compare` length b) <> (a `compare` b)
同态
乘法有一个所谓的分配律,这是说对实数 a,b,c,有 a * (b + c) = a * b + a * c,定义f(x) = a * x,有f(b + c) = f(b) + f(c),又能看到,f(0) = 0,这时称函数 f 是一个加法幺半群到加法幺半群上的同态 Homomorphism,抽象地说,对于 Monoid a 和 b,有:
1 2 3
f :: (Monoid a, Monoid b) => a -> b f mempty = mempty f (x <> y) = f x <> f y
reverseByFoldr :: Foldable t => t a -> [a] reverseByFoldr = foldr (\x acc -> acc ++ [x]) [] -- 忘掉效率!
-- 使用 foldMap 去实现 reverse,重点就是要识别和创建出相应幺半群 -- 容易发现,这里需要这样一个列表的幺半群,它的幺元仍是空集,但二元操作为 flip (++)(应当测试几个用例证明该二元操作满足结合律) newtypeReverse a = Reverse {getReverse :: [a]} instanceSemigroup (Reversea) where (<>) :: Reverse a -> Reverse a -> Reverse a (Reverse a) <> (Reverse b) = Reverse $ b ++ a
instanceMonoid (Reversea) where mempty :: Reverse a mempty = Reverse []
-- 不知道为何 Haskell 没有识别出函数是幺半群,这里直接显式地定义一个幺半群 Endo: newtypeEndo b = Endo {appEndo :: b -> b} instanceSemigroup (Endob) where (<>) :: Endo b -> Endo b -> Endo b (Endo g) <> (Endo f) = Endo $ g . f
instanceMonoid (Endob) where mempty :: Endo b mempty = Endo id
-- foldMap :: Monoid m => (a -> m) -> [a] -> m,替换 m 为 b -> b,替换结果为 Endo b foldComposing :: (a -> (b -> b)) -> [a] -> Endo b foldComposing f = foldMap (Endo . f)
-- 先把`b -> b`列表折成一个 Endo b(即 b -> b),再应用它到初始值 b foldr :: (a -> (b -> b)) -> b -> [a] -> b foldr f z xs = appEndo (foldComposing f xs) z
-- 如果 Haskell 能识别 b -> b 为幺半群的话,直接下面这样就行了 foldComposing :: (a -> (b -> b)) -> [a] -> (b -> b) foldComposinng f = foldMap f
foldr :: (a -> b -> b) -> b -> [a] -> b foldr f z xs = foldComposing f xs z
-- Abridged definition, with just the method signatures. classFoldable t where foldMap :: Monoid m => (a -> m) -> t a -> m foldr :: (a -> b -> b) -> b -> t a -> b
-- All of the following have default implementations: fold :: Monoid m => t m -> m -- generalised mconcat foldr' :: (a -> b -> b) -> b -> t a -> b foldl :: (b -> a -> b) -> b -> t a -> b foldl' :: (b -> a -> b) -> b -> t a -> b foldr1 :: (a -> a -> a) -> t a -> a foldl1 :: (a -> a -> a) -> t a -> a toList :: t a -> [a] null :: t a -> Bool length :: t a -> Int elem :: Eq a => a -> t a -> Bool maximum :: Ord a => t a -> a minimum :: Ord a => t a -> a sum :: Num a => t a -> a product :: Num a => t a -> a
toList 非常有趣,其证明任何可折叠的类型都可以转换成为列表,这借用了列表是幺半群这个特性:
1 2
toList :: (Foldable t) => t a -> [a] toList = foldMap (\x -> [x])
toList reflects the fact that lists are the free monoid for Haskell types. “Free” here means any value can be promoted to the monoid in a way which neither adds nor erases any information (we can convert values of type a to [a] lists with a single element and back through (\x->[x]) and head in a lossless way).
应用
所以,在什么情况下 foldMap 会比 foldr 更香?
考虑二叉树:
1
dataBiTree a = BiTree a (Maybe (BiTreea)) (Maybe (BiTreea)) deriving (Show)
-- 对所有元素进行相同操作,这显然是 Functor,类似列表: instanceFunctorBiTreewhere fmap :: (a -> b) -> BiTree a -> BiTree b fmap f (BiTree v l r) = BiTree (f v) (fmap f <$> l) (fmap f <$> r) -- <$> 提升这个 fmap f 到 Maybe 上下文
-- 求所有子树的和 -- maybe 函数类似于 orElse sumTree :: BiTreeInt -> Int sumTree (BiTree v l r) = maybe 0 sumTree l + maybe 0 sumTree r + v
-- 获取最大深度 maxDepth :: BiTree a -> Int maxDepth (BiTree _ NothingNothing) = 1 maxDepth (BiTree _ l r) = maybe 0 maxDepth l `max` maybe 0 maxDepth r + 1
-- 可以发现,这同时也是一个前序遍历,显然还能有中序遍历,后序遍历,还能有广度优先遍历 instanceFoldableBiTreewhere foldMap :: Monoid m => (a -> m) -> BiTree a -> m foldMap f (BiTree v l r) = f v <> maybe mempty (foldMap f) l <> maybe mempty (foldMap f) r
定义了 Foldable 的实例后,定义 sumTree 就很简单了:
1 2
sumTree :: BiTreeInt -> Int sumTree = getSum . foldMap Sum
treeFold :: b -> (a -> b) -> (b -> b -> b -> b) -> BiTree a -> b -- z 为默认值,在子树为空的情况下填充,mapper 为映射,combiner 为合并函数 treeFold z mapper combiner (BiTree v l r) = combiner (mapper v) l' r' where l' = maybe z (treeFold z mapper combiner) l r' = maybe z (treeFold z mapper combiner) r
maxDepth' :: (Num b, Ord b) => BiTree b -> b maxDepth' = treeFold 0 (const 1) (\_ l r -> l `max` r + 1)
Bonus
下面在 typescript 里利用 foldMap 去实现了 sum 和 groupBy,使用了 type class 模式,比较有趣: