Skip to content

Commit 3591635

Browse files
authored
Improve module Haddocks (#1115)
* Make module Haddocks largely consistent across Set, Map, IntSet, IntMap, Seq. * Make public module docs for each structure (e.g. Data.Map, Data.Map.Lazy, Data.Map.Strict) have similar structure and content. This means some duplication of text, but there is no way to avoid it. * Trim down internal module Haddocks (e.g. Data.Map.Internal) to a structure description and references. There is no need to repeat everything that's in the public module. * Add a few lines about the complexity of common operations on each set and map structure. In particular, union and intersection are mentioned with a clear description of 'm' and 'n', to reduce the chances that a reader assumes them to be tied to positions and not sizes.
1 parent 6ead786 commit 3591635

File tree

15 files changed

+217
-209
lines changed

15 files changed

+217
-209
lines changed

containers/src/Data/IntMap.hs

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,30 @@
1414
-- Maintainer : [email protected]
1515
-- Portability : portable
1616
--
17-
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
18-
-- from key of type @Int@ to values of type @v@.
17+
--
18+
-- = Finite Int Maps (lazy interface)
1919
--
2020
-- This module re-exports the value lazy "Data.IntMap.Lazy" API.
2121
--
22-
-- This module is intended to be imported qualified, to avoid name
23-
-- clashes with Prelude functions, e.g.
22+
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
23+
-- from keys of type @Int@ to values of type @v@.
24+
--
25+
-- The functions in "Data.IntMap.Strict" are careful to force values before
26+
-- installing them in an 'IntMap'. This is usually more efficient in cases where
27+
-- laziness is not essential. The functions in this module do not do so.
2428
--
25-
-- > import Data.IntMap (IntMap)
26-
-- > import qualified Data.IntMap as IntMap
29+
-- For a walkthrough of the most commonly used functions see the
30+
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
31+
--
32+
-- This module is intended to be imported qualified, to avoid name clashes with
33+
-- Prelude functions, e.g.
34+
--
35+
-- > import Data.IntMap.Lazy (IntMap)
36+
-- > import qualified Data.IntMap.Lazy as IntMap
37+
--
38+
-- Note that the implementation is generally /left-biased/. Functions that take
39+
-- two maps as arguments and combine them, such as `union` and `intersection`,
40+
-- prefer the values in the first argument to those in the second.
2741
--
2842
--
2943
-- == Implementation
@@ -52,11 +66,11 @@
5266
-- referring to the number of entries in the map and \(W\) referring to the
5367
-- number of bits in an 'Int' (32 or 64).
5468
--
55-
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
56-
-- This means that the operation can become linear in the number of
57-
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
58-
-- (32 or 64). These peculiar asymptotics are determined by the depth
59-
-- of the Patricia trees:
69+
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
70+
-- complexity of \(O(\min(n,W))\). This means that the operation can become
71+
-- linear in the number of elements with a maximum of \(W\) -- the number of
72+
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
73+
-- depth of the Patricia trees:
6074
--
6175
-- * even for an extremely unbalanced tree, the depth cannot be larger than
6276
-- the number of elements \(n\),
@@ -74,6 +88,10 @@
7488
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
7589
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
7690
--
91+
-- Binary set operations like 'union' and 'intersection' take
92+
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
93+
-- are the sizes of the smaller and larger input maps respectively.
94+
--
7795
-----------------------------------------------------------------------------
7896

7997
module Data.IntMap

containers/src/Data/IntMap/Internal.hs

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,30 @@
3737
-- Authors importing this module are expected to track development
3838
-- closely.
3939
--
40-
-- = Description
4140
--
42-
-- This defines the data structures and core (hidden) manipulations
43-
-- on representations.
41+
-- = Finite Int Maps (lazy interface internals)
42+
--
43+
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
44+
-- from keys of type @Int@ to values of type @v@.
45+
--
46+
--
47+
-- == Implementation
48+
--
49+
-- The implementation is based on /big-endian patricia trees/. This data
50+
-- structure performs especially well on binary operations like 'union'
51+
-- and 'intersection'. Additionally, benchmarks show that it is also
52+
-- (much) faster on insertions and deletions when compared to a generic
53+
-- size-balanced map implementation (see "Data.Map").
54+
--
55+
-- * Chris Okasaki and Andy Gill,
56+
-- \"/Fast Mergeable Integer Maps/\",
57+
-- Workshop on ML, September 1998, pages 77-86,
58+
-- <https://web.archive.org/web/20150417234429/https://ittc.ku.edu/~andygill/papers/IntMap98.pdf>.
59+
--
60+
-- * D.R. Morrison,
61+
-- \"/PATRICIA -- Practical Algorithm To Retrieve Information Coded In Alphanumeric/\",
62+
-- Journal of the ACM, 15(4), October 1968, pages 514-534,
63+
-- <https://doi.org/10.1145/321479.321481>.
4464
--
4565
-- @since 0.5.9
4666
-----------------------------------------------------------------------------

containers/src/Data/IntMap/Lazy.hs

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
2929
--
3030
-- This module is intended to be imported qualified, to avoid name clashes with
31-
-- Prelude functions:
31+
-- Prelude functions, e.g.
3232
--
3333
-- > import Data.IntMap.Lazy (IntMap)
3434
-- > import qualified Data.IntMap.Lazy as IntMap
@@ -64,11 +64,11 @@
6464
-- referring to the number of entries in the map and \(W\) referring to the
6565
-- number of bits in an 'Int' (32 or 64).
6666
--
67-
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
68-
-- This means that the operation can become linear in the number of
69-
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
70-
-- (32 or 64). These peculiar asymptotics are determined by the depth
71-
-- of the Patricia trees:
67+
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
68+
-- complexity of \(O(\min(n,W))\). This means that the operation can become
69+
-- linear in the number of elements with a maximum of \(W\) -- the number of
70+
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
71+
-- depth of the Patricia trees:
7272
--
7373
-- * even for an extremely unbalanced tree, the depth cannot be larger than
7474
-- the number of elements \(n\),
@@ -86,6 +86,10 @@
8686
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
8787
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
8888
--
89+
-- Binary set operations like 'union' and 'intersection' take
90+
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
91+
-- are the sizes of the smaller and larger input maps respectively.
92+
--
8993
-- Benchmarks comparing "Data.IntMap.Lazy" with other dictionary
9094
-- implementations can be found at https://github.com/haskell-perf/dictionaries.
9195
--

containers/src/Data/IntMap/Strict.hs

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
3636
--
3737
-- This module is intended to be imported qualified, to avoid name clashes with
38-
-- Prelude functions:
38+
-- Prelude functions, e.g.
3939
--
4040
-- > import Data.IntMap.Strict (IntMap)
4141
-- > import qualified Data.IntMap.Strict as IntMap
@@ -80,11 +80,11 @@
8080
-- referring to the number of entries in the map and \(W\) referring to the
8181
-- number of bits in an 'Int' (32 or 64).
8282
--
83-
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
84-
-- This means that the operation can become linear in the number of
85-
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
86-
-- (32 or 64). These peculiar asymptotics are determined by the depth
87-
-- of the Patricia trees:
83+
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
84+
-- complexity of \(O(\min(n,W))\). This means that the operation can become
85+
-- linear in the number of elements with a maximum of \(W\) -- the number of
86+
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
87+
-- depth of the Patricia trees:
8888
--
8989
-- * even for an extremely unbalanced tree, the depth cannot be larger than
9090
-- the number of elements \(n\),
@@ -102,6 +102,10 @@
102102
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
103103
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
104104
--
105+
-- Binary set operations like 'union' and 'intersection' take
106+
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
107+
-- are the sizes of the smaller and larger input maps respectively.
108+
--
105109
-- Benchmarks comparing "Data.IntMap.Strict" with other dictionary
106110
-- implementations can be found at https://github.com/haskell-perf/dictionaries.
107111
--

containers/src/Data/IntMap/Strict/Internal.hs

Lines changed: 1 addition & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -28,44 +28,11 @@
2828
-- closely.
2929
--
3030
--
31-
-- = Description
31+
-- = Finite Int Maps (strict interface internals)
3232
--
3333
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
3434
-- from key of type @Int@ to values of type @v@.
3535
--
36-
-- Each function in this module is careful to force values before installing
37-
-- them in an 'IntMap'. This is usually more efficient when laziness is not
38-
-- necessary. When laziness /is/ required, use the functions in
39-
-- "Data.IntMap.Lazy".
40-
--
41-
-- In particular, the functions in this module obey the following law:
42-
--
43-
-- - If all values stored in all maps in the arguments are in WHNF, then all
44-
-- values stored in all maps in the results will be in WHNF once those maps
45-
-- are evaluated.
46-
--
47-
-- For a walkthrough of the most commonly used functions see the
48-
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
49-
--
50-
-- This module is intended to be imported qualified, to avoid name clashes with
51-
-- Prelude functions:
52-
--
53-
-- > import Data.IntMap.Strict (IntMap)
54-
-- > import qualified Data.IntMap.Strict as IntMap
55-
--
56-
-- Note that the implementation is generally /left-biased/. Functions that take
57-
-- two maps as arguments and combine them, such as `union` and `intersection`,
58-
-- prefer the values in the first argument to those in the second.
59-
--
60-
--
61-
-- == Warning
62-
--
63-
-- The 'IntMap' type is shared between the lazy and strict modules, meaning that
64-
-- the same 'IntMap' value can be passed to functions in both modules. This
65-
-- means that the 'Functor', 'Traversable' and 'Data.Data.Data' instances are
66-
-- the same as for the "Data.IntMap.Lazy" module, so if they are used the
67-
-- resulting map may contain suspended values (thunks).
68-
--
6936
--
7037
-- == Implementation
7138
--

containers/src/Data/IntSet.hs

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
--
3535
-- The implementation is based on /big-endian patricia trees/. This data
3636
-- structure performs especially well on binary operations like 'union'
37-
-- and 'intersection'. However, my benchmarks show that it is also
37+
-- and 'intersection'. Additionally, benchmarks show that it is also
3838
-- (much) faster on insertions and deletions when compared to a generic
3939
-- size-balanced set implementation (see "Data.Set").
4040
--
@@ -57,16 +57,16 @@
5757
--
5858
-- == Performance information
5959
--
60-
-- Operation comments contain the operation time complexity in
60+
-- The time complexity is given for each operation in
6161
-- [big-O notation](http://en.wikipedia.org/wiki/Big_O_notation), with \(n\)
6262
-- referring to the number of entries in the map and \(W\) referring to the
6363
-- number of bits in an 'Int' (32 or 64).
6464
--
65-
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
66-
-- This means that the operation can become linear in the number of
67-
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
68-
-- (32 or 64). These peculiar asymptotics are determined by the depth
69-
-- of the Patricia trees:
65+
-- Operations like 'member', 'insert', and 'delete' have a worst-case
66+
-- complexity of \(O(\min(n,W))\). This means that the operation can become
67+
-- linear in the number of elements with a maximum of \(W\) -- the number of
68+
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
69+
-- depth of the Patricia trees:
7070
--
7171
-- * even for an extremely unbalanced tree, the depth cannot be larger than
7272
-- the number of elements \(n\),
@@ -79,10 +79,15 @@
7979
-- the set is sufficiently "dense", this becomes \(O(\min(n, \log n))\) or
8080
-- simply the familiar \(O(\log n)\), matching balanced binary trees.
8181
--
82-
-- The most performant scenario for 'IntSet' are keys from a contiguous subset,
83-
-- in which case the complexity is proportional to \(\log n\), capped by \(W\).
84-
-- The worst scenario are exponentially growing elements \(1,2,4,\ldots,2^n\),
85-
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
82+
-- The most performant scenario for 'IntSet' are elements from a contiguous
83+
-- subset, in which case the complexity is proportional to \(\log n\), capped
84+
-- by \(W\). The worst scenario are exponentially growing elements \(1,2,4,
85+
-- \ldots,2^n\), for which complexity grows as fast as \(n\) but again is capped
86+
-- by \(W\).
87+
--
88+
-- Binary set operations like 'union' and 'intersection' take
89+
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
90+
-- are the sizes of the smaller and larger input sets respectively.
8691
--
8792
-----------------------------------------------------------------------------
8893

containers/src/Data/IntSet/Internal.hs

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -34,22 +34,18 @@
3434
-- Authors importing this module are expected to track development
3535
-- closely.
3636
--
37-
-- = Description
3837
--
39-
-- An efficient implementation of integer sets.
38+
-- = Finite Int Sets (internals)
4039
--
41-
-- These modules are intended to be imported qualified, to avoid name
42-
-- clashes with Prelude functions, e.g.
43-
--
44-
-- > import Data.IntSet (IntSet)
45-
-- > import qualified Data.IntSet as IntSet
40+
-- The @'IntSet'@ type represents a set of elements of type @Int@. An @IntSet@
41+
-- is strict in its elements.
4642
--
4743
--
4844
-- == Implementation
4945
--
5046
-- The implementation is based on /big-endian patricia trees/. This data
5147
-- structure performs especially well on binary operations like 'union'
52-
-- and 'intersection'. However, my benchmarks show that it is also
48+
-- and 'intersection'. Additionally, benchmarks show that it is also
5349
-- (much) faster on insertions and deletions when compared to a generic
5450
-- size-balanced set implementation (see "Data.Set").
5551
--

containers/src/Data/Map.hs

Lines changed: 49 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,50 @@
1414
-- Maintainer : [email protected]
1515
-- Portability : portable
1616
--
17+
--
18+
-- = Finite Maps (lazy interface)
19+
--
20+
-- This module re-exports the value lazy "Data.Map.Lazy" API.
21+
--
1722
-- The @'Map' k v@ type represents a finite map (sometimes called a dictionary)
1823
-- from keys of type @k@ to values of type @v@. A 'Map' is strict in its keys but lazy
1924
-- in its values.
2025
--
21-
-- This module re-exports the value lazy "Data.Map.Lazy" API.
26+
-- The functions in "Data.Map.Strict" are careful to force values before
27+
-- installing them in a 'Map'. This is usually more efficient in cases where
28+
-- laziness is not essential. The functions in this module do not do so.
29+
--
30+
-- When deciding if this is the correct data structure to use, consider:
31+
--
32+
-- * If you are using 'Prelude.Int' keys, you will get much better performance for most
33+
-- operations using "Data.IntMap.Lazy".
34+
--
35+
-- * If you don't care about ordering, consider using @Data.HashMap.Lazy@ from the
36+
-- <https://hackage.haskell.org/package/unordered-containers unordered-containers>
37+
-- package instead.
38+
--
39+
-- For a walkthrough of the most commonly used functions see the
40+
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
2241
--
23-
-- This module is intended to be imported qualified, to avoid name
24-
-- clashes with Prelude functions, e.g.
42+
-- This module is intended to be imported qualified, to avoid name clashes with
43+
-- Prelude functions, e.g.
2544
--
26-
-- > import qualified Data.Map as Map
45+
-- > import Data.Map (Map)
46+
-- > import qualified Data.Map as Map
47+
--
48+
-- Note that the implementation is generally /left-biased/. Functions that take
49+
-- two maps as arguments and combine them, such as `union` and `intersection`,
50+
-- prefer the values in the first argument to those in the second.
51+
--
52+
--
53+
-- == Warning
54+
--
55+
-- The size of a 'Map' must not exceed @'Prelude.maxBound' :: 'Prelude.Int'@.
56+
-- Violation of this condition is not detected and if the size limit is exceeded,
57+
-- its behaviour is undefined.
58+
--
59+
--
60+
-- == Implementation
2761
--
2862
-- The implementation of 'Map' is based on /size balanced/ binary trees (or
2963
-- trees of /bounded balance/) as described by:
@@ -48,16 +82,19 @@
4882
-- \"/Parallel Ordered Sets Using Join/\",
4983
-- <https://arxiv.org/abs/1602.02120v4>.
5084
--
51-
-- Note that the implementation is /left-biased/ -- the elements of a
52-
-- first argument are always preferred to the second, for example in
53-
-- 'union' or 'insert'.
5485
--
55-
-- /Warning/: The size of the map must not exceed @maxBound::Int@. Violation of
56-
-- this condition is not detected and if the size limit is exceeded, its
57-
-- behaviour is undefined.
86+
-- == Performance information
87+
--
88+
-- The time complexity is given for each operation in
89+
-- [big-O notation](http://en.wikipedia.org/wiki/Big_O_notation), with \(n\)
90+
-- referring to the number of entries in the map.
91+
--
92+
-- Operations like 'lookup', 'insert', and 'delete' take \(O(\log n)\) time.
93+
--
94+
-- Binary set operations like 'union' and 'intersection' take
95+
-- \(O\bigl(m \log\bigl(\frac{n}{m}+1\bigr)\bigr)\) time, where \(m\) and \(n\)
96+
-- are the sizes of the smaller and larger input maps respectively.
5897
--
59-
-- Operation comments contain the operation time complexity in
60-
-- the Big-O notation (<http://en.wikipedia.org/wiki/Big_O_notation>).
6198
-----------------------------------------------------------------------------
6299

63100
module Data.Map

0 commit comments

Comments
 (0)