Module Mset

Multisets

Multisets are a variant of sets where each element may appear several time. The number of occurrences of a given element is called its multiplicity. For instance, the multiset

        {{ a, a, a, b, c, c }}

has three occurrences of element a, one occurrence of element b, and two occurrence of element c. The size of a multiset is the sum of its multiplicities, here 6.

This module implements a persistent data structure for multisets using bitmaps, for multisets small enough to fit in a single machine integer.

The universe (i.e. the elements that can be stored in the multiset and, for each, its maximal multiplicity) has to be provided upfront. Functions over multisets fail if they are given elements not belonging to the universe, or if the capacity of an element is exceeded.

module type S = sig ... end

Signature of the multiset data type

module type UNIVERSE = sig ... end

The type for the elements

module Make (X : UNIVERSE) : sig ... end

Functor building an implementation of the multiset structure given an element type.

val chars : (char * int) list -> (module S with type elt = char)

Returns a multiset implementation for a universe of characters.

val of_string : ?filter:(char -> char option) -> string -> (module S with type elt = char)

Returns a multiset implementation for a universe corresponding to the characters of a given string. Characters can be filtered with function filter: a character c is either transformed to another character d with Some d, or filtered out with None. The default filter function ignores blank characters (ASCII codes 9, 10, 12, 13, and 32).

Multisets of uppercase letters (without accents), according to the frequencies of letters in French and English.

module FR : S with type elt = char
module EN : S with type elt = char