Class EliasFanoMonotoneLongBigList

All Implemented Interfaces:
BigList<Long>, LongBigList, LongCollection, LongIterable, LongStack, Size64, Stack<Long>, Serializable, Comparable<BigList<? extends Long>>, Iterable<Long>, Collection<Long>
Direct Known Subclasses:
EliasFanoIndexedMonotoneLongBigList, EliasFanoPrefixSumLongBigList, SparseSelect

public class EliasFanoMonotoneLongBigList
extends AbstractLongBigList
implements Serializable
An implementation of Elias–Fano's representation of monotone sequences; an element occupies a number of bits bounded by two plus the logarithm of the average gap.

Instances of this class represent in a highly compacted form a nondecreasing sequence of natural numbers. Instances are built by providing either an iterator returning the (nondecreasing) sequence, or an iterable object that provides such an iterator. In the first case, you must also provide in advance the number of elements that will be returned and an upper bound to their values (see below), and at the end of the construction the iterator will be exhausted.

An additional bulk method makes it possible to extract several consecutive entries at high speed, and getDelta(long) computes directly the difference between two consecutive elements. Moreover, the nextLong() method of an iterator will read read consecutive data much faster than repeated calls to getLong(long).

Methods to not usually perform bound checks on the arguments. Bounds checks can be enabled, however, by enabling assertions.

This class is thread safe.

Implementation details

Given a monotone sequence 0 ≤ x0x1 ≤ … ≤ xn − 1 < u, where u is a given upper bound (the size of the universe), the Elias–Fano representation makes it possible to store it using at most 2 + log(u/n) bits per element, which is very close to the information-theoretical lower bound ≈ log e + log(u/n). A typical example is a list of pointer into records of a large file: instead of using, for each pointer, a number of bit sufficient to express the length of the file, the Elias–Fano representation makes it possible to use, for each pointer, a number of bits roughly equal to the logarithm of the average length of a record. The representation was introduced in Peter Elias, “Efficient storage and retrieval by content and address of static files”, J. Assoc. Comput. Mach., 21(2):246−260, 1974, and also independently by Robert Fano, “On the number of bits required to implement an associative memory”, Memorandum 61, Computer Structures Group, Project MAC, MIT, Cambridge, Mass., n.d., 1971.

The elements of the sequence are recorded by storing separately the lower s = ⌊log(u/n)⌋ bits and the remaining upper bits. The lower bits are stored contiguously, whereas the upper bits are stored in an array of n + u / 2s bits by setting, for each 0 ≤ i < n, the bit of index xi / 2s + i; the value can then be recovered by selecting the i-th bit of the resulting bit array and subtracting i (note that this will work because the upper bits are nondecreasing).

This implementation uses SimpleSelect to support selection inside the upper-bits array, and exploits SimpleSelect.select(long, long[], int, int) to implement get(long, long[], int, int).

See Also:
EliasFanoIndexedMonotoneLongBigList, Serialized Form
  • Field Details

    • length

      protected final long length
      The length of the sequence.
    • l

      protected final int l
      The number of lower bits.
    • upperBits

      protected transient long[] upperBits
      The upper bits, stored as unary gaps.
    • lowerBits

      protected long[] lowerBits
      The list of lower bits of each element, stored explicitly.
    • selectUpper

      protected final SimpleSelect selectUpper
      The select structure used to extract the upper bits.
    • lowerBitsMask

      protected final long lowerBitsMask
      The mask for the lower bits.
  • Constructor Details

    • EliasFanoMonotoneLongBigList

      protected EliasFanoMonotoneLongBigList​(long length, int l, long[] upperBits, long[] lowerBits, SimpleSelect selectUpper)
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(IntIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
      Parameters:
      list - an iterable object returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(ShortIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
      Parameters:
      list - an iterable object returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(ByteIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
      Parameters:
      list - an iterable object returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(LongIterable list)
      Creates an Elias–Fano representation of the values returned by the given iterable object.
      Parameters:
      list - an iterable object returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(long n, long upperBound, ByteIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

      This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

      Parameters:
      n - the number of elements returned by iterator.
      upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
      iterator - an iterator returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(long n, long upperBound, ShortIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

      This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

      Parameters:
      n - the number of elements returned by iterator.
      upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
      iterator - an iterator returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(long n, long upperBound, IntIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

      This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

      Parameters:
      n - the number of elements returned by iterator.
      upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
      iterator - an iterator returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      public EliasFanoMonotoneLongBigList​(long n, long upperBound, LongIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

      This constructor is particularly useful if the elements of the iterator are provided by some sequential source.

      Parameters:
      n - the number of elements returned by iterator.
      upperBound - a strict upper bound to the values returned by iterator (note that it used to be non-strict).
      iterator - an iterator returning nondecreasing natural numbers.
    • EliasFanoMonotoneLongBigList

      protected EliasFanoMonotoneLongBigList​(long[] a, LongIterator iterator)
      Creates an Elias–Fano representation of the values returned by an iterator, given that the overall number of elements and an upper bound are provided, too.

      This constructor is used only internally, to work around the usual problems caused by the obligation to call this() before anything else.

      Parameters:
      a - an array containing the number of elements returned by iterator and a strict upper bound to the values returned by iterator (note that it used to be non-strict).
      iterator - an iterator returning nondecreasing natural numbers.
  • Method Details