Advances in Virtuoso RDF Triple Storage (Bitmap Indexing)
Created on 2022-05-25T14:29:56-05:00
Bulk loading of documents followed by random access with an uneven distribution of search terms.
Uses one thread to parse triplets and worker threads to actually insert the parsed triplets in the database.
Bitmaps useful when cardinality of data is low but number of records is high (ex. one million records which can either be male or female.)
Bitmaps need to be space efficient when only a single bit in the map is set.
If each bit in the bitmap corresponds to a block of data then the bitmap being hot in that bit means relevant data exists there. So a 64-bit bitmap is able to tell you if something exists within one of 64 buckets.
Bitmap schemes are only slightly slower than complex B-tree indexing.