dtl
Category: containers | Component type: type |
The vec_multiset is an STL compatible container that emulates a std::multiset using a sorted std::vector. This container works best when you have a setup phase where you are inserting elements followed by a lookup phase with few insertions or deletions. The rationale for this container is that it can provide much better performance than a std::multiset in terms of both memory and speed.
1. If you insert many elements without any lookups in a "setup" phase followed by a "lookup" phase where you search for elements in the container with few changes to the container. Vec_multiset is optimized for this case in terms of performance and memory usage. Sorted containers are often used in a very regular fashion: population before lookup.
2. Hashed containers are not suitable for your application because you need guaranteed worst case lookup time, or you are unwilling to have the memory overhead associated with a std::hash_set.
1. If exception safety is an issue, you are better off with std::multiset. Vec_multiset is only exception-safe for POD's.
2. If you intermix insertions/erasures and lookups arbitrarily. Std::multiset handles this case better than vec_multiset.
3. If you need pointer or reference validity throughout the lifetime of the container, use std::multiset. Vec_multiset's underlying container can shift the data elements around in memory. However, if you write your code in a more modern fashion to use iterators rather than pointers, you can still use vec_multiset.
4. Iterators for vec_multiset are not guaranteed to be valid until after the first lookup into the container. So, if you need to capture iterators during the setup phase of the container (i.e. iterators returned by insert() before the container has been sorted) you must use std::multiset. If you only need iterators to begin() or end(), you can still use vec_multiset as those operations will trigger a sort on the container.
Many applications use std::multiset by first inserting elements into the container, then performing lookups in that container. This is the behavior that vec_multiset is optimized to handle. The designers of std::multiset designed their container to handle generic sequences of inserts, erases, and lookups in an arbitrary order. Thus, they used a tree for all operations. The tree design carries with it much memory overhead resulting in poor memory contiguity and lookups that are much slower than can be achieved by a std::vector.
Vec_multiset makes some assumptions which allow it to perform its job in an efficient manner. Vec_multiset assumes that the user will first populate the container and then perform lookups, it doesn't have to keep the elements sorted until lookup time. Vec_multiset can perform inserts in amortized constant time because it just pushes the item onto the end of the underlying container. Furthermore, vec_multiset can store its elements contiguously in memory. Lookups occur in logarithmic time because the STL algorithms invoked by the vec_multiset use binary searches, but without the memory thrashing that std::multiset can suffer from. After the first lookup operation, vec_multiset will insert any additional elements into the proper place in the underlying vector to keep the container sorted, which takes linear time. These additional inserts after the first lookup are slow, but that is OK as we expect very few insertions into the container once we begin lookups into the vec_multiset. Similarly, erasures from a vec_multiset occur in linear time, but again, we don't expect many calls to vec_multiset::erase() once the vector is sorted.
The following performance results (times listed in seconds) compare the performance of vec_multiset<int> with std::multiset<int> in the case of 6,000,000 inserts followed by 6,000,000 calls to the member find() function on each container. To make an equal playing field in the timing data for insertion, the time to sort the underlying vector (which occurs right before the first lookup) is included in the totals. Note that the std::stable_sort() used to implement the sorting in vec_multiset should occur in linear time. The table below presents the test results for different ratios of lookups to inserts. We ran these tests in Debug build in Microsoft Visual C++ v 6.0 SP 5 using STLPort 4.0. We used the optimizations of 'Maximize Speed' and 'Any Suitable' inlining. We also compare against STLPort 4.0's hash_multiset<int> container. The tests performed the same number of lookups as inserts and the actual times recorded in those runs appear in the 1:1 row of the table. We then extrapolated the results in the other rows (2:1, 3:1, 4:1, and 5:1) by multiplying the lookup times by the appropriate factor and adding these totals to the insertion times for our runs. .
ratio of lookups to inserts | multiset inserts |
vec_multiset inserts |
hash_multiset inserts | multiset lookups | vec_multiset lookups | hash_multiset lookups | multiset total | vec_multiset total | hash_multiset total |
1:1 | 15 | 6 | 4 | 2 | 2 | 1 | 17 | 8 | 5 |
2:1 | 15 | 6 | 4 | 4 | 4 | 2 | 19 | 10 | 6 |
3:1 | 15 | 6 | 4 | 6 | 6 | 3 | 21 | 12 | 7 |
4:1 | 15 | 6 | 4 | 8 | 8 | 4 | 23 | 14 | 8 |
5:1 | 15 | 6 | 4 | 10 | 10 | 5 | 25 | 16 | 9 |
Vec_multiset still can't beat a hashed container with a good hashing function as the numbers for hash_multiset prove. The numbers above indicate that vec_multiset is a significant improvement over std::multiset for insertions. Vec_multiset's usual case of constant time insertion beats the logarithmic time of std::multiset's corresponding calls. Vec_multiset matches the good logarithmic performance of lookup calls in std::multiset. We were surprised that the std::multiset lookups did not suffer from the performance penalty expected from memory thrashing. The vec_multiset matches the ideal performance that would be achieved by manually using a std::vector and the appropriate algorithm calls on it (std::stable_sort(), std::lower_bound(), and friends).
The container takes advantage of the assumption that first the user populates the container, and then perform lookups in it. When the user is populating the container, the vec_multiset simply tacks newly inserted elements onto the end of the container. The first time the user invokes a lookup method on the container (also including vec_multiset::begin() and vec_multiset::end()), the container sorts itself. The lookup operations simply call the appropriate STL algorithms, passing the predicate function through along with the begin() and end() iterators of the underlying vector which actually holds the elements. The user may still insert or erase elements after the first lookup, but now the container inserts any new items into their proper sorted positions.
Vec_multiset provides iterators that are always valid unless:
1. The container hasn't been sorted yet. This happens as the sort will rearrange the items in the container through a call to std::stable_sort(), which is oblivious to the structure of vec_multiset or its iterators.
or
2. The element pointed to has been removed from the container. This is true for iterators of the standard STL containers, including std::multiset.
To accomplish this, a vec_multiset::iterator stores the index of the element within the vec_multiset it belongs to (actually in its underlying container), rather than a raw pointer to the element. The iterator thus doesn't care about actual memory addresses of the elements, which is especially important when the vec_multiset's underlying container shifts elements to a new block of memory due to getting full. However, elements in the underlying vector get shifted on insertions and deletions from the container. The index of the element an iterator refers to thus may change during one of these operations. The vec_multiset keeps a log of all insertions and erasures that are performed on it after the container is sorted so that iterators may use this information to properly update their indices (or invalidate themselves if the element referred to was erased).
For the following discussion, assume that a pointer takes 4 bytes of memory and we're basing our data here on Microsoft Visual C++ 6.0 SP5's implementation of std::vector and std::multiset. We also ignore the comparison function and allocator members. We specifically compare a vec_multiset whose underlying container is a std::vector.
In general, vec_multiset incurs less memory overhead than std::multiset. Std::multiset contains the red-black tree which holds the data elements as a member. In turn, the tree holds a pointer to its root. The overhead for this pointer is thus 4 bytes.. The per-element overhead for std::multiset is whatever the overhead for a single tree node is. A tree node contains pointers to the left, right, and parent nodes of the element as well as the color flag (red or black). Thus, we have a total of 16 bytes overhead per element in the std::multiset assuming the color flag takes up 4 bytes (enums are 4 bytes in Visual C++).
Thus, the total overhead for std::multiset = 4 bytes for the pointer to the tree root + 16 bytes * number of elements in the container.
Vec_multiset is forgiving in its use of memory. The total overhead is that of the change log (a std::vector<iter_change> recording insertions/erasures after the first lookup into the vec_multiset), the overhead of the underlying std::vector which contains the data stored in the vec_multiset, and the memory taken up by a flag indicating whether the underlying vector's data is sorted or not (a bool, which in Visual C++ is 1 byte).
As both the change log and elements contained in a vec_multiset are stored in an underlying std::vector, first we need to compute the overhead for the std::vector template. A std::vector contains three iterators, which are implemented as pointers, referring to the begin, end, and last (end of storage) elements in the underlying array, taking up 12 bytes total. As the elements are stored in the underlying array, there is no per-element overhead.
The overhead for the change log will thus be 12 bytes plus the number of elements in the change log which should be relatively small as long as the vec_multiset is used as intended (few inserts/erasures after first lookup). Next we need to know the size of an element in the change log. Each entry in the change log contains an enum indicating whether the change was an insertion or an erasure, the index where the insertion/erasure is occurring (of type size_t, 4 bytes in Visual C++), and the number of elements inserted/erased during this operation (of type size_t). Thus, each log entry takes up 4 bytes for the insert/erase flag + 4 bytes for the insert/erase index + 4 bytes for the value indicating number elements inserted = 12 bytes.
Therefore, the overhead due to the change log = 12 bytes for std::vector overhead + 12 bytes * number of insertions and erasures after first lookup
The overhead for the underlying data vector = 12 bytes.
Thus, the total overhead for vec_multiset =12 bytes for the change log vector's overhead itself + 12 bytes * number of insertions and erasures after first lookup * sizeof(iter_change) + 12 bytes for the std::vector which contains the data elements + 1 byte for the sorted flag
Combining like terms:
Total overhead for vec_multiset = 25 bytes + 12 bytes * number of insertions and erasures after first lookup.
Comparing the results shows that in general, vec_multiset incurs far less memory overhead than std::multiset if very few insertions and erasures are done after the first lookup (remember, that's the intended case for vec_multiset!). Especially note that there is absolutely no per-element overhead for a vec_multiset, which is a major plus! Compare that with the overhead of 16 bytes per element in std::multiset.
We could have used two different methods to achieve iterator stability with vec_multiset: 1. A method David Abrahams proposes for iterator stability (indirect_container and indirect_iterator .. storing a container of iterators to the data elements). This means that the user has to populate a data container such as a std::vector<T> and then create some other container whose elements actually refer to elements in that first data vector, such as a std::set<T>::iterator. This method is a lean and mean way to get iterator (or pointer) stability. However, it comes at at possible cost of a loss of memory contiguity (depending on the allocator) which can slow down memory accesses and also extra address arithmetic the machine must perform. 2. Our method references the data elements directly. Vec_multiset<T> internally stores an underlying std::vector<T>which actually contains the data. The vec_multiset<T>::iterator stores several simple pieces of information: * a pointer to the vec_multiset that it is referring to * the index of the element in the vec_multiset's underlying vector * an int that tells where to look in the referred to container's change log of inserts/erasures after the first lookup in the container so the iterator may update its position and thus maintain iterator stability. This method guarantees both iterator stability and contiguity of memory at very little expense. The key here is that there are no address computations necessary each time elements are compared vs. the indirect_iterator method which must dereference an iterator/pointer. We believed that the direct storing of the vector data would be much more efficient rather than the indirect_container approach. So we performed a benchmark test to compare the two methods. For this test, we used a sample of 15,000,000 ints and used a std::vector<int> as our base container for the indirect_iterator-like simulation to eliminate contiguity as a factor. For the direct storing method (the method we used to implement vec_multiset), we did the following: 1. Inserted the ints into the vector using std::vector<int>::push_back(). 2. Called std::stable_sort() from begin() to end() on the vector 3. Did 15,000,000 std::lower_bound() calls on the begin() to end() range to locate a number. We timed each step and then added the results together to get the overall total for this method. To simulate the indirect_container method, we had to: 1. Populate a std::vector<int> with the 15,000,000 elements using std::vector<int>::push_back(). 2. For each element vec[i] in the vector vec, we inserted &vec[i] into a vector<int *> called pvec. 3. Call std::stable_sort(pvec.begin(), pvec.end(), IntStarLess()) where IntStarLess::operator()(pInt1, pInt2) is true iff *p1 < *p2. 4. Perform 15,000,000 std::lower_bound() calls over the pvec.begin() to pvec.end() range with an IntStarLess as the predicate. Again, we timed each step and summed the results together for the overall time. We performed the same operations using each method using the same data set and calls to std::lower_bound(). The results we got back were astonishing which confirmed our reasoning: ---- vector<int> ---- vec.push_back(): 26 stable_sort(vec.begin(), vec.end()): 20 lower_bound(): 6 *** Total time for vector<int> impl.: 52 *** ---- vector <int *> ---- vec_ptrs.push_back(): 8 stable_sort(vec_ptrs.begin(), vec_ptrs.end()): 55 lower_bound(): 44 Subtotal time for vector<int *> ops: 107 (but we're not done as we must add in the vector<int *> push_back time of: 26) *** Total time for vector<int *> impl.: 133 *** As you can see, the std::stable_sort() and std::lower_bound() calls for the direct approach easily outperformed the corresponding calls for the indirect_iterator simulation. Also, there is the additional overhaead in the indirect method of having to push_back the objects onto the data vector even before it starts working with the pointers. The performance test proved that the direct approach we used for vec_multiset is much faster and more efficient than the use of creatures like indirect_iterator. Caveats to these results: 1. With larger objects, the indirect_iterator method probably would perform much faster as the sort only swaps pointers. The direct storing method's sort call actually has to swap the objects themselves, which can definitely be much more expensive. In the case of our example above this penalty does not appear since ints are small. 2. As Dave Abrahams pointed out, if your container holds something like std::string objects, then holding a vector of pointers (e.g. std::string *) may not lose you much anyway since the underlying objects are in fact holding the data via pointers.
using namespace std; int main() { vec_multiset<int> vms; // "setup" phase ... populate our container vms.insert(23); vms.insert(94); vms.insert(76); vms.insert(10); vms.insert(91); vms.insert(12); vms.insert(76); // more inserts here // ... // "lookup" phase ... use our container to find different elements // print out container elements ... they'll appear in sorted order cout << "Container elements: " << endl; copy(vms.begin(), vms.end(), ostream_iterator<int>(cout, " ")); cout << endl; if (vms.find(12) != vms.end())
cout << "Found element with value 12" << endl; else cout << "Could not find element with value 12" << endl; pair<vec_multiset<int>::iterator, vec_multiset<int>::iterator> pr = vms.equal_range(76); cout << "Elements with value 76: " << endl; copy(pr.first, pr.second, ostream_iterator<int>(cout, " ")); cout << endl; // more lookups here, possibly calls to vms.lower_bound(), vms.count(), etc. return 0; }
Defined in the vec_multiset.h header file.
Multiple Sorted Associative Container, Simple Associative Container
Those defined by Multiple Sorted Associative Container and Simple Associative Container.
None.
Parameter | Description | Default |
---|---|---|
Key | The type of elements within the vec_multiset. | |
Pred | The key comparison function, a Strict Weak Ordering whose argument type is key_type; it returns true if its first argument is less than its second argument, and false otherwise. This is also defined as vec_multiset::key_compare and vec_multiset::value_compare. | std::less<Key> > |
A |
The allocator used by the container. |
std::allocator<Key> > |
Container | The underlying container used by the vec_multiset to actually hold the elements. | std::vector<Key, A> |
Identical to those declared for std::multiset.
Multiple Sorted Associative Container, Simple Associative Container.
The idea for this container came from concepts discussed in:
Meyers, Scott. Effective STL, Item 23, pp. 100-106. Addison-Wesley, Boston: 2001.
Copyright © 2002, Michael Gradman and Corwin Joy.
Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appears in all copies and that both that copyright notice and this permission notice appear in supporting documentation. Corwin Joy and Michael Gradman make no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.