Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can size be made O(1)? #138

Open
neongreen opened this issue Nov 15, 2016 · 11 comments · May be fixed by #170
Open

Can size be made O(1)? #138

neongreen opened this issue Nov 15, 2016 · 11 comments · May be fixed by #170

Comments

@neongreen
Copy link

neongreen commented Nov 15, 2016

It is somewhat unfortunate that size is O(n) for HashMap and HashSet:

  • If a faster size is desired, I have to store it separately in my code and make sure that I update it whenever I do an insert or delete. With filtering functions, union, etc. it gets even more cumbersome.

  • Even experienced programmers often assume that size is O(1) without checking, which leads to performance bugs :(

(I have looked at the code, but I have to confess that I don't understand how HashMap/Set work and so it's hard for me to assess how much of a performance/memory penalty adding a size field to the constructors would be, or whether it is possible at all.)

@tibbe
Copy link
Collaborator

tibbe commented Nov 15, 2016

I've thought about it in the past but haven't found time to work on it. Basically we'd have to wrap the current HashMap data type (a tree-like structure) in another type that tracks the size:

data HashMapW = HashMapW {-# UNPACK #-} !Int !HashMap

All functions that potentially change the size would have their internal implementations return an Int argument indicating the size change:

insertInternal :: (Hashable k, Eq k) => k -> v -> HashMap k v -> (Int, HashMap k v)

The normal insert function would extract the HashMap from HashMapW, call insertInteral and then wrap up the result in a HashMapW again, updating the size using the returned Int.

The question is how much this costs in terms of performance and implementation complexity. Someone would have to try it out.

@neongreen
Copy link
Author

The question is how much this costs in terms of performance and implementation complexity. Someone would have to try it out.

Okay, thanks! If someone at my work tries it out, I'll post an update here.

@rockbmb
Copy link
Contributor

rockbmb commented Oct 3, 2017

@tibbe I've started working on this, and I've gotten to

unionWithKeyInternal :: (Eq k, Hashable k) => (k -> v -> v -> v) -> HashMap k v -> HashMap k v -> (Int, HashMap k v)

which is required by unionWithInternal, which in turn is required by the new unionInternal function that'll use HashMapWrappers. Threading the number of added elements from each recursive call in the go helper is proving complex, especially in the branch vs. branch cases (talking about this: https://github.com/tibbe/unordered-containers/blob/master/Data/HashMap/Base.hs#L803).

I think changing the HashMap datatype to something like

data HashMap k v
    = Empty
    | BitmapIndexed !Size !Bitmap !(A.Array (HashMap k v))
    | Leaf !Hash !(Leaf k v)
    | Full !Size !(A.Array (HashMap k v))
    | Collision !Size !Hash !(A.Array (Leaf k v))

type Size = Int

similar to what is done for Map would entail a similar amount of work, and would not damage the code's readability and ease of refactoring as opposed to the approach first suggested.

What do you think about this?

@treeowl
Copy link
Collaborator

treeowl commented Oct 3, 2017

I think that makes the constructors bigger, which is a definite performance trade-off. That doesn't mean we shouldn't do it, but it's not free. Benchmark! Just what sort of trouble did you run into with the union? I'd fear more trouble from intersection and difference operations, but I still don't know much about these structures.

@rockbmb
Copy link
Contributor

rockbmb commented Oct 3, 2017

@treeowl I forgot to take performance into consideration, you're right.

A bit of background: as a first step I've added copies of every function that may change a hashmap's size and suffixed their names with Internal - this will be for the final refactoring, or at least that's my plan.

This line in and the cases after it in particular (rockbmb@5a6f098#diff-11321ff356efb0247b49763554dc8620R1176) are where I think this approach becomes complex to implement: because go no longer returns a HashMap and now returns (Int, HashMap), a new unionArrayBy function is required such that its first argument is of type (a -> a -> (Int, a)) and not (a -> a -> a)

The reason I'm concerned is because unionArrayBy is used elsewhere, and duplicating this nontrivial function for what is not even the hardest part of the task this early on is a bit of a concern, in my opinion.

I'll continue with this approach until I have definite proof it won't work. As such, considering the alternative might not be a waste of time.

@tibbe
Copy link
Collaborator

tibbe commented Oct 3, 2017

I don't think storing the size in the tree is worth it. It will affect performance of much more than size (and in an insidious way, by increasing GC times). Per-element overheads for Haskell data structures are already a bit too high.

@pepeiborra
Copy link

If O(1) is not viable, is there any way to make size O(logn) at least?

@rockbmb
Copy link
Contributor

rockbmb commented Jun 11, 2021

@pepeiborra it's been a few calendars since I've last attempted to work on this, but from what I recall, it's straightforward to make it O(1) (making sure there are no regressions, not so much), while it's unclear to me right now how to make it O(log n) in a way that makes it an improvement over that approach.

@sjakobi
Copy link
Member

sjakobi commented Apr 2, 2022

Here is an idea for storing the sizes in the tree without increasing node sizes (by much):

data HashMap k v
= Empty
| BitmapIndexed !Bitmap !(A.Array (HashMap k v))
| Leaf !Hash !(Leaf k v)
| Full !(A.Array (HashMap k v))
| Collision !Hash !(A.Array (Leaf k v))

  • For Empty and Leaf, the size is obvious.
  • In the case of Collision, we can query the size of the array.
  • Full nodes should be pretty rare, so adding a Size field shouldn't really hurt.
  • In BitmapIndexed, I think we can store the size in the Bitmap in all but the largest nodes: A BitmapIndexed currently has at most 31 children, therefore we use only the lower 31 bits of the bitmap. In the remaining 33 bits (on 64-bit systems) we could therefore store the size, up to 2^33 - 1 = 8.589.934.591. Primarily for the benefit of 32-bit systems we could even use a little trick based on the invariant that a BitmapIndexed must contain at least 2 elements, and store (size - 2) in the bitmap. This way, on 32-bit systems, we can store sizes < 4 in the bitmap.
  • For the cases where the size does not fit into the bitmap, we can use a new constructor BitmapIndexedLarge with a dedicated size field.

@treeowl
Copy link
Collaborator

treeowl commented Apr 2, 2022

Interesting .... I'm a bit nervous about having a different big-O for different word sizes. Should we use 64-bit always to work around that? On the flip side, if we don't to this, would we (now or in the future) benefit by narrowing that field to a Word32 (get rid of the high bits instead of using them)?

The BitmapIndexedLarge thing sounds just a tad expensive, in code size if nothing else. But there may not be a better way if fast sizes are really important. Are fast sizes really important?

@sjakobi
Copy link
Member

sjakobi commented Apr 2, 2022

I'm a bit nervous about having a different big-O for different word sizes.

Could you expand a bit on this? Maybe give an example?

Should we use 64-bit always to work around that?

That would be one possibility. Alternatively we could make B (bitsPerSubkey) platform-dependent and set it to 4 on 32-bit systems. The corresponding Bitmaps would then use only the lower 15 bits.

But frankly I'm entirely unconcerned about performance on 32-bit systems. I think just about anyone who cares about performance will use a 64-bit system.

On the flip side, if we don't to this, would we (now or in the future) benefit by narrowing that field to a Word32 (get rid of the high bits instead of using them)?

I don't believe we can save space by this. If we can, I'd like to hear about it.

I think it might be useful for constant folding in some rare cases. Roughly speaking, GHC could possibly simplify testBit fullNodeMask x to True then, irregardless of x. But I don't know whether GHC can actually do this already, and I also suspect that any performance gains would be pretty minuscule.

The BitmapIndexedLarge thing sounds just a tad expensive, in code size if nothing else. But there may not be a better way if fast sizes are really important. Are fast sizes really important?

Yeah, at least after inlining, the generated code might be a good bit larger.

Are fast sizes really important?

For some of the participants in this thread they seem to be quite important.

I also agree with this bit from the issue description:

Even experienced programmers often assume that size is O(1) without checking, which leads to performance bugs :(

There's a similar confusion about the complexity of IntMap.size. See e.g. https://www.reddit.com/r/haskell/comments/t3zw0x/monthly_hask_anything_march_2022/i0ieezu/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants