-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider dropping type checks in Keccak256 methods #16
Comments
So, it turns out that during a full sync we're spending so much time on |
@gsalgado I'm betting most of those |
The biggest offenders are actually rlp and trie: rlp/codec.py:72(encode_raw), trie/validation.py:11(validate_is_bytes), rlp/sedes/serializable.py:300(make_immutable). Just those 3 seem to account to almost half of all |
Hrm, I could get onboard with dropping the check from |
I haven't looked at the specific code, but I wonder if the behavior of some of those checks could be duplicated by appropriately coded try/except blocks. According to the docs, try/except is very efficient when an exception is not raised: https://docs.python.org/3.6/faq/design.html#how-fast-are-exceptions. |
not the right solution but I did have a look at |
I think we can improve performance on all of these with
Trivial to move to
Not sure about this one. Can we simply disallow list as a valid input type, and use
This one is tougher, but here is one way to reduce the calls of old code if isinstance(item, Atomic):
if len(item) == 1 and item[0] < 128:
return item
payload = item
prefix_offset = 128 # string
elif not isinstance(item, str) and isinstance(item, collections.Sequence):
payload = b''.join(encode_raw(x) for x in item)
prefix_offset = 192 # list
else:
msg = 'Cannot encode object of type {0}'.format(type(item).__name__)
raise EncodingError(msg, item) new code if isinstance(item, Atomic):
if len(item) == 1 and item[0] < 128:
return item
payload = item
prefix_offset = 128 # string
else:
try:
assert not isinstance(item, str)
except AssertionError as exc:
raise encoding_error(item) from exc
try:
payload = b''.join(encode_raw(x) for x in item)
except TypeError as exc:
raise encoding_error(item) from exc
prefix_offset = 192 # list
def encoding_error(item)
msg = 'Cannot encode object of type {0}'.format(type(item).__name__)
return EncodingError(msg, item) |
I guess this should really be moved to py-evm/pyrlp/py-trie -- seems like we shouldn't remove the type checks in eth-hash until they are clearly a bottleneck. |
We can vendor the |
I've always felt weird about disabling asserts since some libraries might incorrectly use them as a general exception throwing mechanism. Thoughts on that? |
I hope that having libraries break when you turn off assertions would be very rare. I would certainly consider that broken. Is that something you've run into before? @davesque |
@carver Actually, no thankfully. And I guess it's probably not a very big risk and we'd notice it pretty quickly if a library broke in optimized mode. But it does seems a bit undesirable that one should have to use But maybe it's possible that some of that manual type checking could just be removed altogether since a wrong value type would lead pretty quickly to a fairly obvious error (that's why I suggested the try/except approach)? It just feels a bit like we might be trying to shoehorn the wrong concerns into a dynamically typed language. |
eurika! (maybe) We should be able to get roughly the same level of confidence with type checking via |
@pipermerriam Yes, I agree with that. Even if third party users of our libraries didn't use mypy, it would still give some guarantees about their internal correctness and the interactions between them. |
Well, currently we have type definitions for We can move this stuff to a submodule and add type definitions for anything else which is stable, share the submodule across repositories and get some stronger type safety checks across library boundaries, so even consumers of |
Which comes at the cost of requiring that any module that uses the method to use mypy to get a reasonable developer experience. For example, if a module uses Infinite recursion also happens if running with |
It still doesn't sit well with me to disable assertions for performance gains. |
The
__call__()
andnew()
methods there will check the type of their arguments usingisinstance
, and although that should be a cheap check, these are code paths where performance is paramount given the number of times they are called when we perform trie operations on a fully synced DB. For example, when doing a full sync around 10% of the time is being spent into isinstance calls. Even though they're probably not all originating from these checks, I think we should consider removing them as these two methods have type annotations that should make it clear enough what is wrong in case they crash when passed an argument of the wrong typeThe text was updated successfully, but these errors were encountered: