-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up parsejson 3.25x (with a gcc-10.2 PGO build) on number heavy input #16055
Commits on Nov 19, 2020
-
Speed up parsejson 3.25x (with a gcc-10.2 PGO build) on number heavy …
…input. Also add a couple convenience iterators and update `changelo.md`. The basic idea here is to process the ASCII just once in a combination validation/classification/parse rather than validating/classifying a number in `parsejson.parseNumber`, then doing the same work again in `parseFloat`/`parseInt` and finally doing it a 3rd time in the C library `strtod`/etc. The last is especially slow on some libc's/common CPUs due to `long double`/80-bit extended precision arithmetic on each digit. In any event, the new fully functional `parseNumber` is not actually much more code than the old classifying-only `parseNumber`. Because we aim to do this work just once and the output of the work is an int64|float, this PR adds those exported fields to `JsonParser`. (It also documents two existing but undocumented fields.) One simple optimization done here is pow10. It uses a small lookup table for the power of 10 multiplier. The full 2048*8B = 16 KiB one is bigger than is useful. In truth most numbers in any given run will likely be of similar orders of magnitude, meaning the cache cost would not be heavy, but probably best to not rely only likelihood. So fall back to a fast integer-exponentiation algorithm when out of range. The new `parseNumber` itself is more or less a straight-line parse of scientific notation where the '.' can be anywhere in the number. To do the right power-of-10 scaling later, we need to bookkeep a little more than the old `parseNumber` as noted in side comments in the code. Note that calling code that always wants a `float` even if the textual representation looks like an integer can do something like `if p.tok == tkInt: float(p.i) else: p.f` or similar, depending on context. Note that since we form the final float by integer * powerOf10 and since powers of 10 have some intrinsic binary representational error and since some alternative approaches (on just some CPUs) use 80-bit floats, it is possible for the number to mismatch those other approaches in the least significant mantissa bit (or for the ASCII->binary->ASCII round-trip to not be the identity). On those CPUs only, better matching results can maybe be gotten from an `emit` using `long double` if desired (also with a long double table for powers of 10 and the powers of 10 calculation). This strikes me as unlikely to be truly needed long-term, though.
Configuration menu - View commit details
-
Copy full SHA for 0e24c5e - Browse repository at this point
Copy the full SHA 0e24c5eView commit details
Commits on Nov 20, 2020
-
Make the default mode quite a bit faster with setLen+copyMem instead of
slice assignment. Within 1.03x of non-copy mode in a PGO build (1.10x if the caller must save the last valid string). One might think that this argues for ditching `strFloats` & `strIntegers` entirely and just always copying. In some sense, it does. However, the non-copy mode is a useful common case here, as shown in `jsonTokens(string)` example code. So, it is a bit of a judgement call whether supporting that last 10% of perf in a useful common case matters.
Configuration menu - View commit details
-
Copy full SHA for 6cb3375 - Browse repository at this point
Copy the full SHA 6cb3375View commit details -
Configuration menu - View commit details
-
Copy full SHA for 46d5f21 - Browse repository at this point
Copy the full SHA 46d5f21View commit details -
Gack. I guess one needs defined(nimscript) not nimvm.
(Someone should make `copyMem` just work everywhere, IMO.)
Configuration menu - View commit details
-
Copy full SHA for ad4707c - Browse repository at this point
Copy the full SHA ad4707cView commit details -
Ok. This jsOrVmBlock works for
streams
so it should work here.Also, extract mess into a template.
Configuration menu - View commit details
-
Copy full SHA for b802eba - Browse repository at this point
Copy the full SHA b802ebaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0e66b44 - Browse repository at this point
Copy the full SHA 0e66b44View commit details -
Configuration menu - View commit details
-
Copy full SHA for 05a7350 - Browse repository at this point
Copy the full SHA 05a7350View commit details -
Configuration menu - View commit details
-
Copy full SHA for 136fe92 - Browse repository at this point
Copy the full SHA 136fe92View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7fd9287 - Browse repository at this point
Copy the full SHA 7fd9287View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5b98112 - Browse repository at this point
Copy the full SHA 5b98112View commit details -
Fix integer overflow errors. 18 digits is still plenty (IEEE 64-bit
float has only 15.95 decimal digits).
Configuration menu - View commit details
-
Copy full SHA for 61f39cd - Browse repository at this point
Copy the full SHA 61f39cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 822d91c - Browse repository at this point
Copy the full SHA 822d91cView commit details -
This may pass all tests, but it is preliminary to see if, in fact,
this is the only remaining problem. Before merging we should make overflow detection more precise than just a digit count for both `tkInt` and `tkFloat`.
Configuration menu - View commit details
-
Copy full SHA for 88d9845 - Browse repository at this point
Copy the full SHA 88d9845View commit details
Commits on Nov 23, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 7a076c2 - Browse repository at this point
Copy the full SHA 7a076c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 24192b5 - Browse repository at this point
Copy the full SHA 24192b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for e8c234f - Browse repository at this point
Copy the full SHA e8c234fView commit details