Optimize internally tagged enums -- do not use internal buffer if tag is the first field #1922

Mingun · 2020-11-03T20:00:26Z

Also that change has one positive side-effect: if tag is the first field, negative effects from #1183 is eliminated, because buffering is not used

@RReverser, feel free to try to run your benchmarks against this branch

dtolnay

The discussion in #1495 focuses on whether this would be worth it for the cost in compile time. The biggest thing wrong with internally tagged enums is not when the deserializer is slow but that they take significantly long to compile. Overall I would rather optimize for lowering their compile time, not on performance or features at the expense of compile time.

Would you be able to provide some measurements showing the impact of this change on the time to compile internally tagged enums?

Mingun · 2021-02-28T10:17:30Z

I cannot agree with such a question. For me, the runtime performance is much more important thing that compile-time performance. Things are written not for the pleasure of developers, but for solving customer problems.

Would you be able to provide some measurements showing the impact of this change on the time to compile internally tagged enums?

I will try to study how to do performance measurements, but any guidance is welcome

RReverser · 2021-02-28T21:29:40Z

Overall I would rather optimize for lowering their compile time, not on performance or features at the expense of compile time.

That sounds odd tbh. Compile times are affecting only developers, while runtime affects every user of the library / application, which is way more impactful. Why choose compile-time over runtime perf here when we don't do that at any other levels of development (e.g. opt-level = 0 vs opt-level = 2 etc.)?

Mingun · 2021-03-06T16:36:19Z

I've made some research and there is the results. I've created a library project with 1000 types and I've measured compilation time.

I've noticed small increasing of the compilation time, about 0.01 sec per type (or 7-30%). I think it is acceptable worth for the bigger runtime performance.

Test code and raw data

serde-perf.zip

Created library cargo project with following lib.rs content:

use serde::{Deserialize};
macro_rules! generate {
  ($(#[$counter:meta])*) => {
    $(
      const _: () = {
        #[$counter]
        #[derive(Deserialize)]
        #[serde(tag = "tag")]
        enum Node {
          Unit,
          Struct {
            name: String,
            list: Vec<Node>,
          },
          // Uncomment for "big enum" tests
          /*
          Newtype1(std::collections::HashMap<String, String>),
          Newtype2(String),
          Newtype3(u32),
          Newtype4(f32),
          Unit1,
          Unit2,
          Unit3,
          Unit4,
          Struct1 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct2 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct3 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct4 { f1: String, f2: u32, f3: bool, f4: f64 },// */
        }
      };
    )*
  };
}
// Expanded manually for "expand" tests
generate!(
  /// ...
  /// 1000 lines
  /// ...
);

Tests run with command

cargo +nightly build -Ztimings

Test PC

OS version: Windows_NT x64 10.0.18363
CPUs: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (8 x 1992)

Summary table

Small enum (2 variants), 1000 types

Derived both Serialize and Deserialize, types generated with generate! macro

Deserialize + Serialize	master (`c261015`) (100%)	PR (`a169bee`)	Diff
all	47.37 = (46.96 + 47.10 + 48.06)/3	58.67 = (59.27 + 59.17 + 57.56)/3	+11.29 (+29%)
codegen	3.28 = (46.96-43.95 + 47.10-43.92 + 48.06-44.42)/3	4.81 = (59.27-53.93 + 59.17-53.85 + 57.56-53.80)/3	+1.53 (+10%)

Derived only Deserialize, types generated with generate! macro

Deserialize	master (`c261015`) (100%)	PR (`a169bee`)	Diff
all	39.25 = (39.16 + 39.66 + 38.94)/3	50.08 = (49.70 + 51.62 + 48.93)/3	+10.83 (+28%)
codegen	3.62 = (39.16-35.31 + 39.66-35.28 + 38.94-36.31)/3	4.53 = (49.70-46.17 + 51.62-45.29 + 48.93-45.21)/3	+0.91 (+25%)

Derived only Deserialize, types written manually

Deserialize + expanded	master (`c261015`) (100%)	PR (`a169bee`)	Diff
all	38.76 = (38.38 + 39.67 + 38.23)/3	50.36 = (50.34 + 50.59 + 50.14)/3	+11.60 (+30%)
codegen	3.24 = (38.38-35.41 + 39.67-35.65 + 38.23-35.50)/3	5.67 = (50.34-44.34 + 50.59-44.74 + 50.14-45.02)/3	+2.42 (+75%)

Big enum (14 variants), 1000 types

Derived only Deserialize, types generated with generate! macro

Deserialize	master (`c261015`) (100%)	PR (`a169bee`)	Diff
all	241.04 = (236.94 + 239.35 + 246.84)/3	257.01 = (257.75 + 258.76 + 254.70)/3	+16.03 (+7%)
codegen	40.90 = (236.94-197.11 + 239.35-199.61 + 246.84-203.70)/3	46.63 = (257.75-211.25 + 258.76-210.17 + 254.70-209.91)/3	+5.72 (+14%)

Derived only Deserialize, types written manually

Deserialize + expanded	master (`c261015`) (100%)	PR (`a169bee`)	Diff
all	238.41 = (238.49 + 236.58 + 240.16)/3	254.86 = (258.93 + 254.75 + 250.90)/3	+16.45 (+7%)
codegen	39.70 = (238.49-198.46 + 236.58-199.66 + 240.16-198.02)/3	45.05 = (258.93-209.91 + 254.75-211.19 + 250.90-208.34)/3	+5.35 (+13%)

pickfire · 2021-03-26T17:40:26Z

@Mingun What about runtime performance improvements?

Mingun · 2021-03-26T18:26:50Z

I didn't measure it, maybe I should

jarredholman · 2022-07-26T15:47:36Z

Would this also fix the incorrect error messages caused by the internal buffer? #1621

Mingun · 2022-07-26T16:36:12Z

For the optimized case yes, it should

Mingun

@dtolnay, @oli-obk , this PR ready for review again.

Mingun · 2024-08-24T16:47:24Z

serde/src/de/value.rs

+ fn deserialize_unit<V>(self, visitor: V) -> Result<V::Value, Self::Error>
+ where
+ V: de::Visitor<'de>,
+ {
+ // Covered by tests/test_enum_internally_tagged.rs
+ // newtype_unit
+ visitor.visit_unit()
+ }
+
+ fn deserialize_unit_struct<V>(
+ self,
+ _name: &'static str,
+ visitor: V,
+ ) -> Result<V::Value, Self::Error>
+ where
+ V: de::Visitor<'de>,
+ {
+ // Covered by tests/test_enum_internally_tagged.rs
+ // newtype_unit_struct
+ self.deserialize_unit(visitor)
+ }
+
+ fn deserialize_newtype_struct<V>(self, _name: &str, visitor: V) -> Result<V::Value, Self::Error>
+ where
+ V: de::Visitor<'de>,
+ {
+ visitor.visit_newtype_struct(self)
+ }
+


I'm not sure, should we change behavior of SeqAccessDeserializer and MapAccessDeserializer or introduce new private deserializers? From one hand those deserializers was created for support of various serde attributes. From the other hand, technically this is breaking change because those types are public.

Mingun · 2024-08-25T16:54:40Z

I realized, that the old Visitor::visit helper method actually a handwritten DeserializeSeed implementation. So derive now generates DeserializeSeed and all optimization stuff lives in the normal code now.

Mingun · 2024-08-30T19:33:47Z

As I remember, changes in this PR depends on changes in #2445.

As I already said, if you wish, it is possible to do not touch MapAccessDeserializer and SeqAccessDeserializer, but instead introduce new private deserializers.

@dtolnay, @oli-obk, please give your opinion, should I make these changes?

oli-obk · 2024-09-03T13:35:01Z

I didn't measure it, maybe I should

Did this happen?

Mingun · 2024-09-03T14:17:30Z

No. Any suggestions for the benchmark are welcome.

RReverser · 2024-09-03T15:02:23Z

No. Any suggestions for the benchmark are welcome.

It's pretty ancient by now, but in the original issue I referenced binast/binjs-ref@22103b9 where I did a manual implementation of this optimisation just for the types we used.

For testing, I checked out a commit right before that. binast/binjs-ref@53bd87a

The numbers before this PR:

test bench_parsing_reuse_parser       ... bench:  88,088,720 ns/iter (+/- 12,560,451)

The numbers with this PR applied:

test bench_parsing_reuse_parser       ... bench:  66,515,390 ns/iter (+/- 7,634,431)

Note that this is far from a pure JSON benchmark - it uses an external Node.js process to parse JS and produce JSON, and only then parses the output using serde-json, but in that context the -25% perf improvement is even more impressive.

It should be easy to save the JSON output and do a pure serde-json benchmark instead (in my original commit I suggested that showed 2x improvement, which seems realistic), but perhaps someone has more modern examples.

Anything touching JS AST represented as JSON (e.g. Deserialize for ESTree Program from https://swc.rs/) should work.

RReverser · 2024-09-03T16:08:20Z

serde/src/private/de.rs

 where
 S: SeqAccess<'de>,
 {
- Ok(())
+ match tri!(seq.next_element()) {


This behaves quite differently from IgnoredAny.visit_map. I think the behaviour should be consistent, as in, iterate over the entire sequence and ignore its values instead of erroring out on non-empty sequence.

I tried that initially, but that failed other tests and in general not what you want. The unit / unit struct represented in sequence as nothing, so we need to ensure that sequence is empty. This is consistent with normal behavior where struct deserialization from a sequence expects exact number of values, and those fact that flattened unit / unit struct considered as equal to the struct without fields.

Actually, the unit_variant_with_unknown_fields is a test that failed if consume the whole sequence here.

serde/test_suite/tests/test_enum_internally_tagged.rs

Lines 1447 to 1456 in 3aca38d

// Unknown elements are not allowed in sequences

assert_de_tokens_error::<InternallyTagged>(

&[

Token::Seq { len: None },

Token::Str("Unit"), // tag

Token::I32(0),

Token::SeqEnd,

],

"invalid length 1, expected 0 elements in sequence",

);

The unit / unit struct represented in sequence as nothing

Hm but "nothing" should be pretty different conceptually from "ignored any". I'd expect a custom check just for the nothing case, whereas ignored any should be able to consume anything thrown at it silently.

This code tries to read something, doesn't matter what. We expect an empty sequence, so if it contains some element, we fail.

Nevermind, I'm sleepy - I thought you're changing how IgnoredAny works everywhere. I've expanded the context of the diff and I see this is a change on this one specific visitor.

Please disregard my original comment 🤦‍♂️

Although I now wonder if visit_map should be changed to check length as well.

By default maps in serde allows unknown keys and when unit is flattened, all keys become unknown. But you're right -- in case of #[serde(deny_unknown_fields)] we should return error if map not empty. That's idea for another PR!

Mingun · 2024-09-15T09:50:47Z

@oli-obk, how your review progressed?

…sibility of the Visitor Examples of errors produced during deserialization of internally tagged enums in tests if instead of a Seq/Map a Str("unexpected string") will be provided: In tests/test_annotations.rs flatten::enum_::internally_tagged::tuple: before: `invalid type: string "unexpected string", expected tuple variant` after : `invalid type: string "unexpected string", expected tuple variant Enum::Tuple` flatten::enum_::internally_tagged::struct_from_map: before: `invalid type: string "unexpected string", expected struct variant` after : `invalid type: string "unexpected string", expected struct variant Enum::Struct`

…rializer

Deserializer methods are only hints which deserializer is not obliged to follow. Both TaggedContentVisitor and InternallyTaggedUnitVisitor accepts only visit_map and visit_seq and that is what derived implementation of Deserialize does for structs. Therefore it is fine to call deserialize_map here, as that already did in derived deserialize implementation

…ms in non self-describing formats Visitor that passed th the deserialize_any supports only visit_map method, so we can always requests deserialize_map

…untagged and adjacently tagged enums

(review this commit with "ignore whitespace changes" option on)

…ms if tag is the first field failures (2): newtype_unit newtype_unit_struct

Fixes (2): newtype_unit newtype_unit_struct

…quence failures (3): newtype_unit newtype_unit_struct unit_variant_with_unknown_fields

failures (1): unit_variant_with_unknown_fields Fixed (2): newtype_unit newtype_unit_struct

…ucts

…dispatch code

They will be reused twice later

Otherwise the following tests will fail: - test_internally_tagged_newtype_variant_containing_unit_struct - test_internally_tagged_struct_variant_containing_unit_variant This reverts commit 1986c17. failures (2): newtype_variant_containing_unit struct_variant_containing_unit_variant

Mingun force-pushed the optimize-internal-tagged-enums branch from 84c311d to 94e15ef Compare November 3, 2020 20:06

Mingun changed the title ~~Optimize internal tagged enums -- do not use internal buffer if tag is the first field~~ Optimize internally tagged enums -- do not use internal buffer if tag is the first field Nov 3, 2020

Mingun force-pushed the optimize-internal-tagged-enums branch from 94e15ef to 5106111 Compare February 23, 2021 18:34

dtolnay requested changes Feb 28, 2021

View reviewed changes

Mingun requested a review from dtolnay March 6, 2021 17:10

Mingun force-pushed the optimize-internal-tagged-enums branch from a169bee to da641b1 Compare March 10, 2022 16:33

dtolnay force-pushed the master branch from 58c82f1 to d208762 Compare September 3, 2022 04:16

Mingun mentioned this pull request May 7, 2023

Add ability to deserialize enums from SeqAccessDeserializer #2445

Open

Mingun mentioned this pull request Aug 12, 2023

Exhaustive internally tagged tests + support of internally tagged enums in non self-describing formats #2569

Draft

Mingun mentioned this pull request Aug 6, 2024

Deserializing to variant vector fields fails tafia/quick-xml#288

Open

Mingun force-pushed the optimize-internal-tagged-enums branch from da641b1 to c22590d Compare August 24, 2024 15:54

Mingun commented Aug 24, 2024

View reviewed changes

Mingun force-pushed the optimize-internal-tagged-enums branch from c22590d to 8cd44cf Compare August 25, 2024 16:49

oli-obk self-assigned this Sep 3, 2024

oli-obk self-requested a review September 3, 2024 15:11

RReverser reviewed Sep 3, 2024

View reviewed changes

Mingun mentioned this pull request Sep 26, 2024

Cannot Deserialize a Serializable Enum tafia/quick-xml#808

Closed

Mingun and others added 29 commits October 22, 2024 01:28

Copy content::EnumDeserializer into FlatEnumDeserializer

e09d488

Remove Option because Some variant is always used

dca0a20

Store ContentDeserializer in FlatEnumDeserializer and FlatVariantDese…

112d56d

…rializer

Add support for struct variants in untagged and adjacently tagged enu…

f70b26e

…ms in non self-describing formats Visitor that passed th the deserialize_any supports only visit_map method, so we can always requests deserialize_map

Use deserialize_unit instead of deserialize_any for unit variants of …

8fe93e5

…untagged and adjacently tagged enums

Extract first iteration - just copy body of loop

e803c89

Replace if let by match

4c6a838

Introduce drain_map helper

2d9f686

Add tests for internally, adjacently and untagged enums

ee57e43

Move TaggedContentVisitor to generated code

63d12a5

Wrap final match into __Visitor::visit method

58e2e7d

(review this commit with "ignore whitespace changes" option on)

Move implementation of __Visitor::visit above

4723444

Return final result from the root visitor

4242d5c

Return ContentDeserializer from drain_map directly

032dbfc

Fix serde-rs#1495: Do not buffer content of the internally tagged enu…

19b34f6

…ms if tag is the first field failures (2): newtype_unit newtype_unit_struct

Introduce MapAccess::is_empty and SeqAccess::is_empty

3554238

Fixes (2): newtype_unit newtype_unit_struct

Remove unnecessary intermediate deserializer when deserialize from se…

fd4a264

…quence failures (3): newtype_unit newtype_unit_struct unit_variant_with_unknown_fields

Allow to deserialize unit and unit structs from SeqAccess

7473a0c

failures (1): unit_variant_with_unknown_fields Fixed (2): newtype_unit newtype_unit_struct

Allow to deserialize newtype structs from MapAccessDeserializer

7a1c507

Allow to deserialize newtype structs from SeqAccess

b954fc4

Consistently use #this_type and #this_value instead of __Field

741cc22

Add ability to specify prefix for Field, FieldVisitor and Visitor str…

00f2fec

…ucts

Run cargo fmt

7b689c2

Split deserialize_struct into functions that generates visitor and …

40430fc

…dispatch code

Move generated visitors for struct variants out of match expression

03307b2

They will be reused twice later

Run cargo fmt

89e2b36

Mingun force-pushed the optimize-internal-tagged-enums branch from 8cd44cf to 132dc81 Compare October 21, 2024 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize internally tagged enums -- do not use internal buffer if tag is the first field #1922

Optimize internally tagged enums -- do not use internal buffer if tag is the first field #1922

Mingun commented Nov 3, 2020 •

edited

Loading

dtolnay left a comment

Mingun commented Feb 28, 2021

RReverser commented Feb 28, 2021

Mingun commented Mar 6, 2021 •

edited

Loading

Test PC

Small enum (2 variants), 1000 types

Big enum (14 variants), 1000 types

pickfire commented Mar 26, 2021

Mingun commented Mar 26, 2021

jarredholman commented Jul 26, 2022

Mingun commented Jul 26, 2022

Mingun left a comment

Mingun Aug 24, 2024

Mingun commented Aug 25, 2024

Mingun commented Aug 30, 2024

oli-obk commented Sep 3, 2024

Mingun commented Sep 3, 2024

RReverser commented Sep 3, 2024

RReverser Sep 3, 2024 •

edited

Loading

Mingun Sep 3, 2024

Mingun Sep 3, 2024

RReverser Sep 3, 2024

Mingun Sep 3, 2024

RReverser Sep 3, 2024

Mingun Sep 3, 2024

Mingun commented Sep 15, 2024

	// Unknown elements are not allowed in sequences
	assert_de_tokens_error::<InternallyTagged>(
	&[
	Token::Seq { len: None },
	Token::Str("Unit"), // tag
	Token::I32(0),
	Token::SeqEnd,
	],
	"invalid length 1, expected 0 elements in sequence",
	);

Optimize internally tagged enums -- do not use internal buffer if tag is the first field #1922

Are you sure you want to change the base?

Optimize internally tagged enums -- do not use internal buffer if tag is the first field #1922

Conversation

Mingun commented Nov 3, 2020 • edited Loading

dtolnay left a comment

Choose a reason for hiding this comment

Mingun commented Feb 28, 2021

RReverser commented Feb 28, 2021

Mingun commented Mar 6, 2021 • edited Loading

Test PC

Small enum (2 variants), 1000 types

Big enum (14 variants), 1000 types

pickfire commented Mar 26, 2021

Mingun commented Mar 26, 2021

jarredholman commented Jul 26, 2022

Mingun commented Jul 26, 2022

Mingun left a comment

Choose a reason for hiding this comment

Mingun Aug 24, 2024

Choose a reason for hiding this comment

Mingun commented Aug 25, 2024

Mingun commented Aug 30, 2024

oli-obk commented Sep 3, 2024

Mingun commented Sep 3, 2024

RReverser commented Sep 3, 2024

RReverser Sep 3, 2024 • edited Loading

Choose a reason for hiding this comment

Mingun Sep 3, 2024

Choose a reason for hiding this comment

Mingun Sep 3, 2024

Choose a reason for hiding this comment

RReverser Sep 3, 2024

Choose a reason for hiding this comment

Mingun Sep 3, 2024

Choose a reason for hiding this comment

RReverser Sep 3, 2024

Choose a reason for hiding this comment

Mingun Sep 3, 2024

Choose a reason for hiding this comment

Mingun commented Sep 15, 2024

Mingun commented Nov 3, 2020 •

edited

Loading

Mingun commented Mar 6, 2021 •

edited

Loading

RReverser Sep 3, 2024 •

edited

Loading