-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Optionally store content marked as
"base16"
encoded into `byt…
…ea` columns (#287) Motivation: more efficient `bytea` storage of data marked with `"contentEncoding": "base16"` JSON Schema References: - https://datatracker.ietf.org/doc/html/rfc4648#section-8 - https://json-schema.org/draft/2020-12/draft-bhutton-json-schema-validation-00#rfc.section.8.3 Bytea is at least twice as efficient as string to store hex data: ``` select octet_length('\x2BdfBd329984Cf0DC9027734681A16f542cF3bB4'::bytea) as "bytea", octet_length('0x2BdfBd329984Cf0DC9027734681A16f542cF3bB4') as "string" ; bytea | string -------+-------- 20 | 42 ``` It's probably a good idea to make this behaviour opt-in. I'm not sure how to implement that since most of the type detection code is inside static methods of `PostgresConnector`. I can't find a way to inject the target config without making a big change. --------- Co-authored-by: Edgar Ramírez Mondragón <[email protected]>
- Loading branch information
1 parent
1f7b73d
commit 557c9da
Showing
6 changed files
with
221 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8 changes: 8 additions & 0 deletions
8
target_postgres/tests/data_files/base16_content_encoding_interpreted.singer
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{"type":"SCHEMA","stream":"test_base_16_encoding_interpreted","schema":{"type":"object","properties":{"id":{"type":"string"},"contract_address":{"type":"string","contentEncoding":"base16"},"raw_event_data":{"type":["string","null"],"contentEncoding":"base16"}},"required":["id","contract_address","raw_event_data"]},"key_properties":["id"]} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"test_handle_an_hex_str","contract_address":"0xA1B2C3D4E5F607080910","raw_event_data":"0xA1B2C3D4E5F60708091001020304050607080910010203040506070809100102030405060708091001020304050607080910"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"empty_0x_str","contract_address":"0x","raw_event_data":"0x"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"empty_str","contract_address":"","raw_event_data":""},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"test_nullable_field","contract_address":"","raw_event_data":null},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"test_handle_hex_without_the_0x_prefix","contract_address":"A1B2C3D4E5F607080910","raw_event_data":"A1B2C3D4E5F6070809100102030405060"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"test_handle_odd_and_even_number_of_chars","contract_address":"0xA1","raw_event_data":"A12"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_interpreted","record":{"id":"test_handle_upper_and_lowercase_hex","contract_address":"0xa1","raw_event_data":"A12b"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} |
8 changes: 8 additions & 0 deletions
8
target_postgres/tests/data_files/base16_content_encoding_not_interpreted.singer
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{"type":"SCHEMA","stream":"test_base_16_encoding_not_interpreted","schema":{"type":"object","properties":{"id":{"type":"string"},"contract_address":{"type":"string","contentEncoding":"base16"},"raw_event_data":{"type":["string","null"],"contentEncoding":"base16"}},"required":["id","contract_address","raw_event_data"]},"key_properties":["id"]} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"test_handle_an_hex_str","contract_address":"0xA1B2C3D4E5F607080910","raw_event_data":"0xA1B2C3D4E5F60708091001020304050607080910010203040506070809100102030405060708091001020304050607080910"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"empty_0x_str","contract_address":"0x","raw_event_data":"0x"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"empty_str","contract_address":"","raw_event_data":""},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"test_nullable_field","contract_address":"","raw_event_data":null},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"test_handle_hex_without_the_0x_prefix","contract_address":"A1B2C3D4E5F607080910","raw_event_data":"A1B2C3D4E5F6070809100102030405060"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"test_handle_odd_and_even_number_of_chars","contract_address":"0xA1","raw_event_data":"A12"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} | ||
{"type":"RECORD","stream":"test_base_16_encoding_not_interpreted","record":{"id":"test_handle_upper_and_lowercase_hex","contract_address":"0xa1","raw_event_data":"A12b"},"time_extracted":"2023-09-15T19:33:01.841018+00:00"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters