Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NpgSqlBatch and Postgres pipeline mode #5231

Closed
hrrrrustic opened this issue Aug 26, 2023 · 10 comments
Closed

NpgSqlBatch and Postgres pipeline mode #5231

hrrrrustic opened this issue Aug 26, 2023 · 10 comments

Comments

@hrrrrustic
Copy link

hrrrrustic commented Aug 26, 2023

There is a page https://www.npgsql.org/doc/performance.html with Batching/Pipelining section that describes NpgSqlBatch usage. I assumed that Pipelining means support pipeline mode, but looks like it doesn't support it.

I have a code like this:

await using var connection = new NpgsqlConnection(connectionString);

await using var batch = connection.CreateBatch();

var command1 = new NpgsqlBatchCommand("SELECT $1");
command1.Parameters.Add(new NpgsqlParameter<int>() {TypedValue = 1});
var command2 = new NpgsqlBatchCommand("SELECT $1");
command2.Parameters.Add(new NpgsqlParameter<int>() {TypedValue = 2});
    
batch.BatchCommands.Add(command1);
batch.BatchCommands.Add(command2);

await connection.OpenAsync();
await using var reader = await batch.ExecuteReaderAsync();
// The result doesn't matter here

This code works totally fine with

  • Direct connection to PostgreSql,
  • Through PgBouncer
  • Through Odyssey with reserve_prepared_statements=false

and looks in WireShark like this
image

but throws exception with Odyssey and reserve_prepared_statements=true:
Internal server error: Received backend message ParseComplete while expecting BindCompleteMessage. Please file a bug.

WireShark again:
(Don't pay attention to malformed packet, I opened dump on another machine with different decode settings I guess. It was fine on original machine with PostgreSql)
image

Looks like this is pretty similar problem to yandex/odyssey#486, so we just returned to reserve_prepared_statements=false for now. That's OK, but a bit sad because we also have Go services with pgx v5 that supports pipeline.

So, the questions are:

  • Does NpgSql support pipelining, rather than batching?
  • If yes, what's wrong with my code? 😄 Am I missing some feature flag/property or config?
  • If no, what exactly Pipelining means in the performance page? Is there any plans to support pipelining?

UPD:
We use .NET 7, NpgSql 7.0.4 and PostgreSql 14.7

@roji
Copy link
Member

roji commented Aug 27, 2023

@hrrrrustic the above looks like a bug in Odyssey rather than any sort of problem in Npgsql; this seems to be confirmed in yandex/odyssey#486 ("v4 doesn't support pipeline mode").

Does NpgSql support pipelining, rather than batching?

That depends on exactly what you mean when you use these words... At the wire level, in both cases the client sends multiple SQL statements in the same roundtrip, without waiting for the results of earlier statements to arrive before sending later statements. As you've seen in Wireshark, when you batch with Npgsql (and most other drivers), you get a chain of P1/B1/D1/E1/P2/B2/D2/E2/S.

Pipelining sometimes means inserting a sync in between the statements (P1/B1/D1/E1/S1/P2/B2/D2/E2/S2), so that they're kept separate in terms of transactionality and error handling. This can be done with Npgsql but setting NpgsqlCommand.EnableErrorBarriers to true. But I'm not sure this difference matters to Odyssey.

In any case, I'm not sure what exactly is supported and what isn't by Odyssey - and in any case, Npgsql is only using standard PostgreSQL techniques here.

@vonzshik
Copy link
Contributor

Does NpgSql support pipelining, rather than batching?

Npgsql has multiplexing mode which should work the same way as libpq's pipeline mode, though it's still an experimental feature (which is the main reason for it no being documented).

@roji
Copy link
Member

roji commented Aug 27, 2023

Thanks for the added note @vonzshik. Just to be extra clear, from the protocol/server perspective multiplexing doesn't do anything different than a regular batch with EnableErrorBarriers - it's just a chain of statements with Syncs inserted between them.

@hrrrrustic
Copy link
Author

hrrrrustic commented Aug 27, 2023

this seems to be confirmed in yandex/odyssey#486 ("v4 doesn't support pipeline mode").

v4 references to pgx version, not Odyssey. Pgx v5 understand and successfully parse response from the second WireShark picture. That's the point why I created issue here before going to Odyssey repo. Perhaps this response is not valid by specification, I'm just trying to understand why Go driver parsed this and NpgSql not :)

Go code that we tested:

package main

import (
 "context"
 "log"
 "os"

 "github.com/jackc/pgx/v5"
 "github.com/jackc/pgx/v5/pgconn"
)

func main() {
 // os.Args[1] can be replaced with "user=postgres password=postgres host= port= dbname=postgres sslmode=disable"
 conn, err := pgx.Connect(context.Background(), os.Args[1])
 if err != nil {
  panic(err)
 }
 ctx := context.Background()
 pp := conn.PgConn().StartPipeline(ctx)
 eqb := pgx.ExtendedQueryBuilder{}
 if err := eqb.Build(conn.TypeMap(), nil, []any{1}); err != nil {
  panic(err)
 }
 pp.SendQueryParams(select $1::bigint, eqb.ParamValues, nil, eqb.ParamFormats, eqb.ResultFormats)
 if err := eqb.Build(conn.TypeMap(), nil, []any{1}); err != nil {
  panic(err)
 }
 pp.SendQueryParams(select $1::bigint, eqb.ParamValues, nil, eqb.ParamFormats, eqb.ResultFormats)
 if err := pp.Sync(); err != nil {
  panic(err)
 }
 results, errGR := pp.GetResults()
 if errGR != nil {
  panic(errGR)
 }
 rr := results.(*pgconn.ResultReader)
 rows := pgx.RowsFromResultReader(conn.TypeMap(), rr)
 var n int64
 for rows.Next() {
  if err := rows.Scan(&n); err != nil {
   panic(err)
  }
 }
 _ = pp.Close()
}

I'll try to use EnableErrorBarriers add see anything changes, thanks

@roji
Copy link
Member

roji commented Aug 27, 2023

Ah sorry, seems like I misread your issue - if I understand correctly, you're saying that when you use Npsgql with batching with Odyssey and reserve_prepared_statements=true, you get the internal Npgsql error, right? Can you attach a full wireshark dump of this exchange, without malformed packets? Assuming Odyssey does something to the packets when reserve_prepared_statements is true, that would allow us to understand what's going on.

@hrrrrustic
Copy link
Author

hrrrrustic commented Aug 27, 2023

if I understand correctly, you're saying that when you use Npsgql with batching with Odyssey and reserve_prepared_statements=true, you get the internal Npgsql error, right?

Exactly. The error appeared only for batch queries, simple NpgSqlCommand for both read and write queries worked fine at the same time. We also didn't use .Prepare() explicitly or change any related parameters (e.g. Max Auto Prepare) in the app.

Can you attach a full wireshark dump of this exchange, without malformed packets?

Can I send it to email mentioned on your GitHub profile? We dumped it on our staging environment, so it contains IPs and some service information.

@hrrrrustic
Copy link
Author

@roji Did you have a chance to look at the dump? I sent it a few days ago to your email (didn't receive your answer here)

@roji
Copy link
Member

roji commented Sep 1, 2023

Unfortunately not - there's too much stuff going on at the moment. You'll have to be patient for a little longer.

@hrrrrustic
Copy link
Author

That's ok 👍
Just wanted to make sure you received the dump :)

@NinoFloris
Copy link
Member

NinoFloris commented Mar 12, 2024

There are quite some open issues around this feature of oddysey
https://github.com/yandex/odyssey/issues?q=is%3Aissue+is%3Aopen+pool_reserve_prepared_statement+

This particular response looks really broken as oddysey decides to interleave part of its replies.
263498371-12f06b74-1cd1-48fc-86d7-4fb1e4cb96f9

Npgsql sends P/B/D/E twice and instead of returning its replies in sequence 1/2/T/D{0,n}/C (x2) oddysey does something extremely out of spec. It front loads all the parse complete and bind complete messages (which is why you see 1/1 2/2 in the dump) before returning to send each row description, data, command complete in the original sequence again.

As to why this works in go, pgx does something fairly dangerous https://github.com/jackc/pgx/blob/master/pgconn/pgconn.go#L2161 which is why it works there. When you do getResults() pgx just keeps looping until it reads a message type that breaks its loop. Which in your case will be after all the ParseComplete and BindComplete messages have been consumed. It will then read the first RowDesription and return a row reader based on that...

Anyway, I'm going to close this as this is 100% something you should take up with oddysey. While you're at it you might want to attend the pgx folks to the fact they're doing some very loosey-goosey interpretation of the protocol stream. Whether or not that's exploitable is another matter but I wouldn't be surprised.

@NinoFloris NinoFloris closed this as not planned Won't fix, can't repro, duplicate, stale Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants