Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate removing the "create distributed function" section from the quick start guide #1033

Open
ozgune opened this issue Mar 11, 2022 · 2 comments

Comments

@ozgune
Copy link

ozgune commented Mar 11, 2022

Why are we implementing it? (sales eng)

What are the typical use cases?

Communication goals (e.g. detailed howto vs orientation)

Our Quick Start guide is an opportunity to introduce simple concepts to our users.

https://docs.citusdata.com/en/v10.2/get_started/tutorial_multi_tenant.html

In the multi-tenant quick start guide, we introduce the following concept. I feel that the notion of additional roundtrips, creating a new UDF, and then declaring the use of the UDF as a distributed function goes beyond a quick start.

Could we evaluate removing the following section from our Quick Start Guide?

I'm asking because I haven't used create_distributed_function() in this way before. Although I'm not a power user, I also feel that this goes beyond what's needed to get started on Citus.

"Each statement in a transactions causes roundtrips between the coordinator and workers in multi-node Citus. For multi-tenant workloads, it’s more efficient to run transactions in distributed functions. The efficiency gains become more apparent for larger transactions, but we can use the small transaction above as an example.

First create a function that does the deletions:

CREATE OR REPLACE FUNCTION
delete_campaign(company_id int, campaign_id int)
RETURNS void LANGUAGE plpgsql AS $fn$
BEGIN
DELETE FROM campaigns
WHERE id = $2 AND campaigns.company_id = $1;
DELETE FROM ads
WHERE ads.campaign_id = $2 AND ads.company_id = $1;
END;
$fn$;

Next use create_distributed_function to instruct Citus to run the function directly on workers rather than on the coordinator (except on a single-node Citus installation, which runs everything on the coordinator). It will run the function on whatever worker holds the Shards for tables ads and campaigns corresponding to the value company_id.

SELECT create_distributed_function(
'delete_campaign(int, int)', 'company_id',
colocate_with := 'campaigns'
);

-- you can run the function as usual
SELECT delete_campaign(5, 46);"

Good locations for content in docs structure

How does this work? (devs)

Example sql

Corner cases, gotchas

Are there relevant blog posts or outside documentation about the concept/feature?

Link to relevant commits and regression tests if applicable

@ozgune ozgune changed the title Evaluate "creating a distributed function" in the multi-tenant quick start guide Evaluate removing the "create distributed function" section from the quick start guide Mar 11, 2022
@onderkalaci
Copy link
Member

related to #1024.

"Distributed functions" is an advanced topic, so it makes sense not to have it on the quick start.

Users typically create a distributed function and expect the function speed up (expecting similar behavior to create distributed table). However, in reality, the schema/functions should be properly set up to benefit from distributed functions. Hence, users are confused with the concept of distributed functions.

In fact, Marco thinks we could rename create_distributed_function to something more explicit like delegate_procedure_to_nodes or such.

@jonels-msft
Copy link
Member

"Distributed functions" is an advanced topic, so it makes sense not to have it on the quick start.

IIRC there was a push to advertise that feature when it was released, but I agree that it's a distraction in an early tutorial.

In fact, Marco thinks we could rename create_distributed_function to something more explicit like delegate_procedure_to_nodes or such.

Sounds like a good idea. "Distributed function" suggests a false analogy with "distributed table."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants