Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monoid aggregations #48

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Monoid aggregations #48

wants to merge 3 commits into from

Conversation

bachdavi
Copy link
Collaborator

This is not all all done.

Using DiffVector we can aggregate SUM and COUNT in a more efficient, "parallel" way.

The idea is to port all of them.

Currently not working is the correct rearrangement of input variables and output variables.

We can use Differentials Monoids to track different aggregates in the Diff.

We explode() the value vector into a DiffVector and maintain the monoid corresponding to the given aggregation. Using count we resurface the values into the data part.

A few minor caveats:

  1. Median is not a Monoid operation (There might still be a way, which technically not correct, but at least morally ;) )
  2. Different Aggregations: Currently every element in the DiffVector is a Sum Monoid, if we want to use Min or Max we need to use an enum wrapping them , e.g. Diff <- Doesn't look to nice though
  3. Implementing AVG or VAR requirers some post processing currently not there.

@bachdavi
Copy link
Collaborator Author

Corresponding PR in Differential Dataflow: TimelyDataflow/differential-dataflow#156

@comnik
Copy link
Owner

comnik commented Mar 10, 2019

Exciting stuff!

Yay for separate module, nay for killing the old aggregate test ;)

  1. How would that look like?

  2. But seems worth it? Or am I missing something?

  3. I kind of want to get rid of any aggregation that clients could easily derive from a lower-level one, on which Differential can do the heavy lifting. So AVG and VARIANCE would be kicked out. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants