PyDBSP

Introduction - (a subset of) Differential Dataflow for the masses

This library provides an implementation of the DBSP language for incremental streaming computations. It is a tool primarily meant for research. See it as the PyTorch of streaming.

It has zero dependencies, and is written in pure python.

Here you can find a single-notebook implementation of almost everything in the DBSP paper. It mirrors what is in this library in an accessible way, and with more examples.

What is DBSP?

DBSP is differential dataflow's less expressive successor. It is a competing theory and framework to other stream processing systems such as Flink and Spark.

Its value is most easily understood in that it is capable of transforming "batch" possibly-iterative relational queries into "streaming incremental ones". This however only conveys a fraction of the theory's power.

As an extreme example, you can find a incremental Interpreter for Datalog under pydbsp.algorithm. Datalog is a query language that is similar to SQL, with focus in efficiently supporting recursion. By implementing Datalog interpretation with dbsp, we get an interpreter whose queries can both change during runtime and respond to new data being streamed in.

Examples

Paper walkthroughs

Implementation of the DBSP Paper

Blogposts

Streaming Pandas on the GPU

Notebooks

Tests

There many examples living in each test/test_*.py file.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
notebooks		notebooks
pydbsp		pydbsp
test		test
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyDBSP

Introduction - (a subset of) Differential Dataflow for the masses

What is DBSP?

Examples

Paper walkthroughs

Blogposts

Notebooks

Tests

About

Releases 3

Packages

Languages

brurucy/pydbsp

Folders and files

Latest commit

History

Repository files navigation

PyDBSP

Introduction - (a subset of) Differential Dataflow for the masses

What is DBSP?

Examples

Paper walkthroughs

Blogposts

Notebooks

Tests

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages