-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integration: added HaystackAction #398
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 660a268 in 32 seconds
More details
- Looked at
215
lines of code in2
files - Skipped
2
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_71RMLCbEWlBw8VfO
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
A preview of f9e171b is uploaded and can be seen here: ✨ https://burr.dagworks.io/pull/398 ✨ Changes may take a few minutes to propagate. Since this is a preview of production, content with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
- Let's mark the pipeline converter as experimental
- Let's do tests with "custom" haystack components -- should be easy for tests
- Add to refs/docs
https://docs.haystack.deepset.ai/docs/custom-components
burr/integrations/haystack.py
Outdated
self._component = component | ||
self._name = name | ||
self._reads = list(reads.keys()) if isinstance(reads, dict) else reads | ||
self._writes = list(writes.values()) if isinstance(writes, dict) else writes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably validate that writes doesn't have a duplicate mapping, given that it's the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed some mappings to ensure reads={socket: state_field}
and writes={state_field: socket}
.
This seems to make the most sense because:
- on
reads
, each key should be unique and map to a unique kwarg ofComponent.run()
. The same state value could be passed to different kwargs - on
writes
, each key should be a unique state field. Even thoughComponent.run()
returns a dictionary where keys are unique, we want to prevent calling.update()
on the same field several times.
In other words, this would be invalid:
class CustomComponent:
def run() -> dict:
return {"bar": 1, "baz": 2}
# mapping both `bar` and `baz` to `State["foo"]` is invalid
HaystackAction(
component=CustomComponent()
writes={"bar": "foo", "baz": "foo"}
)
# reverting the mapping ensures that two outputs can't be mapped to `foo`
HaystackAction(
component=CustomComponent()
writes={"foo1": "bar", "foo2": "baz"}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that makes sense, just make sure it's thoroughly documented
return self._writes | ||
|
||
@property | ||
def inputs(self) -> tuple[dict[str, str], dict[str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably assign this in the constructor? Break it into a helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding .inputs()
, it is currently a property, but it's return value may change if we allow a .bind()
method. What was previously a required/optional input is now bound.
If .bound()
is removed, then yes we could set values in __init__()
It seems that this logic belongs to .inputs()
and wouldn't be of much use elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, .bound()
should return a copy of the object so the .inputs()
are frozen IMO. That said, this should take in bound_inputs
in the constructor, and we don't need a method? You don't really want properties dynamically computed like this, it's hard to debug and a bit iffy IMO.
burr/integrations/haystack.py
Outdated
|
||
return state.update(**state_update) | ||
|
||
def bind(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So bind
is only (currently) for functions, actions don't have it. Might be a bit confusing here, but if we like it, it should really be at the Action
level, as that makes a lot of sense IMO. Then it could just do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise we can make this a more specific name to not conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand, the pattern is to only allow bound_params
at init? This affects the Component.run()
method
This would be ok:
haystack_retrieve_documents = HaystackAction(
component=InMemoryEmbeddingRetriever(InMemoryDocumentStore()),
name="retrieve_documents",
reads=["query_embedding"],
writes=["documents"],
bound_params={"foo": "bar"},
)
This would not be ok?
haystack_retrieve_documents = HaystackAction(
component=InMemoryEmbeddingRetriever(InMemoryDocumentStore()),
name="retrieve_documents",
reads=["query_embedding"],
writes=["documents"],
)
haystack_retrieve_documents.bind(foo="bar")
I should simply remove the .bind()
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, or call it something else? Alternatively, we can make bind work at the action level. Currently it's only for functions, and class-based actions don't have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah, the pattern should be a copy-operation here, or remove this and just do the inputs in the parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 82ee637 in 15 seconds
More details
- Looked at
13
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. pyproject.toml:65
- Draft comment:
Add 'haystack' to the '[project.optional-dependencies]' section to ensure it is recognized as an optional dependency. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_DEHzhJ627MrB9dwi
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on f9e171b in 26 seconds
More details
- Looked at
26
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. burr/integrations/haystack.py:191
- Draft comment:
Remove the leftoverprint
statement for cleaner code. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. pyproject.toml:118
- Draft comment:
Ensure thathaystack-ai
is the correct package name for the intended dependency. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_kEbWbKwmrel9XTj4
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quick point about bind()
doing a mutation, otherwise looks good
burr/integrations/haystack.py
Outdated
raise ValueError( | ||
f"Socket `{output_socket_name}` missing from output of `Component.run()`" | ||
) from e | ||
print(state_update) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
printline
burr/integrations/haystack.py
Outdated
self._component = component | ||
self._name = name | ||
self._reads = list(reads.keys()) if isinstance(reads, dict) else reads | ||
self._writes = list(writes.values()) if isinstance(writes, dict) else writes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that makes sense, just make sure it's thoroughly documented
burr/integrations/haystack.py
Outdated
|
||
return state.update(**state_update) | ||
|
||
def bind(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah, the pattern should be a copy-operation here, or remove this and just do the inputs in the parameters.
Add
HaystackAction
which allows to define a BurrAction
from a HaystackComponent
.Changes
HaystackAction
classhaystack_pipeline_to_burr_graph()
functionHow I tested this
Notes
Checklist
Important
Add
HaystackAction
and conversion function to integrate Haystack components and pipelines with Burr.HaystackAction
class inhaystack.py
to convert HaystackComponent
to BurrAction
.haystack_pipeline_to_burr_graph()
function inhaystack.py
to convert HaystackPipeline
to BurrGraph
.haystack.rst
for integration reference.examples/haystack-integration/README.md
.application.py
demonstrating integration usage.test_burr_haystack.py
to testHaystackAction
and pipeline conversion.haystack
topyproject.toml
under optional dependencies.This description was created by for f9e171b. It will automatically update as commits are pushed.