diff --git a/docs/3.0rc/get-started/index.mdx b/docs/3.0rc/get-started/index.mdx index a62103228871..ef65a7fa09ff 100644 --- a/docs/3.0rc/get-started/index.mdx +++ b/docs/3.0rc/get-started/index.mdx @@ -31,7 +31,7 @@ if __name__ == "__main__": Supercharge Prefect with enhanced governance, security, and performance capabilities. - + Upgrade from Prefect 2 to Prefect 3 to get the latest features and performance enhancements. diff --git a/docs/3.0rc/resources/upgrade-prefect-3.mdx b/docs/3.0rc/resources/upgrade-prefect-3.mdx deleted file mode 100644 index 13844bfe8669..000000000000 --- a/docs/3.0rc/resources/upgrade-prefect-3.mdx +++ /dev/null @@ -1,406 +0,0 @@ ---- -title: Upgrade to Prefect 3 -description: Learn how to upgrade from Prefect 2 to Prefect 3. ---- - -Prefect 3 is the most flexible, powerful, and performant version of Prefect yet. -It offers more ways than ever to [model your workflows](https://github.com/PrefectHQ/prefect/releases/tag/3.0.0rc1). -This means that some patterns supported by Prefect 2 are no longer supported with Prefect 3, or are enabled differently. - -Most Prefect 2 workflows can be upgraded to Prefect 3 with no changes. -This guide covers changes that you may need to make to ensure your Prefect 2 workflows keep flowing. - - -Prefect 3 is an upgrade to the open source `prefect` Python package. -The Prefect 3 release does not involve any changes to Prefect Cloud. -All previous and future Prefect 2 releases will continue to work with Prefect Cloud. - - -## Client Side Task Orchestration -Prefect 3 introduces client-side task run orchestration as the default. Task creation and state updates now happen locally, dramatically reducing API calls to the Prefect server during execution. This change brings a 10x performance improvement for task executions in many scenarios. - -The reduced server dependency also means your workflows are more resilient to network issues or server load, continuing to run even if there's a temporary loss of connection to the Prefect server. - -Task updates are now logged in batch, greatly improving efficiency and enabling Prefect 3 to handle much larger and more complex workflows. You can create and run workflows with thousands of tasks efficiently, supporting large-scale data processing and complex orchestration scenarios. - -State changes and logs in the Prefect UI are now eventually consistent, this trade-off enables the substantial performance gains and increased workflow scale that Prefect 3 delivers. - -## Module location and name changes - -Some modules have been renamed, reorganized, or removed for clarity. -Objects whose import paths have changed will remain functional until their six month deprecation period ends, but will raise deprecation warnings. - -See the [migration script](https://github.com/PrefectHQ/prefect/blob/main/src/prefect/_internal/compatibility/migration.py#L22) for the full list of modules that have been moved or removed. - -## Integrations must be upgraded - -Integration packages compatible with Prefect 2 generally cannot be used with Prefect 3 and vice versa. -You must update all of your integration packages when upgrading from Prefect 2 to Prefect 3. - -Specify extras when you install the `prefect` package to ensure that you install the integration package version that corresponds to your Prefect version. For example, use `pip install prefect[aws] --pre` to install Prefect 3 and a compatible `prefect-aws` version. - -## Asynchronous task execution behavior - -Prefect 2 abstracts away certain task execution concerns, which is often convenient, but sometimes causes unexpected behavior. -Prefect 3 defaults to running tasks as they would run as normal Python code, and requires you to be more explicit when you modify task execution behavior. -This results in clearer behavior with fewer surprises. - -In Prefect 3, all workflow code, including task functions, runs on the main thread. This means: - -- Python code that is not explicitly defined as asynchronous executes synchronously, as normal Python code does. -- Non-threadsafe objects, such as database connections, can be shared globally within a workflow. - - -In Prefect 3, asynchronous tasks and flows **cannot** be called from synchronous tasks or flows as they could in Prefect 2. - - -To execute tasks asynchronously, you must either write asynchronous Python functions or use the `Task.submit` method to submit to a Prefect task runner. -Prefect 3 encourages conventional asynchronous Python programming principles. -Updating your workflows to bring them in line with these principles requires code changes. -Here are some examples: - - - -```python Prefect 2 -@task -async def my_task(): - return my_value - - -# Enclosing flow can be sync -@flow -def my_flow(): - return my_task() - - -if __name__ == "__main__": - my_flow() -``` - -```python Prefect 3 -@task -async def my_task(): - return my_value - - -# The enclosing flow must be async as well -@flow -async def my_flow(): - return await my_task() - - -if __name__ == "__main__": - asyncio.run(my_flow()) -``` - - -Because `Task.submit` is always a synchronous call, it does not need to be awaited. -`Task.submit` is still non-blocking and execution of the submitted task is also still non-blocking. - -You do not need to use the `await` keyword in front of your `.submit` call: - - -```python Prefect 2 -@task -async def my_task(): - return my_value - - -@flow -async def my_flow(): - # Must await `.submit()` - future = await my_task.submit() - return future -``` - -```python Prefect 3 -@task -async def my_task(): - return my_value - - -@flow -async def my_flow(): - # No need to await `.submit()` - future = my_task.submit() - return future -``` - - -The `PrefectFuture` methods are also always synchronous in Prefect 3. Names and return values changed to reflect this: - -- `PrefectFuture.get_state()` is now `PrefectFuture.state` -- `PrefectFuture.wait()` now returns `None` instead of a `State` - -Wait for the future to complete, then retrieve the state separately: - - -```python Prefect 2 -@task -async def my_task(): - return my_value - - -@flow -async def my_flow(): - future = await my_task.submit() - # State is returned by `wait()` - state = await future.wait() - return state -``` - -```python Prefect 3 -@task -async def my_task(): - return my_value - - -@flow -async def my_flow(): - future = my_task.submit() - # Wait for future completion - # before retrieving the state - future.wait() - state = future.state - return state -``` - - -If you do not first wait for the future to complete, the run may raise an `AttributeError: coroutine object has no attribute get` exception. - -In Prefect 3, `PrefectFuture`s resulting from submitted task runs are not automatically awaited or resolved before exiting a flow, unless they're a dependency of other flows or tasks. You must wait for a task's `PrefectFuture` resolution before the end of a flow: - - -```python Prefect 2 -@task -def my_task(): - return my_value - - -@flow -def my_flow(): - # The flow will not exit until - # these futures are resolved - futures = my_task.map([1, 2, 3, 4]) -``` - -```python Prefect 3 -@task -def my_task(): - return my_value - - -@flow -def my_flow(): - futures = my_task.map([1, 2, 3, 4]) - # You must wait for futures before exiting flow. - futures.wait() -``` - - -If you do not wait for the futures, the tasks may fail with a `Flow must be running to start a task` error. -The one exception to this case is that `PrefectFuture`s returned from a flow resolve to states before completing the flow: - -```python Prefect 3 -@flow -def my_flow(): - # Prefect will resolve these futures to - # states because the flow returned them, - # so you don't have to call `wait()`. - return my_task.map([1, 2, 3, 4]) -``` - -## Task caching and result persistence - -Prefect 3 supports transactional semantics — tasks runs can be grouped into larger transactions, enabling more resilient workflows. -In keeping with the transactional model, Prefect 3 encourages task runs to be idempotent — intended to run only once per flow run. - -To support this, Prefect 3 has more eager, intelligent task caching defaults than Prefect 2. -Whenever a task runs, Prefect 3 first computes its cache key, then uses that cache key to evaluate whether that same task has run with the same inputs and as part of the same flow before. -If it has, the cached result from the previous run is used. If it hasn't, the task runs. - -Tasks that may have re-run with the same inputs in Prefect 2 do not re-run in Prefect 3 unless the cache key is modified to be more precise. -The cache key can be configured through the new global `cache_policy` setting, or through the `cache_policy` on a particular task. -If a particular task isn't idempotent, you can disable this behavior by setting `cache_policy=NONE` on the task. - -This caching behavior only takes effect when result persistence is enabled. -See the [caching documentation](/3.0rc/develop/task-caching) for more information. - -## Workers replace agents - -A deprecation warning was applied to Prefect interfaces related to agent-based deployments in Prefect's [2.16.4 release](https://github.com/PrefectHQ/prefect/releases/tag/2.16.4), including infrastructure blocks and `prefect deployment` commands. -Agents will continue to be supported in Prefect 2 until September 2024, but are not included in Prefect 3. -If you are using agents, learn how to [upgrade from agents to workers](/3.0rc/resources/upgrade-agents-to-workers). - -## Type enforcement via Pydantic - -If you use Pydantic V1 models for data validation and type enforcement, particularly for flow parameters and custom blocks, you must upgrade those models to Pydantic V2 to work with Prefect 3. -See the [Pydantic migration guide](https://docs.pydantic.dev/latest/migration/) for more information. - -## Variables can be typed with JSON - -In Prefect 2, variables are strings. -To store and retrieve *typed* values in Prefect 2, you have to use certain single-value blocks, including JSON, Date Time, and String blocks. - -In Prefect 3, variables can be any JSON-compatible type. -Single-value blocks created in Prefect 2 are still accessible in Prefect 3, but you cannot create new single-value blocks. - -See the [set and get variables](/3.0rc/develop/variables) page for more information. - -## Self-hosted server database upgrade - -If you [self-host a Prefect server instance](/3.0rc/manage/self-host), run the following command to update your database for Prefect 3: - -```bash -prefect server database upgrade -y -``` - -## Flow run final state determination - -In Prefect 2, the final state of a flow run was influenced by the states of its task runs; if any task run failed, the flow run was marked as failed. - -In Prefect 3, the final state of a flow run is entirely determined by: - -1. The `return` value of the flow function (same as in Prefect 2): - - Literal values are considered successful. - - Any explicit `State` that is returned will be considered the final state of the flow run. If an iterable of `State` objects is returned, all must be `Completed` for the flow run to be considered `Completed`. If any are `Failed`, the flow run will be marked as `Failed`. - -2. Whether the flow function allows an exception to `raise`: - - Exceptions that are allowed to propagate will result in a `Failed` state. - - Exceptions suppressed with `raise_on_failure=False` will not affect the flow run state. - -This change means that task failures within a flow do not automatically cause the flow run to fail unless they affect the flow's return value or raise an uncaught exception. - - -When migrating from Prefect 2 to Prefect 3, be aware that flows may now complete successfully even if they contain failed tasks, unless you explicitly handle task failures. - - -To ensure your flow fails when critical tasks fail, consider these approaches: - -1. Allow exceptions to propagate by not using `raise_on_failure=False`. -2. Use `return_state=True` and explicitly check task states to conditionally `raise` the underlying exception or return a failed state. -3. Use try/except blocks to handle task failures and return appropriate states. - -### Examples - - -```python Allow Unhandled Exceptions -from prefect import flow, task - -@task -def failing_task(): - raise ValueError("Task failed") - -@flow -def my_flow(): - failing_task() # Exception propagates, causing flow failure - -try: - my_flow() -except ValueError as e: - print(f"Flow failed: {e}") # Output: Flow failed: Task failed -``` - -```python Use return_state -from prefect import flow, task -from prefect.states import Failed - -@task -def failing_task(): - raise ValueError("Task failed") - -@flow -def my_flow(): - state = failing_task(return_state=True) - if state.is_failed(): - raise ValueError(state.result()) - return "Flow completed successfully" - -try: - print(my_flow()) -except ValueError as e: - print(f"Flow failed: {e}") # Output: Flow failed: Task failed -``` - -```python Use try/except -from prefect import flow, task -from prefect.states import Failed - -@task -def failing_task(): - raise ValueError("Task failed") - -@flow -def my_flow(): - try: - failing_task() - except ValueError: - return Failed(message="Flow failed due to task failure") - return "Flow completed successfully" - -print(my_flow()) # Output: Failed(message='Flow failed due to task failure') -``` - - -Choose the strategy that best fits your specific use case and error handling requirements. - -## Breaking changes: Errors and resolutions - -#### `can't be used in 'await' expression` - -In Prefect 3, certain methods that were contextually sync/async in Prefect 2 are now synchronous: - -##### `Task` -- `submit` -- `map` - -##### `PrefectFuture` -- `result` -- `wait` - -Attempting to use `await` with these methods will result in a `TypeError`, like: - -```python -TypeError: object PrefectConcurrentFuture can't be used in 'await' expression -``` - -**Example and Resolution** - -You should **remove the `await` keyword** from calls of these methods in Prefect 3: - -```python -from prefect import flow, task -import asyncio - -@task -async def fetch_user_data(user_id): - return {"id": user_id, "score": user_id * 10} - -@task -def calculate_average(user_data): - return sum(user["score"] for user in user_data) / len(user_data) - -@flow -async def prefect_2_flow(n_users: int = 10): # ❌ - users = await fetch_user_data.map(range(1, n_users + 1)) - avg = calculate_average.submit(users) - print(f"Users: {await users.result()}") - print(f"Average score: {await avg.result()}") - -@flow -def prefect_3_flow(n_users: int = 10): # ✅ - users = fetch_user_data.map(range(1, n_users + 1)) - avg = calculate_average.submit(users) - print(f"Users: {users.result()}") - print(f"Average score: {avg.result()}") - -try: - asyncio.run(prefect_2_flow()) - raise AssertionError("Expected a TypeError") -except TypeError as e: - assert "can't be used in 'await' expression" in str(e) - -prefect_3_flow() -# Users: [{'id': 1, 'score': 10}, ... , {'id': 10, 'score': 100}] -# Average score: 55.0 -``` \ No newline at end of file diff --git a/docs/3.0rc/resources/upgrade-to-prefect-3.mdx b/docs/3.0rc/resources/upgrade-to-prefect-3.mdx new file mode 100644 index 000000000000..ea73eb2c6b15 --- /dev/null +++ b/docs/3.0rc/resources/upgrade-to-prefect-3.mdx @@ -0,0 +1,175 @@ +--- +title: Upgrade to Prefect 3 +description: Learn how to upgrade from Prefect 2 to Prefect 3. +--- + +Prefect 3 introduces exciting new features and improvements while maintaining compatibility with most Prefect 2 workflows. For the majority of users, upgrading to Prefect 3 will be a seamless process that requires few or no code changes. This guide highlights key changes that you may need to consider when upgrading. + +## Quickstart + +To upgrade to Prefect 3, run: + +```bash +pip install prefect>=3.0.0rc1 --pre +``` + +Note that the `--pre` flag is required to install the release candidate until the final release is available. + + +If you self-host a Prefect server, run this command to update your database: + +```bash +prefect server database upgrade +``` + +If you use a Prefect integration or extra, remember to upgrade it as well. For example: + +```bash +pip install prefect[aws]>=3.0.0rc1 --pre +``` + +## Upgrade notes + +### Pydantic V2 + + +This change affects you if: You use custom Pydantic models with Prefect features. + + +Prefect 3 is built with Pydantic 2 for improved performance. All Prefect objects will automatically upgrade, but if you use custom Pydantic models for flow parameters or custom blocks, you'll need to ensure they are compatible with Pydantic 2. You can continue to use Pydantic 1 models in your own code if they do not interact directly with Prefect. + +Refer to [Pydantic's migration guide](https://docs.pydantic.dev/latest/migration/) for detailed information on necessary changes. + +### Module location and name changes + +Some less-commonly used modules have been renamed, reorganized, or removed for clarity. The old import paths will continue to be supported for 6 months, but emit deprecation warnings. You can look at the [deprecation code](https://github.com/PrefectHQ/prefect/blob/main/src/prefect/_internal/compatibility/migration.py) to see a full list of affected paths. + +### Async tasks + + +This change affects you if: you use advanced asynchronous behaviors in your flows. + + +Prefect 3 makes a few changes to handling asynchronous code. There are three key changes to be aware of: + +1. **Async Tasks in Synchronous Flows**: In Prefect 2, it was possible to call native `async` tasks from synchronous flows, a pattern that is not normally supported in Python. Prefect 3 removes this ability to reduce complexity and potential issues. If you relied on asynchronous tasks in synchronous flows, you must either make your flow asynchronous or use a task runner that supports asynchronous execution. + +### Flow final states + + +This change affects you if: you want your flow to fail if any task in the flow fails, and you invoke your tasks in a way that doesn't automatically raise an error (including submitting them to a `TaskRunners`). + + +In Prefect 2, the final state of a flow run was influenced by the states of its task runs; if any task run failed, the flow run was marked as failed. + +In Prefect 3, the final state of a flow run is entirely determined by: + +1. The `return` value of the flow function (same as in Prefect 2): + - Literal values are considered successful. + - Any explicit `State` that is returned will be considered the final state of the flow run. If an iterable of `State` objects is returned, all must be `Completed` for the flow run to be considered `Completed`. If any are `Failed`, the flow run will be marked as `Failed`. + +2. Whether the flow function allows an exception to `raise`: + - Exceptions that are allowed to propagate will result in a `Failed` state. + - Exceptions suppressed with `raise_on_failure=False` will not affect the flow run state. + +This change means that task failures within a flow do not automatically cause the flow run to fail unless they affect the flow's return value or raise an uncaught exception. + + +When migrating from Prefect 2 to Prefect 3, be aware that flows may now complete successfully even if they contain failed tasks, unless you explicitly handle task failures. + + +To ensure your flow fails when critical tasks fail, consider these approaches: + +1. Allow task exceptions to propagate by not using `raise_on_failure=False`. +2. Use `return_state=True` and explicitly check task states to conditionally `raise` the underlying exception or return a failed state. +3. Use try/except blocks to handle task failures and return appropriate states. + +#### Examples + + +```python Allow Unhandled Exceptions +from prefect import flow, task + +@task +def failing_task(): + raise ValueError("Task failed") + +@flow +def my_flow(): + failing_task() # Exception propagates, causing flow failure + +try: + my_flow() +except ValueError as e: + print(f"Flow failed: {e}") # Output: Flow failed: Task failed +``` + +```python Use return_state +from prefect import flow, task +from prefect.states import Failed + +@task +def failing_task(): + raise ValueError("Task failed") + +@flow +def my_flow(): + state = failing_task(return_state=True) + if state.is_failed(): + raise ValueError(state.result()) + return "Flow completed successfully" + +try: + print(my_flow()) +except ValueError as e: + print(f"Flow failed: {e}") # Output: Flow failed: Task failed +``` + +```python Use try/except +from prefect import flow, task +from prefect.states import Failed + +@task +def failing_task(): + raise ValueError("Task failed") + +@flow +def my_flow(): + try: + failing_task() + except ValueError: + return Failed(message="Flow failed due to task failure") + return "Flow completed successfully" + +print(my_flow()) # Output: Failed(message='Flow failed due to task failure') +``` + + +Choose the strategy that best fits your specific use case and error handling requirements. + +----- + +### Futures interface + + +This change affects you if: you directly interact with `PrefectFuture` objects. + + +PrefectFutures now have a standard synchronous interface, with an asynchronous one [planned soon](https://github.com/PrefectHQ/prefect/issues/15008). + + +### Automatic task caching + + +This change affects you if: You rely on side effects in your tasks + + +Prefect 3 introduces a powerful idempotency engine. By default, tasks in a flow run are automatically cached if they are called more than once with the same inputs. If you rely on tasks with side effects, this may result in surprising behavior. To disable caching, pass `cache_policy=None` to your task. + +### Workers + + +This change affects you if: You're using agents from an early version of Prefect 2. + + +In Prefect 2, agents were deprecated in favor of next-generation workers. Workers are now standard in Prefect 3. For detailed information on upgrading from agents to workers, please refer to our [upgrade guide](https://docs-3.prefect.io/3.0rc/resources/upgrade-agents-to-workers). diff --git a/docs/3.0rc/resources/whats-new-prefect-3.mdx b/docs/3.0rc/resources/whats-new-prefect-3.mdx new file mode 100644 index 000000000000..66913f599512 --- /dev/null +++ b/docs/3.0rc/resources/whats-new-prefect-3.mdx @@ -0,0 +1,46 @@ +--- +title: What's new in Prefect 3 +sidebarTitle: What's new +--- + +Prefect 3 represents a significant leap forward in workflow orchestration, bringing a host of new features, performance improvements, and expanded capabilities to enhance your data engineering experience. Let's explore the exciting new additions and enhancements in this release. + +Most Prefect 2 users can upgrade without changes to their existing workflows. Please review the [upgrade guide](/3.0rc/resources/upgrade-to-prefect-3) for more information. + +## Open source events and automation system + +One of the most anticipated features in Prefect 3 is the introduction of the events and automation system to the open-source package. Previously exclusive to Prefect Cloud, this powerful system now allows all users to create sophisticated, event-driven workflows. + +With this new capability, you can trigger actions based on specific event payloads, cancel runs if certain conditions aren't met, or automate workflow runs based on external events. For instance, you could initiate a data processing pipeline automatically when a new file lands in an S3 bucket. The system also enables you to receive notifications for various system health events, giving you greater visibility and control over your workflows. + +## New transactional interface + +Another major addition in Prefect 3 is the new transactional interface. This powerful feature makes it easier than ever to build resilient and idempotent pipelines. With the transactional interface, you can group tasks into transactions, automatically roll back side effects on failure, and significantly improve your pipeline's idempotency and resilience. + +For example, you can define rollback behaviors for your tasks, ensuring that any side effects are cleanly reversed if a transaction fails. This is particularly useful for maintaining data consistency in complex workflows involving multiple steps or external systems. + +## Flexible task execution + +Prefect 3 has no restrictions on where tasks can run. Tasks can be nested within other tasks, allowing for more flexible and modular workflows; they can also be called outside of flows, essentially enabling Prefect to function as a background task service. You can now run tasks autonomously, apply them asynchronously, or delay their execution as needed. This flexibility opens up new possibilities for task management and execution strategies in your data pipelines. + + +## Enhanced client-side engine + +Prefect 3 comes with a thoroughly reworked client-side engine that brings several improvements to the table. You can now nest tasks within other tasks, adding a new level of modularity to your workflows. The engine also supports generator tasks, allowing for more flexible and efficient handling of iterative processes. + +One of the most significant changes is that all code now runs on the main thread by default. This change improves performance and leads to more intuitive behavior, especially when dealing with shared resources or non-thread-safe operations. + +## Improved artifacts and variables + +Prefect 3 enhances the artifacts system with new types, including progress bars and image artifacts. These additions allow for richer, more informative task outputs, improving the observability of your workflows. + +The variables system has also been upgraded to support arbitrary JSON, not just strings. This expansion allows for more complex and structured data to be stored and retrieved as variables, increasing the flexibility of your workflow configurations. + +## Workers + +Workers were first introduced in Prefect 2 as next-generation agents, and are now standard in Prefect 3. Workers offer a stronger governance model for infrastructure, improved monitoring of jobs and work pool/queue health, and more flexibility in choosing compute layers, resulting in a more robust and scalable solution for managing the execution of your workflows across various environments. + +## Performance enhancements + +Prefect 3 doesn't just bring new features; it also delivers significant performance improvements. Users running massively parallel workflows on distributed systems such as Dask and Ray will notice substantial speedups. In some benchmark cases, we've observed up to a 98% reduction in runtime overhead. These performance gains translate directly into faster execution times and more efficient resource utilization for your data pipelines. + diff --git a/docs/mint.json b/docs/mint.json index 7f96d807c58d..15fa8a34ae35 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -12,9 +12,9 @@ "url": "https://communityinviter.com/apps/prefect-community/prefect-community" }, { - "icon": "link", - "name": "Prefect.io", - "url": "https://www.prefect.io/" + "icon": "cloud", + "name": "Prefect Cloud", + "url": "https://app.prefect.cloud" } ], "api": { @@ -52,6 +52,7 @@ "group": "Get started", "pages": [ "3.0rc/get-started/index", + "3.0rc/resources/whats-new-prefect-3", "3.0rc/get-started/install", "3.0rc/get-started/quickstart" ], @@ -197,7 +198,7 @@ { "group": "Resources", "pages": [ - "3.0rc/resources/upgrade-prefect-3", + "3.0rc/resources/upgrade-to-prefect-3", "3.0rc/resources/upgrade-agents-to-workers", "3.0rc/resources/cancel", "3.0rc/resources/visualize-flow-structure", @@ -1276,8 +1277,8 @@ } ], "topbarCtaButton": { - "name": "TRY PREFECT CLOUD", - "url": "https://app.prefect.cloud/" + "type": "github", + "url": "https://github.com/PrefectHQ/Prefect" }, "versions": [ "3.0rc",