diff --git a/docs/contributing/contributing.rst b/docs/contributing/contributing.rst
index 450a6439..ab3859a7 100644
--- a/docs/contributing/contributing.rst
+++ b/docs/contributing/contributing.rst
@@ -1,3 +1,5 @@
+.. _contributing:
+
 ============
 Contributing
 ============
diff --git a/docs/examples/deployment/_web-server.png b/docs/examples/deployment/_web-server.png
deleted file mode 100644
index 20b85af4..00000000
Binary files a/docs/examples/deployment/_web-server.png and /dev/null differ
diff --git a/docs/examples/deployment/aws.md b/docs/examples/deployment/aws.md
deleted file mode 100644
index 99869df3..00000000
--- a/docs/examples/deployment/aws.md
+++ /dev/null
@@ -1,101 +0,0 @@
-# AWS Lambda
-
-[AWS Lambda](https://aws.amazon.com/lambda/) - serverless computation service in AWS.
-
-Here we have an example how to deploy "hello-world" AWS Lambda with a simple Burr application.
-This example is based on the official instruction: https://docs.aws.amazon.com/lambda/latest/dg/python-image.html#python-image-instructions
-
-Burr can be deployed within a Lambda function. This is a good option if you want to run your
-application in response to events, or if you want to run your application in a serverless environment.
-
-See the [repository on GitHub](https://github.com/DAGWorks-Inc/burr/tree/main/examples/deployment/aws/lambda) for the full code example and for step-by-step instructions on how to deploy a Burr application to AWS Lambda.
-
-## Prerequisites
-
-- **AWS CLI Setup**: Make sure the AWS CLI is set up on your machine. If you haven't done this yet, no worries! You can follow the [Quick Start guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html) for easy setup instructions.
-
-## Step-by-Step Guide
-
-### 1. Build Docker image:
-
-- **Build Docker image for deploy in AWS ECR**
-
-  ```shell
-  docker build --platform linux/amd64 -t aws-lambda-burr .
-  ```
-
-- **Local tests:**
-
-  Run Docker container:
-
-  ```shell
-  docker run -p 9000:8080 aws-lambda-burr
-  ```
-
-  Send test request to check if Docker container executes it correctly:
-
-  ```shell
-  curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"body": {"number":3}}'
-  ```
-
-### 2. Create AWS ECR repository:
-
-Ensure the AWS account number (`111122223333`) is correctly replaced with yours:
-
-- **Authenticate Docker to Amazon ECR**:
-
-    Retrieve an authentication token to authenticate your Docker client to your Amazon Elastic Container Registry (ECR):
-
-    ```shell
-    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 111122223333.dkr.ecr.us-east-1.amazonaws.com
-    ```
-
-- **Create the ECR Repository**:
-
-    ```shell
-    aws ecr create-repository --repository-name aws-lambda-burr \
-        --region us-east-1 \
-        --image-scanning-configuration scanOnPush=true \
-        --image-tag-mutability MUTABLE
-  ```
-
-### 3. Deploy the Image to AWS ECR
-
-Ensure the AWS account number (`111122223333`) is correctly replaced with yours:
-
-```shell
-docker tag aws-lambda-burr 111122223333.dkr.ecr.us-east-1.amazonaws.com/aws-lambda-burr:latest
-docker push 111122223333.dkr.ecr.us-east-1.amazonaws.com/aws-lambda-burr:latest
-```
-
-### 4. Create a simple AWS Lambda role:
-
-Example of creating an AWS Role for Lambda execution:
-
-```shell
-aws iam create-role \
-    --role-name lambda-ex \
-    --assume-role-policy-document '{"Version": "2012-10-17","Statement": [{ "Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'
-```
-
-### 5. Create AWS Lambda
-
-Ensure the AWS account number (`111122223333`) is correctly replaced with yours:
-
-```shell
-aws lambda create-function \
-    --function-name aws-lambda-burr \
-    --package-type Image \
-    --code ImageUri=111122223333.dkr.ecr.us-east-1.amazonaws.com/aws-lambda-burr:latest \
-    --role arn:aws:iam::111122223333:role/lambda-ex
-```
-
-### 6. Test AWS Lambda
-
-```shell
-aws lambda invoke \
-    --function-name aws-lambda-burr \
-    --cli-binary-format raw-in-base64-out \
-    --payload '{"body": {"number": 5}}' \
-    response.json
-```
diff --git a/docs/examples/deployment/index.rst b/docs/examples/deployment/index.rst
deleted file mode 100644
index 7fb0c272..00000000
--- a/docs/examples/deployment/index.rst
+++ /dev/null
@@ -1,9 +0,0 @@
-=============
-✈ Deployment
-=============
-
-.. toctree::
-    :maxdepth: 2
-
-    web-server
-    aws
diff --git a/docs/examples/deployment/infrastructure.rst b/docs/examples/deployment/infrastructure.rst
new file mode 100644
index 00000000..be4d7ecc
--- /dev/null
+++ b/docs/examples/deployment/infrastructure.rst
@@ -0,0 +1,8 @@
+-------------------------------------
+Provisioning Infrastructure/Deploying
+-------------------------------------
+
+Burr is not opinionated about the method of deployment/cloud one uses. Any method that can run a python server will work
+(AWS, vercel, etc...). Note we aim to have more examples here -- see `this issue <https://github.com/DAGWorks-Inc/burr/issues/390>`_ to track!
+
+- `Deploying Burr in an AWS lambda function <https://github.com/DAGWorks-Inc/burr/tree/main/examples/deployment/aws/lambda>`_
diff --git a/docs/examples/deployment/monitoring.rst b/docs/examples/deployment/monitoring.rst
new file mode 100644
index 00000000..4138b6af
--- /dev/null
+++ b/docs/examples/deployment/monitoring.rst
@@ -0,0 +1,24 @@
+------------------------
+Monitoring in Production
+------------------------
+
+Burr's telemetry UI is meant both for debugging and running in production. It can consume `OpenTelemetry traces <https://burr.dagworks.io/reference/integrations/opentelemetry/>`_,
+and has a suite of useful capabilities for debugging Burr applications.
+
+It has two (current) implementations:
+
+1. `Local (filesystem) tracking <https://burr.dagworks.io/concepts/tracking/>`_ (default, for debugging or lower-scale production use-cases with a distributed file-system)
+2. `S3-based tracking <https://github.com/DAGWorks-Inc/burr/blob/main/burr/tracking/server/s3/README.md>`_ (meant for production use-cases)
+
+Which each come with an implementation of data storage on the server.
+
+To deploy these in production, you can follow the following examples:
+
+1. `Burr + FastAPI + docker <https://github.com/mdrideout/burr-fastapi-docker-compose>`_ by `Matthew Rideout <https://github.com/mdrideout>`_. This contains a sample API + UI + tracking server all bundled in one!
+2. `Docker compose + nginx proxy <https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant#running-the-ui-with-email-server-backend-in-a-docker-container>`_ by `Aditha Kaushik <https://github.com/97k>`_ for the email assistant example, demonstrates running the docker image with the tracking server.
+
+We also have a few issues to document deploying Burr's monitoring system in production:
+
+- `deploy on AWS <https://github.com/DAGWorks-Inc/burr/issues/391>`_
+- `deploy on GCP <https://github.com/DAGWorks-Inc/burr/issues/392>`_
+- `deploy on Azure <https://github.com/DAGWorks-Inc/burr/issues/393>`_
diff --git a/docs/examples/deployment/web-server.md b/docs/examples/deployment/web-server.md
deleted file mode 100644
index 495dad11..00000000
--- a/docs/examples/deployment/web-server.md
+++ /dev/null
@@ -1,224 +0,0 @@
-# Web service (FastAPI, Flask, Django, etc.)
-
-Burr is meant to run interactive apps. This means running it as part of a web-service that
-responds to requests, manages state, and documents its capabilities. The interactive nature of Burr
-(moving in/out of programmatic control) means we want to think carefully about how to expose
-our Burr applications to the web. Burr makes it natural to integrate with a web-server such as FastAPI.
-
-In this tutorial we will use the [email assistant example](https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant) as a walk-through.
-Our goal is to expose the email assistant in a web-server that a UI can easily be built on top of.
-While we will not be building the UI here, we will link out to the final product for you to explore.
-
-## Email Assistant
-
-The email assistant is an example of a "human-in-the-loop" generative AI application. This means that
-it requires human assistance at multiple points to build a better product.
-
-### Running the example
-
-If you want to get a sense for how this looks, open the burr UI:
-
-```bash
-pip install "burr[start]"
-burr
-```
-
-Then navigate to the email assistant via http://localhost:7241/demos/email-assistant,
-
-You can create a new "application" and see it run through, with the telemetry on the right side.
-
-### Conceptual Model
-At a high-level, the email assistant does the following:
-
-1. Accepts an email + instructions to respond
-2. Comes up with a set of clarifying questions (if the LLM deems it required)
-3. Using the answer to those questions, generates a draft
-4. Accepts feedback to that draft and generates another one, repeating until the user is happy
-5. Returns the final draft
-
-Due to the stochastic, often complex nature of LLMs, this has been shown to be one of the most promising
-applications -- a collaboration between humans and AI to quickly build high-quality responses.
-
-### Modeling with Burr
-This is a brief overview, for a more in-depth look at the email assistant, see the [email assistant example](https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant).
-To model our email assistant with burr, we can use the following diagram:
-
-![Modeling](./_web-server.png)
-
-There are three points at which the user can interact:
-1. `process_input`: This is where the user inputs the email and instructions
-2. `clarify_instructions`: The LLM has created a set of clarification questions
-3. `process_feedback`: The user has provided feedback on the draft
-
-(3) repeats until the user is happy with the draft (in our implementation this occurs when the feedback they provide is empty)
-
-Recall that we use the word "application" in Burr to refer to an instance of this process above
-(with persisted state).
-
-You can see the full application in [application.py](https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant/application.py).
-
-## Integrating in a web server
-
-For this example we will use [FastApi](https://fastapi.tiangolo.com/) and [pydantic](https://docs.pydantic.dev/latest/),
-but it should work with any other web-stack that uses python.
-
-### Endpoints
-
-We construct the following endpoints:
-
-1. `POST /create`: This will create a new application and return the ID
-2. `PUT /initialize_draft/{id}/`: This calls out to `process_input`, passing in the email and instructions
-3. `PUT /clarify_instructions/{id}`: This will gives answers back to the LLM
-4. `PUT /process_feedback/{id}`: This will give feedback back to the LLM
-5. `GET /{id}/state`: This will return the current state of the application
-
-The `GET` endpoint allows us to get the current state of the application -- this enables
-the user to reload if they quit the browser/get distracted. Each of these endpoints will return the full state of the application,
-which can be rendered on the frontend.  Furthermore, it will indicate the next API endpoint
-we call, which allows the UI to render the appropriate form and
-
-Using FastAPI + Pydantic, this becomes very simple to implement. First, let's add a utility to
-get the `application` object. This will use a cached version or instantiate it:
-
-```python
-@functools.lru_cache(maxsize=128)
-def _get_application(app_id: str) -> Application:
-    app = email_assistant_application.application(app_id=app_id)
-    return app
-```
-
-All this does is call to our function `application` in `email_assistant` that
-recreates the application. We have not included the `create` function here,
-but it calls out to the same API.
-
-### Data Model
-
-Let's then define a pydantic model to represent the state, and the app object in FastAPI:
-```python
-
-class EmailAssistantState(pydantic.BaseModel):
-    app_id: str
-    email_to_respond: Optional[str]
-    response_instructions: Optional[str]
-    questions: Optional[List[str]]
-    answers: Optional[List[str]]
-    drafts: List[str]
-    feedback_history: List[str]
-    final_draft: Optional[str]
-    # This stores the next step, which tells the frontend which ones to call
-    next_step: Literal["process_input", "clarify_instructions", "process_feedback", None]
-
-    @staticmethod
-    def from_app(app: Application):
-        # implementation left out, call app.state and translate to pydantic model
-        # we can use `app.get_next_action()` to get the next step and return it to the user
-        ...
-```
-
-### Execution
-
-Next, we can run through to the next step, starting from any point:
-
-```python
-def _run_through(project_id: str, app_id: Optional[str], inputs: Dict[str, Any]) -> EmailAssistantState:
-    email_assistant_app = _get_application(project_id, app_id)
-    email_assistant_app.run(  # Using this as a side-effect, we'll just get the state aft
-        halt_before=["clarify_instructions", "process_feedback"],
-        halt_after=["final_result"],
-        inputs=inputs,
-    )
-    return EmailAssistantState.from_app(email_assistant_app)
-```
-
-We `halt_before` the steps that require user instructions, and `halt_after`
-the final result. This allows us to get the state after each step.
-
-Finally, we can define our endpoints. For instance:
-
-```python
-@router.post("/provide_feedback/{id}")
-def provide_feedback(project_id: str, app_id: str, feedback: Feedback) -> EmailAssistantState:
-    return _run_through(project_id, app_id, dict(feedback=feedback.feedback))
-```
-
-This represents a simple but powerful architecture. We can continue calling these endpoints
-until we're at a "terminal" state, at which point we can always ask for the state.
-If we decide to add more input steps, we can modify the state machine and add more input steps.
-We are not required to hold state in the app (it is all delegated to Burr's persistence),
-so we can easily load up from any given point, allowing the user to wait for seconds,
-minutes, hours, or even days before continuing.
-
-As the frontend simply renders based on the current state and the next step, it will always
-be correct, and the user can always pick up where they left off. With Burr's telemetry capabilities
-they can debug any state-related issues, ensuring a smooth user experience.
-
-### Persistence
-
-Note that we never called out to databases. It all just magically worked.. This is all because we decouple the persistence
-layer from the web-call. The application will be persisted (to whatever database you want),
-by burr's plugin capabilities -- read more [here](https://burr.dagworks.io/concepts/state-persistence/).
-This greatly reduces the amount you have to think about when developing. As Burr persistence is
-pluggable, you can write to your own database with whichever schema you prefer, customizing
-the schema for your project or using a generic one (state is just a JSON object -- you can easily serialize/deseriealize it).
-
-### Additional concerns
-
-#### Scaling
-
-But [is this webscale](https://www.youtube.com/watch?v=b2F-DItXtZs)? As anything, it depends on how you implement it.
-Two factors determine the scalability of this system:
-
-1. database layer -- can the database support the volume of inputs/outputs?
-2. compute layer -- can the server run fast enough to keep up with the users?
-
-For the database layer, it depends largely on the underlying database, as well as the
-schema you use. That said, Burr makes it easier due to natural partitioning of the data
-into `application_id` and `partition_key` (the latter could be the `user ID`), allowing common
-operations (such as _give me all applications for X user_ and _give me the state of application Y_)
-simple if you index your state table on the application ID and `partition_key`.
-
-For the compute layer, you can simply scale horizontally. The only tricky aspect is ensuring state synchronization
-and locking. As we cached the application object, we could potentially get into a position
-in which the state is out of sync. To solve this, you can either:
-
-1. Use a locking method (either in the database) to ensure that only one server is running a given application at any point
-2. Use sticky sessions/sharding to ensure that a given user always hits the same server
-3. Handle forking/resolution of state at the persistence layer with a custom implementation
-
-Or possibly some combination of the above.
-
-#### Async
-
-While we implemented synchronous calls, you can easily make these async by using `async def` and `await` in the appropriate places,
-and using the `arun` method in Burr. Read more about async capabilities in [applications](https://burr.dagworks.io/concepts/state-machine/),
-and [actions](https://burr.dagworks.io/concepts/actions/).
-
-#### Streaming
-
-You can use streaming to send back the stream of the output at any given point. You do this by creating a
-[streaming action](https://burr.dagworks.io/concepts/streaming-actions/). You can then integrate with the
-streaming respose in FastAPI to send back the stream of the output. You can do this with any steps
-(intermediate or final) in your application.
-
-#### Authentication/Data access
-
-While Burr does not operate at the data access layer, this can be easily handles at the  application layer.
-Any authentication system will tell you the user ID, which you can look in your DB to determine access
-to your partition key.
-
-## Wrap-up
-
-In this tutorial we showed how to integrate Burr into a web-server. We used FastAPI and Pydantic
-to create a simple but powerful API that allows users to interact with the email assistant, leveraging
-Burr's persistence capabilities to ensure that the user can always pick up where they left off.
-
-At a high-level, the real value of representing your application as a state machine (as Burr does)
-is that it all becomes easier to think about. You don't have to conceptually model state persistence,
-dataflow, and the web infrastructure in one piece -- they can all be built separately.
-
-In the future we will be automating this process, allowing you to generate a FastAPI app from the Burr application.
-
-For now though, you can find the resources for the current implementation:
-- [application.py](https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant/application.py)
-- [server.py](https://github.com/DAGWorks-Inc/burr/tree/main/examples/email-assistant/server.py)
-- [ui](https://github.com/DAGWorks-Inc/burr/tree/main/telemetry/ui/src/examples/EmailAssistant.tsx) -- this uses [ReactQuery](https://tanstack.com/query/latest/docs/framework/react/overview) to call the API and [react](https://react.dev/) to render the state.
diff --git a/docs/examples/deployment/web-server.rst b/docs/examples/deployment/web-server.rst
new file mode 100644
index 00000000..a139db49
--- /dev/null
+++ b/docs/examples/deployment/web-server.rst
@@ -0,0 +1,22 @@
+--------------------
+Burr in a web server
+--------------------
+
+We largely use `fastAPI <https://fastapi.tiangolo.com/>`_ as our web server, but Burr can work with any python-friendly server framework
+(`django <https://www.djangoproject.com>`_, `flask <https://flask.palletsprojects.com/>`_, etc...).
+
+To run Burr in a FastAPI server, see the following examples:
+
+- `Human in the loop FastAPI server <https://github.com/DAGWorks-Inc/burr/tree/main/examples/web-server>`_ (`TDS blog post <https://towardsdatascience.com/building-an-email-assistant-application-with-burr-324bc34c547d>`__ )
+- `OpenAI-compatible agent with FastAPI <https://github.com/DAGWorks-Inc/burr/tree/main/examples/openai-compatible-agent>`_
+- `Streaming server using SSE + FastAPI <https://github.com/DAGWorks-Inc/burr/tree/main/examples/streaming-fastapi>`_  (`TDS blog post <https://towardsdatascience.com/how-to-build-a-streaming-agent-with-burr-fastapi-and-react-e2459ef527a8>`__ )
+- `Use typed state with Pydantic + FastAPI <https://github.com/DAGWorks-Inc/burr/tree/main/examples/typed-state>`_
+
+Connecting to a database
+------------------------
+
+To connect Burr to a database, you can use one of the provided persisters, or build your own:
+
+- :ref:`Documentation on persistence <state-persistence>`
+- :ref:`Set of available persisters <persistersref>`
+- `Simple chatbot intro with persistence to SQLLite <https://github.com/DAGWorks-Inc/burr/blob/main/examples/simple-chatbot-intro/notebook.ipynb>`_