Skip to content

Commit

Permalink
Address the use case distinguishing between multiple models
Browse files Browse the repository at this point in the history
Signed-off-by: Mark McLoughlin <[email protected]>
  • Loading branch information
markmc committed Jul 8, 2024
1 parent e6e5ee1 commit c8a3626
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions docs/sdg/sdg-flow-yaml.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,17 @@ We will continue to use these config files unchanged, and custom files can be sp

### Future - Model Serving

Custom pipelines may have more unique model serving requirements. Instead of serving just one model, we may need to launch the model server with a model and an additional model with adapter.
Custom pipelines may have more unique model serving requirements. Instead of serving just one model, we may need to launch the model server with a model and an additional model with adapter. vLLM, for example, can host both a model and a model+adapter under two different model IDs.

Right now the sdg code takes two parameters for each block - the OpenAI client instance, and the model ID. This should be sufficient. vLLM, for example, can host both a model and a model+adapter under two different model IDs.
The pipeline author needs some way of disambiguating between these multiple models - i.e. the definition of each `LLMBlock` needs to specify a particular model.

Right now the `Pipeline` constructor takes two relevant parameters - the OpenAI client instance, and the model ID for the default model. It's important to note that this model ID is defined by the user at runtime, and it may not match the model IDs that the pipeline author used.

The use cases will be:

1. Most LLMBlock definitions will use the default teacher model - and we can make the semantic that if the pipeline author doesn't specify a model in an `LLMBlock`, the default in `PipelineContext.model_id` is used.
2. In cases where a model+adapter is served, it should usually be the pipeline author defining both the model serving configuration and the model's use in an `LLMBlock`. So the pipeline author can name the model in the serving config and reference it by that name in the pipeline config.
3. In cases where a model+adapter is served with a name that doesn't match that chosen by the pipeline author, the user can supply a mapping between the name used in the `LLMBlock` and the name used in the user's serving config.

In future, it may make sense to define some model serving config (e.g. which adapter to use for SDG) as part of a flow definition. We will address this in a future if enhancement, if needed.

Expand Down

0 comments on commit c8a3626

Please sign in to comment.