pub trait ModelRuntimePort:
Send
+ Sync
+ Debug {
// Required methods
fn ensure_model_running<'life0, 'life1, 'async_trait>(
&'life0 self,
model_name: &'life1 str,
num_ctx: Option<u64>,
default_ctx: u64,
) -> Pin<Box<dyn Future<Output = Result<RunningTarget, ModelRuntimeError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait;
fn current_model<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Option<RunningTarget>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
fn stop_current<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<(), ModelRuntimeError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
}Expand description
Port for managing model runtime (ensuring models are running).
This is the primary interface the proxy uses to get a running model server. Implementations handle:
- Model resolution (name → file path)
- Process lifecycle (start, stop, health check)
- Context size management
- Single-swap or concurrent strategies
Required Methods§
Sourcefn ensure_model_running<'life0, 'life1, 'async_trait>(
&'life0 self,
model_name: &'life1 str,
num_ctx: Option<u64>,
default_ctx: u64,
) -> Pin<Box<dyn Future<Output = Result<RunningTarget, ModelRuntimeError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn ensure_model_running<'life0, 'life1, 'async_trait>(
&'life0 self,
model_name: &'life1 str,
num_ctx: Option<u64>,
default_ctx: u64,
) -> Pin<Box<dyn Future<Output = Result<RunningTarget, ModelRuntimeError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Ensure a model is running and ready to serve requests.
This method:
- Resolves the model name to a database entry
- Checks if the model is already running with the correct context
- Starts or restarts the model if needed
- Waits for the health check to pass
- Returns the target information for routing
§Arguments
model_name- Name or alias of the model to runnum_ctx- Optional context size override from requestdefault_ctx- Default context size if not specified
§Errors
Returns ModelRuntimeError if the model cannot be started.
Sourcefn current_model<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Option<RunningTarget>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn current_model<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Option<RunningTarget>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Get information about the currently running model, if any.
Returns None if no model is currently running.
Sourcefn stop_current<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<(), ModelRuntimeError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn stop_current<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Result<(), ModelRuntimeError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Stop the currently running model.
This is primarily for cleanup/shutdown scenarios.