Skip to content

Ollama keep_alive is not working #256

@woofy0

Description

@woofy0

Hi,

When using ollama and passing in "keep_alive" as a "language_model_params", the model is loaded with the default keep_alive of 5 minutes.

        result = lx.extract(
            text_or_documents=input_text,
            prompt_description=prompt,
            examples=examples,
            language_model_type=lx.inference.OllamaLanguageModel,
            model_id="qwen2.5:14b",
            model_url=os.getenv("OLLAMA_HOST", "http://localhost:11434"),
            temperature=0.3,
            fence_output=False,
            use_schema_constraints=False,
            max_char_buffer=5000,
            language_model_params={
                "num_ctx": 8192,
                "keep_alive": 10*60,   # 10 minutes
                "timeout": 10*60       # 10 minutes
            }
        )

You can run the following to verify (assuming the model wasn't in memory already), it will be loaded for 5 minutes.
ollama ps

In the Ollama.py file, it looks like keep_alive is put under the "options" parameter, but the Ollama API documentation shows that it is one of the top level parameters so the payload should be:

payload: dict[str, Any] = {
        'model': model,
        'prompt': prompt,
        'system': system,
        'stream': False,
        'raw': raw,
        'keep_alive': keep_alive,
        'options': options,
    }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions