Development AI LLM [LLM] Integrating LLMs with the Ollama Python Library

Overview

Using the Ollama Python library to connect to a remote LLM server, with generate, chat, and LangChain integration.

Steps

1. Remote Server Connection

Specifying host and timeout on Client connects to a remote Ollama server. When using LangChain, set base_url on ChatOllama.

import ollama

client = ollama.Client(
    host="http://192.168.x.x:11434",
    timeout=300
)

2. generate and chat

# Single response generation
result = client.generate(
    model="qwen3:8b",
    prompt="Explain what Python is in one sentence"
)
print(result["response"])

# Conversational
reply = client.chat(
    model="qwen3:8b",
    messages=[{"role": "user", "content": "Hello"}]
)
print(reply.message.content)

3. LangChain Integration

Using LangChain’s ChatOllama makes it easy to build a conversational interface.

from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage

llm = ChatOllama(
    model="qwen3:8b",
    base_url="http://192.168.x.x:11434"
)

while True:
    user_input = input("Enter your question (quit: exit): ")
    if user_input.lower() == "exit":
        break

    messages = [HumanMessage(content=user_input)]
    response = llm.invoke(messages)
    print("Answer:", response.content)

4. Timeout

Even an 8B model can take time depending on server specs. If the default timeout is too short, a ConnectionError occurs. Setting a generous timeout value is recommended.

[LLM] Integrating LLMs with the Ollama Python Library

binaryloader

Overview

Steps

1. Remote Server Connection

2. generate and chat

3. LangChain Integration

4. Timeout

Resources

References

Share on

Leave a comment

You may also enjoy

[Column] The Age of Building and Selling Apps Is Over

[Claude Code] Building a Real-Time Agent Dashboard

[Claude Code] Designing an Expert Team with Subagents