Overview
Using the Ollama Python library to connect to a remote LLM server, with generate, chat, and LangChain integration.
Steps
1. Remote Server Connection
Specifying host and timeout on Client connects to a remote Ollama server. When using LangChain, set base_url on ChatOllama.
import ollama
client = ollama.Client(
host="http://192.168.x.x:11434",
timeout=300
)
2. generate and chat
# Single response generation
result = client.generate(
model="qwen3:8b",
prompt="Explain what Python is in one sentence"
)
print(result["response"])
# Conversational
reply = client.chat(
model="qwen3:8b",
messages=[{"role": "user", "content": "Hello"}]
)
print(reply.message.content)
3. LangChain Integration
Using LangChain’s ChatOllama makes it easy to build a conversational interface.
from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage
llm = ChatOllama(
model="qwen3:8b",
base_url="http://192.168.x.x:11434"
)
while True:
user_input = input("Enter your question (quit: exit): ")
if user_input.lower() == "exit":
break
messages = [HumanMessage(content=user_input)]
response = llm.invoke(messages)
print("Answer:", response.content)
4. Timeout
Even an 8B model can take time depending on server specs. If the default timeout is too short, a ConnectionError occurs. Setting a generous timeout value is recommended.
Resources
- 올라마와 오픈소스 LLM을 활용한 AI 에이전트 개발 입문
Leave a comment