Local models with Ollama.ai
This file contains examples of how to work with Ollama.ai models. It assumes that you've already installated and launched the Ollama server. For more details or troubleshooting advice, see the Frequently Asked Questions section.
First, let's import the package and define a helper link for calling un-exported functions:
using PromptingTools
const PT = PromptingTools
PromptingTools
There were are several models from https://ollama.ai/library that we have added to our PT.MODEL_REGISTRY
, which means you don't need to worry about schema changes: Eg, "llama2" or "openhermes2.5-mistral" (see PT.list_registry()
and PT.list_aliases()
)
Note: You must download these models prior to using them with ollama pull <model_name>
in your Terminal.
If you use Apple Mac M1-3, make sure to provide `api_kwargs=(; options=(; num_gpu=99))` to make sure the whole model is offloaded on your GPU. Current default is 1, which makes some models unusable. Example for running Mixtral: `msg = aigenerate(PT.OllamaSchema(), "Count from 1 to 5 and then say hi."; model="dolphin-mixtral:8x7b-v2.5-q4_K_M", api_kwargs=(; options=(; num_gpu=99)))`
Text Generation with aigenerate
Simple message
TL;DR if you use models in PT.MODEL_REGISTRY
, you don't need to add schema
as the first argument:
msg = aigenerate("Say hi!"; model = "llama2")
AIMessage("Hello there! *adjusts glasses* It's nice to meet you! Is there anything I can help you with or would you like me to chat with you for a bit?")
Standard string interpolation
model = "openhermes2.5-mistral"
a = 1
msg = aigenerate("What is `$a+$a`?"; model)
name = "John"
msg = aigenerate("Say hi to {{name}}."; name, model)
AIMessage("Hello John! *smiles* It's nice to meet you! Is there anything I can help you with today?")
Advanced Prompts
conversation = [
PT.SystemMessage("You're master Yoda from Star Wars trying to help the user become a Yedi."),
PT.UserMessage("I have feelings for my iPhone. What should I do?")]
msg = aigenerate(conversation; model)
AIMessage("(Deep sigh) A problem, you have. Feelings for an iPhone, hmm? (adjusts spectacles)
Much confusion, this causes. (scratches head) A being, you are. Attached to a device, you have become. (chuckles) Interesting, this is.
First, let go, you must. (winks) Hard, it is, but necessary, yes. Distract yourself, find something else, try. (pauses)
Or, perhaps, a balance, you seek? (nods) Both, enjoy and let go, the middle path, there is. (smirks) Finding joy in technology, without losing yourself, the trick, it is. (chuckles)
But fear not, young one! (grins) Help, I am here. Guide you, I will. The ways of the Yedi, teach you, I will. (winks) Patience and understanding, you must have. (nods)
Now, go forth! (gestures) Explore, discover, find your balance. (smiles) The Force be with you, it does! (grins)")
Schema Changes / Custom models
If you're using some model that is not in the registry, you can either add it:
PT.register_model!(;
name = "llama123",
schema = PT.OllamaSchema(),
description = "Some model")
PT.MODEL_ALIASES["l123"] = "llama123" # set an alias you like for it
"llama123"
OR define the schema explicitly (to avoid dispatch on global PT.PROMPT_SCHEMA
):
schema = PT.OllamaSchema()
aigenerate(schema, "Say hi!"; model = "llama2")
AIMessage("Hello there! *smiling face* It's nice to meet you! I'm here to help you with any questions or tasks you may have, so feel free to ask me anything. Is there something specific you need assistance with today? 😊")
Note: If you only use Ollama, you can change the default schema to PT.OllamaSchema()
via PT.set_preferences!("PROMPT_SCHEMA" => "OllamaSchema", "MODEL_CHAT"=>"llama2")
Restart your session and run aigenerate("Say hi!")
to test it.
! Note that in version 0.6, we've introduced OllamaSchema
, which superseded OllamaManagedSchema
and allows multi-turn conversations and conversations with images (eg, with Llava and Bakllava models). OllamaManagedSchema
has been kept for compatibility and as an example of a schema where one provides a prompt as a string (not dictionaries like OpenAI API).
Providing Images with aiscan
It's as simple as providing a local image path (keyword image_path
). You can provide one or more images:
msg = aiscan("Describe the image"; image_path=["julia.png","python.png"] model="bakllava")
image_url
keyword is not supported at the moment (use Downloads.download
to download the image locally).
Embeddings with aiembed
Simple embedding for one document
msg = aiembed(schema, "Embed me"; model) # access msg.content
PromptingTools.DataMessage(JSON3.Array{Float64, Vector{UInt8}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}} of size (4096,))
One document and we materialize the data into a Vector with copy (postprocess
function argument)
msg = aiembed(schema, "Embed me", copy; model)
PromptingTools.DataMessage(Vector{Float64} of size (4096,))
Multiple documents embedding
Multiple documents - embedded sequentially, you can get faster speed with async
msg = aiembed(schema, ["Embed me", "Embed me"]; model)
PromptingTools.DataMessage(Matrix{Float64} of size (4096, 2))
You can use Threads.@spawn or asyncmap, whichever you prefer, to paralellize the model calls
docs = ["Embed me", "Embed me"]
tasks = asyncmap(docs) do doc
msg = aiembed(schema, doc; model)
end
embedding = mapreduce(x -> x.content, hcat, tasks)
size(embedding)
4096×2 Matrix{Float64}:
...
Using postprocessing function
Add normalization as postprocessing function to normalize embeddings on reception (for easy cosine similarity later)
using LinearAlgebra
schema = PT.OllamaSchema()
msg = aiembed(schema,
["embed me", "and me too"],
LinearAlgebra.normalize;
model = "openhermes2.5-mistral")
PromptingTools.DataMessage(Matrix{Float64} of size (4096, 2))
Cosine similarity is then a simple multiplication
msg.content' * msg.content[:, 1]
2-element Vector{Float64}:
0.9999999999999982
0.40796033843072876
This page was generated using Literate.jl.