API Reference
API reference for SemanticCaches.
SemanticCaches.HashCache
SemanticCaches.HashCache
SemanticCaches.SemanticCache
SemanticCaches.SemanticCache
SemanticCaches.similarity
SemanticCaches.similarity
SemanticCaches.HashCache
— TypeHashCache
A cache that uses string hashes to find the exactly matching items. Useful for long input strings, which cannot be embedded quickly.
Any incoming request must match key
exactly (in lookup
), otherwise it's not accepted. key
represents what user finds meaningful to be strictly matching (eg, model name, temperature, etc).
Fields
items
: A vector of cached items (typeCachedItem
)lookup
: A dictionary that maps keys to the indices of the items that have that key.items_lock
: A lock for the items vector.lookup_lock
: A lock for the lookup dictionary.
SemanticCaches.HashCache
— Method(cache::HashCache)(key::String, fuzzy_input::String; verbose::Integer = 0, min_similarity::Real = 1.0)
Finds the item that EXACTLY matches the provided cache key
and EXACTLY matches the hash of fuzzy_input
.
Arguments
key::String
: The key to match exactly.fuzzy_input::String
: The input to compare the hash of.verbose::Integer = 0
: The verbosity level.min_similarity::Real = 1.0
: The minimum similarity (we expect exact match defined as 1.0).
Returns
A CachedItem
:
- If an exact match is found, the
output
field is set to the cached output. - If no exact match is found, the
output
field is set tonothing
.
You can validate if an item has been found by checking if output
is not nothing
or simply isvalid(item)
.
Example
cache = HashCache()
item = cache("key1", "fuzzy_input")
## add it to cache if new
if !isvalid(item)
# calculate the expensive output
output = expensive_calculation()
item.output = output
## add it to cache
push!(cache, item)
end
# If you ask again, it will be faster because it's in the cache
item = cache("key1", "fuzzy_input")
SemanticCaches.SemanticCache
— TypeSemanticCache
A cache that stores embeddings and uses semantic search to find the most relevant items.
Any incoming request must match key
exactly (in lookup
), otherwise it's not accepted. key
represents what user finds meaningful to be strictly matching (eg, model name, temperature, etc).
Fields
items
: A vector of cached items (typeCachedItem
)lookup
: A dictionary that maps keys to the indices of the items that have that key.items_lock
: A lock for the items vector.lookup_lock
: A lock for the lookup dictionary.
SemanticCaches.SemanticCache
— Method(cache::SemanticCache)(
key::String, fuzzy_input::String; verbose::Integer = 0, min_similarity::Real = 0.95)
Finds the item that EXACTLY matches the provided cache key
and is the most similar given its embedding. Similarity must be at least min_similarity
. Search is done via cosine similarity (dot product).
Arguments
key::String
: The key to match exactly.fuzzy_input::String
: The input to embed and compare to the cache.verbose::Integer = 0
: The verbosity level.min_similarity::Real = 0.95
: The minimum similarity.
Returns
A CachedItem
:
- If the similarity is above
min_similarity
, theoutput
field is set to the cached output. - If the similarity is below
min_similarity
, theoutput
field is set tonothing
.
You can validate if an item has been found by checking if output
is not nothing
or simply isvalid(item)
.
Example
cache = SemanticCache()
item = cache("key1", "fuzzy_input"; min_similarity=0.95)
## add it to cache if new
if !isvalid(item)
# calculate the expensive output
output = expensive_calculation()
item.output = output
## add it to cache
push!(cache, item)
end
# If you ask again, it will be faster because it's in the cache
item = cache("key1", "fuzzy_input"; min_similarity=0.95)
SemanticCaches.similarity
— Methodsimilarity(cache::HashCache, items::Vector{CachedItem},
indices::Vector{Int}, hash::UInt64)
Finds the items with the exact hash as hash
.
SemanticCaches.similarity
— Methodsimilarity(cache::SemanticCache, items::Vector{CachedItem},
indices::Vector{Int}, embedding::Vector{Float32})
Finds the most similar item in the cache to the given embedding. Search is done via cosine similarity (dot product).
Arguments
cache::SemanticCache
: The cache to search in.items::Vector{CachedItem}
: The items to search in.indices::Vector{Int}
: The indices of the items to search in.embedding::Vector{Float32}
: The embedding to search for.
Returns
A tuple (max_sim, max_idx)
where
max_sim
: The maximum similarity.max_idx
: The index of the most similar item.
Notes
- The return item is not guaranteed to be very similar, you need to check if the similarity is high enough.
- We assume that embeddings are normalized to have L2 norm 1, so Cosine similarity is the same as dot product.