API Reference

SemanticCaches.HashCache
SemanticCaches.HashCache
SemanticCaches.SemanticCache
SemanticCaches.SemanticCache
SemanticCaches.similarity
SemanticCaches.similarity

SemanticCaches.HashCache — Type

HashCache

A cache that uses string hashes to find the exactly matching items. Useful for long input strings, which cannot be embedded quickly.

Any incoming request must match key exactly (in lookup), otherwise it's not accepted. key represents what user finds meaningful to be strictly matching (eg, model name, temperature, etc).

Fields

items: A vector of cached items (type CachedItem)
lookup: A dictionary that maps keys to the indices of the items that have that key.
items_lock: A lock for the items vector.
lookup_lock: A lock for the lookup dictionary.

source

SemanticCaches.HashCache — Method

(cache::HashCache)(key::String, fuzzy_input::String; verbose::Integer = 0, min_similarity::Real = 1.0)

Finds the item that EXACTLY matches the provided cache key and EXACTLY matches the hash of fuzzy_input.

Arguments

key::String: The key to match exactly.
fuzzy_input::String: The input to compare the hash of.
verbose::Integer = 0: The verbosity level.
min_similarity::Real = 1.0: The minimum similarity (we expect exact match defined as 1.0).

Returns

A CachedItem:

If an exact match is found, the output field is set to the cached output.
If no exact match is found, the output field is set to nothing.

You can validate if an item has been found by checking if output is not nothing or simply isvalid(item).

Example

cache = HashCache()
item = cache("key1", "fuzzy_input")

## add it to cache if new
if !isvalid(item)
    # calculate the expensive output
    output = expensive_calculation()
    item.output = output
    ## add it to cache
    push!(cache, item)
end

# If you ask again, it will be faster because it's in the cache
item = cache("key1", "fuzzy_input")

source

SemanticCaches.SemanticCache — Type

SemanticCache

A cache that stores embeddings and uses semantic search to find the most relevant items.

Any incoming request must match key exactly (in lookup), otherwise it's not accepted. key represents what user finds meaningful to be strictly matching (eg, model name, temperature, etc).

Fields

items: A vector of cached items (type CachedItem)
lookup: A dictionary that maps keys to the indices of the items that have that key.
items_lock: A lock for the items vector.
lookup_lock: A lock for the lookup dictionary.

source

SemanticCaches.SemanticCache — Method

(cache::SemanticCache)(
    key::String, fuzzy_input::String; verbose::Integer = 0, min_similarity::Real = 0.95)

Finds the item that EXACTLY matches the provided cache key and is the most similar given its embedding. Similarity must be at least min_similarity. Search is done via cosine similarity (dot product).

Arguments

key::String: The key to match exactly.
fuzzy_input::String: The input to embed and compare to the cache.
verbose::Integer = 0: The verbosity level.
min_similarity::Real = 0.95: The minimum similarity.

Returns

A CachedItem:

If the similarity is above min_similarity, the output field is set to the cached output.
If the similarity is below min_similarity, the output field is set to nothing.

You can validate if an item has been found by checking if output is not nothing or simply isvalid(item).

Example

cache = SemanticCache()
item = cache("key1", "fuzzy_input"; min_similarity=0.95)

## add it to cache if new
if !isvalid(item)
    # calculate the expensive output
    output = expensive_calculation()
    item.output = output
    ## add it to cache
    push!(cache, item)
end

# If you ask again, it will be faster because it's in the cache
item = cache("key1", "fuzzy_input"; min_similarity=0.95)

source

SemanticCaches.similarity — Method

similarity(cache::HashCache, items::Vector{CachedItem},
    indices::Vector{Int}, hash::UInt64)

Finds the items with the exact hash as hash.

source

SemanticCaches.similarity — Method

similarity(cache::SemanticCache, items::Vector{CachedItem},
    indices::Vector{Int}, embedding::Vector{Float32})

Finds the most similar item in the cache to the given embedding. Search is done via cosine similarity (dot product).

Arguments

cache::SemanticCache: The cache to search in.
items::Vector{CachedItem}: The items to search in.
indices::Vector{Int}: The indices of the items to search in.
embedding::Vector{Float32}: The embedding to search for.

Returns

A tuple (max_sim, max_idx) where

max_sim: The maximum similarity.
max_idx: The index of the most similar item.

Notes

The return item is not guaranteed to be very similar, you need to check if the similarity is high enough.
We assume that embeddings are normalized to have L2 norm 1, so Cosine similarity is the same as dot product.

source