API Reference

API reference for FlashRank.

FlashRank.BertTextEncoderType
BertTextEncoder

The text encoder for Bert model (WordPiece tokenization).

Fields

  • wp::WordPiece: The WordPiece tokenizer.
  • vocab::Dict{String, Int}: The vocabulary, 0-based indexing of tokens to match Python implementation.
  • startsym::String: The start symbol.
  • endsym::String: The end symbol.
  • padsym::String: The pad symbol.
  • trunc::Union{Nothing, Int}: The truncation length. Defaults to 512 tokens.
source
FlashRank.EmbedResultType
EmbedResult{T <: Real}

The result of embedding passages.

Fields

  • embeddings::AbstractArray{T}: The embeddings of the passages. With property embeddings as column-major matrix of size (batch_size, embedding_dimension).
  • elapsed::Float64: The time taken to embed the passages.
source
FlashRank.EmbedderModelType
EmbedderModel

A model for embedding passages, including the encoder and the ONNX session for inference.

For embedding, use as embed(embedder, passages) or as a functor embedder(passages).

source
FlashRank.RankerModelType
RankerModel

A model for ranking passages, including the encoder and the ONNX session for inference.

For ranking, use as rank(ranker, query, passages) or as a functor ranker(query, passages).

source
FlashRank.WordPieceType
WordPiece

WordPiece is a tokenizer that splits a string into a sequence of KNOWN sub-word tokens (or token IDs). It uses a double array trie to store the vocabulary and the index of the vocabulary.

Implementation is based on: https://github.com/chengchingwen/Transformers.jl

Fields

  • trie::DoubleArrayTrie: The double array trie of the vocabulary (for fast lookups of tokens).
  • index::Vector{Int}: The index of the vocabulary. It is 0-based as we provide token IDs to models trained in python.
  • unki::Int: The index of the unknown token in the TRIE (ie, this is not the token ID, but the trie index).
  • max_char::Int: The maximum number of characters in a token. Default is 200.
  • subword_prefix::String: The prefix of a sub-word token. Default is "##".
source
FlashRank.WordPieceMethod
(wp::WordPiece; token_ids::Bool = false)(x)

WordPiece functor that tokenizes a string into a sequence of tokens (or token IDs).

Arguments

  • token_ids::Bool = false: If true, return the token IDs directly. Otherwise, return the tokens.
source
FlashRank.embedMethod
embed(
    embedder::EmbedderModel, passage::AbstractString; split_instead_trunc::Bool = false)

Embeds a single passage.

If passage is too long for the model AND split_instead_trunc is true, the passage is split into several smaller chunks of size embedder.encoder.trunc and embedded separately.

source
FlashRank.embedMethod
embed(
    embedder::EmbedderModel, passages::AbstractVector{<:AbstractString})

Embeds passages using the given embedder model.

Arguments:

  • embedder::EmbedderModel: The embedder model to use.
  • passages::AbstractVector{<:AbstractString}: The passages to embed.

Returns

  • EmbedResult: The embeddings of the passages. With property embeddings as column-major matrix of size (batch_size, embedding_dimension).

Example

model = EmbedderModel(:tiny_embed)
result = embed(model, ["Hello, how are you?", "How is it going?"])
result.embeddings # 312x2 matrix of Float32
source
FlashRank.encodeMethod
encode(enc::BertTextEncoder, text::String; add_special_tokens::Bool = true,
    max_tokens::Int = enc.trunc, split_instead_trunc::Bool = false)

Encodes the text and returns the token IDs, token type IDs, and attention mask.

We enforce max_tokens to be a concrete number here to be able to do split_instead_trunc. split_instead_trunc splits any long sequences into several smaller ones.

source
FlashRank.rankMethod
rank(
    ranker::RankerModel, query::AbstractString, passages::AbstractVector{<:AbstractString};
    top_n = length(passages))

Ranks passages for a given query using the given ranker model. Ranking should determine higher suitability to provide an answer to the query (higher score is better).

Arguments:

  • ranker::RankerModel: The ranker model to use.
  • query::AbstractString: The query to rank passages for.
  • passages::AbstractVector{<:AbstractString}: The passages to rank.
  • top_n: The number of most relevant documents to return. Default is length(passages).
source
FlashRank.tokenizeMethod
tokenize(enc::BertTextEncoder, text::AbstractString;
    add_special_tokens::Bool = true, add_end_token::Bool = true, token_ids::Bool = false,
    max_tokens::Union{Nothing, Int} = enc.trunc)

Tokenizes the text and returns the tokens or token IDs (to skip looking up the IDs twice).

Arguments

  • add_special_tokens::Bool = true: Add special tokens at the beginning and end of the text.
  • add_end_token::Bool = true: Add end token at the end of the text.
  • token_ids::Bool = false: If true, return the token IDs directly. Otherwise, return the tokens.
  • max_tokens::Union{Nothing, Int} = enc.trunc: The maximum number of tokens to return (usually defined by the model).
source