API Reference
API reference for FlashRank.
FlashRank.BertTextEncoder
FlashRank.EmbedResult
FlashRank.EmbedderModel
FlashRank.RankerModel
FlashRank.WordPiece
FlashRank.WordPiece
FlashRank.bert_cased_tokenizer
FlashRank.bert_uncased_tokenizer
FlashRank.embed
FlashRank.embed
FlashRank.encode
FlashRank.rank
FlashRank.tokenize
FlashRank.BertTextEncoder
— TypeBertTextEncoder
The text encoder for Bert model (WordPiece tokenization).
Fields
wp::WordPiece
: The WordPiece tokenizer.vocab::Dict{String, Int}
: The vocabulary, 0-based indexing of tokens to match Python implementation.startsym::String
: The start symbol.endsym::String
: The end symbol.padsym::String
: The pad symbol.trunc::Union{Nothing, Int}
: The truncation length. Defaults to 512 tokens.
FlashRank.EmbedResult
— TypeEmbedResult{T <: Real}
The result of embedding passages.
Fields
embeddings::AbstractArray{T}
: The embeddings of the passages. With propertyembeddings
as column-major matrix of size(batch_size, embedding_dimension)
.elapsed::Float64
: The time taken to embed the passages.
FlashRank.EmbedderModel
— TypeEmbedderModel
A model for embedding passages, including the encoder and the ONNX session for inference.
For embedding, use as embed(embedder, passages)
or as a functor embedder(passages)
.
FlashRank.RankerModel
— TypeRankerModel
A model for ranking passages, including the encoder and the ONNX session for inference.
For ranking, use as rank(ranker, query, passages)
or as a functor ranker(query, passages)
.
FlashRank.WordPiece
— TypeWordPiece
WordPiece is a tokenizer that splits a string into a sequence of KNOWN sub-word tokens (or token IDs). It uses a double array trie to store the vocabulary and the index of the vocabulary.
Implementation is based on: https://github.com/chengchingwen/Transformers.jl
Fields
trie::DoubleArrayTrie
: The double array trie of the vocabulary (for fast lookups of tokens).index::Vector{Int}
: The index of the vocabulary. It is 0-based as we provide token IDs to models trained in python.unki::Int
: The index of the unknown token in the TRIE (ie, this is not the token ID, but the trie index).max_char::Int
: The maximum number of characters in a token. Default is 200.subword_prefix::String
: The prefix of a sub-word token. Default is "##".
FlashRank.WordPiece
— Method(wp::WordPiece; token_ids::Bool = false)(x)
WordPiece functor that tokenizes a string into a sequence of tokens (or token IDs).
Arguments
token_ids::Bool = false
: If true, return the token IDs directly. Otherwise, return the tokens.
FlashRank.bert_cased_tokenizer
— Methodbert_cased_tokenizer(input)
Google bert tokenizer which remain the case during tokenization. Recommended for multi-lingual data.
FlashRank.bert_uncased_tokenizer
— Methodbert_uncased_tokenizer(input)
Google bert tokenizer which do lower case on input before tokenization.
FlashRank.embed
— Methodembed(
embedder::EmbedderModel, passage::AbstractString; split_instead_trunc::Bool = false)
Embeds a single passage
.
If passage is too long for the model AND split_instead_trunc
is true, the passage is split into several smaller chunks of size embedder.encoder.trunc
and embedded separately.
FlashRank.embed
— Methodembed(
embedder::EmbedderModel, passages::AbstractVector{<:AbstractString})
Embeds passages
using the given embedder
model.
Arguments:
embedder::EmbedderModel
: The embedder model to use.passages::AbstractVector{<:AbstractString}
: The passages to embed.
Returns
EmbedResult
: The embeddings of the passages. With propertyembeddings
as column-major matrix of size(batch_size, embedding_dimension)
.
Example
model = EmbedderModel(:tiny_embed)
result = embed(model, ["Hello, how are you?", "How is it going?"])
result.embeddings # 312x2 matrix of Float32
FlashRank.encode
— Methodencode(enc::BertTextEncoder, text::String; add_special_tokens::Bool = true,
max_tokens::Int = enc.trunc, split_instead_trunc::Bool = false)
Encodes the text and returns the token IDs, token type IDs, and attention mask.
We enforce max_tokens
to be a concrete number here to be able to do split_instead_trunc
. split_instead_trunc
splits any long sequences into several smaller ones.
FlashRank.rank
— Methodrank(
ranker::RankerModel, query::AbstractString, passages::AbstractVector{<:AbstractString};
top_n = length(passages))
Ranks passages
for a given query
using the given ranker
model. Ranking should determine higher suitability to provide an answer to the query (higher score is better).
Arguments:
ranker::RankerModel
: The ranker model to use.query::AbstractString
: The query to rank passages for.passages::AbstractVector{<:AbstractString}
: The passages to rank.top_n
: The number of most relevant documents to return. Default islength(passages)
.
FlashRank.tokenize
— Methodtokenize(enc::BertTextEncoder, text::AbstractString;
add_special_tokens::Bool = true, add_end_token::Bool = true, token_ids::Bool = false,
max_tokens::Union{Nothing, Int} = enc.trunc)
Tokenizes the text and returns the tokens or token IDs (to skip looking up the IDs twice).
Arguments
add_special_tokens::Bool = true
: Add special tokens at the beginning and end of the text.add_end_token::Bool = true
: Add end token at the end of the text.token_ids::Bool = false
: If true, return the token IDs directly. Otherwise, return the tokens.max_tokens::Union{Nothing, Int} = enc.trunc
: The maximum number of tokens to return (usually defined by the model).