Tuning Relevance in Elasticsearch with Custom Boosting

Elasticsearch offers different options out of the box in terms of ranking function (similarity function, in Lucene terminology). The default ranking function is a variation of TF-IDF, relatively simple to understand and, thanks to some smart normalisations, also quite effective in practice.

Each use case is a different story so sometimes the default ranking function doesn’t works as well as it does for the general case. In particular, when the collection starts being fairly diverse in terms of document structure (different document types, with different fields, with different size, etc.), we need to adjust the default behaviour so that we can tune relevance.

Static Boosting

The first option we discuss is static boosting, or in other words boosting at indexing time. The boost factor is calculated depending on the nature of the document.

For example, if we have different types of documents in the index, some type could be given more importance because of a specific business rule, or just to compensate for different morphology, as short documents are promoted by the default ranking, so on the other side long documents are penalised.

A different example is the need to apply some sort of popularity/authority score, such as PageRank or similar, that takes into account how a document is linked to other documents.

Once we have a clear formula for the boost factor, we can store its value as an extra field in the document, and use a function_score query to incorporate it into the relevance score, e.g.:

{
    "query": {
        "function_score": {
            "query": {  
                "match": {
                    "some_field_name": "query terms here"
                }
            },
            "functions": [{
                "field_value_factor": { 
                    "field": "my_boost_field"
                }
            }],
            "score_mode": "multiply"
        }
    }
}

In this example, the boost factor is stored in my_boost_field: its value and the relevance score coming from the similarity function are multiplied to achieve the final score used for the ranking.

Because of the nature of this approach, if we want to give a different boosting to the document, the document must be re-indexed in order to apply. In other words, we shouldn’t use this approach if the business rules that define the boosting are likely to change, or in general if we know that over time the relative importance of a document will change (and its boost factor with it).

Dynamic boosting

A different approach consists in the possibility of boosting the documents at query time.

In a previous post – How to Promote Recent Articles in Elasticsearch – we discussed an example of dynamic boosting used to promote recently published documents using a so-called decay function. In this case, we used the value of a date field to calculate the relative importance of the documents.

When our query is a combination of different sub-queries, we can assign different weights to the different components, for example:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title": {
              "query": "some query terms here",
              "boost": 3
            }
          }
        },
        {
          "match": { 
            "content": "other query terms there"
          }
        }
      ]
    }
  }
}

This is a boolean OR query with two clauses, one over the field title and one over the field content. The clauses can be any type of query, and the boosting factor set to 3 means the the first clause is three times more important than the second clause.

It is worth noticing that because of the length normalisation, short fields are implicitely boosted by the default similarity, and the title is usually shorter than the whole content (similar consideration for subject vs body of a message, etc.).

When the query is a multi_match one, we can plug the boost factor in-line, for example:

{
  "multi_match" : {
    "query": "some query terms here", 
    "fields": [ "title^3", "content" ] 
  }
}

In this case, the same match query is execture over the two different fields, and using the caret syntax we can specify, like in the previous example, that the title is three times more important than the content.

Sensible values for this kind of boost factors are usually in the range 1-10, or 1-15 max, as the Elasticsearch manual suggests. This is because some boost normalisation happens internally, so bigger values will not have much impact.

Final Thoughts

Improving relevance is a crucial part of a search application, and often the final tuning is a matter of try-and-see. Boosting – in particular dynamic boosting – allows some level of control over relevance tuning, in a way that is easy to experiment with, and easy to change in order to adapt for specific business rules. With dynamic boosting, there is no need to re-index, so this is an interesting approach for relevance tuning.

Elasticsearch offers a boosting option for nearly every type of query, and on top of this, a function_score can also be employed to further personalise the scoring function.

@MarcoBonzanini

How to Promote Recent Articles in Elasticsearch

The default ranking options in Elasticsearch (and Lucene) are purely based on content. Sometimes, it is useful to mix other factors in, such as the location/distance of a restaurant, to promote venues close to the user’s location, or the publication date of an article, to promote recent articles.

This post discusses a simple solution which is already integrated in Elasticsearch and can be enabled at query time: a decay function.

Default Ranking in Elasticsearch

The default ranking function (called similarity in Lucene terminology) is a normalised version of TF-IDF. The normalisation is based on document length and promotes shorter documents over longer ones. Informally, the idea behind this approach is that longer documents will have higher TF’s just because they have more content, so something should be done to avoid penalising short documents. In practice, this works well most of the time, but in case this normalised TF-IDF is not what you need, there are plenty of other options, including BM25, Divergence From Randomness or Language Model. All these ranking functions are based on the content of a field, i.e. they are based on term statistics.

To showcase how the default ranking works, and how we can affect it later, I’ll use a sample database of articles I’ve published on this blog (click here for the Gist). The Gist will create an index on a local instance of Elasticsearch, where each document will have the fields title (string) and published (date).

Once the index is created, we can run a very basic query:

{
    "query": {
        "match": {
            "title": "python"
        }
    }
}

The query will look for the term python in the title field, and you should see how the documents ranking higher are the ones with shorter titles. This is because the term python occurs only once in each title, so what makes the difference in terms of scoring is the document length normalisation.

Function Score and Decay Functions

Elasticsearch provides an extensive support for custom scoring via the query DSL, meaning that relevance can be tweaked at query time without re-indexing. Using a function_score query, we can smooth the default scoring function to include freshness (a.k.a. recency) as a component for relevance.

Specifically, the use of some scoring functions called decay functions (namely gaussian, exponential and linear) is the key to integrate numeric fields, date fields and geo fields in the picture.

As an example, let’s consider the following query:

{
    "query": {
        "function_score": {
            "query": {
                "match": { "title": "python"}
            },
            "gauss": {
                "published": {
                    "scale":  "8w"
                }
            }
        }
    }
}

This query produces the same result set as the previous one, but the ranking will be very different. In particular, recent articles will be pushed on top. The scale parameter is what governs how quickly the score drops when moving away from the origin. In this example, we have used a range of 8 weeks, meaning that articles older than that will have a very low score.

Summary

Content is the key aspect to consider in most ranking problems which involve textual data. Sometimes though, content is not the only aspect to consider. This article has showcased decay functions as a simple Elasticsearch option to influence ranking by including freshness/recency as a core component to be mixed with content. Since decay functions work on numeric fields, date fields and geo-points, they can be used to integrate location/distance, recency and other numerical aspects like popularity (e.g. number of votes) into the ranking function. This can be done at query time, without the need to re-index.