External ranking parameters

This feature is still new. Please contact support if you have issues.

The scoring algorithm is working off of data such as parameters on the activity itself, analytics counters or reaction counters. For most use cases this is enough to present a relevant ranked feed based on those (semi-static) parameters, but it’s lacking the capabilities to rank a feed based on a users own preferences.

In order to rank a feed based on a users preferences (i.e a personalized feed) Stream now accepts external parameters that can be used at call time in the scoring algorithm.

Using external ranking parameters

In order to utilize external ranking parameters there’s two things that needs to be done

  1. define at least one external parameter in the scoring algorithm

  2. use the external parameter(s) when fetching and ranking the feed

1. Define an external parameter in the scoring algorithm

External parameters are defined by using the reserved keyword external in the scoring algorithm.

{
  "score": "external.X + 1",
  "defaults": {
    "external": {
      "X": 0
    }
  }
}

2. Use the external parameter

To use external parameters you need to supply a JSON encoded object (and properly url encoded) in the query string named ranking_vars, like so: ranking_vars=urlencode({"X": 42}). This will make the ranking assign external.X the value 42 and thus get the total score of 43.

This feature is currently not available in all SDKs. SDKs that support sending in query parameters dynamically (like python, ruby etc) can use this feature out of the box. If your SDK is not supporting this feature contact us to have it added.

Example: Music service

Consider the example of a music service that lets users upload songs and that they have a “global” front page feed that highlights the latest uploaded songs (activities). A naive approach might be to leave it unranked (i.e ranked by time) but this might not be front page worthy content. A better approach could be to rank the activities in this feed by the number of views or reactions on the activity, like a “new and hot” feed. The best approach is potentially to have a “new and hot” feed that also takes into account the type of activities (i.e type of music) that the user likes.

Consider two different users “the metalhead”, which only listens to metal and “the classicist”, which only listens classical music. Lets say we have this list of recent activities posted to the “global” feed.

{"name": "song1", "classes":{"rock": 0.3, "pop": 0.1, "metal": 0.7, "classical": 0.4, "rap": 0.1}}
{"name": "song2", "classes":{"rock": 0.7, "pop": 0.2, "metal": 0.1, "classical": 0.3, "rap": 0.3}}
{"name": "song3", "classes":{"rock": 0.4, "pop": 0.3, "metal": 0.3, "classical": 0.3, "rap": 0.2}}
{"name": "song4", "classes":{"rock": 0.1, "pop": 0.4, "metal": 0.9, "classical": 0.5, "rap": 0.0}}
{"name": "song5", "classes":{"rock": 0.8, "pop": 0.5, "metal": 0.2, "classical": 0.8, "rap": 0.2}}
{"name": "song6", "classes":{"rock": 0.2, "pop": 0.6, "metal": 0.2, "classical": 0.9, "rap": 0.1}}

We can then score these activities with the following scoring algorithm (external prefix dropped for brevity):

score = w_rock*classes.rock + w_pop*classes.pop + w_metal*classes.metal + w_classical*classes.classical + w_rap*classes.rap

and at call time we can supply the weights for a specific user, i.e for “the metalhead” we can specify the weights as

{"w_rock": 5, "w_pop": 1, "w_metal": 20, "w_classical": 4, "w_rap": 0}

which will put song4 and song1 at the top of the feed for that user.

$response = $userFeed->getActivities(0, 5, $options=[
  'ranking' => 'front_page_personalized',
  'ranking_vars' => '{"w_rock": 5, "w_pop": 1, "w_metal": 20, "w_classical": 4, "w_rap": 0}',
]);

The classicist” only cares about classical music with these weights

{"w_rock": 0, "w_pop": 0, "w_metal": 0, "w_classical": 1, "w_rap": 0}

which will order the feed based on the classical score only so song6 and song5 will be at the top of the feed for that user.

$response = $userFeed->getActivities(0, 5, $options=[
  'ranking' => 'front_page_personalized',
  'ranking_vars' => '{"w_rock": 0, "w_pop": 0, "w_metal": 0, "w_classical": 1, "w_rap": 0}',
]);

Full example

Here follows a full example of the music service described above. The ranking algorithm, named docs_music_service in the code, looks like this:

{
  "score": "external.w_rock*classes.rock + external.w_pop*classes.pop + external.w_metal*classes.metal + external.w_classical*classes.classical + external.w_rap*classes.rap",
  "defaults": {
    "external": {
      "w_rock": 1,
      "w_pop": 1,
      "w_metal": 1,
      "w_classical": 1,
      "w_rap": 1
    },
    "classes": {
      "rock": 0,
      "pop": 0,
      "metal": 0,
      "classical": 0,
      "rap": 0
    }
  }
}

In the code (python) we simulate the music service by adding 10 activities to a feed with random values assigned to each of the classes. We also define three different users THE_METALHEAD, THE_ALL_EATER and ANYTHING_BUT_METAL

import stream
import json
import time
import random

THE_METALHEAD = {"w_rock": -1, "w_pop": -1, "w_metal": 100, "w_classical": -1, "w_rap": -1}
THE_ALL_EATER = {"w_rock": 2, "w_pop": 3, "w_metal": 1, "w_classical": 2, "w_rap": 1}
ANYTHING_BUT_METAL = {"w_rock": 1, "w_pop": 1, "w_metal": -100, "w_classical": 1, "w_rap": 1}

client = stream.connect('{{ api_key }}', '{{ api_secret }}')

def generate_weights():
  return {
    "rock": random.random(),
    "pop": random.random(),
    "metal": random.random(),
    "classical": random.random(),
    "rap": random.random(),
  }

def generate_activities(n):
  activities = []
  for _ in range(n):
    weights = generate_weights()
    prominent_genre = sorted(weights.items(), key=lambda x: x[1], reverse=True)[0][0]
    activities.append(
      {
        "actor": "jimmy",
        "verb": "post",
        "object": f"this is an activity mostly about {prominent_genre} music",
        "classes": weights,
      }
    )
  return activities

def print_personalized_feed(user):
  r = feed.get(
    ranking="docs_music_service",
    withScoreVars=True,
    ranking_vars=json.dumps(user),
  )
  for r in r["results"]:
    print(f"{r['object']} (score={r['score']:.1f})")

feed_id = str(int(time.time()))
feed = client.feed("user", feed_id)
feed.add_activities(generate_activities(10))

print("The metal head")
print_personalized_feed(THE_METALHEAD)
print("\n---------------------------------\n")
print("The all eater")
print_personalized_feed(THE_ALL_EATER)
print("\n---------------------------------\n")
print("Anything but metal")
print_personalized_feed(ANYTHING_BUT_METAL)

When you run this code it will perform the following steps

  1. construct a new feed in the user feed group

  2. add 10 activities to the feed with random music genre classes assigned

  3. rank and print the feed for each of the 3 users

The output, depending on the random classes, will look something like this:

The metal head
this is an activity mostly about metal music (score=94.3)
this is an activity mostly about metal music (score=79.3)
this is an activity mostly about classical music (score=76.3)
this is an activity mostly about classical music (score=68.1)
this is an activity mostly about rock music (score=37.6)
this is an activity mostly about rap music (score=36.5)
this is an activity mostly about pop music (score=33.4)
this is an activity mostly about rock music (score=21.9)
this is an activity mostly about classical music (score=9.9)
this is an activity mostly about rock music (score=4.5)

---------------------------------

The all eater
this is an activity mostly about rock music (score=6.7)
this is an activity mostly about rock music (score=6.4)
this is an activity mostly about pop music (score=6.0)
this is an activity mostly about metal music (score=5.9)
this is an activity mostly about classical music (score=5.8)
this is an activity mostly about classical music (score=4.4)
this is an activity mostly about rap music (score=3.9)
this is an activity mostly about metal music (score=3.8)
this is an activity mostly about classical music (score=3.3)
this is an activity mostly about rock music (score=2.3)

---------------------------------

Anything but metal
this is an activity mostly about rock music (score=-4.5)
this is an activity mostly about classical music (score=-9.9)
this is an activity mostly about rock music (score=-21.9)
this is an activity mostly about pop music (score=-33.4)
this is an activity mostly about rap music (score=-36.5)
this is an activity mostly about rock music (score=-37.6)
this is an activity mostly about classical music (score=-68.1)
this is an activity mostly about classical music (score=-76.3)
this is an activity mostly about metal music (score=-79.3)
this is an activity mostly about metal music (score=-94.3)

As you can see, we get 3 different rankings depending on which user is fetching the feed.

Distance based ranking

The external ranking parameters can be used together with the function dist from custom ranking to rank activities based on the distance the user is from them. For this to work each activity must have a latitude and longitude associated with them. If we just want to rank by distance we can use the following custom ranking (assumed to be named distance below):

{
  "score": "-dist(lat,lng,external.lat,external.lng)",
  "defaults": {
    "external": {
      "lat": 0,
      "lng": 0
    },
    "lat": 0,
    "lng": 0
  }
}

When we add activities we make sure they have fields lat and lng, like so:

feed.add_activities(
  [
    {
      "actor": "jimmy",
      "verb": "post",
      "object": "MALMÖ",
      "lat": 55.605057,
      "lng": 13.014354,
    },
    {
      "actor": "jimmy",
      "verb": "post",
      "object": "COPENHAGEN",
      "lat": 55.684105,
      "lng": 12.571859,
    },
    {
      "actor": "jimmy",
      "verb": "post",
      "object": "PARIS",
      "lat": 48.864716,
      "lng": 2.349014,
    },
    {
      "actor": "jimmy",
      "verb": "post",
      "object": "LONDON",
      "lat": 51.509865,
      "lng": -0.118092,
    },
    {
      "actor": "jimmy",
      "verb": "post",
      "object": "SYDNEY",
      "lat": -33.865143,
      "lng": 151.209900,
    },
  ]
)

Now, when we read the feed we can send in the reading users location (as denoted by external.lat and external.lng ) in the custom ranking

# simulate a user fetching activities while in Amsterdam
amsterdam = {"lat": 52.377956, "lng": 4.897070}
for r in feed.get(
  ranking="distance"
  ranking_vars=json.dumps(amsterdam),
)["results"]:
  print(
    f"distance to activity in {r['object']} is {round(abs(r['score']), 1)}km (score was: {r['score']})"
  )
© Getstream.io, Inc. All Rights Reserved.