One thing that has always bugged me about Bradley Terry models is that it doesn’t model rock-paper-scissors.
To recap the Bradley Terry model posits that each entity has a strength \(\gamma_{i}\) and when competing the probability of victory for entity a over entity b is
\[exp(\gamma_{a}) / exp(\gamma_{a}) + exp(\gamma_{b})\]
or to make it clearer logit(γa - γb).
There’s a couple different approaches in the literature. This one augments each entity with a 2d vector vi and defines the probability as
logit(norm(v_b - v_a) - norm(v_a - v_b) + γa - γb)
This vector difference then provides the intransitivity.
I want to do a little better: in particular I’m not a huge fan of this difficult to interpret this difference of norms. You also don’t get a lot of pooling: since you have to fit this vector for each entity you won’t do well with sparse observations.
So what can we do instead? There is a natural clustering suggested by this rock-paper-scissors example so let’s use it. We augment each entity with a latent cluster Zi and a matrix Ac1, c2 of the advantage an entity of class c1 would have over an entity of class c2. Then the probability becomes
logit(AZa, Zb + γa - γb)
This is perhaps equivalent to this paper although I can’t quite tell as it’s confusingly written. I think they additionally put a DP on the γ terms which doesn’t make a ton of sense to me although they justify it by pointing that with match data certain entities are just statistically indistinguishable.]
So let’s now try it on a simulated rock-paper-scissors-like data. First let’s see what a standard Bradley Terry Model performs.
# TOOD: Plot a heatmap of win probabilities for rock-paper-scissors
For a “real” example we can use the Weedle’s cave data set from Kaggle which captures battle outcomes between different pokemon. Now those of you born in the 90s will know1 that Pokemon have types and these types exhibit rock-paper-scissors like advantages: electric is super effective against water which is super effective against fire and so on.
Data set: https://www.kaggle.com/datasets/terminus7/pokemon-challenge/data
using AlgebraOfGraphics, CairoMakie
using Turing, Turing.RandomMeasures
using DataFrames
using CSV
using MCMCChains
struct MatchData
lo_ids::Vector{Int}
hi_ids::Vector{Int}
lo_wins::Vector{Int}
matches::Vector{Int}
n_entities::Int
n_pairs::Int
end
function simulated_data()
strengths = [1,1,1,2,2,2,3,3,3]
class = [1,2,3,1,2,3,1,2,3]
class_strengths = zeros(3, 3)
class_strengths[1,2] = 3
class_strengths[2,1] = -3
class_strengths[1,3] = -3
class_strengths[3,1] = 3
class_strengths[2,3] = 3
class_strengths[3,2] = -3
lo_ids = Int[]
hi_ids = Int[]
lo_wins = Int[]
matches = Int[]
for lo in 1:9, hi in (lo+1):9
push!(lo_ids, lo)
push!(hi_ids, hi)
logit_p = strengths[lo] - strengths[hi] + class_strengths[class[lo], class[hi]]
p = 1 / (1 + exp(-logit_p))
push!(matches, 20)
push!(lo_wins, rand(Binomial(20, p)))
end
return MatchData(lo_ids, hi_ids, lo_wins, matches, 9, 9*4)
end
function prep_match_data(df)
pairs = Dict{Tuple{Int, Int}, Tuple{Int, Int}}()
for row in eachrow(df)
lo, hi = minmax(row["First_pokemon"], row["Second_pokemon"])
lo_wins = lo == row["Winner"]
if haskey(pairs, (lo, hi))
pairs[lo, hi] = pairs[lo, hi] .+ (lo_wins, 1)
else
pairs[lo, hi] = (lo_wins, 1)
end
end
lo_ids = zeros(Int, length(pairs))
hi_ids = zeros(Int, length(pairs))
lo_wins = zeros(Int, length(pairs))
matches = zeros(Int, length(pairs))
for (ii, (key, val)) in enumerate(pairs)
lo_ids[ii] = key[1]
hi_ids[ii] = key[2]
lo_wins[ii] = val[1]
matches[ii] = val[2]
end
n_pairs = length(pairs)
n_entities = maximum(hi_ids)
return MatchData(lo_ids, hi_ids, lo_wins, matches, n_entities, n_pairs)
end
@model function BradleyTerryDP(match_data)
# Dirichlet Process for infinite mixture of clusters
α = 1.0 # Concentration parameter
rpm = DirichletProcess(α)
H = Normal(0, 1) # Base distribution for cluster effects
n_entities = match_data.n_entities
# Individual strengths
# β = tzeros(n_entities)
β ~ filldist(Normal(0, 1), n_entities)
# Cluster assignments and strengths
cluster = tzeros(Int, n_entities)
max_clusters = 30
cluster_effects = tzeros(max_clusters, max_clusters)
for ii in 1:n_entities
K = maximum(cluster)
nk = Vector{Int}(map(k -> sum(cluster .== k), 1:K))
cluster[ii] ~ ChineseRestaurantProcess(rpm, nk)
if cluster[ii] > K
cluster_effects[K + 1, K + 1] = 0.0
for k in 1:K
cluster_effects[k, K + 1] ~ Normal(0, 1)
cluster_effects[K + 1, k] = -cluster_effects[k, K + 1] # enforce antisymmetry
end
end
end
# Likelihood
for ii in 1:match_data.n_pairs
lo = match_data.lo_ids[ii]
hi = match_data.hi_ids[ii]
logit_p = β[lo] - β[hi] + cluster_effects[cluster[lo], cluster[hi]]
p_lo_win = 1.0 / (1.0 + exp(-logit_p))
match_data.lo_wins[ii] ~ Binomial(match_data.matches[ii], p_lo_win)
end
end
sim_data = simulated_data()
model = BradleyTerryDP(sim_data)
#df = DataFrame(CSV.File("assets/non-transitive-bradley-terry/combats.csv"))
#match_data = prep_match_data(df)
#model = BradleyTerryDP(match_data)
# TODO: figure out what these paramters mean
#chain = sample(model, Gibbs(PG(20, :cluster), NUTS(500, 0.65, :cluster_effects), NUTS(500, 0.65, :β)), n_iterations)
n_iterations = 1000
chain = sample(model, Gibbs(:cluster => PG(100),
:cluster_effects => HMC(0.01, 4),
:β => HMC(0.01, 4)), n_iterations)
# char_df = DataFrame(CSV.File("assets/non-transitive-bradley-terry/pokemon.csv"))
# cluster_type = DataFrame(:cluster => chain[44, namesingroup(chain, :cluster), :].value.data[:],
# :type => char_df[!, "Type 1"])
# cluster_counts = combine(groupby(cluster_type, [:cluster, :type]), nrow => :count)
# plt = data(cluster_counts) * mapping(:cluster => nonnumeric, :type => nonnumeric, :count) * visual(Heatmap)
# draw(plt)
*
-
sadly I know this only second-hand as my parents thought books were better than gameboys: look at the ruin that brought me ↩︎