I wrote here about the bias encoded into the ORES models deployed on Wikipedia for helping editors to monitor changes to the encyclopedia. There I showed that the models were unfair to newcomers and anonymous editors using two different notions of fairness: balance and calibration. I brought up the fact that there is an inherent tradeoff between these two quantified notions of fairness such that in non-trivial situations it is impossible to satisfy them both. Here, I'm going to illustrate this point with a simple simulation and show how a straightfoward approach to creating a balanced model from an imbalanced one results in a model which is not calibrated.

```
# I'm going to use these R packages
library(ggplot2)
theme_set(theme_bw())
library(data.table)
```

**X**, which stands for everything else we can observe and include in our model. We'll say the linear probit model with these two variables is the true model.

```
# generate a dataset according to the model
B_anon <- 2
B_X <- 1
n <- 4000
edits <- data.table(anon=c(rep(TRUE,n/2),rep(FALSE,n/2)), X = rnorm(n/2,0,1))
edits[,p_damaging := pnorm(B_anon*anon + B_X*X,1,1)]
edits[,damaging := sapply(p_damaging, function(p) rbinom(1,1,p))]
```

Next I'll fit a model to the generated data and generate model predictions

```
glm_mod = glm(damaging ~ anon + X - 1, data = edits,family=binomial(link='probit'))
edits[,p.calibration := pnorm(predict(glm_mod,newdata=edits))]
edits[,calibration.pred:= p.calibration > 0.5]
```

The true model should be calibrated, but not balanced. Let's verify that is the case.

```
edits[, .(model=mean(p.calibration), true=mean(damaging)),by=c("anon")]
```

So we see that it is calibrated, but is it balanced?

```
edits[,mean(p.calibration),by=c("damaging","anon")]
```

**Not even close!** The model is super unbalanced. Non-damaging anonymous edits have almost the same average score as damaging non-anonymous edits!

**X** and ignores anons.

```
glm_mod2 = glm(damaging ~ X , data = edits,family=binomial(link='probit'))
edits[,p.try_balance := pnorm(predict(glm_mod2,newdata=edits))]
edits[,mean(p.try_balance),by=c("damaging","anon")]
```

```
edits[, .(model=mean(p.try_balance), true=mean(damaging)),by=c("anon")]
```

**X**.

Plot the ROC curves.

```
roc_x <- 0:100/100
tpr_anon <- edits[anon==TRUE, sapply(roc_x, function(x) sum( (p.calibration > x) & (damaging==TRUE) )/sum(damaging==TRUE))]
fpr_anon <- edits[anon==TRUE, sapply(roc_x, function(x) sum((p.calibration > x) & (damaging==FALSE))/sum(damaging==FALSE))]
tpr_nonanon <- edits[anon==FALSE, sapply(roc_x, function(x) sum( (p.calibration > x) & damaging==TRUE)/sum(damaging==TRUE))]
fpr_nonanon <- edits[anon==FALSE, sapply(roc_x, function(x) sum((p.calibration > x) & damaging==FALSE)/sum(damaging==FALSE))]
roc <- data.table(x=roc_x,tpr_anon=tpr_anon,fpr_anon=fpr_anon,tpr_nonanon=tpr_nonanon, fpr_nonanon=fpr_nonanon)
ggplot(roc) + geom_line(aes(x=fpr_nonanon,y=tpr_nonanon,color="Non anon")) + geom_line(aes(x=fpr_anon,y=tpr_anon,color="Anon")) + ylab("True positive rate") + xlab("False positive rate")
```

So it looks like we can find balance with the FPR is around 0.2

```
(t.nonanon <- roc_x[which.min(abs(fpr_nonanon - 0.21
))])
```

```
(t.anon <- roc_x[which.min(abs(fpr_anon - 0.21))])
```

```
## for anons its where fpr_anon is about 0.22 which is at about 0.77
## you can use linear programming to do this but i'm lazy
edits[anon==TRUE, balance.pred := p.calibration > t.anon]
edits[anon==FALSE, balance.pred := p.calibration > t.nonanon]
edits[,mean(balance.pred),by=.(damaging,anon)]
```

Check that our new predictor is balanced

```
#tpr
(edits[anon==TRUE, sum( (calibration.pred==TRUE) & (damaging==TRUE) )/sum(damaging==TRUE)])
(edits[anon==FALSE,sum( (calibration.pred==TRUE) & damaging==TRUE)/sum(damaging==TRUE)])
```

```
#fpr
(edits[anon==TRUE, sum((calibration.pred==TRUE) & (damaging==FALSE))/sum(damaging==FALSE)])
(edits[anon==FALSE, sum((calibration.pred==TRUE) & damaging==FALSE)/sum(damaging==FALSE)])
```

```
#tnr
(edits[anon==TRUE, sum( (calibration.pred==FALSE) & (damaging==FALSE) )/sum(damaging==FALSE)])
(edits[anon==FALSE,sum( (calibration.pred==FALSE) & damaging==FALSE)/sum(damaging==FALSE)])
```

```
#fnr
(edits[anon==TRUE, sum((calibration.pred==FALSE) & (damaging==TRUE))/sum(damaging==TRUE)])
(edits[anon==FALSE, sum((calibration.pred==FALSE) & damaging==TRUE)/sum(damaging==TRUE)])
```

```
#tpr
(edits[anon==TRUE, sum( (balance.pred==TRUE) & (damaging==TRUE) )/sum(damaging==TRUE)])
(edits[anon==FALSE,sum( (balance.pred==TRUE) & damaging==TRUE)/sum(damaging==TRUE)])
```

```
#fpr
(edits[anon==TRUE, sum((balance.pred==TRUE) & (damaging==FALSE))/sum(damaging==FALSE)])
(edits[anon==FALSE, sum((balance.pred==TRUE) & damaging==FALSE)/sum(damaging==FALSE)])
```

```
#tnr
(edits[anon==TRUE, sum( (balance.pred==FALSE) & (damaging==FALSE) )/sum(damaging==FALSE)])
(edits[anon==FALSE,sum( (balance.pred==FALSE) & damaging==FALSE)/sum(damaging==FALSE)])
```

```
#fnr
(edits[anon==TRUE, sum((balance.pred==FALSE) & (damaging==TRUE))/sum(damaging==TRUE)])
(edits[anon==FALSE, sum((balance.pred==FALSE) & damaging==TRUE)/sum(damaging==TRUE)])
```

The next question is if the balanced predictor is calibrated. What do you expect?

```
## check if the classifier is calibrated. No way!
edits[,.(Predicted=mean(balance.pred), True=mean(damaging)), by=c("anon")]
```

```
idx <- sample.int(n,1 00)
samp <- edits[idx]
samp2 <- edits[idx]
samp[anon==TRUE,threshhold := 0.5]
samp[anon==FALSE,threshhold := 0.5]
samp[ (damaging==TRUE) & (calibration.pred ==TRUE), type:="True Positive"]
samp[(damaging==FALSE) & (calibration.pred ==TRUE), type:="False Positive"]
samp[(damaging==TRUE) & (calibration.pred ==FALSE), type:="False Negative"]
samp[(damaging==FALSE) & (calibration.pred ==FALSE),type:="True Negative"]
samp[,type:=factor(type,levels = c("True Positive","True Negative","False Positive","False Negative"))]
samp[,model:="calibration"]
```

```
samp2[anon==TRUE,threshhold := t.anon]
samp2[anon==FALSE,threshhold := t.nonanon]
samp2[ (damaging==TRUE) & (balance.pred ==TRUE), type:="True Positive"]
samp2[(damaging==FALSE) & (balance.pred ==TRUE), type:="False Positive"]
samp2[(damaging==TRUE) & (balance.pred ==FALSE), type:="False Negative"]
samp2[(damaging==FALSE) & (balance.pred ==FALSE),type:="True Negative"]
samp2[,type:=factor(type,levels = c("True Positive","True Negative","False Positive","False Negative"))]
samp2[,model:="balance"]
samp = rbind(samp,samp2)
samp[,model:=factor(model,levels=c("calibration","balance"))]
```

```
my_labeller = as_labeller(c('FALSE' = "Not Anon", "TRUE" = "Anon","balance"="Balance",'calibration'="Calibration"))
```

```
ggplot(samp, aes(x=X,y=p.calibration,color=type)) + geom_point(alpha=0.7) + geom_hline(data=samp,aes(yintercept = threshhold)) + facet_grid(anon~model, labeller=my_labeller) + scale_color_brewer("",palette = 'Set1') + ylab("Predicted probability") + ggtitle("Illustrating calibration vs balance")
```

```
balance.rates = edits[,mean(balance.pred),by=c('damaging','anon')]
balance.rates[,level := 'Balanced model']
calibration.rates = edits[,mean(calibration.pred),by=c('damaging','anon')]
calibration.rates[,level:='Calibrated model']
true.rates = edits[,mean(p_damaging),by=c('damaging','anon')]
true.rates[,level:='True model']
dt <- rbind(balance.rates, calibration.rates, true.rates)
ggplot(dt,aes(x=damaging==TRUE,color=anon,group=anon,y=V1)) + geom_point() + facet_wrap(.~level) + xlab("Damaging") + ylab("Probability of predicting damage")
```

```
(acc_calib <- edits[,mean(calibration.pred == (damaging==TRUE))])
(acc_trybal <- edits[,mean((p.try_balance > 0.5) == (damaging==TRUE))])
(acc_bal <- edits[,mean(balance.pred == (damaging == TRUE))])
```

```
names(edits)
```

```
(recall.calib <- edits[,sum(calibration.pred == TRUE & damaging == TRUE)/sum(damaging==TRUE)])
(precision.calib <- edits[,sum(calibration.pred == TRUE & damaging == TRUE)/sum(calibration.pred==TRUE)])
```

```
(recall.balance <- edits[,sum(balance.pred == TRUE & damaging == TRUE)/sum(damaging==TRUE)])
(precision.balance <- edits[,sum(balance.pred == TRUE & damaging == TRUE)/sum(balance.pred==TRUE)])
```

```
(recall.try_balance <- edits[,sum( (p.try_balance > 0.5) & damaging == TRUE)/sum(damaging==TRUE)])
(precision.try_balance <- edits[,sum( (p.try_balance > 0.5) & damaging == TRUE)/sum(balance.pred==TRUE)])
```

Even though in this simulated data anons were three times as likely to make damaging edits compared to non-anons, balancing the model only costs 5 percentage points of accuracy. Moreover, the balanced model has better accuracy than the model that ignores that anons exist! And similarly, we see that balancing the model results in a substantial hit to precision and recall, but ignoring the very informative information that the editor is anonymous makes things even worse!

Choosing the point where the ROC curves for the two groups intersects is a good way to choose threshholds that will balance the model, but this comes at the cost of calibration. Removing the anon variable from the model is a way to compromise between balance and fairness, but potentially at the cost of accuracy (in this exercise, the cost in accuracy was quite high, but if we increase the rate of X enough it will not matter much).

However, total balance and total calibration can be thought of as boundaries that define the space of possible tradeoffs between these two different notions of fairness. Wikipedians might want to make a principled compromise between balance and calibration. One way to do this can be to choose different threshholds for anons and for non-anons that may not accomplish total balance, but that can preserve more calibration. There are also good approaches based on adding constraints (e.g. using KKT conditions) to the model that carefully penalize deviations from balance and calibration.