Probability and Games: Loot Drops

 from Red Blob Games
DRAFT

Status: just some notes until I’m ready to work on this page

 1  Outline - new#

{{ Types of problems I want to cover: N->k or k->N ; counting k vs spacing of drops ; infinite vs finite sequence. We start with simple code but then look at behavior, and behavior can be bad. Let’s find some other code that gives us better behavior. We start with code and end with behavior but maybe we should be designing backwards, by starting with behavior and ending with code. }}

 1.1. Probability 1

Let’s look at a random event: if you open a treasure chest there’s a chance \(p\) of finding a gem, where \(p\) is between 0.0 (0% chance) and 1.0 (100% chance). If you open \(N=10\) chests with probability \(p=0.2\) (20% chance), how many gems \(k\) will you find?

{ TODO: add a count of gems, and then animate the count into the histogram }

{ TODO: when you’ve added enough sequences, add a button to add 100 sequences at a time }

Each player will have a different experience. Some will get 0 gems; some will get 5. The histogram shows how many players will have each type of experience. The way to read it is to pick the number on the x-axis that represents the number of gems received, and then read the value from the y-axis that represents how many players experienced that. For example, ??13%?? of players will receive 4 gems out of 10 chests.

 1.2. Statistics 1

In this scenario each random event is independent of all the others. The probability \(p\) of a gem being in one chest is the same as the probability of it being in any other chest. It has a simple implementation:

def roll():
    if random.uniform(0, 1) < 0.2:
        return 'gem'
    else:
        return 'empty'

gems = 0
chests = 0
while chests < 10:
    chests += 1
    if roll() == 'gem':
        gems += 1

Instead of running our simulation thousands of times, we can calculate the histogram exactly. Let’s call \(p\) the probability that you’ll find a gem in a chest. Let’s call \(X\) the number of gems we find in \(N\) chests. Then the probability that there will be exactly \(k\) gems is called a binomial distribution (in the special case of 1 chest, it’s also called a bernoulli distribution):

\[ P(X=k) = {N\choose k} p^k (1-p)^{N-k} \]

Let’s see what it looks like with N = chests and p = chance of finding a gem. How many gems k do we find?

Play with this. What’s the range of \(k\)? It can be as little as \(0\) drops and as much as \(N\) drops. What’s the most common outcome? It will be \(k = N p\). How much variation is there? Try doubling the number of chests, but keep the probability the same. Notice that the variation decreases.

{ TODO: figure out better chart scaling, because right now as you drag p around, it rescales and you completely miss that p is affecting the peak – maybe pick a scale range based on p=0.02, and maybe average the calculated max with the desired max so that you get some rescaling }

 1.3. Probability 2

In some situations you know how many chests you’re going to open, and want to know how many gems you’re going to get. In other situations you know how many gems you need, and want to know how many chests you have to open. In the code, all I’ve done is change chests < 10 to gems < 5:

def roll():
    if random.uniform(0, 1) < 0.2:
        return 'gem'
    else:
        return 'empty'

gems = 0
chests = 0
while gems < 5:
    chests += 1
    if roll() == 'gem':
        gems += 1

{ TODO make a diagram where you can click to generate a new random sample k=5, p=0.2; N is added to histogram; after several clicks, add a button to generate more of samples, and change histogram from discrete units to bars, and add a button to generate lots more, and then reveal this text: }

Each player will have a different experience. Lucky players will get all the gems opening only \(N=5\) chests; unlucky players might have to open \(N=20\) or more. This histogram shows how many players will have each type of experience. The way to read it is to pick the number on the x-axis that represents the number of chests opened, and then read the value from the y-axis that represents how many players experienced that. For example, ??13%?? of players had to open 10 chests to get their 5 gems.

 1.4. Statistics 2

Instead of running our simulation thousands of times, we can calculate the histogram exactly. As before, \(p\) is the probability that you’ll find a gem in a chest. Let’s call \(X\) the number of chests we need to open to get \(k\) gems. Then the probability that you have to open \(N\) chests is called a negative binomial distribution (in the special case of one gem, it’s also called a geometric distribution):

\[ P(X=N) = {N-1\choose k-1}p^k (1-p)^{N-k} \]

Let’s see what it looks like for k = gems and p = chance of finding a gem. How many chests N do we open?

Play with this. What’s the range of \(N\)? It can be as little as \(k\) chests and as much as \(\infty\) chests. What’s the most common outcome? It will be \(N = k/p\). How much variation is there?

 1.5. Problem

The big danger here is that \(N\) has no limit. Players have a chance of having an extremely high \(N\), which will lead to an unpleasant playing experience. If there are lots of players, then it’s likely that some players will have a bad experience. Let’s try to quantify this. How many players will have to open at most \(j\) chests?

\[ \sum\limits_{N=k}^{j} P(X=N) \]

{ TODO: diagram showing cumulative distribution }

For example, even though the most common scenario is that the player has to open 10 chests, 3% of players have to open more than 50 chests. Ugh.

Can we do better?

Yes. But it means we have to change something about the way we’re picking random numbers.

 1.6. Probability 3

One option is to cap the number of chests needed. How can we do this? Here’s one way to think about it:

Let’s say we want a cap of \(j=20\) treasure chests, and you want to find \(k=5\) gems. Let’s make a set of chests and randomly shuffle them:

{ TODO diagram showing gems and chests, with shuffle button, and show how many chests have to be opened, with histogram }

 1.7. Statistics 3

How many chests \(N\) have to be opened? There’s a hard cap of 20, but most of the time you have to open close to 20 before you get 5 gems.

Is this related to hypergeometric? It seems so but I’m not sure how.

{ TODO diagram with controls for p, j, k }

How much variance is there? It’s a lot more when \(k\) is a lot smaller than \(j\).

 1.8. Implementation 3

We could implement this system by randomly shuffling an array:

outcomes = ['gem'] * k + ['empty'] * (j-k)
random.shuffle(outcomes)

def roll():
    return outcomes.pop()

Those of you have taken a probability course have seen this as “sampling without replacement” which is slightly different from “sampling with replacement”.

However, this requires us to keep a potentially large array. Can we do better? Yes. Instead of deciding ahead of time what each chest contains, we can keep track of how many gems need to be in the remaining chests, and decide on gem placement one chest at a time.

remaining_chests = j
remaining_gems = k

def roll():
    global remaining_chests, remaining_gems
    
    p = remaining_gems / remaining_chests
    if random.random() < p:
        outcome = 'gem'
        remaining_gems -= 1
    else:
        outcome = 'chest'
    
    remaining_chsts -= 1
    return outcome

The probability that this chest contains a gem depends on how many gems and chests remain. This approach generates the same distribution of events as the randomly shuffled array, but we have to keep only two numbers.

 1.9. Better implementation

Looking at the last implementation, when you open a chest and don’t find a gem, remaining_gems stays the same but remaining_chests decreases, so the probability of a gem increases for the next chest. If you do find a gem, the probability decreases. Eventually, if you have enough bad luck, the probability will reach 100%. The probability of one chest having a gem is no longer independent of another chest having a gem.

This strategy could work in general. We can increase the probability over time, so that if you’re having a streak of bad luck, the increased probability can help you get out. Are there other ways of changing the probability based on the kind of luck you’ve had?

 1.10. Spacing - a whole ‘nuther topic

So far we’ve only been looking at how many gems drop in \(N\) chests, and how many chests are needed to get \(k\) gems. For game design, we also want to consider how they’re distributed. Both of these sequences produce 3 gems in 10 chests but they’re probably not equally fun:



Which of these is more fun? Will the player enjoy the game less knowing that there’s a nonuniform probability? Is there a fair way to have the player experience more fun sequences? Do we need to explain the nonuniformity in the game? If so, is there an in-game mechanism that would produce these sequences? For example, if you want more loot drops towards the end of the sequence, maybe the player’s character is getting better at spotting treasures. If you want more loot drops towards the beginning of the sequence, maybe the monsters are getting better at hiding treasures.

 1.11. Repeating shuffles

If you have a never-ending selection, such as picking music from your music library, the shuffle approach runs into a problem: you might by chance shuffle the array [A, B, Z] and then next time [Z, B, A], so you play Z twice in a row. Another approach is to use filtered randomness, where you generate a sequence but then you remove things that you don’t want.

https://engineering.atspotify.com/2014/02/how-to-shuffle-songs/[1]

TODO Sampling with and without replacement

Hybrid with/out replacement with delay D

D = 0 means with replacement
D = max means without replacement

array { a b c d e f g h }
put a partition at array[D]
pick random number, put it on the right partition
cycle through right partition somehow (it's a circular queue)
why not just two arrays? could be easier, except for the initial state

at D = 0, D = max, does it become what we want?

what are the properties of this?

 2  System libraries#

bernoulli process, does a single line of code end up with such a complex distribution? is this the right generalization from geometric distributions? lots to learn before I can write this section, and amazing that it’s rarely ever explained in the context of games.

average of 20 but the distribution is asymmetric and sometimes it could take a long time. if you have millions of players, playing millions of times, some of them will have bad luck and end up with a poor experience

path of exile completely evenly spaces evasion (instead of doing something random) http://www.pathofexile.com/forum/view-thread/11707/filter-account-type/staff/page/10#p748465[9]

use diagrams to explain how it works, let people change parameters, etc. then ask: what characteristics do you want for the game? in addition to number of drops, do you want those drops to be early or late in the process?

{TO investigate: WoW:LK does something – what?? – to tweak the distribution to cut off the tail ; need to find out if https://www.shacknews.com/article/57886/blizzard-details-secret-world-of[10] progressive drop rates are explained in detail somewhere; even if we don’t have details of what they did I can use a linear interpolation of drop rates from first kill to Nth kill, where N is the design limit on number of kills}

DOTA uses https://dota2.fandom.com/wiki/Random_Distribution[11]

for damage rolls, it might be nice to have a stacked bar chart to show multiple distributions – for example, in the crit diagram, show which of them resulted from a crit

Also look at http://designdeeply.blogspot.com/2006/09/swayed-random-number-generation.html[12] and http://cbloomrants.blogspot.com/2010/09/09-06-10-weighteddecayrand.html[13] and https://devblogs.microsoft.com/shawnhar/the-psychology-of-randomness.aspx[14]http://www.shadowtiger.com/website/d3randomness.html[15] ; and Mark Ivey sends me: https://www.gamedeveloper.com/pc/gdc-sid-meier-s-lessons-on-gamer-psychology[16] and https://gamebalanceconcepts.wordpress.com/2010/08/04/level-5-probability-and-randomness-gone-horribly-wrong/[17] and https://engineering.atspotify.com/2014/02/how-to-shuffle-songs/[18]

https://stackoverflow.com/questions/910215/need-for-predictable-random-generator[19] or randomize the number of 0s between 1s (poisson?)

https://gamedev.stackexchange.com/questions/201805/looking-for-a-middle-ground-between-raw-random-and-shuffle-bags[20]

May also find discussion on https://gamedev.stackexchange.com/questions/60329/why-do-loot-drops-contain-mostly-useless-items[21] to be interesting

Discussion on https://gamedev.stackexchange.com/questions/95675/how-can-i-make-a-random-generator-that-is-biased-by-prior-events[22]

XCOM2 just shows the player a different number than what they actually use (!) https://www.gamedeveloper.com/design/jake-solomon-explains-the-careful-use-of-randomness-in-i-xcom-2-i[23]-

Also see https://www.johndcook.com/blog/2010/06/14/generating-poisson-random-values/[24]

Also see https://www.gearboxsoftware.com/2013/09/inside-the-box-the-borderlands-2-loot-system/[25]

Probabilistic programming language treats random() as a distribution generator instead of returning only one result http://www.pl-enthusiast.net/2014/09/08/probabilistic-programming/[26]