#+TITLE: Probability and Games: Loot Drops #+begin_export html #+end_export #+begin_note Status: just some notes until I'm ready to work on this page #+end_note * Outline - new :PROPERTIES: :CUSTOM_ID: outline-new :END: {{ Types of problems I want to cover: N->k or k->N ; counting k vs spacing of drops ; infinite vs finite sequence. We start with simple code but then look at behavior, and behavior can be bad. Let's find some other code that gives us better behavior. We start with code and end with behavior but maybe we should be designing backwards, by starting with behavior and ending with code. }} ** Probability 1 :PROPERTIES: :CUSTOM_ID: probability-1 :END: Let's look at a random event: if you open a treasure chest there's a chance $p$ of finding a gem, where $p$ is between 0.0 (0% chance) and 1.0 (100% chance). If you open $N=10$ chests with probability $p=0.2$ (20% chance), how many gems $k$ will you find? #+begin_export html
#+end_export { *TODO*: add a /count/ of gems, and then animate the count into the histogram } { *TODO*: when you've added enough sequences, add a button to add 100 sequences at a time } Each player will have a different experience. Some will get 0 gems; some will get 5. The /histogram/ shows how many players will have each type of experience. The way to read it is to pick the number on the x-axis that represents the number of gems received, and then read the value from the y-axis that represents how many players experienced that. For example, ??13%?? of players will receive 4 gems out of 10 chests. ** Statistics 1 :PROPERTIES: :CUSTOM_ID: statistics-1 :END: In this scenario each random event is /independent/ of all the others. The probability $p$ of a gem being in one chest is the same as the probability of it being in any other chest. It has a simple implementation: #+begin_src python def roll(): if random.uniform(0, 1) < 0.2: return 'gem' else: return 'empty' gems = 0 chests = 0 while chests < 10: chests += 1 if roll() == 'gem': gems += 1 #+end_src Instead of running our simulation thousands of times, we can calculate the histogram exactly. Let's call $p$ the probability that you'll find a gem in a chest. Let's call $X$ the number of gems we find in $N$ chests. Then the probability that there will be exactly $k$ gems is called a /binomial distribution/ (in the special case of 1 chest, it's also called a /bernoulli distribution/): \[ P(X=k) = {N\choose k} p^k (1-p)^{N-k} \] #+begin_export html

Let's see what it looks like with N = chests and p = chance of finding a gem. How many gems k do we find?

#+end_export Play with this. What's the range of $k$? It can be as little as $0$ drops and as much as $N$ drops. What's the most common outcome? It will be $k = N p$. How much variation is there? Try doubling the number of chests, but keep the probability the same. Notice that the variation decreases. { *TODO*: figure out better chart scaling, because right now as you drag p around, it rescales and you /completely miss/ that p is affecting the peak -- maybe pick a scale range based on p=0.02, and maybe average the calculated max with the desired max so that you get some rescaling } ** Probability 2 :PROPERTIES: :CUSTOM_ID: probability-2 :END: In some situations you know how many chests you're going to open, and want to know how many gems you're going to get. In other situations you know how many gems you need, and want to know how many chests you have to open. In the code, all I've done is change =chests < 10= to =gems < 5=: #+begin_src python def roll(): if random.uniform(0, 1) < 0.2: return 'gem' else: return 'empty' gems = 0 chests = 0 while gems < 5: chests += 1 if roll() == 'gem': gems += 1 #+end_src { *TODO* make a diagram where you can click to generate a new random sample k=5, p=0.2; N is added to histogram; after several clicks, add a button to generate more of samples, and change histogram from discrete units to bars, and add a button to generate lots more, and then reveal this text: } Each player will have a different experience. Lucky players will get all the gems opening only $N=5$ chests; unlucky players might have to open $N=20$ or more. This histogram shows how many players will have each type of experience. The way to read it is to pick the number on the x-axis that represents the number of chests opened, and then read the value from the y-axis that represents how many players experienced that. For example, ??13%?? of players had to open 10 chests to get their 5 gems. ** Statistics 2 :PROPERTIES: :CUSTOM_ID: statistics-2 :END: Instead of running our simulation thousands of times, we can calculate the histogram exactly. As before, $p$ is the probability that you'll find a gem in a chest. Let's call $X$ the number of chests we need to open to get $k$ gems. Then the probability that you have to open $N$ chests is called a /negative binomial distribution/ (in the special case of one gem, it's also called a /geometric distribution/): \[ P(X=N) = {N-1\choose k-1}p^k (1-p)^{N-k} \] #+begin_export html

Let's see what it looks like for k = gems and p = chance of finding a gem. How many chests N do we open?

#+end_export Play with this. What's the range of $N$? It can be as little as $k$ chests and as much as $\infty$ chests. What's the most common outcome? It will be $N = k/p$. How much variation is there? ** Problem :PROPERTIES: :CUSTOM_ID: problem :END: The big danger here is that $N$ has no limit. Players have a chance of having an extremely high $N$, which will lead to an unpleasant playing experience. If there are lots of players, then it's likely that /some/ players will have a bad experience. Let's try to quantify this. How many players will have to open at most $j$ chests? \[ \sum\limits_{N=k}^{j} P(X=N) \] { *TODO*: diagram showing cumulative distribution } For example, even though the most common scenario is that the player has to open 10 chests, 3% of players have to open more than 50 chests. Ugh. Can we do better? Yes. But it means we have to change something about the way we're picking random numbers. ** Probability 3 :PROPERTIES: :CUSTOM_ID: probability-3 :END: One option is to /cap/ the number of chests needed. How can we do this? Here's one way to think about it: Let's say we want a cap of $j=20$ treasure chests, and you want to find $k=5$ gems. Let's make a set of chests and randomly shuffle them: { *TODO* diagram showing gems and chests, with shuffle button, and show how many chests have to be opened, with histogram } ** Statistics 3 :PROPERTIES: :CUSTOM_ID: statistics-3 :END: How many chests $N$ have to be opened? There's a hard cap of 20, but most of the time you have to open close to 20 before you get 5 gems. Is this related to hypergeometric? It seems so but I'm not sure how. { *TODO* diagram with controls for p, j, k } How much variance is there? It's a lot more when $k$ is a lot smaller than $j$. ** Implementation 3 :PROPERTIES: :CUSTOM_ID: implementation-3 :END: We could implement this system by randomly shuffling an array: #+begin_src python outcomes = ['gem'] * k + ['empty'] * (j-k) random.shuffle(outcomes) def roll(): return outcomes.pop() #+end_src Those of you have taken a probability course have seen this as "sampling without replacement" which is slightly different from "sampling with replacement". However, this requires us to keep a potentially large array. Can we do better? Yes. Instead of deciding ahead of time what each chest contains, we can keep track of how many gems need to be in the remaining chests, and decide on gem placement one chest at a time. #+begin_src python remaining_chests = j remaining_gems = k def roll(): global remaining_chests, remaining_gems p = remaining_gems / remaining_chests if random.random() < p: outcome = 'gem' remaining_gems -= 1 else: outcome = 'chest' remaining_chsts -= 1 return outcome #+end_src The probability that /this/ chest contains a gem depends on how many gems and chests remain. This approach generates the same distribution of events as the randomly shuffled array, but we have to keep only two numbers. ** Better implementation :PROPERTIES: :CUSTOM_ID: better-implementation :END: Looking at the last implementation, when you open a chest and don't find a gem, =remaining_gems= stays the same but =remaining_chests= decreases, so the probability of a gem increases for the next chest. If you do find a gem, the probability decreases. Eventually, if you have enough bad luck, the probability will reach 100%. The probability of one chest having a gem is no longer /independent/ of another chest having a gem. This strategy could work in general. We can increase the probability over time, so that if you're having a streak of bad luck, the increased probability can help you get out. Are there other ways of changing the probability based on the kind of luck you've had? ** Spacing - a whole 'nuther topic :PROPERTIES: :CUSTOM_ID: spacing-a-whole-nuther-topic :END: So far we've only been looking at /how many/ gems drop in $N$ chests, and /how many/ chests are needed to get $k$ gems. For game design, we also want to consider how they're distributed. Both of these sequences produce 3 gems in 10 chests but they're probably not equally fun: #+begin_export html


#+end_export Which of these is more fun? Will the player /enjoy/ the game less knowing that there's a nonuniform probability? Is there a /fair/ way to have the player experience more fun sequences? Do we need to /explain/ the nonuniformity in the game? If so, is there an in-game mechanism that would produce these sequences? For example, if you want more loot drops towards the end of the sequence, maybe the player's character is getting better at spotting treasures. If you want more loot drops towards the beginning of the sequence, maybe the monsters are getting better at hiding treasures. ** Repeating shuffles :PROPERTIES: :CUSTOM_ID: repeating-shuffles :END: If you have a never-ending selection, such as picking music from your music library, the shuffle approach runs into a problem: you might by chance shuffle the array [A, B, Z] and then next time [Z, B, A], so you play Z twice in a row. Another approach is to use /filtered randomness/, where you generate a sequence but then you remove things that you don't want. https://engineering.atspotify.com/2014/02/how-to-shuffle-songs/ #+begin_example TODO Sampling with and without replacement Hybrid with/out replacement with delay D D = 0 means with replacement D = max means without replacement array { a b c d e f g h } put a partition at array[D] pick random number, put it on the right partition cycle through right partition somehow (it's a circular queue) why not just two arrays? could be easier, except for the initial state at D = 0, D = max, does it become what we want? what are the properties of this? #+end_example * System libraries :PROPERTIES: :CUSTOM_ID: system-libraries :END: - [[https://en.cppreference.com/w/cpp/numeric/random][C++ ]] - [[https://www.johndcook.com/blog/distributions_scipy/][Python scipy/random]] * Random links I should check :PROPERTIES: :CUSTOM_ID: random-links-i-should-check :END: - https://www.wolframalpha.com/input/?i=binomial+n%3D100+p%3D0.1 - https://www.wolframalpha.com/input/?i=negative+binomial+n%3D5+p%3D0.1 bernoulli process, does a single line of code end up with such a complex distribution? is this the right generalization from geometric distributions? lots to learn before I can write this section, and amazing that it's rarely ever explained in the context of games. - https://www.wowhead.com/item=71141/eternal-ember#comments:id=1457924 it's mentioned in comments, rarely - https://www.mmo-champion.com/threads/877483-World-BoE-Epics - http://forums.kingdomofloathing.com/vb/showthread.php?t=183137 average of 20 but the distribution is asymmetric and sometimes it could take a long time. if you have millions of players, playing millions of times, some of them will have bad luck and end up with a poor experience path of exile completely evenly spaces evasion (instead of doing something random) http://www.pathofexile.com/forum/view-thread/11707/filter-account-type/staff/page/10#p748465 use diagrams to explain how it works, let people change parameters, etc. then ask: what characteristics do you want for the game? in addition to number of drops, do you want those drops to be early or late in the process? {TO investigate: WoW:LK does something -- what?? -- to tweak the distribution to cut off the tail ; need to find out if https://www.shacknews.com/article/57886/blizzard-details-secret-world-of progressive drop rates are explained in detail somewhere; even if we don't have details of what they did I can use a linear interpolation of drop rates from first kill to Nth kill, where N is the design limit on number of kills} DOTA uses https://dota2.fandom.com/wiki/Random_Distribution for damage rolls, it might be nice to have a stacked bar chart to show multiple distributions -- for example, in the crit diagram, show which of them resulted from a crit Also look at http://designdeeply.blogspot.com/2006/09/swayed-random-number-generation.html and http://cbloomrants.blogspot.com/2010/09/09-06-10-weighteddecayrand.html and https://devblogs.microsoft.com/shawnhar/the-psychology-of-randomness.aspx http://www.shadowtiger.com/website/d3randomness.html ; and Mark Ivey sends me: https://www.gamedeveloper.com/pc/gdc-sid-meier-s-lessons-on-gamer-psychology and https://gamebalanceconcepts.wordpress.com/2010/08/04/level-5-probability-and-randomness-gone-horribly-wrong/ and https://engineering.atspotify.com/2014/02/how-to-shuffle-songs/ https://stackoverflow.com/questions/910215/need-for-predictable-random-generator or randomize the number of 0s between 1s (poisson?) https://gamedev.stackexchange.com/questions/201805/looking-for-a-middle-ground-between-raw-random-and-shuffle-bags May also find discussion on https://gamedev.stackexchange.com/questions/60329/why-do-loot-drops-contain-mostly-useless-items to be interesting Discussion on https://gamedev.stackexchange.com/questions/95675/how-can-i-make-a-random-generator-that-is-biased-by-prior-events XCOM2 just shows the player a different number than what they actually use (!) https://www.gamedeveloper.com/design/jake-solomon-explains-the-careful-use-of-randomness-in-i-xcom-2-i- Also see https://www.johndcook.com/blog/2010/06/14/generating-poisson-random-values/ Also see https://www.gearboxsoftware.com/2013/09/inside-the-box-the-borderlands-2-loot-system/ Probabilistic programming language treats random() as a distribution generator instead of returning only one result http://www.pl-enthusiast.net/2014/09/08/probabilistic-programming/ - https://www.gdcvault.com/play/1014496/Using-Randomness-in-AI-Both - https://xax.dreamwidth.org/81019.html - http://www.fdg2015.org/papers/fdg2015_paper_05.pdf - https://www.keithschwarz.com/darts-dice-coins/ - https://simon.lc/the-history-of-tetris-randomizers - different versions of Tetris tried different approaches to (a) making sure the same piece doesn't occur several times in a row, and (b) making sure certain pieces occur often enough. The later versions use a shuffled bag of 7 or 14 pieces. - https://old.reddit.com/r/roguelikedev/comments/d34qu5/tetris_randomizer/f02eat0/ - Cogmind switched to a shuffle system - https://en.m.wikipedia.org/wiki/Reservoir_sampling - you want to fairly sample N of an unknown M items, but you can't know what M is or go back through elements you've already seen #+begin_export html