Skip to content

Commit

Permalink
add multinomial docu
Browse files Browse the repository at this point in the history
  • Loading branch information
bvenn committed Feb 23, 2024
1 parent cea81b7 commit 557ee8a
Show file tree
Hide file tree
Showing 2 changed files with 125 additions and 12 deletions.
62 changes: 62 additions & 0 deletions docs/Distributions.fsx
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ _Summary:_ this tutorial shows how to use the various types of probability distr
- [Discrete](#Discrete)
- [Bernoulli distribution](#Bernoulli-distribution)
- [Binomial distribution](#Binomial-distribution)
- [Multinomial distribution](#Multinomial-distribution)
- [Hypergerometric distribution](#Hypergerometric-distribution)
- [Poisson distribution](#Poisson-distribution)
- [Gamma distribution](#Gamma-distribution)
Expand Down Expand Up @@ -705,6 +706,67 @@ cdfComparison |> GenericChart.toChartHTML
(**
You can clearly see, that the CDF distributions are shifted according to the number of successes because: $failures (k) = trials (x) - successes (r)$.
### Multinomial distribution
The multinomial distribution is a generic version of the binomial distribution. While for binomial, the probabilities for a single success state is of interest, the multinomial distribution
deals with multiple exact success events.
Example: There are people from 3 different towns: _3 from town A_, _7 from town B_, and _20 from town C_. When 5 people are randomly chosen, what is the probability, to obtain exactly _1 from town A_,
_1 from town B_, and _3 from town C_? The individual success probabilities can be easily accessed _p(A)=5/30_, _p(B)=7/30_, and _p(C)=20/30_.
*)

// probabilities for all individual success states
let multiNomProb = vector [(3./30.); (7./30.); (20./30.)]

// the success combination that is of interest
let multiNomKs = Vector.Generic.ofList [1; 1; 3]

// gives the probability of obtaining exactly the pattern 1,1,3
let mNom = Discrete.Multinomial.PMF multiNomProb multiNomKs


(***hide***)
(sprintf "The probability to chose 1 person from town A, 1 from B, and 3 from C is: %.4f" mNom)
(***include-it-raw***)

(**
**Relation to binomial distribution**
Input vectors of length 2 correspond to the binomial distribution. The following examples are identical. While for binomial it is required to input the total number (n), for the
multinomial distribution you have to give the corresponding anto-probability:
*)

let mNom_bin_A = (Discrete.Binomial.PMF 0.123 200 20)
let mNom_bin_B = Discrete.Multinomial.PMF (vector [|0.123; 0.877|]) (Vector.Generic.ofArray [|20; 180|])

mNom_bin_A //0.0556956956889893
mNom_bin_B //0.0556956956889898


(**
A cumulative density function (CDF) is not defined properly, as you do not know which success statement is of interest. If you want to investigate how probable it is to see at least 3
people of town C you would have to calculate the sum all possible combinations that result in this constellation:
```
Discrete.Multinomial.PMF multiNomProb [1;1;3]
Discrete.Multinomial.PMF multiNomProb [2;0;3]
Discrete.Multinomial.PMF multiNomProb [0;2;3]
Discrete.Multinomial.PMF multiNomProb [1;0;4]
Discrete.Multinomial.PMF multiNomProb [0;1;4]
Discrete.Multinomial.PMF multiNomProb [0;0;5]
```
The cumulative probability is 0.7901234.
It may be that in future, a dedicated CDF functionality is added that requests the index of the success state of interest and sums up all possible combination probabilities.
## Empirical
You can create empirically derived distributions and sample randomly from these.
Expand Down
75 changes: 63 additions & 12 deletions tests/FSharp.Stats.Tests/DistributionsDiscrete.fs
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@ let binomialTests =

testList "Distributions.Discrete.Binominal" [
// Values taken from R 4.0.3
// dbinom(k,n,p)
testCase "Parameters" <| fun () ->
let param =
match (Discrete.Binomial.Init 0.1 3).Parameters with
Expand Down Expand Up @@ -233,12 +234,28 @@ let binomialTests =

testCase "Binomial.PMF" <| fun () ->
let testCase = Discrete.Binomial.PMF 0.69 420 237
let r_value = 4.064494e-08
let r_value = 1.741364593002809e-08
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.PMF with n=420, p=0.69 and k=237 does not equal the expectd 4.064494e-08"
"Binomial.PMF with n=420, p=0.69 and k=237 does not equal the expectd 1.741364593002809e-08"

let testCase2 = Discrete.Binomial.PMF 0.5 10 5
let r_value2 = 0.2460937500001213
Expect.floatClose
Accuracy.medium
testCase2
r_value2
"Binomial.PMF with n=10, p=0.5 and k=5 does not equal the expectd 0.2460937500001213"

let testCase3 = Discrete.Binomial.PMF 0.123 200 20
let r_value3 = 0.0556956956889893
Expect.floatClose
Accuracy.high
testCase3
r_value3
"Binomial.PMF with n=10, p=0.5 and k=5 does not equal the expectd 0.2460937500001213"

testCase "Binomial.PMF_n=0" <| fun () ->
let testCase = Discrete.Binomial.PMF 0.69 0 237
Expand All @@ -257,21 +274,37 @@ let binomialTests =
testCase
r_value
"Binomial.PMF with n=420, p=0.69 and k=-10 does not equal the expectd 0"

testCase "Binomial.CDF"<| fun () ->
let testCase = Discrete.Binomial.CDF 0.69 420 237
let r_value = 9.341312e-08
let r_value = 4.064494106136236e-08
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=420, p=0.69 and k=237 does not equal the expectd 9.341312e-08"
"Binomial.CDF with n=420, p=0.69 and k=237 does not equal the expectd 4.064494096e-08"

let testCase2 = Discrete.Binomial.CDF 0.5 10 5
let r_value2 = 0.6230468749999999
Expect.floatClose
Accuracy.high
testCase2
r_value2
"Binomial.CDF with n=420, p=0.69 and k=237 does not equal the expectd 0.6230468749999999"

let testCase3 = Discrete.Binomial.CDF 0.123 200 20
let r_value3 = 0.1901991220393886
Expect.floatClose
Accuracy.high
testCase3
r_value3
"Binomial.CDF with n=420, p=0.69 and k=237 does not equal the expectd 0.1901991220393886"

testCase "Binomial.CDF_n=0"<| fun () ->
let testCase = Discrete.Binomial.CDF 0.69 0 237
let r_value = 1.
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=0, p=0.69 and k=237 does not equal the expectd 1."
Expand All @@ -280,7 +313,7 @@ let binomialTests =
let testCase = Discrete.Binomial.CDF 0.69 420 0
let r_value = 2.354569e-214
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=420, p=0.69 and k=0 does not equal the expectd 2.354569e-214"
Expand All @@ -289,7 +322,7 @@ let binomialTests =
let testCase = Discrete.Binomial.CDF 0.69 420 -10
let r_value = 0.
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=420, p=0.69 and k=-10 does not equal the expectd 0."
Expand All @@ -298,7 +331,7 @@ let binomialTests =
let testCase = Discrete.Binomial.CDF 0.69 420 (-infinity)
let r_value = 0.
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=420, p=0.69 and k=--infinity does not equal the expectd 0."
Expand All @@ -307,7 +340,7 @@ let binomialTests =
let testCase = Discrete.Binomial.CDF 0.69 420 (infinity)
let r_value = 1.
Expect.floatClose
Accuracy.low
Accuracy.high
testCase
r_value
"Binomial.CDF with n=420, p=0.69 and k=-infinity does not equal the expectd 1."
Expand Down Expand Up @@ -395,6 +428,24 @@ let multinomialTests =
testCase3
1.
"Multinominal.PMF is incorrect"

let testCase4 = Discrete.Multinomial.PMF (vector [|0.5; 0.5|]) (Vector.Generic.ofArray [|5; 5|])
let r_value4 = 0.2460937500001213
Expect.floatClose
Accuracy.high
testCase4
r_value4
"Multinomial.PMF (vector [|0.5; 0.5|]) (Vector.Generic.ofArray [|5; 5|]) should result in Binomial.PMF 0.5 10 5"


let testCase5 = Discrete.Multinomial.PMF (vector [|0.123; 0.877|]) (Vector.Generic.ofArray [|20; 180|])
Expect.floatClose
Accuracy.high
testCase5
(Discrete.Binomial.PMF 0.123 200 20)
"Discrete.Multinomial.PMF (vector [|0.123; 0.877|]) (Vector.Generic.ofArray [|20; 180|])should result in Discrete.Binomial.PMF 0.123 200 20"



testCase "Checks.pSum1" <| fun () ->
let prob2 = vector [0.1;0.3;0.5]
Expand Down

0 comments on commit 557ee8a

Please sign in to comment.