Draw a Circle Using Random Points

This is a post to introduce a couple of probability concepts that are useful for machine learning. To brand it more interesting, I am mixing in information technology some MonteCarlo simulation ideas too!

To come across examples in Python we need get-go to innovate the concept of random numbers.

Stochastic vs deterministic numbers

The English word stochastic is an adjective describing something that was randomly determined. Information technology originally came from Greek στόχος (stokhos), meaning 'aim, approximate'.

On the other side, deterministic means that the outcome – given the same input – will always be the aforementioned. There is no unpredictability.

Random number

Randomly generated is a big area by itself, for our scope is enough to say that randomness is the lack of pattern or predictability in events. A random sequence of events therefore has no order and does not follow an intelligible combination.

Individual random events are by definition unpredictable, just in many cases the frequency of dissimilar outcomes over a large number of events is predictable.
And this is what is interesting for the states: if I throw a die with six faces thousands of times, how many times in pct shall I expect to see the confront number six?

Let's encounter some practical examples with Python. As usual they are also available in a notebook.
We generate a (pseudo) random number in Python using the random library:

import random def genEven():     '''     Returns a random fifty-fifty number x, where 0 <= x < 100     '''     return random.randrange(0,100,2)  genEven()

def stochasticNumber():     '''     Stochastically generates and returns a uniformly distributed even     number between 9 and 21     '''     render random.randrange(x,21,2)  stochasticNumber()

Over again:

stochasticNumber()

Generally, in applications such every bit in security applications, hardware generators are generally preferred over software algorithm.

A pseudo-random algorithms, similar the Python random library above, is called pseudo because is not really unpredictable: the sequence of random numbers generated depends on the initial seed: using the same number equally seed will generate the same sequence.

This is very useful for debugging purpose simply means too that one needs to be very careful when choosing the seed (for instance, choosing atmospheric signals or other noises).

Nosotros can use this property to generate deterministic numbers. If the seed is fixed, the number generated "randomly" will be e'er the same.

def deterministicNumber():     '''     Deterministically generates and returns an even number     between nine and 21     '''     random.seed(0)  # Fixed seed, always the same.     render random.randrange(ten,21,ii)  deterministicNumber()

And again:

deterministicNumber()

The same number !!

Before looking at what is the probability of an issue, we define a function that simulates the roll of a six-faced die:

def rollDie():     """returns a random int between i and 6"""     return random.randint(1,six)     # an culling: return random.pick([1,two,3,4,5,half-dozen])   rollDie()

four

Discrete Probability

If E represents an event, so P(E) is the probability that East will occur.
A probability is a frequency expressed as a fraction of the sample size:

P(E) = number of outcomes favourable to Eastward / full number of outcomes

It is a existent value between 0 and 1.

Example: a die has half dozen faces: the space of the possibilities is six.
If I want to calculate the probability of getting the event v when rolling the dice ane time, I need to consider that there is ane face up with the number v on it therefore only i possible consequence favourable:

P(five) = one / 6

This is the frequentist definition.
Probability can be seen every bit the long-run relative frequency with which an event occurred given many repeated trials.
Empirically, let'south say I throw this dice chiliad times and the face number five comes 125 times.The frequency observed for face six is 125/1000 = 0.125

An alternative is Bayesianism which defines probability as a degree of belief that an event will occur. Information technology depends on the own state of cognition, therefore is more subjective than frequency probability.

By the way, the probability that an event does NOT occur is:
P(not A) = i – P(A)

Probability Not to get a five in a die roll is therefore 1 – i/6 = 5/6

Monte Carlo simulations

One of the most powerful techniques in whatever information scientist'south tool belt is the Monte Carlo Simulation. It's super flexible and extremely powerful since it can be applied to nearly whatever situation if the problem tin be stated probabilistically.

Monte Carlo methods are a form of methods that can be applied to computationally 'hard' issues to arrive at near-plenty authentic answers. The general premise is remarkably elementary:

Randomly sample input(due south) to the problem
For each sample, compute an output
Amass the outputs to gauge the solution

The name refers to a famous casino in Monaco. Information technology was coined in 1949 by one of the method's pioneers, Stanislaw Ulam. Ulam's uncle was reportedly a gambling homo, and the connection between the chemical element of 'take a chance' in gambling and in Monte Carlo methods must have been peculiarly credible to Stanislaw.

A Montecarlo simulation of a die gyre

We run now a simulation of a die gyre and see what are the frequencies for each face of the die:

def simMCdie(numTrials):     impress ("Running ", numTrials)     counters = [0] * 6 # initialize the counters for each face     for i in range(numTrials):         gyre = rollDie()         counters[roll-ane] += 1      return counters  results = simMCdie(10000)

Running 10000  [1657, 1690, 1697, 1655, 1632, 1669]

import matplotlib.pyplot as plt plt.bar(['1','2','3','four','five','6'], results);

output_29_0 — Frequencies of the six faces of a die

Independent events

A and B are independent if knowing whether A occurred gives no data about whether B occurred.

Permit's define a office to model n rolls of a die, whereas each coil should be independent from the others.

def rollNdice(n):     result = ''     for i in range(n):         result = result + str(rollDie())     return result  rollNdice(five)

'12344'

These are independent events.

Now an interesting question would be:
What is the probability that both ii independent events A and B occur?

For example, the probability to get 2 sequent half-dozen in a dice roll is therefore:
one/half dozen * 1/6 = ane / (half dozen^2) = ane / 36

Which is quite low.

This applies likewise for more than two independent events.
In general, the probabilities to occur n contained events is:

$P(\forall E_i) = \prod (E_i)$

For a 6-sided dice, at that place are 6^5 possible sequences of length five.
The probability of getting five consecutive six is 1 / 6^5, pretty depression number. 1 out of 7776 possibilities.
Permit's await at a simulation to check this.

def getTarget(goal):     # Scroll dice until we become the goal     # goal: a cord of Due north die results, for instance 5 six: "66666"     numRolls = len(goal)      numTries = 0     while True:         numTries += 1         result = rollNdice(numRolls)           # if success then leave         if result == goal:             return numTries  def runDiceMC(goal, numTrials):     print ("Running ... trails: ", numTrials)     total = 0     for i in range(numTrials):         total += getTarget(goal)      print ('Boilerplate number of tries =', total/float(numTrials))  runDiceMC('66666', 100)

Running ... trails: 100 Average number of tries = 6596.93

Remember that the theory says information technology will have on average 7776 tries.

Pascal'southward problem

A friend asked Pascal:
would it be profitable, given 24 rolls of a pair of dice, to bet against their being at least one double six?

In 17th century this was a difficult problem.
Now we know it'south :
P(A=6 and B=vi) = 1/6 * i/6 = 1/36 (ii independent events)
P(not double six) = i – 1/36 = 35/36
P(no double six in 24 rolls) = (35/36)^24

(35.0 / 36.0)**24

0.5085961238690966

it'south very shut!
Once again, nosotros can run a simulation to cheque it:

def checkPascalMC(numTrials, roll, numRolls = 24, goal = half-dozen):     numSuccess = 0.0      for i in range(numTrials):         for j in range(numRolls):             die1 = roll()             die2 = roll()             if die1 == goal and die2 == goal:                 numSuccess += one                 break      print ('Probability of losing =', ane.0 - numSuccess / numTrials) checkPascalMC(10000, rollDie)

Probability of losing = 0.5063

Sampling table

The sampling table gives the number of possible samples of size k out of a population of size n, under diverse assumptions how the sample is collected.

One case:
i ball volition be drawn at random from a box containing: iii green assurance, 5 red balls, and 7 yellow assurance.

What is the probability that the ball volition be green?

green = 3 red = 5 yellow = vii assurance = greenish+red+xanthous pGreen = green / balls pGreen

0.2

The population has size fifteen and has therefore 15 possible samples of size 1; out of these 15 possible samples, merely iii of them will reply our question (brawl is green).

We divers the variable pGreen as the probability of choosing a light-green brawl from the box. What is the probability that the ball you draw from the box will Non exist green?

ane - pGreen

0.viii

Sampling without replacement – generalised

Instead of taking just one draw, consider taking ii draws. You take the second draw without returning the first draw to the box. Nosotros call this sampling without replacement.

What is the probability that the first draw is green and that the 2d draw is not greenish?

                      # probability of choosing a greenish brawl from the box on the first depict. pGreen1 = green / balls   # probability of NOT choosing a green brawl on the 2d draw without replacement. pGreen2 = (ruddy + yellow) / (dark-green -one + ruddy + yellow)    # Calculate the probability that the commencement draw is green and the 2nd describe is not green. pGreen1 * pGreen2

0.17142857142857143

Sampling with replacement – generalised

Now repeat the experiment, but this time, afterwards taking the start draw and recording the color, render information technology back to the box and shake the box. Nosotros call this sampling with replacement.

What is the probability that the start draw is green and that the second draw is not light-green?

# probability of choosing a green brawl from the box on the first draw. # same as above: pGreen1 # probability of Not choosing a green brawl on the second describe with replacement  pGreen2r = (red + yellowish) / balls    # Calculate the probability that the first draw is cyan and the second draw is not cyan. pGreen1 * pGreen2r

0.16000000000000003

Sampling with replacement – exist careful

Say you've drawn 5 assurance from a box that has 3 dark-green balls, 5 scarlet balls, and seven xanthous balls – with replacement – and all accept been xanthous.

What is the probability that the next i is yellowish?

# probability that a xanthous ball is fatigued from the box. pYellow = yellow / balls  # probability of drawing a yellow ball on the 6th draw. pYellow

0.4666666666666667

Yep, doesn't matter if all previous 5 draws were ALL yellow balls, the probability that the sixth ball is yellow is the same equally for the first describe and all other draws. With replacement the population is ever the same.

Two teams, say the Manchester United (1000.Utd.) and the AC Milan, are playing a 7 game series. The Milan are a better team and have a lx% chance of winning each game.

What is the probability that the Chiliad.Utd. wins at to the lowest degree ane game? Remember that they must win one of the first four games, or the serial volition be over!
Permit´s assume they are independent events (in reality losing one match may impact the team's morale for the side by side match):

p_milan_wins = 0.6   # probability that the Milan team will win the first four games of the series: p_milan_win4 = p_milan_wins**4    # probability that the Grand.Utd. wins at to the lowest degree one game in the get-go iv games of the series. ane - p_milan_win4

0.8704000000000001

Here is the Monte Carlo simulation to ostend our answer to K.Utd. winning a game:

import numpy as np def RealWinsOneMC(numTrials, nGames=4):     numSuccess = 0      for i in range(numTrials):         simulatedGames = np.random.choice(["lose","win"], size=nGames, supplant=True, p=[0.6,0.4])         if 'win' in simulatedGames:             numSuccess += one      render numSuccess / numTrials impress (RealWinsOneMC(1000))

0.876

Winning a game – with MonteCarlo

The two teams are playing a seven game title series. The first to win iv games wins the series. The teams are equally expert, so they each have a 50-50 take chances of winning each game.

If Milan lose the first game, what is the probability that they win the series?

# Create a listing that contains all possible outcomes for the remaining games. possibilities = [(g1,g2,g3,g4,g5,g6) for g1 in range(two) for g2 in range(two)                  for g3 in range(2) for g4 in range(2) for g5 in range(2)                  for g6 in range(2)]  # Create a list that indicates whether each row in 'possibilities' # contains enough wins for the Cavs to win the series. sums = [sum(tup) for tup in possibilities] results = [s >= 4 for south in sums]  # Summate the proportion of 'results' in which the Cavs win the series. np.mean(results)

0.34375

Ostend the results of the previous question with a Monte Carlo simulation to estimate the probability of Milan winning the serial.

def MilanWinsSeriesMC(numTrials, nGames=6):     numSuccess = 0      for i in range(numTrials):         simulatedGames = np.random.selection([0,1], size=nGames, supersede=Truthful)         if sum(simulatedGames) >= four:             numSuccess += one      render numSuccess / numTrials MilanWinsSeriesMC(100)

0.iii

That's all for now.
Finally – to show how useful probability and MonteCarlo tin be for many real bug – we can see how to judge the value of the infamous PI.

Approximate the value of pi

Pi (or π) is a mathematical constant. Information technology is mayhap almost famous for its appearance in the formula for the expanse of a circle:

A = πr²

π is an example of anirrational number. Its exact value is impossible to correspond every bit any fraction of two integers.
In fact, π is likewise an example of atranscendental number — there aren't even whatsoever polynomial equations to which it is a solution.

Yet obtaining an authentic value for π is not that difficult.
You can find a pretty skillful estimate of π using a Monte Carlo-inspired method:

Draw a 2m×2m square on a wall. Inside, draw a circumvolve target of radius 1m.
Now, accept a few steps dorsum and throw darts randomly at the wall. Count each fourth dimension the dart lands in the circumvolve.
After a hundred throws, multiply what fraction of throws landed in the circle by the area of the square. At that place is your judge for π.

The circle's area is virtually 78% that of the square, and then about that pct of darts lands within the circle.
The reason this works is very intuitive.When sampling random points from a square containing a circle, the probability of selecting points from inside the circumvolve is proportional to the circle'south expanse. With enough random samples, we can find a reliable estimate for this proportion, p.

Now, nosotros know the area of the foursquare is two×2 = 4m², and we know the area of the circle is π×r². Since the radius r equals 1, the area of the circle is just π.

As we know the area of the square and have an estimate for the proportion of its area covered by the circle, we tin can estimate π. Simply multiply p×4.

The more than random samples we throw, the better the gauge *p* will be. However, the gain in accurateness diminishes equally nosotros take more and more samples.

def throwNeedlesInCircle(numNeedles):     '''     Throw randomly  needles in a 2x2 square (expanse=4)     that has a circle inside of radius 1 (area = PI)     Count how many of those needles at the cease landed within the circle.     Return this estimated proportion: Circle Area / Square Area     '''     inCircle = 0 # number of needles inside the circle      for needle in range(one, numNeedles + 1):          ten = random.random()         y = random.random()          if (x*x + y*y)**0.5 <= i.0:             inCircle += 1      render (inCircle/float(numNeedles)) piEstimation = throwNeedlesInCircle(100) * iv piEstimation

iii.16

More needles you throw more precise will be the PI value.

def getPiEstimate(numTrials, numNeedles):      print(("{t} trials, each has {n} Needles.")\           .format(t= numTrials, n=numNeedles))      estimates = []     for t in range(numTrials):         piGuess = 4*throwNeedlesInCircle(numNeedles)         estimates.suspend(piGuess)      stdDev = np.std(estimates)     curEst = sum(estimates)/len(estimates)      print ('PI Interpretation = ' + str(curEst))     print ('Std. dev. = ' + str(circular(stdDev, v)))      return (curEst, stdDev) getPiEstimate(20, 100);

20 trials, each has 100 Needles. PI Interpretation = 3.106 Std. dev. = 0.18858

We can do meliorate and proceed – increasing the number of needles at each trial – until we become the wished precision.

def estimatePi(precision, numTrials, numNeedles = chiliad):      sDev = precision     while sDev >= precision/2.0:         curEst, sDev = getPiEstimate(numTrials, numNeedles)         numNeedles *= 2         print("---")     render curEst estimatePi(0.005, 100)

100 trials, each has one thousand Needles. PI Estimation = 3.13456 Std. dev. = 0.05118 --- 100 trials, each has 2000 Needles. PI Estimation = 3.1476800000000016 Std. dev. = 0.0388 --- 100 trials, each has 4000 Needles. PI Estimation = 3.1422299999999996 Std. dev. = 0.02884 --- 100 trials, each has 8000 Needles. PI Estimation = three.140664999999999 Std. dev. = 0.01864 --- 100 trials, each has 16000 Needles. PI Estimation = 3.141509999999999 Std. dev. = 0.01187 --- 100 trials, each has 32000 Needles. PI Interpretation = 3.14153125 Std. dev. = 0.00842 --- 100 trials, each has 64000 Needles. PI Estimation = three.1414699999999995 Std. dev. = 0.00589 --- 100 trials, each has 128000 Needles. PI Estimation = 3.1423515625000005 Std. dev. = 0.00448 --- 100 trials, each has 256000 Needles. PI Interpretation = three.142143593750002 Std. dev. = 0.0034 --- 100 trials, each has 512000 Needles. PI Estimation = iii.1416040624999995 Std. dev. = 0.00247 ---  three.1416040624999995

Monte Carlo techniques are often invoked when nosotros need to do repeated sampling to determine results. Often that means that the problem cannot be simply solved mathematically. For example, when the issue of one pace in the calculation changes the possibilities in the next steps.

In a probabilistic world, we apply random variables to represent stochastic phenomena. In one case we know the random variables, we can use the Monte Carlo method.

The Monte Carlo method is a fine way to find the variations of the process. In other words, the risk of the process. The risk of a supply chain to be understock or overstock. The risk of the costs of the project to be in a higher place upkeep. The take a chance of the work to exceed timeline. And then on.

tuttlewhichisatur.blogspot.com

Source: https://mashimo.wordpress.com/2019/01/27/montecarlo-and-pi/