tag:code.tutsplus.com,2005:/categories/mathematicsEnvato Tuts+ Code - Mathematics2022-04-14T07:36:00Ztag:code.tutsplus.com,2005:PostPresenter/cms-277502016-11-26T06:11:53+00:00Mathematical Modules in Python: Statistics<style>* { box-sizing: border-box; } body {margin: 0;}*{box-sizing:border-box;}body{margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;}</style><div data-content-block-type="Wysi" id="ieej" class="content-block-wysi content-block">
<p>Statistical analysis of data helps us make sense of the information as a whole. This has applications in a lot of fields, like biostatistics and business analytics.</p>
<p>Instead of going through individual data points, just one look at their collective mean value or variance can reveal trends and features that we might have missed by observing all the data in raw format. It also makes the comparison between two large data sets way easier and more meaningful.</p>
<p>Keeping these needs in mind, Python has provided us with the <a href="https://docs.python.org/3/library/statistics.html" target="_blank" rel="external noopener">statistics</a> module.</p>
<p>In this tutorial, you will learn about different ways of calculating averages and measuring the spread of a given set of data. Unless stated otherwise, all the functions in this module support <code class="inline">int</code>, <code class="inline">float</code>, <code class="inline">decimal</code>, and <code class="inline">fraction</code> based data sets as input.</p>
<table>
<thead>
<tr>
<th>Statistics Task</th>
<th>Typical Functions</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="calculating-the-mean">Calculating the Mean</a></td>
<td>
<code>mean()</code>, <code>fmean()</code>, <code>geometric_mean()</code>, <code>harmonic_mean()</code>
</td>
</tr>
<tr>
<td><a href="calculating-the-mode">Calculating the Mode</a></td>
<td>
<code>mode()</code>, <code>multimode()</code>
</td>
</tr>
<tr>
<td><a href="calculating-the-median">Calculating the Median</a></td>
<td><code>median()</code></td>
</tr>
<tr>
<td><a href="Measuring%20the%20Spread%20of%20Data">Measuring the Spread of Data</a></td>
<td>
<code>pvariance()</code>, <code>variance()</code>, <code>pstdev()</code>, <code>stdev()</code>
</td>
</tr>
</tbody>
</table>
<h2>
<a name="calculating-the-mean"></a>Calculating the Mean</h2>
<p>You can use the <code class="inline">mean(data)</code> function to calculate the mean of some given data. It is calculated by dividing the sum of all the data points by the number of data points. If the data is empty, a <a href="https://docs.python.org/3/library/statistics.html#statistics.StatisticsError" target="_blank" rel="external noopener">StatisticsError</a> will be raised. Here are a few examples:</p>
<pre class="brush: python noskimlinks noskimwords">import statistics
from fractions import Fraction as F
from decimal import Decimal as D
statistics.mean([11, 2, 13, 14, 44])
# returns 16.8
statistics.mean([F(8, 10), F(11, 20), F(2, 5), F(28, 5)])
# returns Fraction(147, 80)
statistics.mean([D("1.5"), D("5.75"), D("10.625"), D("2.375")])
# returns Decimal('5.0625')
</pre>
<p>You learned about a lot of functions to generate random numbers in our <a href="https://code.tutsplus.com/tutorials/mathematical-modules-in-python-random--cms-27738" target="_blank" rel="external noopener">last tutorial</a>. Let's use them now to generate our data and see if the final mean is equal to what we expect it to be.</p>
<pre class="brush: python noskimlinks noskimwords">import random
import statistics
data_points = [ random.randint(1, 100) for x in range(1,1001) ]
statistics.mean(data_points)
# returns 50.618
data_points = [ random.triangular(1, 100, 80) for x in range(1,1001) ]
statistics.mean(data_points)
# returns 59.93292281437689
</pre>
<p>With the <code class="inline">randint()</code> function, the mean is expected to be close to the mid-point of both extremes, and with the triangular distribution, it is supposed to be close to <code class="inline">low + high + mode / 3</code>. Therefore, the mean in the first and second cases should be 50 and 60.33 respectively, which is close to what we actually got.</p>
<p>One thing that you will realize when using the <code>mean()</code> function in the <code>statistics</code> module is that it has been written to prioritize accuracy over speed. This implies that you will get much better results with wildly varying data by using the <code>mean()</code> function instead of doing regular average computation with a simple sum.</p>
<p>You can consider using the <code>fmean()</code> function introduced in Python 3.8 if you prefer speed over absolute accuracy. The results will still be accurate in most situations. This function will convert all the data to floats and then return the mean as a <code>float</code> as well.</p>
<pre class="brush: python noskimlinks noskimwords">import random
import statistics
from fractions import Fraction as F
int_values = [random.randrange(100) for x in range(9)]
frac_values = [F(1, 2), F(1, 3), F(1, 4), F(1, 5), F(1, 6), F(1, 7), F(1, 8), F(1, 9)]
mix_values = [*int_values, *frac_values]
print(statistics.mean(mix_values))
# 929449/42840
print(statistics.fmean(mix_values))
# 21.69582166199813</pre>
<p>Starting from version 3.8, Python also supports the calculation of the geometric and harmonic means of data using the <code>geometric_mean(data)</code> and <code>harmonic_mean(data, weights=None)</code> functions.</p>
<p>The geometric mean is calculated by multiplying all the <strong>n</strong> values in the data and then taking the <strong>n</strong><sup><strong>th</strong> </sup>root of the product. The results may be slightly off in some cases due to floating-point errors.</p>
<p>One application of the geometric mean is in the quick calculation of compound annual growth rates. For example, let's say the sales of a company in four years are 100, 120, 150, and 200. The percentage growth for three years will then be 20%, 25%, and 33.33%. The average growth rate of sales for the company will be more accurately represented by the geometric mean of the percentages. The arithmetic mean will always give us a wrong and slightly higher growth rate.</p>
<pre class="brush: python noskimlinks noskimwords">import statistics
growth_rates = [20, 25, 33.33]
print(statistics.mean(growth_rates))
# 26.11
print(statistics.geometric_mean(growth_rates))
# 25.542796263143476</pre>
<p>The harmonic mean is simply the reciprocal of the arithmetic mean of the reciprocal of the data. Since the <code>harmonic_mean()</code> function calculates the mean of reciprocals, a value of 0 in the data creates problems, and we'll get a <code>StatisticsError</code> exception.</p>
<p>The harmonic mean is useful for calculating the averages of ratios and rates, such as calculating the average speed, density, or resistance in parallel. Here is some code that calculates the average speed when someone covers a fixed portion of a journey (100km in this case) with specific speeds. </p>
<pre class="brush: python noskimlinks noskimwords">import statistics
speeds = [30, 40, 60]
distance = 100
total_distance = len(speeds)*distance
total_time = 0
for speed in speeds:
total_time += distance/speed
average_speed = total_distance/total_time
print(average_speed)
# 39.99999999999999
print(statistics.harmonic_mean(speeds))
# 40.0</pre>
<p>Two things worth noticing here are that the <code>harmonic_mean()</code> function reduces all the calculations to a single one-liner and at the same time gives more accurate results without floating-point errors.</p>
<p>We can use the weights argument to specify how much corresponding distance was covered with certain speeds.</p>
<pre class="brush: python noskimlinks noskimwords">import statistics
speeds = [30, 40, 60]
distances = [100, 120, 160]
print(statistics.harmonic_mean(speeds, distances))
# 42.222222222</pre>
<h2>
<a name="calculating-the-mode"></a>Calculating the Mode</h2>
<p>The mean is a good indicator of the average, but a few extreme values can result in an average that is far from the actual central location. In some cases, it is more desirable to determine the most frequent data point in a data set. The <code class="inline">mode()</code> function will return the most common data point from discrete numerical or non-numerical data. This is the only statistical function that can be used with non-numeric data.</p>
<pre class="brush: python noskimlinks noskimwords">import random
import statistics
data_points = [ random.randint(1, 100) for x in range(1,1001) ]
statistics.mode(data_points)
# returns 94
data_points = [ random.randint(1, 100) for x in range(1,1001) ]
statistics.mode(data_points)
# returns 49
data_points = [ random.randint(1, 100) for x in range(1,1001) ]
statistics.mode(data_points)
# returns 32
mode(["cat", "dog", "dog", "cat", "monkey", "monkey", "dog"])
# returns 'dog'</pre>
<p>The mode of randomly generated integers in a given range can be any of those numbers as the frequency of occurrence of each number is unpredictable. The three examples in the above code snippet prove that point. The last example shows us how we can calculate the mode of non-numeric data.</p>
<p>A newer <code>multimode()</code> function in Python 3.8 allows us to return more than one result when there are multiple values that occur with the same top frequency.</p>
<pre class="brush: python noskimlinks noskimwords">import statistics
favorite_pet = ['cat', 'dog', 'dog', 'mouse', 'cat', 'cat', 'turtle', 'dog']
print(statistics.multimode(favorite_pet))
# ['cat', 'dog']</pre>
<h2>
<a name="calculating-the-median"></a>Calculating the Median</h2>
<p>Relying on the mode to calculate a central value can be a bit misleading. As we just saw in the previous section, it will always be the most frequently occurring data point, irrespective of all other values in the data set. Another way of determining the central location is by using the <code class="inline">median()</code> function. It will return the median value of given numeric data by calculating the mean of two middle points if necessary. If the number of data points is odd, it returns the middle point. If the number of data points is even, it returns the average of the two median values.</p>
<p>The problem with the <code class="inline">median()</code> function is that the final value may not be an actual data point when the number of data points is even. In such cases, you can either use <code class="inline">median_low()</code> or <code class="inline">median_high()</code> to calculate the median. With an even number of data points, these functions will return the smaller and larger value of the two middle points respectively.</p>
<pre class="brush: python noskimlinks noskimwords">import random
import statistics
data_points = [ random.randint(1, 100) for x in range(1,50) ]
statistics.median(data_points)
# returns 53
data_points = [ random.randint(1, 100) for x in range(1,51) ]
statistics.median(data_points)
# returns 51.0
data_points = [ random.randint(1, 100) for x in range(1,51) ]
statistics.median(data_points)
# returns 49.0
data_points = [ random.randint(1, 100) for x in range(1,51) ]
statistics.median_low(data_points)
# returns 50
statistics.median_high(data_points)
# returns 52
statistics.median(data_points)
# returns 51.0
</pre>
<p>In the last case, the low and high medians were 50 and 52. This means that there was no data point with a value of 51 in our data set, but the <code class="inline">median()</code> function still calculated the median to be 51.0.</p>
<h2>
<a name="Measuring%20the%20Spread%20of%20Data"></a>Measuring the Spread of Data</h2>
<p>Determining how much the data points deviate from the typical or average value of the data set is just as important as calculating the central or average value itself. The <em>statistics</em> module has four different functions to help us calculate this spread of data.</p>
<p>You can use the <code>pvariance(data, mu=None)</code> function to calculate the population variance of a given data set.</p>
<p>The second argument in this case is optional. The value of <em>mu</em>, when provided, should be equal to the mean of the given data. The mean is calculated automatically if the value is missing. This function is helpful when you want to calculate the variance of an entire population. If your data is only a sample of the population, you can use the <code>variance(data, xBar=None)</code> function to calculate the sample variance. Here, <code>xBar</code> is the mean of the given sample and is calculated automatically if not provided.</p>
<p>To calculate the population standard definition and sample standard deviation, you can use the <code>pstdev(data, mu=None)</code> and <code>stdev(data, xBar=None)</code> functions respectively.</p>
<pre class="brush: python noskimlinks noskimwords">import statistics
from fractions import Fraction as F
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
statistics.pvariance(data) # returns 6.666666666666667
statistics.pstdev(data) # returns 2.581988897471611
statistics.variance(data) # returns 7.5
statistics.stdev(data) # returns 2.7386127875258306
more_data = [3, 4, 5, 5, 5, 5, 5, 6, 6]
statistics.pvariance(more_data) # returns 0.7654320987654322
statistics.pstdev(more_data) # returns 0.8748897637790901
some_fractions = [F(5, 6), F(2, 3), F(11, 12)]
statistics.variance(some_fractions)
# returns Fraction(7, 432)
</pre>
<p>As evident from the above example, a smaller variance implies that more data points are closer in value to the mean. You can also calculate the standard deviation of decimals and fractions.</p>
<h2>
<a name="final-thoughts"></a>Final Thoughts</h2>
<p>In this last tutorial of the series, we learned about different functions available in the <em>statistics</em> module. You might have observed that the data given to the functions was sorted in most cases, but it doesn't have to be. I have used sorted lists in this tutorial because they make it easier to understand how the value returned by different functions is related to the input data.</p>
</div>2022-04-14 07:36:00 UTC2022-04-14 07:36:00 UTCMonty Shokeentag:code.tutsplus.com,2005:PostPresenter/cms-277382016-11-25T08:54:06+00:00Mathematical Modules in Python: Random<style>* { box-sizing: border-box; } body {margin: 0;}*{box-sizing:border-box;}body{margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;}</style><div data-content-block-type="Wysi" id="iowx" class="content-block-wysi content-block">
<p>Randomness is all around us. When you flip a coin or roll a die, you can never be sure of the outcome. This unpredictability has a lot of applications, like determining the winners of a lucky draw or generating test cases for an experiment with random values produced based on an algorithm. </p>
<p>Keeping this usefulness in mind, Python has provided us with the <a href="https://docs.python.org/3/library/random.html" target="_blank" rel="external noopener">random</a> module. You can use it in games to spawn enemies randomly or to shuffle the elements in a list. </p>
<table>
<tbody>
<tr>
<td><strong>Types of Functions</strong></td>
<td><strong>Example Functions</strong></td>
</tr>
<tr>
<td><a href="#basic">Initialize and use the random number generator</a></td>
<td>
<code>seed()</code>, <code>random()</code>
</td>
</tr>
<tr>
<td><a href="#integers">Random integers in a range</a></td>
<td>
<code>randrange()</code>, <code>randint()</code>
</td>
</tr>
<tr>
<td><a href="#sequence">Random items from a sequence</a></td>
<td>
<code>choice()</code>, <code>shuffle()</code>, <code>sample()</code>
</td>
</tr>
<tr>
<td><a href="#distributions">Random floats with standard distributions</a></td>
<td>
<code>triangular()</code>, <code>uniform()</code>, <code>normalvariate()</code>
</td>
</tr>
<tr>
<td><a href="#weighted">Random items from a weighted list</a></td>
<td>
<code>choice()</code>, <code>choices()</code>, <code>sample()</code>
</td>
</tr>
</tbody>
</table>
<h2>
<a name="basic"></a>How Does Random Work?</h2>
<p>Nearly all of the functions in this module depend on the basic <code class="inline">random()</code> function, which will generate a random float greater than or equal to zero and less than one. Python uses the <a href="https://en.wikipedia.org/wiki/Mersenne_Twister" target="_blank" rel="external noopener">Mersenne Twister</a> to generate the floats. It produces 53-bit precision floats with a period of 2<sup>19937</sup>-1. It is actually the most widely used general-purpose pseudo-random number generator.</p>
<h4>Initialize the Random Number Generator With <code>seed()</code>
</h4>
<p>Sometimes, you want the random number generator to reproduce the sequence of numbers it created the first time. This can be achieved by providing the same seed value both times to the generator using the <code>seed(s, version)</code> function. If the <code>s</code> parameter is omitted, the generator will use the current system time to generate the numbers. Here is an example:</p>
<pre class="brush: python noskimlinks noskimwords">import random
random.seed(100)
random.random()
# returns 0.1456692551041303
random.random()
# returns 0.45492700451402135</pre>
<p>Keep in mind that unlike a coin flip, the module generates pseudo-random numbers which are completely deterministic, so it is not suitable for cryptographic purposes.</p>
<h2>
<a name="integers"></a>Generating Random Integers</h2>
<h4>Generate Integers in a Range With <code>randrange()</code> and <code>randint()</code>
</h4>
<p>The module has two different functions for generating random integers. You can use <code class="inline">randrange(a)</code> to generate a random whole number smaller than <code class="inline">a</code>. </p>
<p>Similarly, you can use <code class="inline">randrange(a, b[,step])</code> to generate a random number from <code class="inline">range(a, b, step)</code>. For example, using <code class="inline">random.randrange(0, 100, 3)</code> will only return those numbers between 0 and 100 which are also divisible by 3.</p>
<p>If you know both the lower and upper limits between which you want to generate the numbers, you can use a simpler and more intuitive function called <code>randint(a, b)</code>. It is simply an alias for <code class="inline">randrange(a, b+1)</code>.</p>
<pre class="brush: python noskimlinks noskimwords">import random
random.randrange(100)
# returns 65
random.randrange(100)
# returns 98
random.randrange(0, 100, 3)
# returns 33
random.randrange(0, 100, 3)
# returns 75
random.randint(1,6)
# returns 4
random.randint(1,6)
# returns 6</pre>
<h2>
<a name="sequence"></a>Functions for Sequences</h2>
<h4>Chose a Random Element From a List With <code>choice()</code>
</h4>
<p>To select a random element from a given non-empty sequence, you can use the <code>choice(seq)</code> function. With <code class="inline">randint()</code>, you are limited to a selection of numbers from a given range. The <code class="inline">choice(seq)</code> function allows you to choose a number from any sequence you want. </p>
<p>Another good thing about this function is that it is not limited to just numbers. It can select any type of element randomly from a sequence. For example, the name of the winner of a lucky draw among five different people, provided as a string, can be determined using this function easily.</p>
<h4>Shuffle a Sequence With <code>shuffle()</code>
</h4>
<p>If you want to shuffle a sequence instead of selecting a random element from it, you can use the <code class="inline">shuffle(seq)</code> function. This will result in an <em>in place</em> shuffling of the sequence. For a sequence with just 10 elements, there can be a total of 10! = 3,628,800 different arrangements. With a larger sequence, the number of possible permutations will be even higher—this implies that the function can never generate all the permutations of a large sequence.</p>
<h4>Sample Multiple Times With <code>sample()</code> </h4>
<p>Let's say you have to pick 50 students from a group of 100 students to go on a trip. </p>
<p>At this point, you may be tempted to use the <code class="inline">choice(seq)</code> function. The problem is that you will have to call it about 50 times in the best case scenario where it does not choose the same student again. </p>
<p>A better solution is to use the <code>sample(seq, k)</code> function. It will return a list of <code>k</code> unique elements from the given sequence. The original sequence is left unchanged. The elements in the resulting list will be in selection order. If <em>k</em> is greater than the number of elements in the sequence itself, a <code><a href="https://docs.python.org/3/library/exceptions.html#ValueError" target="_blank" rel="external noopener">ValueError</a></code> will be raised. </p>
<pre class="brush: python noskimlinks noskimwords">import random
ids = [1, 8, 10, 12, 15, 17, 25]
random.choice(ids) # returns 8
random.choice(ids) # returns 15
names = ['Tom', 'Harry', 'Andrew', 'Robert']
random.choice(names) # returns Tom
random.choice(names) # returns Robert
random.shuffle(names)
names
# returns ['Robert', 'Andrew', 'Tom', 'Harry']
random.sample(names, 2)
# returns ['Andrew', 'Robert']
random.sample(names, 2)
# returns ['Tom', 'Robert']
names
# returns ['Robert', 'Andrew', 'Tom', 'Harry']
</pre>
<p>As you can see, <code class="inline">shuffle(seq)</code> modified the original list, but <code class="inline">sample(seq, k)</code> kept it intact.</p>
<h2>
<a name="distributions"></a>Generating Random Floats With Standard Distributions</h2>
<p>In this section, you will learn about functions that can be used to generate random numbers based on specific real-value distributions. The parameters of most of these functions are named after the corresponding variable in that distribution's actual equation.</p>
<p>When you just want a number between 0 and 1, you can use the <code class="inline">random()</code> function. If you want the number to be in a specific range, you can use the <code class="inline">uniform(a, b)</code> function with <em>a</em> and <em>b</em> as the lower and higher limits respectively.</p>
<h4>Generating Random Floats With Probability Distributions</h4>
<p>Let's say you need to generate a random number between <code>low</code> and <code>high</code> such that it has a higher probability of lying in the vicinity of another number <code>mode</code>. You can do this with the <code>triangular(low, high, mode)</code> function. The <code>low</code> and <code>high</code> values will be 0 and 1 by default. Similarly, the <code>mode</code> value defaults to the mid-point of the low and high values, resulting in a symmetrical distribution.</p>
<p>There are a lot of other functions as well to generate random numbers based on different distributions. As an example, you can use <code>normalvariate(mu, sigma)</code> to generate a random number based on a normal distribution, with <code>mu</code> as the mean and <code>sigma</code> as the standard deviation.</p>
<h4>Example Random Values From Probability Distributions</h4>
<pre class="brush: python noskimlinks noskimwords">import random
random.random()
# returns 0.8053547502449923
random.random()
# returns 0.05966180559620815
random.uniform(1, 20)
# returns 11.970525425108205
random.uniform(1, 20)
# returns 7.731292430291898
random.triangular(1, 100, 80)
# returns 42.328674062298816
random.triangular(1, 100, 80)
# returns 73.54693076132074
</pre>
<h2>
<a name="weighted"></a>Random Items With Weighted Probabilities</h2>
<p>As we just saw, it is possible to generate random numbers with a uniform distribution as well as a triangular or normal distribution. Even in a finite range like 0 to 100, an infinite number of floats can be generated. What if there is a finite set of elements and you want to add more weight to some specific values while selecting a random number? This situation is common in lottery systems where numbers with little reward are given a high weighting.</p>
<h4>Choosing From a Weighted List With <code>choice(seq)</code>
</h4>
<p>If it is acceptable for your application to have weights that are integer values, you can create a list of elements whose frequency depends on their weight. You can then use the <code class="inline">choice(seq)</code> function to select an element from this weighted list randomly. Here is an example showing the selection of a prize amount randomly.</p>
<pre class="brush: python noskimlinks noskimwords">import random
w_prizes = [('$1', 300), ('$2', 50), ('$10', 5), ('$100', 1)]
prize_list = [prize for prize, weight in w_prizes for i in range(weight)]
random.choice(prize_list)
# returns '$1'
</pre>
<p>In my case, it took ten trials to get a $2 prize chosen from the list. The chances of getting a $100 prize would be much lower.</p>
<h4>Choosing From a Weighted List With <code>random.choices()</code>
</h4>
<p>Python also has a function called <code>random.choices(population, weights=None, *, cum_weights=None, k=1)</code> that allows you to natively pick values from a weighted distribution instead of implementing something similar on our own, as we just did. It accepts four arguments, but only the first one is required. Just passing a single list of values to the function will give you back one item from the list.</p>
<p>As you can see below, our weighted probability code could easily be rewritten to get a list of values using the <code>random.choices()</code> function.</p>
<pre class="brush: python noskimlinks noskimwords">import random
prizes = ['$1', '$2', '$10', '$100']
weightings = [300, 50, 5, 1]
print(random.choices(prizes, weightings, k=10))
# ['$1', '$1', '$1', '$1', '$2', '$1', '$1', '$1', '$1', '$2']
print(random.choices(prizes, k=10))
# ['$1', '$1', '$1', '$10', '$10', '$2', '$100', '$10', '$2', '$2']</pre>
<p>Values are selected with equal probability if you don't provide weightings. The <code>choices()</code> function will repeat some of the returned values in the final selected sample. You should note that this is different from the <code>sample()</code> function we discussed earlier, which returns a list of unique values from the given length. Passing a value of <code>k</code> higher than the population length will result in a <code>ValueError</code> with <code>sample()</code> but works with <code>choices()</code>. Here is an example:</p>
<pre class="brush: python noskimlinks noskimwords">import random
prizes = ['$1', '$2', '$10', '$100']
print(random.choices(prizes, k=10))
# ['$100', '$1', '$1', '$10', '$10', '$100', '$10', '$1', '$10', '$2']
print(random.sample(prizes, k=10))
# ValueError: Sample larger than population or is negative</pre>
<p>The <code>choices()</code> function is useful for simulating things like a coin toss or a dice throw because there is a possibility of repetition. On the other hand, <code>sample()</code> is useful for things like picking people randomly for different teams as the same person cannot be picked for two teams.</p>
<p>The <code>sample()</code> function was updated in version 3.9 to accept an additional <code>counts</code> parameter, which is simply a list that specifies how many times specific values are repeated in a population. You can use this parameter to simulate weighted distribution.</p>
<pre class="brush: python noskimlinks noskimwords">import random
fruits = ['apple', 'mango', 'banana', 'guava']
numbers = [50, 30, 12, 100]
print(random.sample(fruits, 10, counts=numbers))
# ['guava', 'apple', 'apple', 'apple', 'guava', 'guava', 'mango', 'apple', 'apple', 'guava']</pre>
<p>This is useful in situations where you have to pick something randomly (e.g. fruits from a basket) and then distribute them. Using <code>sample()</code> means that there is no possibility of selecting more bananas than the total amount present in the basket. The <code>counts</code> parameter allows us to avoid creating an actual list of 50 apples, 100 guavas, etc.</p>
<p>Keeping all these subtle differences between the functions in mind will help you write code that doesn't show unexpected behavior.</p>
<h2>Final Thoughts</h2>
<p>This module can be useful in a lot of situations, like shuffling the questions in an assignment or generating random usernames or passwords for your users by using the <code class="inline">shuffle()</code> function. You can also generate random numbers uniformly, as well as giving weighting to numbers in a specific range. In our next tutorial, we will be using the functions from this module to generate random data for statistical analysis.</p>
<p>Do you have some interesting applications of random number generators in mind that can be useful to fellow readers? Let us know on the <a href="https://forums.envato.com/c/project-making/envato-courses-and-tutorials/17" target="_self">forum</a>.</p>
</div>2022-03-09 00:17:50 UTC2022-03-09 00:17:50 UTCMonty Shokeentag:code.tutsplus.com,2005:PostPresenter/cms-319722018-10-06T03:22:55+00:00Trigonometry, Random Numbers and More With Built-in PHP Math Functions<div class="content-block-wysi" content-block-type="Wysi">
<p>Basic maths is used a lot during programming. We need to frequently compare, add, multiply, subtract and divide different values when writing code. </p>
<p>Sometimes, the maths required in a program can be more involved. You might need to work with logarithmic, trigonometric or exponential functions. In this tutorial, I'll discuss how to use each of these functions in PHP, with examples.</p>
<p>This tutorial will introduce you to the built-in math functions in PHP for doing trigonometry, exponentiation, and logarithm calculations. We'll also look at rounding and generating random numbers.<br></p>
<h2>Trigonometric Functions in PHP</h2>
<p>You can calculate the value of sine, cos and tangent of different angles given in radians using <code class="inline">sin($angle)</code>, <code class="inline">cos($angle)</code>, and <code class="inline">tan($angle)</code>. They all return <code class="inline">float</code> values, and the angle measure passed to them is in radians.</p>
<p>This means that when you simply calculate <code class="inline">tan(45)</code>, you won't get 1 as output, because you will actually be calculating the value of tangent at 45 radians, which is about 2,578 degrees. Luckily, PHP has two very useful functions for converting radians to degrees and vice versa. These functions are <code class="inline">rad2deg()</code> and <code class="inline">deg2rad()</code>. So, if you actually want to calculate the value of the tangent of 45 degrees, you will have to write <code class="inline">tan(deg2rad(45))</code>.</p>
<p> It is noteworthy that there is no direct PHP function to calculate the value of <code class="inline">cosec()</code>, <code class="inline">sec()</code>, or <code class="inline">cot()</code>. However, these values are just reciprocals of <code class="inline">sin()</code>, <code class="inline">cos()</code>, and <code class="inline">tan()</code>, so you can still calculate their values indirectly.</p>
<p>You can also do the inverse and calculate the angle at which a trigonometric angle has a particular value. These functions are called <code class="inline">asin()</code>, <code class="inline">acos()</code>, and <code class="inline">atan()</code>. One thing you have to remember is that the values of sin and cos can never go beyond the range of -1 to 1 for any angle. This means that <code class="inline">asin()</code> and <code class="inline">acos()</code> can only accept values in the range -1 to 1 as valid arguments. A value outside this range will give you NaN.</p>
<p>Trigonometry has a lot of applications like determining the trajectory of a projectile or the heights and distances of different objects, so having access to these functions is definitely helpful if you are writing code that simulates these situations.</p>
<p>These functions are also very helpful when you want to draw different elements using radial and angular values. Let's say you want to draw a pattern of circles around a larger circle at a uniform distance. If you have read the <a href="https://code.tutsplus.com/tutorials/rendering-text-and-basic-shapes-using-gd--cms-31767" rel="external" target="_blank">PHP GD Shapes tutorial</a> on Envato Tuts+, you probably remember that drawing any shapes will require you to pass coordinates in the form of x, y coordinates, but drawing circular patterns is easier with polar coordinates.</p>
<p>Using these trigonometric functions in this case will help you draw the desired figures using <code class="inline">sin()</code> and <code class="inline">cos()</code> to convert polar coordinates to cartesian form. Here is an example:</p>
<pre class="brush: php noskimlinks noskimwords"><?php
$image = imagecreatetruecolor(800, 600);
$bg = imagecolorallocate($image, 255, 255, 255);
imagefill($image, 0, 0, $bg);
$radius = 80;
for($i = 0; $i < 12; $i++) {
$col_ellipse = imagecolorallocate($image, rand(0, 200), rand(0, 200), rand(0, 200));
imagefilledellipse($image, 175 + 125*cos(deg2rad($i*30)), 175 + 125*sin(deg2rad($i*30)), 3*$radius/4, 3*$radius/4, $col_ellipse);
imageellipse($image, 175 + 125*cos(deg2rad($i*30)), 175 + 125*sin(deg2rad($i*30)), 3.5*$radius/4, 3.5*$radius/4, $col_ellipse);
$col_ellipse = imagecolorallocate($image, rand(0, 200), rand(0, 200), rand(0, 200));
imagefilledellipse($image, 575 + 150*cos(deg2rad($i*30)), 375 + 150*sin(deg2rad($i*30)), 3*$radius/4, 3*$radius/4, $col_ellipse);
imageellipse($image, 575 + 150*cos(deg2rad($i*30)), 375 + 150*sin(deg2rad($i*30)), 3.5*$radius/4, 3.5*$radius/4, $col_ellipse);
}
$col_ellipse = imagecolorallocate($image, 255, 255, 255);
imagefilledellipse($image, 175, 175, 275, 275, $col_ellipse);
?></pre>
<p>The following image shows the final result of the above PHP code.</p>
<figure class="post_image"><img alt="Using PHP GD with Trigonometric functions" src="https://cms-assets.tutsplus.com/cdn-cgi/image/width=800/uploads/users/1251/posts/31972/image/ellipse_circles.png" loading="lazy" width="820px" height="620px" class="resized-image resized-image-desktop" srcset="https://cms-assets.tutsplus.com/cdn-cgi/image/width=1600/uploads/users/1251/posts/31972/image/ellipse_circles.png 2x"><img alt="Using PHP GD with Trigonometric functions" src="https://cms-assets.tutsplus.com/cdn-cgi/image/width=630/uploads/users/1251/posts/31972/image/ellipse_circles.png" loading="lazy" width="650px" height="493px" class="resized-image resized-image-tablet" srcset="https://cms-assets.tutsplus.com/cdn-cgi/image/width=1260/uploads/users/1251/posts/31972/image/ellipse_circles.png 2x"><img alt="Using PHP GD with Trigonometric functions" src="https://cms-assets.tutsplus.com/cdn-cgi/image/width=360/uploads/users/1251/posts/31972/image/ellipse_circles.png" loading="lazy" width="380px" height="290px" class="resized-image resized-image-mobile" srcset="https://cms-assets.tutsplus.com/cdn-cgi/image/width=720/uploads/users/1251/posts/31972/image/ellipse_circles.png 2x"></figure><h2>Exponential and Logarithmic Functions <br>
</h2>
<p>PHP also has some exponential and logarithmic functions. The <code class="inline">exp($value)</code> function will return the constant <strong>e</strong> raised to the power of float <code class="inline">$value</code>. Similarly, you can calculate the logarithm of a given number to any base using the <code class="inline">log($arg, $base)</code> function. If the <code class="inline">$base</code> is omitted, the logarithm will be calculated using the natural base <strong>e</strong>. If you want to calculate the logarithm of a number to base 10, you can simply use the function <code class="inline">log10($arg)</code>.</p>
<p>One more function that you might find useful is <code class="inline">pow($base, $exp)</code>, which returns <code class="inline">$base</code> raised to the power of <code class="inline">$exp</code>. Some of you might prefer to use the <code class="inline">**</code> operator. The expression <code class="inline">$a**$b</code> will give the same result as <code class="inline">pow($a, $b)</code>. However, you might get incorrect results in certain situations with <code class="inline">$a**$b</code>. For example, <code class="inline">-1**0.5</code> will give you <strong>-1</strong>, which is incorrect. Calculating the same expression using <code class="inline">pow(-1, 0.5)</code> will give the correct value, NaN.</p>
<pre class="brush: php noskimlinks noskimwords"><?php
echo log(1000, M_E); // 6.9077552789821
echo log(1000); // 6.9077552789821
echo log(1000, 10); // 3
echo log10(1000); // 3
echo pow(121, -121); // 9.6154627930786E-253
echo pow(121, 121); // 1.0399915443694E+252
echo pow(121, 1331); // INF
?></pre>
<h2>Other Useful Mathematical Functions</h2>
<h3>Rounding Numbers</h3>
<p>There are a lot of other important mathematical functions as well. You can round fractions or decimal numbers up to the nearest integer using the <code class="inline">ceil(float $value)</code> function. This will convert both 2.1 and 2.9 to 3. Similarly, you can round fractions or decimal numbers down to the nearest integer using the <code class="inline">floor(float $value)</code> function. It will change both 2.1 and 2.9 to 2.</p>
<p>These functions are good for rounding up the results of different calculations easily. Let's say you need to know how many persons a hall can accommodate based on its area. Your final answer after the division will most probably be a decimal number, but you can't divide people into fractions, so the right answer would be the floor value of the calculated value.<br></p>
<p>You will often want to round a number up or down to the nearest integer. For example, you might want to change 2.1 to 2 but 2.9 to 3. This can be done easily using the <code class="inline">round($value, $precision, $mode)</code> function. The <code class="inline">$precision</code> parameter determines the number of decimal places to round to. The default value of 0 will simply return integers. The third <code class="inline">$mode</code> parameter is used to determine what happens if the number you want to round lies exactly in the middle. You can use it to specify if 3.5 should be changed to 3 or 4.</p>
<h3>Minimum and Maximum</h3>
<p>PHP also has two functions called <code class="inline">min($values)</code> and <code class="inline">max($values)</code> to help you determine the lowest and highest values in a set or array of numbers. These functions can accept different kinds of parameters like two arrays and a string. You should take a look at the <a href="https://php.net/manual/en/function.min.php" rel="external" target="_blank">documentation</a> to see how they would be compared.</p>
<pre class="brush: php noskimlinks noskimwords"><?php
$hall_width = 120;
$hall_length = 180;
$per_person_area = 35;
$hall_capacity = floor($hall_length*$hall_length/$per_person_area);
// Output: The hall can only accommodate 925 people.
echo 'The hall can only accommodate '.$hall_capacity.' people.';
$water_tank_volume = 548733;
$bucket_volume = 305;
$buckets_needed = ceil($water_tank_volume/$bucket_volume);
// Output: We will need 1800 buckets of water to completely fill the tank.
echo 'We will need '.$buckets_needed.' buckets of water to completely fill the tank.';
$marks = [49, 58, 93, 12, 30];
// Output: Minimum and maximum obtained marks in the exam are 12 and 93 respectively.
echo 'Minimum and maximum obtained marks in the exam are '.min($marks).' and '.max($marks).' respectively.';
?></pre>
<h3>Integer Division</h3>
<p>You can also perform integer division in PHP using the <code class="inline">intdiv($dividend, $divisor)</code> function. In this case, only the integral part of the quotient is returned after a division. Similarly, you can also get the remainder or modulo after the division of two arguments using the <code class="inline">fmod($dividend, $divisor)</code> function. The returned value will always be less than the <code class="inline">$divisor</code> in magnitude.</p>
<p>There are some other useful functions like <code class="inline">is_nan($value)</code>, <code class="inline">is_finite($value)</code> and <code class="inline">is_infinite($val)</code> which can be used to determine if the value is a number and, if it is a number, whether it is finite or infinite. Remember that PHP considers any value that is too big to fit in a float to be infinite. So <code class="inline">is_finite()</code> will return <code class="inline">true</code> for 100 factorial but <code class="inline">false</code> for 1000 factorial.</p>
<h2>Generating Random Numbers in PHP</h2>
<p>Random numbers prove to be quite useful in a number of situations. You can use them to generate "random" data for your application or to spawn enemy instances in games, etc. It is very important to remember that none of the functions we discuss in this section generate cryptographically secure random numbers. These functions are only meant to be used in situations where security is not an issue, like showing random greeting texts to repeat visitors or for generating random statistical data.</p>
<p>The functions <code class="inline">rand($min, $max)</code> and <code class="inline">mt_rand($min, $max)</code> can generate positive random integers between given values including the <code class="inline">$min</code> and <code class="inline">$max</code> value. When the functions are called without any parameters, they generate random numbers between 0 and <code class="inline">getrandmax()</code>. You can <code class="inline">echo</code> the value of <code class="inline">getrandmax()</code> to see the maximum possible random number that these functions can generate on your platform.</p>
<p>The function <code class="inline">mt_rand()</code> is 4 times faster than <code class="inline">rand()</code> and returns false if <code class="inline">$max</code> is less than <code class="inline">$min</code>. Starting from PHP 7.1.0, <code class="inline">rand()</code> has actually been made an alias of <code class="inline">mt_rand()</code>. The only difference is that <code class="inline">rand()</code> still doesn't give an error if <code class="inline">$max</code> is less than <code class="inline">$min</code> to maintain backward compatibility.</p>
<p>Here is a loop to generate random values between 0 and 100 a million times. As you can see, the values 0, 50 and 100 are generated approximately 10,000 times with slight fluctuations.</p>
<pre class="brush: php noskimlinks noskimwords"><?php
$rand_values = [];
$sum = 0;
$count = 1000000;
for($i = 0; $i < $count; $i++) {
$rand_values[$i] = mt_rand(0, 100);
$sum += $rand_values[$i];
}
$count_frequency = array_count_values($rand_values);
// Output: 100 was randomly generated 9969 times.
echo '100 was randomly generated '.$count_frequency[100].' times.';
// Output: 50 was randomly generated: 9994 times.
echo '50 was randomly generated: '.$count_frequency[50].' times.';
// Output: 0 was randomly generated: 10010 times.
echo '0 was randomly generated: '.$count_frequency[0].' times.';
// Output: Average of random values: 49.97295
echo 'Average of random values: '.($sum/$count);
?></pre>
<p>Both these functions also have their own seeder functions called <code class="inline">srand()</code> and <code class="inline">mt_srand()</code> to provide a seed for the random number generators. You should just keep in mind that you only call <code class="inline">srand()</code> and <code class="inline">mt_srand()</code> once in your program. Calling them before every call to <code class="inline">rand()</code> and <code class="inline">mt_rand()</code> will give you the same "random" numbers every time.</p>
<pre class="brush: php noskimlinks noskimwords"><?php
srand(215);
echo rand()."\n"; // 330544099
srand(215);
echo rand()."\n"; // 330544099
srand(215);
echo rand()."\n"; // 330544099
echo rand()."\n"; // 138190029
echo rand()."\n"; // 1051333090
echo rand()."\n"; // 1219572487
?></pre>
<h2>Final Thoughts</h2>
<p>PHP comes with a lot of <a href="http://www.php.net/manual/en/ref.math.php" rel="external" target="_blank">built-in functions</a> that should meet all your day-to-day computation needs. You can use these functions to do slightly more complicated calculations like GCD, LCM and factorials yourself.</p>
<p>There are just a couple of things you should remember when using these functions. For example, the value returned by functions like <code class="inline">floor()</code> and <code class="inline">ceil()</code> is an integer, but it is still a float. Similarly, all trigonometric functions expect their angles to be given in radians. You will get incorrect results if you pass them an angle value that you wanted to be treated as a degree measure. So make sure you check the return value and expected arguments of all these functions in the <a href="http://php.net/manual/en/function.floor.php" rel="external" target="_blank">documentation</a>.<br></p>
</div>2018-10-12 12:06:59 UTC2018-10-12 12:06:59 UTCMonty Shokeen