**Introduction**

Probability and statistics are core branches of mathematics. They govern those random events, such as data collection, interpretation, display of numerical data, and analysis. The origin of probability can be traced in the study of insurance and gambling as early as 17^{th} century. Currently, it is an imperative subject and an important tool in both natural and social sciences. On the other hand, statistics traces its origin from the development of census in ancient time. However, to be regarded as a scientific discipline, much of its development took part in the 19^{th} century in the study of economies, populations, and moral actions. Currently, statistics is used as a mathematical tool for data analysis. The rationale behind this study is to provide persuasive analysis on the history of probability and statistics and the changes that have developed the current discipline that is available today for the study.

**The Early Development of Probability**

In this part, the study of probability will be traced from its founder Pascal and Fermat during the exchange of letters in 1654 to the 19^{th} century where Laplace developed theories of probability. The study will discuss the meaning and the application of mathematics probability theory that have evolved over the years.

The credit for the foundation of mathematical probability goes to the Pascal Blaise and Fermat Pierre. The founders developed and solve the problem of points, the puzzle of dividing the stakes equitably when a dice is stopped before either player has enough chances to win. The problems they solved had been discussed in earlier centuries before 1654. In fact, the solution s they offered are considered correct in today’s probability arguments. In addition, they assisted in solving other issues about fair odds in the games of chances. Eventually, their ideas became popular after Huygens Christian published their ideas in *De ratiociniis in ludo aleae* in 1657. In the following century other academicians, such as Bernoulli, Abraham De Moivre, and Pierre Rémond de Montmort came up with mathematical tools that were powerful to calculate the odds in complicated games. Thomas Simpson and De Moivre also used the theory to compute fair price for insurance and annuities policies. Bernoulli in 1713 published *Ars conjectandi* that laid the philosophical platform for broader applications. He applied the philosophical idea of probability to the theories of mathematics and devised rules for conjoining the probabilities or arguments that proved his theorem, which stated that an event has a probability of being morally certain which is approximated by how many times it occurs.

The theorem was later advance by Bernoulli and acquired a new definition as the ‘law of large numbers of poisson’ as a rationalization for applying observed frequencies as probabilities that were to be combined by the rules to settle questions in a practical way. His ideas attracted mathematical and philosophical attention. However, either gambling in lives or in games remained the foremost source of concepts of the new ideas in the theories of probability in the first half of 18^{th} century. In the second half of 18^{th} century, there was development of new ideas in many fields. The experts in geodesy and astronomy initiated methods that reconciled observation and at the same time, their students in probability theory applied probability in explaining the observations. In fact, the theories developed inspired Laplace Pierre’s invention in the theories of inverse probability that was anchored from the Bernoulli’s invention laws of large numbers commonly known as central limit theorem. In the process, the theory culminated in the publications of the Adrien Marie

Legendre methods of least square in 1805 together with least squares probabilistic rationales, which were initiated by Carl Friedrich Gauss and Laplace between 1809 and 1811. In essence, those ideas were gathered and published in 1812 by Laplace treatise on probability also known as *Théorie analytique des probabilities*.

**Game of Chance **

The game of chances was an area of interest in ancient times as it is today. Many authors in medieval period both mathematicians and mystics indicated the likelihood of various games outcome. However, Kendall 1956 and David 1962 noted that the enumeration of those cases did not yield equally likely instances and they could not be used in calculating odds in the probability field. Cardano in his probability theory published *Liber de ludo aleae* after the Huygens’ work in 1560. In his theories, he formulated principles indicating that the stakes in any equitable wager must be in proportion to the player’s number of ways predisposed to winning. The principle was also applied to locate fair odds for wagering with the dice. In addition, Galileo was another precursor who enumerated the number of possible outcomes from a throw of three dice. He did the experiment to show that the faces of the dice would add up to ten more readily and hence more frequently than the dice would add up to nine. However, those great works was not published until after Huygens did his publications.

Galileo did not solve the problem of points despite being the most salient challenge in mathematics during that time. Italian mathematicians such as Peverone, Cardano, Pacioli, and Tartaglia also discussed the problem of points in the early 14^{th} century, but they could not solve it. In fact, Tartaglia concluded that it was unsolvable and indicated that the problem was judicial rather than mathematical and anyone who would require to get the correct division, there must be a cause of litigation. On the contrary, Fermat and Pascal developed a solution and they established a convincing argument on how to solve it.

Fermat solved the problem by presenting a number of cases that the play might take. He supposed that if Paul and Peter were in the game, and they had staked equal amount of money and if the first three points are to be won and the game stops then peter would lack two points while Paul would lack one point. Therefore, Paul should get more than Peter because he has more points. In addition, if they participate in other two games there would be four possibilities. The first and the second win will go to Paul. Secondly, Paul will win the first and Peter the second. Thirdly, Peter win the first and Paul the second win. The Forth possibility, Peter will win the first and the second. Essentially, in first three cases, Paul will win the game and in the forth case, the game will go to Peter. According to Cardano’s principle, the proportion of the stakes will be proportional, such as Paul three and Peter one hence three-fourth of the whole game.

On the other hand, Pascal preferred the method of expectation that relies on the concept of equity instead of application of Cardano’s principle. His method enabled him to solve in the instances when the players losses so many point that would lender listing all the possibilities impractical. In essence, the Pascal’s argument is based on the premise that because the player have equal chances of winning then they also have equal right to expect a win the next win after their opponent. Therefore, if Paul wins, it is the whole stake; thus, it entitles him to half of the whole stakes. On the other hand, if Peter wins, then there is a tie and both will share the equally the remaining half. Hence, Paul is entitled to winning three-fourth in the entire stakes. By extension and backward reasoning by mathematical induction, Pascal solved the arising problem of missing numbers of any player. Indeed, Pascal applied recursive properties to solve the puzzle, which is also known as Pascal’s triangle. For instance, Pascal assumed that if Paul lacks two points and Peter lacks four, then their share is calculated by summing up the numbers at the base of the Pascal’s triangle. Hence, they would be Paul’s share to the Peter’s share as or 13 to 3, or 1+5+10+10 to 5+1.

**Pascal’s arithmetic triangle**

1 1 1 1 1 1

1 2 3 4 5

1 3 6 10

1 4 10

1 5

1

It is worth noting that the combinational knowledge in Pascal’s reasoning was more rich in explanation, but less streamlined compared to foundation of probability that is currently taught in schools. According to Edwards (1987), Pascal was able to organize probability knowledge and integrated it in providing solutions to the problems. In addition, all those theories were published in 1665 as *Traité du triangle arithmétique* three years after he died.

**The Weighing of Evidence and Opinion**

The weighing of evidence and opinion in probability was instrumental in 17^{th} century. However, it was not tackled in the letters between Fermat and Pascal, or in the treatise of Huygen’s. In fact, the word probability did not feature in those scholars work and in their analysis, the number between one and zero that we regard as probability was only considered as a proportion of the player’s stake. The probability in this case was not isolated as the measure of believe. Consequently, the new ideas to problems in probability developed and Pascla was able to use his ideas for the arguments of his religion, which entailed the Port Loyal Logic and the concept of believing in God. He used probability theory to weigh possibilities in his life.

Bernoulli validated the connection between the Huygen’s theory and the probability in his publications, *Ars conjectandi *that was published after his death in 1713. In his argument, he stated that an argument is just a part of a complete certainty just as it is in the games of the chance. He continued to state that for a complete certainty to be a full unit, the probability of that event must be between one and zero. According to Huygen’s rules, the probability will be represented by the events of favorable cases against the total number of cases.

James Bernoulli drew the traditional ideas of probability that were more general into the theories of mathematics through formulation of rules that combined arguments in probability similar to Hooper’s ruler. He hoped that the rules would make probability a widely used practical tool. During that time, probability would be derived from observed frequencies that were combined to formulate business, personal, and judicial decisions. He believed that his theorem also morally certain because an event frequency under large number of trials would be its approximate probability. Although Bernoulli guarded his theory strictly, he only offered disappointingly large bound on the trials of numbers required for moral certainty, which the frequency would be in vicinity of probability.

De Moivre and Nicholas Bernoulli improved the James Bernoulli theory in 1733 by estimating accurately the trials required by use of expansion series, which is currently know as normal density. The innovations and improvement of De Moivre and Bernoulli is believed to be the precursor of the today’s theory of confidence intervals. In that connection, De Moivres approximation helps us to state the degree of the confidence given on the observation that the actual probability lies on a certain bounds.

Surprisingly, Bernoulli theorem of combining probability did not achieve the goals of making probability a tool for judicial and everyday life affairs. They were only analyzed and discussed in the books, but not applied in practice. However, the program had a huge philosophical impact, which put probability at the core of mathematical theory. His theorems were also debated in the framework of speculation about boy and girls birth ratios.

**The Combination of Observations**

It is clear that although probability had a theoretical place in theories of mathematics by 1750, its application for theory was limited to question of equity. The process of using probability in data analysis as well as life insurance and annuity was not known. In fact, the development of probability in ancient time allowed Simpson to use mortality statistics and De Moivre used theoretical curves in mortality. However, no one had the knowledge to use probabilistic methods in problem models as the modern demographers use it. Probability and data analysis was brought together by the combination of observation in geodesy and astronomy. By 18^{th} century, the combined observations reconciled inconsistent equations and yielded the numbers that served as coefficients in unknown quantities in a linear equation.

**Laplace’s synthesis**

The observation combination brought the probability theory ideas that support the mathematical statistics that uses fitting models in data analysis. In addition, it brought two main theories for fitting models, including Bayesian analysis and linear estimation. In 1812, Laplace treatise on probability was published with other editions in 1814 and 1820. Laplace’s picture of theories of probability was developed and it was entirely different from what was there in 1750. On the philosophical perspective, his interpretation of probability was on rational belief and inverse probability being its foundation. On the mathematical perspective, it was a method of forming functions, techniques for evaluating future probabilities, and a central limit theorem. Laplace’s applied perspective included the games of chance, but they were characterized by problems of data analysis and methods of linking probability of judgment in Bayesian methods. According to Shafer (1978) and Zabell (1988), Laplace’s probability concepts replaced non-Bayesian techniques of Bernoulli and Hooper.

In almost a whole generation, Laplace’s view dominated probability. Later they paved way for other perspective that brought different understanding of probability. In 19^{th} century, the error distribution at the core of his approach appeared empiricists. However, the frequency interpretation was incompatible with his philosophy and his own raison d’être for least square on the premise of central limit theorem made his concept of inverse probability superfluous. By 19^{th} and 20^{th} century, the empiricist view was dominant and frequency was adopted as the real foundation of the probability. The Laplace’s synthesis approaches was powerful in mathematical probability and focussed on direct application of probability. In the current world, the probability is in the branch of pure mathematics and plays a great role in applied and mathematical statistics as one of its many facets of sciences.

**Statistics**

The statistical analysis is an intriguing discipline especially when it is first introduced in mathematics in high school setting. In fact, the introduction of statistics is usually taken as new and modern. However, the learners are surprised to understand that statistics as a discipline can be traced back into antiquity. In addition, the clarification and the ambiguity of the subject was done by the probability before it became a theory in mathematics in the early 16^{th} century. The word statistics is an Italic word “statista” that was derived from Latin “status” which indicated the state. The word ‘statistic’ was not used until in the 18^{th} century when it was used by Achenwall. Chinese used statistical facts relating to births, taxes, population, and wealth in early 2300 BC, while Egyptians used the same fact in 3050 BC when building the pyramids (Stringfellow and Stringfellow 1).

**Works Cited**

Edwards, A.W.F. *1987*. *Pascal’s Arithmetic Triangle. *London. Griffin Publishers. Print.

David, F.N. *1962*. *Games, Gods, and Gambling. *London. Griffin Publishers. Print.

Kendall, Maurice G. *1956*. ‘The beginnings of a probability calculus.’ *Biometrika 43, *1-14. Reprinted in Pearson and Kendall (*1970*).

Shafer, Glenn *1978*. ‘Non-additive probabilities in the work of Bernoulli and Lambert.’ *Archive for History of Exact Sciences 19, *309-370.

Zabell, Sandy L. *1988*. ‘The probabilistic analysis of testimony.’ *Journal of Planning **and Inference 20, *327-354.

Stringfellow, Thomas and Stringfellow, Emma L. A Brief History of Statistics: A Contribution for the Enrichment of High School Mathematics. *School Science and Mathematics *Volume 61, Issue 1, pages 1–4, 2010