The Probability/Statistics Object Library

Author(s):

Kyle Siegrist

Java, like most modern programming languages has an object oriented paradigm, as opposed to the procedural paradigm of older languages. In object oriented programming (OOP), the basic programmatic elements are classes of objects that are defined by their properties and methods. A class of objects can be sub-classed by modifying the properties and methods or by adding additional properties and methods. An object can be passed, as a parameter, to another object.

The object oriented paradigm, if not the particular terms in the jargon, should be clear to any teacher of mathematics, for it is the same paradigm as in abstract mathematical structures. My thesis in this section is that object oriented programming can be pedagogically valuable to students of mathematics, particularly when the programming is centered on mathematical objects. I will use examples from the PSOL to make the case for this thesis.

An abstract probability distribution on a set S of real numbers is implemented in the PSOL as an abstract Java class. The probability mass function (or probability density function) f, and the domain S, are left unspecified, but then other quantities of interest (cumulative distribution function, quantile function, mean, variance, simulated value, etc.) can be computed from this function. These computations form the methods of the object and are simply the Java implementations of standard definitions in probability theory. For example, the mean of a discrete distribution on a countable set S with probability mass function f is given by

(1) Definition of the mean .

On the other hand, the binomial distribution with parameters n and p governs the number of successes in n Bernoulli trials with success probability p on each trial. For example, the number of aces in the Dice Experiment has this distribution, where n is the number of dice and p is the probability of rolling an ace with a single die. The binomial distribution is implemented in the PSOL as a subclass of the abstract distribution class, by specifying in the set of possible values S = {0, 1, ..., n} and the probability mass function f:

(2) Binomial probability mass function .

Many of the generic methods of the abstract distribution class are then overridden (replaced) in the binomial distribution class with the appropriate special closed formulas. For example, as every student of probability knows, the mean of the binomial distribution is simply

(3) Mean of the binomial distribution .

Thus, the method for computing the mean in the abstract distribution class, which implements (1), is overridden the binomial distribution class by implementing (3).

A random variable is implemented in the PSOL as an object that contains both a distribution object and a data object. A random variable can be passed to a graph or table to display information about the distribution or empirical data. In particular, note that the graph and table in the Dice Experiment are not "hard-wired" for this particular applet -- the graph and table are general components that can be used with any random variable.

The main point I want to make is that the object-oriented structure of the components parallels the underlying mathematical theory. I believe that designing an object model for an area of mathematics and programming the objects in this model lead to deep understanding of the mathematics, just as rigor and proof lead to deep understanding.

Kyle Siegrist, "The Probability/Statistics Object Library - Object-Oriented Programming & Abstract Mathematics," Convergence (October 2004)

JOMA