Skip to content
— CH. 1 · THE FUNCTION BEHIND THE CHANCE —

Random variable

~4 min read · Ch. 1 of 6
6 sections
  • A random variable is a mathematical function, not a rolling die or a spinning coin. It maps outcomes from a sample space to real numbers. Consider the set of possible upper sides of a flipped coin. This set contains heads and tails. A random variable might map heads to negative one and tails to positive one. The range of this function is often a subset of the real numbers. The term refers neither to randomness nor variability in its strict definition. Instead it describes a measurable function within measure theory. George Mackey noted that Pafnuty Chebyshev was the first person to think systematically in terms of these variables. The domain remains the set of all possible outcomes while the range becomes a measurable space.

  • Mathematicians distinguish between discrete and continuous types based on their probability distributions. A discrete random variable takes values in a countable subset. Its distribution can be described by a probability mass function. This function assigns a specific probability to each value in the image. For example, consider the number of children a person has. This variable holds non-negative integer values. One can compute probabilities for individual integers like zero or two. An infinite sum of these probabilities yields the chance of an even number of children. In contrast, a continuous random variable takes values in an interval of real numbers. Such variables almost never take an exact prescribed value c. Instead they assign positive probability to intervals which can be arbitrarily small. A spinner choosing a horizontal direction provides a concrete case. The values taken are directions mapped to bearings in degrees clockwise from North. Any single real number has a probability of zero of being selected. Yet a positive probability exists for any range of values like [0, 180].

  • The systematic development of this concept traces back to Pafnuty Chebyshev. He was the first person to think systematically in terms of random variables according to George Mackey. Before his work, randomness lacked a rigorous axiomatic setup. Modern mathematics now relies on measure theory to define these functions. A random variable is defined as a measurable function from a sample space to a measurable space. This allows consideration of the pushforward measure called the distribution. The distribution acts as a probability measure on the set of all possible values. It is possible for two random variables to have identical distributions but differ significantly. They may be independent despite sharing the same statistical profile. The underlying probability space serves as a technical device to guarantee existence. Sometimes mathematicians dispose of this space altogether and work directly with measures.

  • Statisticians characterize behavior using parameters known as moments. The expected value represents the average value of a random variable. It is denoted by E[X] and also called the first moment. Once the average is known, one asks how far values typically deviate from it. Variance and standard deviation answer this question about spread. Mathematically this constitutes the problem of moments. For a given class of random variables, one finds functions such that expectation values fully characterize the distribution. Moments apply only to real-valued functions or complex-valued ones. Even non-real-valued variables allow moments of real-valued functions. Consider a categorical variable taking nominal values like red, blue, or green. One constructs a real-valued function using an Iverson bracket. This function has value one if the variable equals green and zero otherwise. The expected value and other moments of this function can then be determined. An infinite population provides intuitive context for these averages.

  • Different senses of equality exist for random variables in mathematical statistics. Two variables can be equal almost surely or equal in distribution. Equality in distribution means they share the same distribution functions. They need not be defined on the same probability space. Moment generating functions provide a method to check equality of certain functions. Almost sure equality holds when the probability that two variables differ is zero. This notion acts as strong as actual equality for practical purposes. Actual equality requires them to be identical as functions on their measurable space. This final notion is typically least useful because the underlying measure space is rarely explicitly characterized. A sequence of random variables may converge to another variable. Significant themes include the law of large numbers and the central limit theorem. These results obtain convergence properties for sequences within statistical theory. Two variables equal in distribution but not almost surely can have different covariances with a third variable.

  • Practical uses extend into fields like graph theory and natural language processing. Random elements model the variation of non-numerical data structures. One may represent a random word as an integer index into a vocabulary. Alternatively, it becomes a random indicator vector where only specific positions hold positive probability. The position of the one indicates the chosen word. A random sentence of given length n may appear as a vector of n random words. Graphs on n vertices become matrices of random variables specifying adjacency. Stochastic processes function as random functions of time. Random vectors serve as random functions of some index set such as integers. Random fields act as functions on any set including time or space. These representations allow different random variables to covary while defined on the same underlying probability space. Such flexibility supports modeling complex systems in computer science and discrete mathematics.

Common questions

What is a random variable in mathematics?

A random variable is a mathematical function that maps outcomes from a sample space to real numbers. It describes a measurable function within measure theory rather than randomness or variability itself.

Who first thought systematically about random variables?

Pafnuty Chebyshev was the first person to think systematically in terms of these variables according to George Mackey. Before his work, randomness lacked a rigorous axiomatic setup in modern mathematics.

How do discrete and continuous random variables differ?

A discrete random variable takes values in a countable subset while a continuous random variable takes values in an interval of real numbers. Discrete variables assign specific probabilities to each value whereas continuous variables assign positive probability only to intervals.

What are moments used for in statistics?

Statisticians use parameters known as moments to characterize behavior such as expected value and variance. The expected value represents the average value of a random variable and serves as the first moment.

When do two random variables have equal distributions?

Two variables have equal distributions when they share the same distribution functions even if not defined on the same probability space. This equality means they need not be identical as functions on their measurable space.