Friday, October 17, 2014

Summation Notation: Sigma IS Sum

The time has come to discuss actual mathematical operations, specifically, summation notation. This post will be featuring the capital iteration of your soon to be favorite greek letter, sigma (Σ).

Summation notation is also known as sigma notation, or weird squiggly line from hell, depending on your preference. In statistics, summation notation is unavoidable. Getting comfortable with summation notation, and reviewing the math concepts that go along with it, will allow you to understand and use the formulas involved in fundamental statistical concepts.

Summation notation is helpful/inevitable when you are working with a sequence of numbers. A sequence of numbers is a list of numbers that is in order. When you are working with a dataset you are working with sequences of numbers. A sequence of n numbers can be denoted as {x1, x2, x3,..., xn}.

So, consider the sequence: {1, 4, 2, 7, 5}. For this sequence the following will be true:

n = 5
x= 1
x= 4
x= 2
x= 7
x= 5

In summation notation an expression (also referred to as a function or formula) is evaluated for given values in a sequence. The results of that expression are added together, hence the term summation notation. The following graphic breaks down the basic features in summation notation:


Using our sequence from before {1, 4, 2, 7, 5} the above summation notation turns into:


Given that i=1, we start with the first observation (which equals1). The expression to be evaluated is simply the value at the given index. We then move on and add the value of the expression for observation 2 (which equals 4), and so forth up until observation number n, which is the last observation in our sequence.

It may have been a while since you took a math course. You may have hoped you would never have to use math again. That's ok, you're ok, it will all be ok. Before diving into summation notation it is important to take it back to basics: Order of operations. You may remember a little something like this from your math days:

PEMDAS
Parentheses
Exponents
Multiplication
Division
Addition
Subtraction

Summation notation is addition in wolf's clothing. What does this mean? It means that despite its intimidating display summation notation is a basic operation. Why is this important? It is important because order of operations can make seemingly similar expressions mean completely different things. Order of operations is like punctuation for math. I would argue PEMDAS is easier but that is because I have some serious anxiety stemming from commas and semicolons.

Example time!

We will stick with our previous sequence: {1, 4, 2, 7, 5}.

Consider the following:

Your initial inclination may be to expand this equation in the following way:


Your initial inclination is not your friend. The above expansion would be appropriate for the following equation:


The important difference here is the parentheses. With parentheses the "+2" is included within the summation. The order of operations dictates that anything within parentheses is computed before any addition operations.  This means that 2 is added to every value used in the summation. When there are no parentheses, the proper expansion is:


In this case, 2 is added after the summation is carried out.

When dealing with summations, it is helpful to remember that basic algebra may save you time and computation errors. Examine the following two formulas:




Notice anything? The formulas certainly look similar. In fact, they provide us with the same results. Let's look at the expansions:



By factoring out the common denominator, we can simplify this to:


In summation form this would be:


Oftentimes, we are not just working with a single sequence. When we have data on multiple variables, we are working with multiple sequences. Let's say we now have two variables. The data can be represented by two sequences:

X = {1, 4, 2, 7, 5}
Y = {2, 5, 3, 6, 1}

The same rules apply to summation notation with two variables. Consider this:


This results in the following expansion:


The values from set X and set Y that share the same index were multiplied together. These products were then summed. When working with multiple sequences it may be helpful to arrange values in a table. A table for sequences X and Y might look like this:



That covers the basics of summation notation. If you have grown to love our dear friend sigma, never fear, this is only the beginning of what's sure to be a beautiful (unavoidable) friendship.



No comments:

Post a Comment