A little joke for you: A reporter was writing an article on the new Common Core math standards and decided to interview some people who used math professionally. She chose a mathematician, engineer, and statistician. Her last question was one she thought would be just for giggles: “What is 1 + 1?” The responses were not what she expected. The mathematician actually started writing out a proof for her and handed her several pages that were incomprehensible to the lay person but ended with “2.” The engineer started asking a bunch of questions about how each 1 was measured, how exact the measurement was, etc. The statistician just answered, “What would you like the answer to be?”
This joke illustrates a perception that many have of statistics. And it is true, that numbers can be used in a misleading way, even if those numbers are true. However if you know which numbers to look for and what questions to ask, you are less likely to be fooled and even likely to see the true meaning of the numbers.
“Wait. Isn’t this a theology and culture blog? I was told there would be no math.” you say. Numbers, and particularly statistics, are used constantly in our culture to describe ourselves and others and to make arguments. 58% of people believe in bigfoot, 4 out of 5 doctors agree you should limit red meat intake, a majority of people have seen Game of Thrones, there has been a 145% increase in school shootings after prayer was banned in schools, 104 people have died from pool noodle injuries this year, and 1.5 million new jobs were created this quarter are all stats I made up but examples of the ways people use numbers to make a point, shape perceptions, scare you with the evening news, or sell you something.
Math is so often presented in a way that is difficult to understand, frustrating you. This is because the books are written by mathematicians for whom the math you learn is elementary and thus “beneath” them. Add to that mathematicians are usually naturals for whom your math lesson is intuitive and you have a recipe for unpleasant experiences and a desire to avoid math lessons like this one. I assure you that I am no mathematician and that my knowledge of stats was hard won and I really want to give you tools that are useful and do so in a way that is approachable and interesting.
A disclaimer- all numbers presented as examples are made up unless otherwise noted. I’m just presenting theory here so that you can apply it.
One of the more important questions one should ask of any statistic presented is, “How was the data acquired?” If a statistic is purports to tell you something about a larger group of people or a society, then the people polled or data source and the method matters.
The sample source should be representative of the group the study purports to describe: For example, if one were to study race relations in the American general public and conduct a study which polled people on how many friends of a different race they have, very different data will result from polling people at a university basketball practice arena vs. a renaissance faire vs. the Westside Mall in Atlanta vs. a CNN website poll. All of these places are frequented by demographics that are not representative of the total population. A study must find a sample that is representative of the entire population is purports to represent. A random sample is the term used for this.
In particular, website polls are the worst for getting accurate info. Firstly the poll is affected by the demographics of the website clientele. Secondly, only a certain subset of that website’s clientele will participate in the poll (usually those who feel strongly about the poll subject). Thirdly, trolls, both individuals and organized, are attracted to online polls. 4chan its brethren love to mess with online polls.
If a poll is conducted, how were the questions worded? For our interracial friendships poll, were the questions something like “How many people of another race do you know?” or “How many friends of a different race do you have?” or “How many close friends of a different race do you have” or “How many friends of a different race have you interacted with outside of work/school/business transactions in the past 2 weeks.” The last question is by far the better question for getting information about actual friendships, not just acquaintances. But just remember, some people will still lie, especially if they think they should have/do more/less.
For data collection of more objective data that can be of different qualities/severity such as jobs created or people made ill, one should ask about how wide a net was cast. For example, the business section of NPR announces how many jobs were created each month, but it does not mention what quality of jobs were created. The CDC may announce how many people were infected by West Nile Virus, but rarely how many of them were sickened to the point they missed work or were hospitalized, and for how long.
Experimental procedure is incredibly important for any sort of experimental study. Medical studies of experimental medicines, for example, should be double-blind so that neither the patient nor the medical professional administering the treatment can know whether the patient is getting the experimental drug, a commonly used drug, or a placebo. When this double-blind procedure is not used the “placebo effect” often occurs where the patient will experience psychosomatic effects based upon their hopes and fears. When testing for side effects for different materials, pesticides, noise pollution, etc., it is important that the real-world doses be included in experiments. This is because the dosage makes the poison. Caffeine is incredibly poisonous, but you like it in your coffee and soda. Tylenol will destroy your liver, if you take more than the correct dosage. Fluoride of less than 1 part per million in drinking water prevents cavities, but 2 ppm leads to tooth discoloration, and 4 ppm leads to health problems. You breathe water all the time. In the winter, many people even add it to their air. But you wouldn’t dream of trying to breathe pure water. So, when presented with data from an experiment, some critical thinking about these and other procedures is required.
The most commonly used statistic is an “average” (technically the “mean”). People are pretty familiar with this. “The average household income in such and such county is $52,670.” The average is found by adding up all the data and then dividing by the number of things added. The problem with averages is that they can be easily thrown off by extreme individuals/groups. For example, suppose Warren Buffet lived in your county. The average income might be $457,892 per household, but you know that the vast majority of people are not making anywhere near that much. Because of the sensitivity to “outliers” (the technical term for extreme individuals or groups) the average (stat use) can be very misleading and may not really describe the average (colloquial use) individual. Therefore, some questions should be asked when presented with an average.
Questions you should be asking about an “average:”
How did they treat outliers? It is not unusual for a certain number or small percentage of the lowest and highest values to be ignored when calculating the average. Doing so can help prevent the average from being thrown off by outliers.
How far from the average (stat use) can something be and still be average (colloquial use). i.e.How “spread out” is the data? Suppose you were told the average number of vacation days companies in America give is 16. Do the vast majority of people get about 16 days or do a pretty decent number of people get anywhere from 5 to 25 days? How many days should be considered unusually stingy or generous of an employer compared to others?
The value used to measure how spread out data is is called the “standard deviation” (may also be called “standard error” if the entire population is measured, like in a census). Basically this is the average distance each data point in the entire sample is from the average. Standard deviations are useful to know how unusual or usual an individual is. If you look at the diagram of IQ scores, you can see that about 2/3 of all data points fall within 1 standard deviation on either side of the average. About 95% are within 2 standard deviations. And 99.7% are within 3 standard deviations. This is true for any data has that nice bell shape (we call that a “normal distribution” of data). Generally speaking, anything more than 3 standard deviations from the average is considered an outlier. And a good rule of thumb is that if two values are more than one standard deviation from one another, than they are statistically different. Suppose for our vacation example, the standard deviation is 5 days. This indicates that while the average number of vacation days is 16, if you were to be offered only 12 days a year, it would seem to be in line with what many other companies offer because 2/3’s of the companies offer between 11 and 21 days. If the standard deviation were 1, then you might want to negotiate more vacation because 99.7% of companies offer between 13 and 19 days. But…
How “normal” is the data?
Take household income for example. It is not normal because most of the population is clustered within a few tens of thousands of dollars of each other while a small number have incomes in the hundreds of thousands or millions of dollars. According to the US census, the average (stat use) household income in PA is $69,629 but if we look at a graph that shows the distribution of income, we see that most people are making less than that. The very rich “skew” the curve (like those 2 nerdy kids always did in school). If you look at the graph, you can see this skewdness. For this reason, you should also be asking “What is the median?” The median is the number in the middle of all the data. In the case of PA household income, the median is $52267, which is a lot closer to what most people are making. If the median and the mean are very different, you have skewdness, due to a small number of very high or very low values and you know the average (stat term) is not very average (colloquial term).
Another way data can not be normal is if there are 2 or more large groupings (“modes” in stat speak). For example, in the number of children per household: there are a large number of households with 0 children, a few with 1, many with 2, many with 3, and a small number with more, and the Duggars. This makes for a very different looking graph. You can see my graph of totally made up data that shows this kind of grouping. There are 2 distinct groups here because there 2 groups- people without children and people with children, but the average will not tell you that.
Often we try to compare groups to one another. The average life span of a smoker is 75 vs. 80 for non-smokers. The average prison term for white shoplifters is 8 months, but 13 months for blacks. People in 1975 watched tv an average of 3.2 hours a day compared to 4 hours a day in 2014. The average number of autism cases per 1000 people is 3.2 for those vaccinated and 2.8 for those who were not. Christians donate to charity an average of 5.8% of their income and unchurched people donate 3.6%. You get the picture. There are some really pertinent questions to be asked here.
Firstly, are the values being compared actually different? I mean just because 2 averages differ in value a bit, are they really that different? If I poll 500 people I might get one average, but the next time I poll another 500 people my average will be a bit different. Even if I do it a 3rd time, my average will be a bit different this time as well because I’m not polling the same 500 people. One hopes that these are all good, random samples, but even still there will be slight differences just due to randomness. This is why “error” comes into play. Often you will see an error rate displayed on a graph in USA Today or on CNN. Error rate is a value that is calculated to show where the true value for the whole population is likely to be based on the sample size and how spread out the data was. Usually error rates are simply the standard deviation. For opinion polls (are you in favor or against medical marijuana, which candidate will you vote for, etc.) there is another equation we won’t get into. The basic idea is that if two values are within the error rates of each other they are statistically equal. This is why in a presidential race the one candidate can have 52% of the vote according to a poll and the other 47%, but because there is a +/-3% error rate, they could very well be tied and the race is too close to call. There is a lot more that goes into error rates and usually there is also a 5% chance of the true average for the population being outside the error rate.
Secondly, are things being fairly compared? Are we comparing apples to apples or apples to oranges? The only real difference between the two samples should be the trait in question. Suppose that we are asking about the religious beliefs of graduate students compared to the general public, but polling people of all ages from the general public? Older generations have different religious beliefs than younger generations as it is and thus comparing multiple generations from the general public to grad students who are generally in their 20’s will result in muddied results. If you want a fairer comparison, it should only include people of the same age, but the primary difference between the groups looked at is that some are graduate students and some are not.
And thirdly, correlation is not causation. Suppose we find that as ice cream sales rise, assault arrests also rise. Should we then conclude that ice cream consumption causes people to beat each other up? No. There is another player- heat. When it gets hot people want ice cream. When it gets hot, more people are out and about and tempers are shorter. You should also be aware that there are some pretty weird correlations out there of things that seem to have no connections at all.
To bring a lot of these concepts together, suppose a bank was accused of giving higher interest rates to latinos than whites. The local paper does some investigative journalism and finds that the average interest rate for whites is 5.6% and 8.2% for latinos. Sounds pretty bad right? The bank counters that this is not a good comparison, because the finances of the latino population in general are not as good as the white population in general and thus the latinos who apply for loans have, on average, lower incomes and credit scores than their average white customers. To avoid a lawsuit the paper runs another article, but this time only uses customers who make $45-55k a year and have a credit score of 700-725. The interest rates are now 4.1% for whites and 4.75% for latinos. So are those rates actually different? It depends on the error rate/standard deviation. If the error rate is less than 0.65%, than yes they are actually different.
Finally, someone may just present you with raw numbers. 13 people died this year from shark attacks or 4 people contracted West Nile Virus in the county last summer or 10,258 children broke a bone while riding on bikes in 2010. 9 people on death row were exonerated in 2009 due to the use of DNA testing. 568 murders in the US were committed last year by illegal immigrants. With raw numbers it is important to, in addition to asking some of the previous critical thinking questions regarding data collection, to also put those numbers in perspective. Ask questions like “Out of how many total?” and “Out of how many people who were at risk?” and “How does that compare to other things?” and “How does that compare to the general population?” Take the children breaking a bone while biking example. How many millions of children ride bikes each year (25 million at least)? That would be 0.04% of bike riders broke a bone. How does that compare to other childhood activities like playing on playgrounds or sports?
Consider the Source
Consider who is supplying these numbers. Those with a financial interest or who are fanatic believers are more likely to not be very critical of data that agrees with them, to ask loaded questions in polls, to use biased samples, to mislead with numbers, and to just outright lie. Therefore you should investigate what party conducted the study and pay attention to who is publishing these numbers and then question most carefully numbers that come from such sources. This is not to say that Green Peace or the NRA or Americans for a Better Life, etc. cannot produce truthful, quality statistics (though it is unlikely), but that you should be careful with them. University, non-partisan, and government organizations usually produce the better done studies or databases, but still use your judgment and have a healthy dose of critical skepticism as you consider what they are telling you.
So, there you have it. Everything you need to know about statistics. I admit that it is a lot. And I probably only gave you just enough to be dangerous. But the way stats are thrown around and the way we use them to inform personal decisions and policy for governments and corporations, we need some basic grounding. I do hope you got something out of this that will help you and if you want to learn more, http://www.robertniles.com/stats/ is a decent site for “lay” people and if you wanted to really go for it, many universities offer free online courses in stats each semester.