Comments on: Study: We're getting used to the taste of spam
The Pew Internet and American Life Project turns up some surprising data about people's attitudes toward junk e-mail.
The Pew Internet and American Life Project turns up some surprising data about people's attitudes toward junk e-mail.
December 27, 2009 7:40 AM PST
December 26, 2009 2:17 PM PST
December 26, 2009 11:19 AM PST
Add headlines from CNET News to your homepage or feedreader.
More feeds available in our RSS feed index.
Related quotes
What last year was a rate of say 50 SPAM messages a day is now less than 10 per week...
The critical factor is not really the number of people involved in the sample, but how the sample is selected in the first place. It is true that you'll generally need at least 30 at an absolute minimum to have a good level of accuracy, and this does increase when the population you are dealing with is much bigger, but 1,421 people is probably a completely reasonable number. So, the size of the sample is important, but as long as the population is big, you can get a reasonably accurate result even when your sample is much less than 1% of the total.
Let me try an example to explain why the selection of the sample is more important than the actual number of people, and show you how close you tend to get even with a small sample.
Let's say there were only 300,000 e-mail users in the world, and an even 200,000 e-mail users don't like spam, and the other 100,000 don't care. That's about 67% who don't like it, and 33% who don't care.
Now, assuming I chose e-mail users totally at random, I have a perfect 2 of 3 chance to pick an e-mail user who doesn't like spam, and a perfect 1 of 3 chance to pick a user who does like spam. This is now a probability problem, and the law of large numbers states that as I make a number of trials given events of constant probabilities, my results will approach the actual probabilities with each additional trial.
So, let's try it with a sample of 30 using a 6-sided die. Values of 1-4 will be considered a spam-hater, and values of 5-6 will be considered someone who doesn't care. SH will be the label for the spam haters, and IN will be the label for the indifferent.
- #1: A spam hater. SH: 1 (100%); IN: 0 (0%)
- #2: A spam hater. SH: 2 (100%); IN: 0 (0%)
- #3: A spam hater. SH: 3 (100%); IN: 0 (0%)
- #4: A spam hater. SH: 4 (100%); IN: 0 (0%)
- #5: Indifferent. SH: 4 (80%); IN: 1 (20%)
- #6: A spam hater. SH: 5 (83%); IN: 1 (17%)
- #7: A spam hater. SH: 6 (86%); IN: 1 (14%)
- #8: Indifferent. SH: 6 (75%); IN: 2 (25%)
- #9: Indifferent. SH: 6 (67%); IN: 3 (33%)
(See, only nine rolls, and the true probabilities have already been revealed. This is largely because the example is so simple, but even with stranger percentages, you can see how random selection narrows in on it very quickly).
- #10: Indifferent. SH: 6 (60%); IN: 4 (40%)
- #11: A spam hater. SH: 7 (64%); IN: 4 (36%)
- #12: Indifferent. SH: 7 (58%); IN: 5 (42%)
- #13: A spam hater. SH: 8 (62%); IN: 5 (38%)
- #14: A spam hater. SH: 9 (64%); IN: 5 (36%)
- #15: A spam hater. SH: 10 (67%); IN: 5 (33%)
- #16: A spam hater. SH: 11 (69%); IN: 5 (31%)
- #17: Indifferent. SH: 11 (65%); IN: 6 (35%)
- #18: Indifferent. SH: 11 (61%); IN: 7 (39%)
- #19: A spam hater. SH: 12 (63%); IN: 7 (37%)
- #20: A spam hater. SH: 13 (65%); IN: 7 (35%)
- #21: Indifferent. SH: 13 (62%); IN: 8 (38%)
- #22: A spam hater. SH: 14 (64%); IN: 8 (36%)
- #23: A spam hater. SH: 15 (65%); IN: 8 (35%)
- #24: Indifferent. SH: 15 (63%); IN: 9 (37%)
- #25: A spam hater. SH: 16 (64%); IN: 9 (36%)
- #26: A spam hater. SH: 17 (65%); IN: 9 (35%)
- #27: A spam hater. SH: 18 (67%); IN: 9 (33%)
- #28: A spam hater. SH: 19 (68%); IN: 9 (32%)
- #29: A spam hater. SH: 20 (69%); IN: 9 (31%)
- #30: A spam hater. SH: 21 (70%); IN: 9 (30%)
Now, it wouldn't be uncommon for a survey like this to have a margin of error of +/- 3%, so even with this imperfect example, it came very close to the reality after only 30 samples. Now, it is possible that I could somehow have picked a sample of 30 that were almost all on one side or the other, which is why there is a margin of error and a confidence level. It is rare that you see the mainstream actually report both of these, however; they really should, it only takes an extra sentence, but they don't.
This is what they mean, however. After taking a sample, I apply some statistical tests and equations that tell me how close to the mean I should be with a certain level of confidence. In other words, I can calculate a margin of error for a confidence level of 95%, which is the most typical. Let's just say that the margin of error at a confidence level of 95% for the above example was plus or minus +/- 4%; I didn't actually calculate this--I'm just explaining what it means.
This means that 95% of the possible samples I could possibly have picked randomly would end up having 66% to 74% spam haters. (That's 70% - 4% to 70% + 4%; my result plus and minus the margin of error). And that's only with a sample size of 30, and I got that close.
So, you see, a sample size of 1,421 will be far more accurate than mine. And as you can see, they reported the margin of error as +/- 3.2%; pretty common, but the article does not tell you the confidence rate. The actual report published by those who conducted the survey says they used a 95% confidence rate.
So, I've done a lot to show that the sample size itself is only so important. The most critical factor is how the sample was collected.
If this had been a web poll on the side of a web site, this would have been a bad method. This is a voluntary poll where anyone can submit their opinion; this isn't truly random. Anyone who feels more passionately about a subject is more likely to respond to such a poll, and thus spam haters would be more likely to respond, skewing the results away from the truth.
The nature of the site may also provide a bias; if the site tends to be frequented by technically-adept or technically-challenged
people, it is also more likely to skew the results.
Fortunately, these guys did it right--their survey was mostly completely random. They used a telephone survey using a random dialer, so the odds of any one person being selected were mostly equal. It's not perfect: not everyone has telephone service, more than one Internet user may share a single line, and someone could always refuse to provide an answer. All of these could skew the results a little further from the +/- 3.2%, but they aren't likely to do too much damage. In the end, while not a perfectly random sample, it is probably close enough to be as accurate as they say.
we HATE it and WE ARE SICK OF IT!
This study of 1421 people is moronic & not worth publishing.
Given their margin of error, it is entirely possible that the true difference from year to year could possibly be as small as 3.6%. (This assumes that their previous study had the same margin of error). It's only significant enough to indicate the annoyance level almost certainly did drop, but probably not by a significant margin.
The study appears to be to have been conducted well, but the headline is blatant sensationalism; the study does not say we are getting used to spam. That conclusion is not supported by the study's data, and I doubt the study's author(s) intended to make that point except for perhaps some idle speculation.
This is a very good example of people, and the media, misuse or misinterpret statistics, and why people should learn at least the basics behind them. We can't trust the media to properly interpret them for us; really, you can rarely most people with any sort of agenda, even the agenda of getting your attention, to properly interpret a statistic for you.
You are always stuck thinking for yourself, and it's better to do that from a position of knowledge and understanding.
I've looked over the survey, and I would say the science and methodology trumps your initution. Of course, you're making the assumption that the survey's data supports the conclusion that people are becoming less annoyed with spam. That's not what the data says.
That's why I made my previous post about the headline, and much of the wording of this article, is misleading. The actual study is more careful in its wording.
The point is that the *overall* opinion against spam has diminished. It doesn't necessarily mean that we spam-haters are getting any more tolerant of it.
Think about it. Every year they do this study, there's a year's worth of new Internet users they might catch. One interesting thing the study noted was that the 30 and older group had almost no change in their opinion of spam, but the younger crowd dropped by nearly 10 points. So, opinions haven't dropped across the board, and the study doesn't any in way suggest that.
No, this article made certain poor assumptions about the data says. Then again, journalists do this all the time. Statistics should be a required course for journalists, with heavy emphasis on analyzing results.
I do blame the study's author for the opening paragraph, however--this may have helped the author of this article make the unwarranted assumptions.
"A year after the CAN-SPAM Act became law, email users say they are receiving
slightly more spam than before, but they are minding it less. More than half of
internet users still consider spam to be a big problem, yet the ill effects of spam on
email habits and the overall internet experience have declined."
The study does support most of these statements, but only on an overall basis. The line about "minding it less" however, is misleading. Still, the author is more cautious in the rest of the report.
Additionally, the study also states that pornographic e-mail had been the greatest irritant, and since that was falling, that could explain much of the difference by itself.
- The SPAM problem isn't getting better!
- by wbenton April 13, 2005 9:50 AM PDT
- This story doesn't relate the truth of things.
- Like this Reply to this comment
-
(9 Comments)SPAM is still on a constant increase ursurping ever more and more bandwidth.
The reports and peoples feelings of those whom were interviewed do not reflect the true state of the situation.
Many young people don't require a fixed permanent E-mail address and thus they find it easy to change their E-mail address when ever their E-mail inbox gets too full of SPAM for them to put up with. But for corporate users, that's a different story. Their jobs depend on them getting E-mails and thus changing their E-mail address isn't as easy as one might think.
Likewise, many Spam filtering products have been introduced on the market which only hides the true story from the end-user... but many times, with many spam filtering products offered, important and necessary things get filtered out as well.
So I agree that they don't notice it so much today... but that doesn't necessarily mean that things are getting better.
The number of SPAM sent out continues to rise which is the original problem in the first place and it continues to grow.
Thus the perception that things aren't that bad is just that... a perception... and a wrong perception at that.
I think the true story needs to be re-written about this. ISP's continue to allow spoofed SPAM through their sites, but they're not held accountable for their user's actions.
If that problem were resolved and those ISP's whom don't take concrete actions to disallow spoofed SPAM through their networks, then those ISP's should have either their internet licenses revoked or at least their entire IP Subnet Range completely blacklisted for anywhere from 3 days to one month depending on whether it was their first warning, second warning or umpteenth warning.
That would bring the amount of SPAM down regardless of whether users see it or not.
Ask any internet backbone router provider for stats on bandwidth usage and compare that with the millions or even billions of SPAM sent out per day and you'll see that SPAM eats away at the internet responsiveness. It slows everything down.
So how about a re-write or follow-up on this story with the correct information this time?