Using Statistics to Mislead

Most of you don't know much about me, because I haven't shared that much here; I don't post much about my personal life, and I'm not inclined to because I'm kind of a target for some of y'all.  

I'm not complaining about this; just explaining why it's unlikely that anyone here would know that, when it comes to math and statistics, I'm formidable.  I'm not saying this to be arrogant: it's not anything that I take credit for.  I just happen to have a facility for mathematics and statistics.  I find them boring but I'm extremely good with them and when I teach my students about research and research methodology, I often do so with an eye towards getting them to distrust what they see in mainstream media; even if not intentionally deceptive (which it sometimes is), the media uses math, graphs and statistics to misrepresent reality from time to time.

I expect this from the mainstream media, and I try to teach my students to look beyond the simple.

But I don't expect it here.

Two decades, I was working for a congressional candidate and doing statistical analysis for him.  We'd received some polling data.  I don't remember the exact details, but it came down to something along the lines of: 35% favoring his position, 32% favoring his opponents with 33% undecided.  This was a small difference but he wanted me to present the data differently.  He wanted me to remove the undecided from the numbers so he could claim that he had 52% support for his point of view.  

This was a statistical lie, and I told him so, and told him that if he were going to pull that, he'd have to have someone else working on the data analysis because I wouldn't do it.  This sort of crap pisses me off, and offends me because it violates the truth.  This is why I'm deeply disappointed to have seen the following:

In a Front Page Diary, we have the following graph:

Note the low end of that scale: it puts 10,200 at the bottom of the scale, making this look like a huge disparity between the two.

This is what that graph should look like (Updated from the comments thread: mine didn't work for some reason):

There are arguments to be made that the current system is unfair and undemocratic.  I happen to agree that it is.  I have no idea who would have won this nomination had the system been different, nor do I care at this point.  Right now, this is the system we're using and we probably do need to reform it, but the only meaningful way of doing so is to do it through the actual evidence.

I have no problem with arguments to change the system, but they shouldn't be based on deceptive graphs and they shouldn't be based on using numbers to present information with false spin.

This is the sort of thing I expect from the Heritage Foundation, not from Jerome.  



Display:


Re: Using Statistics to Mislead (2.00 / 5)

Personally, I think a difference of nearly 1,000 voters per delegate is pretty big, but I guess different graphs take different perspective on how much individual voters actually matter.  


2004 swing state margins: PA-2%, OH-2%, IA-1%, WI-0.5%, MI-3%, FL-5%, NM-1%; Alienating 50% of the party is a luxury we can't afford.
by BPK80 on Sun May 25, 2008 at 07:03:29 AM EST

Re: Using Statistics to Mislead (2.00 / 25)

Personally, I think a difference of nearly 1,000 voters per delegate is pretty big, but I guess different graphs take different perspective on how much individual voters actually matter.

Which is fine.  Go ahead and think that.  My problem is with presenting a graph that makes it look like a 3:1 difference.


I'm only a click away
by juliewolf on Sun May 25, 2008 at 07:09:28 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 3)

Agreed.  But it's not uncommon for graphs to use baselines other than zero for various purposes.  With your familiarity with statistics and graphs, I'm sure you have have seen several.  


2004 swing state margins: PA-2%, OH-2%, IA-1%, WI-0.5%, MI-3%, FL-5%, NM-1%; Alienating 50% of the party is a luxury we can't afford.
by BPK80 on Sun May 25, 2008 at 07:13:08 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 10)

I've seen many, and they are often misleading.


I'm only a click away
by juliewolf on Sun May 25, 2008 at 07:17:54 AM EST
[ Parent ]

Re: Using Statistics to Mislead (1.60 / 5)

And sometimes they are not.  


2004 swing state margins: PA-2%, OH-2%, IA-1%, WI-0.5%, MI-3%, FL-5%, NM-1%; Alienating 50% of the party is a luxury we can't afford.
by BPK80 on Sun May 25, 2008 at 07:24:52 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 7)

In this case it is misleading to use a non-zero baseline.  Do you agree?


Consider that everything which happens, happens justly, and if thou observest carefully, thou wilt find it to be so. -Marcus Aurelius
by Blue Neponset on Sun May 25, 2008 at 07:40:58 AM EST
[ Parent ]

Re: Using Statistics to Mislead (1.66 / 12)

It depends on what the presenter is attempting to prove.  

If the goal is to make it seem that Hillary received 15x as many voters per delegate, then that would be misleading.

If the graph is trying to stress how large of a popular difference inheres in a nearly 1,000 voter per delegate disparity, then the graph accomplishes it.  


2004 swing state margins: PA-2%, OH-2%, IA-1%, WI-0.5%, MI-3%, FL-5%, NM-1%; Alienating 50% of the party is a luxury we can't afford.
by BPK80 on Sun May 25, 2008 at 08:11:18 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 18)

I'm sorry, but that's bs.  The whole point of a graph is to visually represent a proportional difference between values to give the person looking at the graph a quick idea of how much the difference is.  

Jerome's Graph fails to do that.  Period.  

Having a graph that shows the difference properly would have been effective enough.  Why the need to slide the baseline above zero?  All it does is mislead the reader of the graph.  Thus the creator of the graph's intention is not to show that there is a difference, but to misrepresent how big the difference really is.

Why?  Why would someone that operates a blog in the "reality" based community use such a graph to make a point.  It only weakens his argument by making it look biased.

Nice.


by herenow on Sun May 25, 2008 at 11:31:29 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 6)

It depends on what the presenter is attempting to prove.

This is exactly the opposite attitude that scientists try to use when presenting and appraising data.

Data should be presented as neutrally as possible. To the degree that data are presented in a way that influences their appraisal is the degree to which the data are presented fraudulently.

Of course, data are dishonestly presented all the time, even in science, but that doesn't make it any better.

Fyi, here's what the graph looks like with all of the methods of calculating the popular vote (as I posted in the original thread):


by randomscientist on Sun May 25, 2008 at 12:16:07 PM EST
[ Parent ]

And here's... (2.00 / 2)

What the popular vote counts for:

0

I've seen pretty graphs showing correlations between party representation in Congress and the number of sunspots. Pretty, yes -- meaningful, no.

by Twin Planets on Sun May 25, 2008 at 01:37:16 PM EST
[ Parent ]

Re: And here's... (none / 0)

According to most political theories the popular vote is one of the foundations of representive legitimacy.

For me 2000 resulted in a illegitimate presidency that was not based on the will of the people.

And for me Obama is the legitimate nominee because he has an unsurpassable popular vote lead as well as the most delegates.

Back then most democrats believed as I do, Now I wonder why most don't anymore.


"Another problem we have...is that in election years we behave somewhat as primitive peoples do at the time of the full moon." --Harry Truman
by Ernst on Mon May 26, 2008 at 08:03:05 AM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

I'm glad that you reposted it, It didn't get the attention it should last time.

I really think somebody should do a calculation based upon proportional distribution per state instead of district. I think that would largely remove any discrepancy in the votes per delegate.

It should be an easy fix for this problem. It's also good to note that the difference never been so low here, winner takes all contests create a for bigger distortion between delegates and votes. It's important to note that it's all about developing an improvement further.


"Another problem we have...is that in election years we behave somewhat as primitive peoples do at the time of the full moon." --Harry Truman
by Ernst on Mon May 26, 2008 at 08:10:04 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

Yay - even more bullshit justifications from Clinton supporters! It's supposedly good that the graph deceives, since the diarist wanted it to deceive. (And since it was done in the service of Clinton, ofcourse, ofcourse)

If 1000 votes voter per delegate are such a huge disparity, then Jerome wouldn't have needed the deceiving graph at all. The very idea of honest graphs are to represent numbers in the shapes of analogous lengths or areas.

But for you their purpose is to deceive, so ofcourse you liked Jerome's graph. What a surprise.


by Aris Katsaris on Sun May 25, 2008 at 02:11:25 PM EST
[ Parent ]

And Like I Pointed Out (2.00 / 3)

the MATH just doesn't work. If you multiply the delegates per vote X the delegates alloted according to the front page of MyDD, it doesn't make any sense. Whether you count FL & MI or not, neither metric gives you a believable vote total.

To me, the graph looks like Clinton is getting 11750 per delegate, to Obama's 10800. Work it out. It's senseless.


by RNinNC on Sun May 25, 2008 at 07:28:13 AM EST
[ Parent ]

Re: And Like I Pointed Out (none / 0)

Your calculation, as I understand it, just is wrong in how it attempts to capture the differential.

Here's how the calculation behind the graph works, presumably.

For each state, there is a total vote and a total number of pledged delegates. The number of votes per delegate is determined by taking that total vote and dividing by the number of delegates.

This is done both for primary states and caucus states. After this calculation, the delegates are sorted into Obama delegates and Hillary delegates. The number of votes per delegate for both Obama and Clinton are calculated by taking the average of the votes per delegate for all the delegates on each side.

If you do the calculation like this -- which makes perfect sense -- you do not expect that the calculation you have, namely taking the total number of delegates for each side and multiplying them by the total number of votes per delegate for that side, will result in general in the total number of votes for that side.

That calculation is just a confusion of what's really going on.


by frankly0 on Sun May 25, 2008 at 12:05:51 PM EST
[ Parent ]

Re: And Like I Pointed Out (none / 0)

Just to clarify:

Why is the average votes per delegate so much lower than one might "expect", based on dividing the total vote by the total number of delegates?

Because that average is brought way down by the pledged delegates from caucus states.


by frankly0 on Sun May 25, 2008 at 12:13:04 PM EST
[ Parent ]

Re: And Like I Pointed Out (none / 0)

Actually, as I thought about it a bit more, it may be that the votes per delegate might be the same either way it might be calculated -- I'd have to go through the calculation to see if one can be derived from the other (which I'm not going to bother to do). If you used the median, though, instead of the average, the two calculations would certainly be different in general.


by frankly0 on Sun May 25, 2008 at 04:47:43 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

Indeed intellectual dishonesty is not uncommon.


by french imp on Sun May 25, 2008 at 01:39:42 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 2)


Agreed.  But it's not uncommon for graphs to use baselines other than zero for various purposes.  With your familiarity with statistics and graphs, I'm sure you have have seen several.  

There is only one purpose of using baselines other than zero: to draw focus to (and magnify) the pattern above the baseline actually used. There are honest reasons to do so, if the pattern itself is the focus: for example, an ac signal on a dc carrier.

There is nothing special in this simple bar graph - no pattern to be amplified for better viewing. The sole purpose of this graph is to distort the difference between the two numbers posted. And no amount of hand-waving or spin can change that.


It is not because I cannot explain that you won't understand. It is because you won't understand that I cannot explain. - Elie Wiesel
by Sumo Vita on Sun May 25, 2008 at 06:41:19 PM EST
[ Parent ]

Jerome presented a graph (2.00 / 3)

from another site which he provided a link to first, with a host of other graphs to compare and contrast.

Calling him out publicly to label as deceitful with the intent to mislead is way over the line and totally inappropriate.


by phoenixdreamz on Sun May 25, 2008 at 08:43:30 AM EST
[ Parent ]

Re: Jerome presented a graph (2.00 / 7)

Then he should of just left it as a link, instead of intentionally posting one of them over here, in an effort to try to prove a point.


Users who are excessively bashing the Democratic Party, or being Republican trolls, will be banned.
by Massadonious on Sun May 25, 2008 at 08:57:24 AM EST
[ Parent ]

Re: Jerome presented a graph (none / 0)

True.  More than two sentences of accompanying explanatory text might have done wonders too.


Unseen, in the background, Fate was quietly slipping the lead into the boxing glove.
by fogiv on Sun May 25, 2008 at 01:22:01 PM EST
[ Parent ]

Re: Jerome presented a graph (2.00 / 6)

"Jerome presented a graph from another site which he provided a link to first, with a host of other graphs to compare and contrast.

Calling him out publicly to label as deceitful with the intent to mislead is way over the line and totally inappropriate"

If you had clicked that link the chance you would been have been deceived could have increased.

Of the "host of other graphs to compare and contrast" the only one that did not have a zero base line was the one that Jerome posted. It was positioned after/under many charts that had a zero base line.

So, to be charitable I will assume Jerome was himself deceived and in his haste to diary good news for Senator Clinton accidentally widened the circle of deception.


by My Ob on Sun May 25, 2008 at 09:19:16 AM EST
[ Parent ]

Re: Jerome presented a graph (2.00 / 1)

Well, that's the graph that appears on the main page.  And, it's the only graph in the set that shows a disparity in such a visually striking way.  So, I think people can draw their own conclusions as to why that particular graph is gracing the front page.


by rfahey22 on Sun May 25, 2008 at 11:46:17 AM EST
[ Parent ]

Re: Jerome presented a graph (1.33 / 12)

yep.   this diarist is just GRANDSTANDING. get over it I say to her. She already posted all this information in the original diary thread. and yes, slandering the front-pager - really not kosher.


by swissffun on Sun May 25, 2008 at 11:59:14 AM EST
[ Parent ]

Two things (2.00 / 15)

She already posted all this information in the original diary thread. and yes, slandering the front-pager - really not kosher.
(1) Please prove that I posted all this information in the original diary thread.  

(2) you have just accused me of an actionable offense.  Before you go throwing around words like "slander" you should make sure you understand what they actually mean.  


I'm only a click away
by juliewolf on Sun May 25, 2008 at 12:02:59 PM EST
[ Parent ]

Re: Jerome presented a graph (2.00 / 4)

Slander denotes a spoken falsehood, so you're doubly wrong.  You meant libel, which is a written falsehood and still inapplicable.  The diarist wrote that the graph in Jerome's front page post is misleading, which is absolutely true.


by deminva on Sun May 25, 2008 at 02:38:58 PM EST
[ Parent ]

Re: Jerome presented a graph (2.00 / 1)

Jerome is a smart guy. Either he didn't take the time to examine the graph properly, in which case he was ridiculously careless in his rush to make a point, or he did see how misleading it was, and he posted it anyway.

When an oil company front-man releases fake information to show that global climate change isn't real, I absolutely blame them. But I also blame the media who are either too stupid or too eager to tell the "controversy" story to point out the flaws in the science. Both are at fault for misleading the public, both should be criticized, and both should lose some of the public trust. This is no different.


Walberg Watch - Following Radical Conservative Rep. Tim Walberg in MI-07
by Fitzy on Sun May 25, 2008 at 12:42:50 PM EST
[ Parent ]

Re: Jerome presented a graph (none / 0)

In the critical thinking textbook I use for my introductory logic course (Parker and Moore), an example identical to this is used in the informal fallacy and rhetoric chapter as an instance of a visual fallacy and distortion.  There the authors reproduce a bar graph from USA Today published during the Terry Shiavo debacle.  The poll asked "should Shiavo be taken off of life support?" and sorts the answers according to party affiliation.  When they published the results they did the vertical axis numbering in such a way as to visually give the impression of a massive disparity between democrats and republicans on the issue (by numbering each unit:  1, 2, 3, 4... 11, 12, etc) when, in fact, there was about a 7% disparity between the parties on this matter.  They were either trying to suggest that there was more of a controversy here than there was or that republicans care more about human life.  

This graph does exactly the same way and is an instance of the sort of shenanigans and spin that have driven so many of us up the wall during this campaign.  I doubt that Jerome was deliberately trying to mislead, but Juliewolf is absolutely right here.  This is a visually dishonest graph.  Moreover, these sorts of tactics do Clinton a disservice as they give people the impression that her supporters aren't intellectually honest in their arguments.


by Philoguy on Sun May 25, 2008 at 03:35:07 PM EST
[ Parent ]

Re: Jerome presented a graph (none / 0)

Let me join the chorus:  

The graph posted on the front page was terribly misleading.  

Bravo for this diary.


by Bargeron on Sun May 25, 2008 at 03:15:46 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

You must be joking, do you recall what an X and Y axis is for, or are your students really that stupid to think only in visual terms. Jay Cost made the graph, take it up with him.

Besides, this is all a distraction from the main point. That delegates are not evenly gotten in the current set-up, as a measure of votes. Seems pretty un-democratic to me.


by Jerome Armstrong on Sun May 25, 2008 at 06:45:00 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 6)

You must be joking, do you recall what an X and Y axis is for, or are your students really that stupid to think only in visual terms.
Well, that's a disappointing response.

Jay Cost made the graph, take it up with him.
You used it to present it as evidence of something.  I think that's your responsibility, not his.

Besides, this is all a distraction from the main point. That delegates are not evenly gotten in the current set-up, as a measure of votes. Seems pretty un-democratic to me.
This system is not designed to be democratic and I think it needs to be redesigned in order to be more Democratic.  

But you should really focus on real evidence as to why the system is undemocratic, such as the involvement of unelected superdelegates in the race.  When you try to present a very small difference between the vote per delegate numbers of Clinton and Obama as though it's a large one, you do a disservice to the case for reform.  

You may not believe this, but I have a bit of respect for you (I loved "Crashing the Gates"), but every time something like this happens, that respect wavers a bit and I have to question whether or not it's well-placed.

So yeah, let's reform the nomination process, but let's do so based on relevant facts about the system, not on misleading graphs.


I'm only a click away
by juliewolf on Sun May 25, 2008 at 09:17:09 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 8)

do we really have to go over why amalgamating 60 different vote counts from 60 different contests with 60 different sets of rules does not a useful measure make?

Each contest had different rules. My mom wanted to vote for Obama, but she's not a registered Democrat and our state has closed primaries. If she had lived twenty miles south, she would have counted. Combining the results from contests with different rules and pretending it's more fair doesn't make any sense.


The primaries are over!
Focus on McCain
by really not a troll on Sun May 25, 2008 at 07:10:59 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 2)

It's symbolic and a better indicator of people's will than the fruits of an even more state-to-state incongruent pledged delegate apportionment system, which clouds the determination of will with distortions like caucus results and a variable augmenting the voice of certain voters having a district with Democratic past voting behavior.


2004 swing state margins: PA-2%, OH-2%, IA-1%, WI-0.5%, MI-3%, FL-5%, NM-1%; Alienating 50% of the party is a luxury we can't afford.
by BPK80 on Sun May 25, 2008 at 07:16:00 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 4)

Both pledged delegate count and "popular vote" counts are misleading. The idea that sixty different state-level snapshots taken over a five-month span with different rules in each contest is in any way representative of the democratic electorate (either via pledged delegates or popular vote) is absurd.

The difference is that while "popular vote" is merely symbolic, delegates actually determine the nominee.

The best way to determine the will of democrats across the country is probably polls. For all their inaccuracy, they at least follow uniform rules across the country and represent a narrow, recent band of time.


The primaries are over!
Focus on McCain
by really not a troll on Sun May 25, 2008 at 07:26:26 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

And right now, Obama is leading almost comfortably.  The rejoinder would be that polls taken at this point are meaningless, blah blah blah, so I'll save some of you the trouble of typing it.


by ReillyDiefenbach on Sun May 25, 2008 at 12:03:52 PM EST
[ Parent ]

I look at it a different way. (2.00 / 9)

It shows Barack Obama knew the rules, and was ready from Day One.  That's how he netted seven more delegates from Illinois (53) than than Clinton did from more delegate-rich New York (46).  It's how Obama netted one fewer delegate from Georgia (36) than Clinton did from California (37).  It's how Obama netted more delegates from Idaho (12) than Clinton did from New Jersey (11).  It's how Obama netted the same number of delegates from Virginia (25) as Clinton did from the Texas (4), Ohio (9), and Pennsylvania (12) primaries combined.  

Hillary Clinton picked up this concept in West Virginia and Kentucky, but by then it was too late.  Barack Obama was ready from Day One.


by Brad G on Sun May 25, 2008 at 12:02:31 PM EST
[ Parent ]

Re: I look at it a different way. (none / 0)

More from Idaho than from New Jersey, that's a hoot!


by ReillyDiefenbach on Sun May 25, 2008 at 12:04:50 PM EST
[ Parent ]

Obama: An experienced, vetted leader. (none / 0)

I like the ring of that. Of course, that's what Hillary claimed and look how poorly it worked out for her. I guess Obama should stick with his existing tactics which have worked well enough to win him the nomination.

It's a shame Hillary is so inexperienced that she didn't know the rules of the Democratic primary. Just goes to show what a lousy president she'd make.


by edg1 on Sun May 25, 2008 at 04:38:37 PM EST
[ Parent ]

Re: Obama: An experienced, vetted leader. (none / 0)

On that, we disagree.  I actually think Hillary Clinton would make a good President -- certainly one much better than her husband, John McCain, and the current occupant of the White House.  She just got really unlucky by having to run against a truly gifted campaigner whose experiences and compelling message matched what the voters saw as the nation's needs of the time.  Barack Obama picked the right time and the right place to run.

I doubt Barack Obama would have done this well in 1996, 2000, or even 2004.  His message simply wouldn't have fit in with the nation's mood at the time.


by Brad G on Mon May 26, 2008 at 11:04:11 AM EST
[ Parent ]

Re: I look at it a different way. (2.00 / 1)

I agree with this -- Obama showed exactly the skills I would want in a president -- understanding a complex and arcane process, and figuring out a way to utilize it to the best advantage and greatest efficiency of effort.  Clinton -- despite an enormous initial financial advantage and a team of veteran political operatives with a wealth of experience -- has lost the nomination because of her inability to win caucuses, and apparently to understand that these losses are significant.  This pretty directly undermines the whole electability argument -- she ran a campaign that left a huge opening for Obama and Obama used it to win.  And though you may not like it, he has in fact won this and he's waiting at this point until the primaries are over out of politeness and to avoid alienating the Hillary voters he'll want to get back for November.  


by Headlight on Sun May 25, 2008 at 08:30:46 PM EST
[ Parent ]

Re: I look at it a different way. (none / 0)

I think just not competing in states when there is proportional allocation and two candidates -- she learned the hard way how you're opponent can rack up huge delegate gains on the cheap.


by Brad G on Sun May 25, 2008 at 09:46:27 PM EST
[ Parent ]

When that 1,000 votes represents... (none / 0)

... less than a one percent difference, as in this case, it's a lot less significant.


Ignorance is weakness. Get strong.
by tbetz on Sun May 25, 2008 at 02:28:29 PM EST
[ Parent ]

Argh, misread the numbers... (none / 0)

... lost a zero, so it looks like about 8% difference.


Ignorance is weakness. Get strong.
by tbetz on Sun May 25, 2008 at 02:34:19 PM EST
[ Parent ]

Re: Argh, misread the numbers... (none / 0)

Which is actually a significant difference. smaller then with winner takes all elections but larger then we should want. A change to proportional distribution on state level instead district would reduce the difference to more acceptable levels.

Luckily the difference was not large enough to effect the election this time so it's still an academic problem.


"Another problem we have...is that in election years we behave somewhat as primitive peoples do at the time of the full moon." --Harry Truman
by Ernst on Mon May 26, 2008 at 08:53:30 AM EST
[ Parent ]

Re: Argh, misread the numbers... (2.00 / 1)

I favor a change to proportional on a statewide level.  But, that's not the reason for the disparity.  I saw a comparison from after Super Tuesday showing that Clinton acutally received a few more pledged delegates under the current system than if we had a statewide proportional system.  The reason for the disparity is Obama did exceedingly well in the caucuses, which have much lower turnout than primaries.


John McCain vows to overturn Roe
by soccerandpolitics on Mon May 26, 2008 at 07:59:57 PM EST
[ Parent ]

Re: Argh, misread the numbers... (none / 0)

Ah, I stand corrected.

Then I guess the district level distribution doesn't lead to a significant diffence because of the large number of districts make sure the rounding errors are evenly distributed among the candidates.


"Another problem we have...is that in election years we behave somewhat as primitive peoples do at the time of the full moon." --Harry Truman
by Ernst on Tue May 27, 2008 at 04:15:44 AM EST
[ Parent ]

Re: Argh, misread the numbers... (2.00 / 1)

Exactly.  Poblano did an analysis on DKos of this sometime after Super Tuesday, when about half of the delegates had been selected.  IIRC, Clinton had about 4 or so more pledged delegates than she would have been entitled to under a strict statewide proportional system.  Maybe this has changed since all the more recent elections.  But, as you surmise, the large number of districts and the randomness of who is advantaged makes it unlikely for any candidate to gain a really significant advantage.


John McCain vows to overturn Roe
by soccerandpolitics on Wed May 28, 2008 at 12:53:06 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

image link is broken


The primaries are over!
Focus on McCain
by really not a troll on Sun May 25, 2008 at 07:08:20 AM EST

Re: Using Statistics to Mislead (none / 0)

nevermind, it works now. thanks!


The primaries are over!
Focus on McCain
by really not a troll on Sun May 25, 2008 at 07:11:26 AM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

Weird-- I updated it to use yours anyway :)


I'm only a click away
by juliewolf on Sun May 25, 2008 at 07:12:54 AM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

cool :-)
glad to have been of service
The primaries are over!
Focus on McCain
by really not a troll on Sun May 25, 2008 at 09:11:54 AM EST
[ Parent ]

Re: Using Statistics to Mislead (1.33 / 3)

so the numbers are different how exactly?


by zerosumgame on Sun May 25, 2008 at 11:56:39 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

The visual (which is why you use a graph to begin with) is different.

The visual that Jerome used looked like a huge disparity, the actual difference is not as large as it appeared.


Like the nominee, don't like the nominee... Our nominee is still better than John McCain...
by JenKinFLA on Sun May 25, 2008 at 01:15:22 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

Uprated because missing the obvious is not the same thing as attacking another user or being a Freeper.


by letterc on Sun May 25, 2008 at 05:27:30 PM EST
[ Parent ]

Julie, I don't think even Heritage Foundation (2.00 / 7)

would post something that laughable!

I got a real belly laugh when I finally saw how he had the X axis set up.  If he wanted to make it REALLY SUPER-DUPER GINORMOUS impressive, he could have set the origin at 10,600.  Maybe drawn a little bird's nest or diving board at the top of the red column.


by Dumbo on Sun May 25, 2008 at 08:10:29 AM EST

AND! ... (2.00 / 1)

Somebody needs to make a blue and red graph of the current state of the delegate race to show how far ahead Obama is.  I suggest we set the origin point at 1700 delegates.

Latest count: Obama 1972, Clinton 1780.


by Dumbo on Sun May 25, 2008 at 08:14:25 AM EST
[ Parent ]

Re: AND! ... (2.00 / 16)

As the placement of the x-axis clearly shows, clinton has negative delegates!


by SupremeCourt on Sun May 25, 2008 at 11:56:12 AM EST
[ Parent ]

Re: AND! ... (2.00 / 1)

Major reverse-distortion mojo!


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 11:58:40 AM EST
[ Parent ]

That's a great visual (2.00 / 1)

and clearly shows the huge disparity that exists. Makes it hard to understand why Clinton is still in the race. ;)


by casperr on Sun May 25, 2008 at 02:00:13 PM EST
[ Parent ]

Re: AND! ... (2.00 / 1)

Hey SupremeCourt, I used this image and credited you in my post here:

http://www.mydd.com/story/2008/5/25/1747 9/7977

I'd mojo you if I could.  :)


John McCain the flip-flopper...
by chinapaulo on Sun May 25, 2008 at 06:13:56 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 4)

I have taught research methodology classes to college students as well.

There is no particular way that this graph "should look like."  Using a 0 base is not some kind of golden rule for making graphs and deviation from this is always wrong.  Nor does deviating from a 0 base indicate that Cost is somehow not presenting the "actual evidence," as you say.  Graphs like Cost's are actually found frequently in both the social and natural sciences when dealing with phenomena in which small discrepancies across categories are very meaningful.  

To draw a loose parallel between this choice, which is a common practice in many fields, and your boss's desire to actually manipulate his data is itself very deceptive - maybe this diary should be called "Using False Analogies to Mislead."


John McCain: Extending SCHIP would be an "unfunded liability."
by Fuzzy Dunlop on Sun May 25, 2008 at 09:44:07 AM EST

Re: Using Statistics to Mislead (2.00 / 13)

There is no particular way that this graph "should look like."

True.  But there are ways to present the information in a false light.  This is one such method.

Using a 0 base is not some kind of golden rule for making graphs and deviation from this is always wrong.

True.  But if what you're comparing is a count of a set of factors, you should be starting at the baseline for that count.   I.e., zero.  If you do not do this, you should indicate why you have chosen not to do so.

Nor does deviating from a 0 base indicate that Cost is somehow not presenting the "actual evidence," as you say.

In this case, it really does.  Suppose you had a pie chart showing 75% for one group and 25% for another.  Then, you discover that the counting factor was started at 10,000, with one group getting 10,075 and the other getting 10,025.  You can include a legend in small print which indicates that the factor in question is n-10,000, so the graphic is not technically deceptive, but that doesn't make it a realistic representation of the information in question.


I'm only a click away
by juliewolf on Sun May 25, 2008 at 11:51:12 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

"True.  But if what you're comparing is a count of a set of factors, you should be starting at the baseline for that count.   I.e., zero.  If you do not do this, you should indicate why you have chosen not to do so."

That is just ridiculous.  For example, if I wanted to make a graph showing Manny Ramirez's career home run total (498 I believe) and the players he might pass in the next season on the all time list, why would I start this graph at 0?  I would want to show in more detail exactly how many home runs each player has, so I would pick a different scale and make it clearly labeled.  There is nothing magical about "counting."

"In this case, it really does.  Suppose you had a pie chart showing 75% for one group and 25% for another.  Then, you discover that the counting factor was started at 10,000, with one group getting 10,075 and the other getting 10,025.  You can include a legend in small print which indicates that the factor in question is n-10,000, so the graphic is not technically deceptive, but that doesn't make it a realistic representation of the information in question."

This is a terrible analogy, as of course a proportion of a total that subtracts 10,000 from that total would be deceptive.  But this is not the case here at all.


John McCain: Extending SCHIP would be an "unfunded liability."
by Fuzzy Dunlop on Sun May 25, 2008 at 02:26:13 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

This is a terrible analogy, as of course a proportion of a total that subtracts 10,000 from that total would be deceptive.  But this is not the case here at all.
How could it be deceptive if it's clearly labeled what's being measured?

What kind of idiots do you think people who read graphs are?


I'm only a click away
by juliewolf on Sun May 25, 2008 at 04:08:25 PM EST
[ Parent ]

Re: Using Statistics to Mislead (1.00 / 2)

Oh please.  Scaling a graph in a way that doesn't use a 0 is absolutely common and accepted practice in many fields and the axes tell the whole story simply and clearly.  Your ridiculous example would not be found in any academic or journalistic work.    

You are now just constructing strawmen, probably because your position about the absolute necessity of "counting" graphs utilizing a 0 is simply indefensible.

I know all I need to know at this point about your supposed expertise in statistics.  Its just too bad that you have managed to pull the wool over the eyes of so many others here.


John McCain: Extending SCHIP would be an "unfunded liability."
by Fuzzy Dunlop on Sun May 25, 2008 at 04:24:39 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

Lie Factor = size of effect shown in the graph divided by size of effect shown in data.  If the lie factor of a graph is greater than 1, the graph is exaggerating the size of the effect.


by edg1 on Sun May 25, 2008 at 04:48:48 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

Yeah, except what is the "effect shown in data?" The lie factor idea assumes that the effect shown in data is a proportional difference between two categories, but this is not always the salient aspect of the data for a given research question.

That's why it is far from uncommon, at the highest levels of academic practice, to use graphs that don't employ a zero.  What does that tell you?

This debate is getting tiring.  I think I'm done with it.


John McCain: Extending SCHIP would be an "unfunded liability."
by Fuzzy Dunlop on Sun May 25, 2008 at 05:06:44 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

Hide rated for insulting another user. You should have quit before this comment. For what it was worth, you had already made your point.


by letterc on Sun May 25, 2008 at 05:31:19 PM EST
[ Parent ]

Oh please yourself (none / 0)

I don't normally repeat myself, but I'll make you the exception.

Yes, scaling from a non-zero base is common - but only when there's significance to the range called into focus. There's nothing significant about the simple difference between two integers that should have necessitated a bar graph to begin with.

The reason for its use is quite apparent though, in the unscrupulous employment of the non-zero scale. This is done to magnify the relatively insignificant difference between the two counts. At first glance, it appears as though HRC's popular vote won per delegate is more than twice those of Obama.

Which I need hardly remind you, is a blatant lie. Instead of empty insults, perhaps you'd care to point out why graphing with a non-zero base was felt to be necessary at all.


It is not because I cannot explain that you won't understand. It is because you won't understand that I cannot explain. - Elie Wiesel
by Sumo Vita on Mon May 26, 2008 at 02:45:16 AM EST
[ Parent ]

thats not the same as using a graph to (none / 0)

fudge the perception


-- be excellent to each other
by kindthoughts on Sun May 25, 2008 at 12:52:07 PM EST
[ Parent ]

Re: Using Statistics to Mislead (none / 0)

I teach statistics as well (that's all I teach), and I both agree and disagree with you.  I agree that there is no inherently "right" way to make a graph.  I also agree that the research question may help determine the scale of a graph, and there are definitely plenty of cases when an axis doesn't start at 0.  

However, in almost every case I've ever seen (and published myself), there is an effort to clarify that the origin did not start at 0 so that readers are not misled.

So, I think the diarist is being a little extreme in calling out Jerome, but I think you're being a little extreme, also, by claiming that there are no rules for graph making (in fact, I teach a section on this in my courses, and I point students to a book called 'How to Lie with Statistics' for details on how to make good graphs.)  


by slynch on Mon May 26, 2008 at 01:38:40 AM EST
[ Parent ]

The "Lie Factor" in Jay Costs's graph (2.00 / 14)

I posted this as a comment on Jerome's post but thought I'd put it here too.

Edward Tufte's book The Visual Display of Quantitative Information is widely considered the best book ever published on the subject.  He introduced the "Lie Factor" of graphs as follows:

Lie factor =

   size of effect shown in graphic
    -------------------------------
    size of effect in data

What is the lie factor in Jay Cost's graph?  

In the Cost graph, the difference = about 40 millimeters.

In the "integrity version," the difference = about 5 millimeters.

(Note: these are rough estimates: I copied each graph onto OpenOffice.org's Draw program, zoomed to 100 percent, and measured manually.  The graphs are about the same size but not exactly.)

So

The "Lie Factor" = 8

To quote Tufte, "Lie Factors greater than 1.05 or less than .95 indicate substantial distortion, far beyond minor inaccuracies in plotting."  

Note: I'm not accusing Jerome of lying.  Rather, I'm accusing Jay Cost of lying, based on a sound and established method of demonstrating lies in graphic representations of data.  


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 10:08:34 AM EST

SNAP! (2.00 / 1)

Well done!


Unseen, in the background, Fate was quietly slipping the lead into the boxing glove.
by fogiv on Sun May 25, 2008 at 01:41:34 PM EST
[ Parent ]

That book is absolutely fantastic (2.00 / 1)

It's a classic.  Thanks for mentioning it.


by semiquaver on Sun May 25, 2008 at 02:49:20 PM EST
[ Parent ]

Another way to calculate the Lie Factor (2.00 / 1)

Perhaps, it might be objected, it's too easy to calculate the Lie Factor by measuring the graphs visually. Well, you can calculate a Lie Factor using how it represents the percent of the total.  

Case/Armstrong graph
Total data represented = 1600
Difference = 1000
Difference as % of total = 63%

Integrity version:
Total data represented= 12000
Difference = 1000
Difference as % of total = 8%

Lie factor =

63%
----
8%

or 7.5


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 03:42:35 PM EST
[ Parent ]

A better calculation than mine (none / 0)

See here.  Jesse calculates a Lie Factor of 4.64.  Jesse's calculation is more rigorous than mine, which was a bit lazy.  But it still shows an astonishingly poor representation.  

Interestingly, MS Excel (which was probably used to create the graph) defaults to a Lie-Factor-Friendly form.  Which just shows why spreadsheet programs are not a solution.


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 05:23:56 PM EST
[ Parent ]

size effect in data... (none / 0)

I don't think that is quite right. The size of the effect in the data is not necessarily equal to the difference relative to the full value. De-meaning data is a valid practice.

A more appropriate lie factor would be to look at the distribution by state of voters per delegate, and plot the caucus vs primary voters per delegate on a scale that would include at 95% of the state voter by delegate values. This would still give the figure Jerome highlighted a substantial lie factor, but much less than 8.

I'm too lazy to do the calculations myself (the voters/delegate numbers), but I have seen Wisconsin mentioned as having 15,000 voters per delegate (huge turn out, early state), while Michigan would have less than 5,000 voters per delegate if it were seated in full.

Actually, that gives an axis range of around 10,000, which gives us a lie factor in the 7-8 range again.


by letterc on Sun May 25, 2008 at 05:43:28 PM EST
[ Parent ]

That's a good point. (none / 0)

An external blog post says:

It looks like Clinton gets around 11,750 votes per delegate and Obama gets around 10,800. This is around a 13.2% difference in the data.

The size of the effect on the graph, however, shows a 61.3% difference between the two numbers. That's a Lie Factor of around 4.64!

This is a better calculation than mine, which was sloppy.


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 07:10:49 PM EST
[ Parent ]

Almost a double digit lie factor! (none / 0)


It is not because I cannot explain that you won't understand. It is because you won't understand that I cannot explain. - Elie Wiesel
by Sumo Vita on Mon May 26, 2008 at 11:47:23 AM EST
[ Parent ]

The first graph is right (1.50 / 2)

only in the sense that it colors Sen. Clinton's votes red.


by notme54 on Sun May 25, 2008 at 10:18:35 AM EST

Re: Using Statistics to Mislead (2.00 / 2)

Whoa; out of approximately 30 million votes cast, for approximately 4000 delegates in total, means approximately 75K votes per delegate

How on earth do you arrive, Jerome or anyone, at a number like 12K per delegate?

Indeed, a difference of 1000 votes or so per delegate using real numbers amounts to about a 2 percent difference; it is almost statistically insignificant

Jay Cost got his numbers wrong somewhere; please show me how 4000 goes into 30 million only about 12K times

Republicans use numbers to lie to us; I don't expect it from progressives

THANK you for this diary, julie;  I was appalled and disturbed by Jerome's deceptive FPP

recommended, before I get kicked out of here for daring to question Our Great Leader

Direct democracy is fine, just not on the website called "My Direct Democracy"

Go figure


by fightbull on Sun May 25, 2008 at 10:23:40 AM EST

Re: Using Statistics to Mislead (none / 0)

Whoops; never do math in the morning

7500 votes per delegate, not 75K


by fightbull on Sun May 25, 2008 at 12:49:39 PM EST
[ Parent ]

Here's the plan: (none / 0)

1.  Calculate 30,000,000/4,000

2.  ?????????

3.  Profit!


Check out McCain.
by you like it on Sun May 25, 2008 at 01:00:04 PM EST
[ Parent ]

Re: Here's the plan: (none / 0)

I knew the underwear gnomes were behind this...Probably in league with the crab people.


John McCain is surprisingly bad for this country
by minnesotaryan on Sun May 25, 2008 at 02:31:27 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 4)

And what is more, Obama leads in the popular vote under most calculations, and all fair calculations, by about 1 and a half percent, which is about the 1000 vote difference "per delegate"

Denial is one thing; outright lying to us is another, Clinton fans


by fightbull on Sun May 25, 2008 at 10:25:39 AM EST

Re: Using Statistics to Mislead (none / 0)

Actually, Obama more than makes up the 1000 v/d difference in his delegate lead alone - which is why, as you point out, he does have the popular vote lead.


It is not because I cannot explain that you won't understand. It is because you won't understand that I cannot explain. - Elie Wiesel
by Sumo Vita on Mon May 26, 2008 at 11:50:56 AM EST
[ Parent ]

I'm interested in knowing the numerical (none / 0)

difference. As a "low informational" person the origin (0,10,200) allows me to see that the voters/pledged delegate ratio for Clinton is ~11,800 whereas for Obama it is 10,800, a difference of ~1000 voters. The fixed origin (0,0)shows there's a difference, but I can not make out the numerical differences without using superimposing detailed scaling.

Actually for somebody who wants to know the difference the second curve is deceptive visually as it fudges the difference pretty well.

To refer the first graph as a statistical lie, please provide with statistical/mathematical proof that it is such.

Thanks


by louisprandtl on Sun May 25, 2008 at 10:32:37 AM EST

h/t to map (2.00 / 2)

So you don't see any problem with this graph?  You don't think it is misleading?  It's from a comment downthread.

http://www.mydd.com/comments/2008/5/25/6 5822/1936/29#29

I guess you're right, a delegate count from zero is deceptive visually because it "fudges the difference" making it look as though Clinton can win the nomination when she really can't.  From now on we should use only map's graph when discussing the delegate race.


Check out McCain.
by you like it on Sun May 25, 2008 at 01:17:11 PM EST
[ Parent ]

Unfortunately you didn't understand (none / 0)

what Jerome was trying to relate. He is trying to relate the fact that Clinton had to win more voters per pledged delegate than Obama had to. He is calling for Dem primary reform.

He is not making a point whether Clinton is electable in terms of delegate count as of today which is the point of the histogram that you referred to.


by louisprandtl on Sun May 25, 2008 at 01:35:13 PM EST
[ Parent ]

I agree that reform is needed. (none / 0)

Anyone who has been paying attention knows that the rules are arcane and outdated.  That is irrelevant to whether or not the graph is misleading.

How is the graph that I pointed out any less legitimate?  It shows the numeric difference between Clinton's and Obama's delegate counts quite well.  Its ability to show numeric differences was the justification that you cited in defense of the front paged graph.

Be fair.  Both graphs are misleading because they manipulate the origin of the y-axis.  What reason is there to start the graph at 10,200?


Check out McCain.
by you like it on Sun May 25, 2008 at 01:46:44 PM EST
[ Parent ]

Re: I'm interested in knowing the numerical (none / 0)

To relate the absolute difference one should just use a data table.  Humans interpret graphs in relative terms.


by vann on Sun May 25, 2008 at 07:06:18 PM EST
[ Parent ]

Well in most conferences that I've been to (none / 0)

Jerome's graph for differential representation using an origin other than (0,0) is good enough. As a human being, I look at the y-axis and derive my conclusions from the graph. Nothing more, nothing less.


by louisprandtl on Sun May 25, 2008 at 09:29:50 PM EST
[ Parent ]

Re: Well in most conferences that I've been to (none / 0)

Do you disagree with my article?  If so, why?

It's a direct application of Tufte's principle.  Do you disagree with that principle?  If so, on what grounds?

Be forthright, please.


by vann on Mon May 26, 2008 at 01:55:03 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

Excellent diary.

Please vote for transparency in data:
http://www.mydd.com/story/2008/5/25/1057 57/862


We care about politics because we know politics matters for people's lives and opportunities.
by politicsmatters on Sun May 25, 2008 at 11:24:15 AM EST

Re: Using Statistics to Mislead (2.00 / 32)



Lost rate and rec for issuing a '1' to a trollish comment. The troll, not so much.

by map on Sun May 25, 2008 at 11:45:16 AM EST

Major reverse-distortion mojo! n/t (2.00 / 2)


Obama leads the popular vote too
by kellogg on Sun May 25, 2008 at 11:57:16 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 3)

lol.


Obama Citizen Ad Videos
by lovingj on Sun May 25, 2008 at 11:57:58 AM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 3)

They say a picture is worth a thousand words...


by Roberta on Sun May 25, 2008 at 11:58:14 AM EST
[ Parent ]

Fucking destroyed. (2.00 / 1)


Commissar: Canadian Gal; Proletariat Policemen: ragekage, Lord Hadrian. "For the Proletariat!"
by Lord Hadrian on Sun May 25, 2008 at 12:45:32 PM EST
[ Parent ]

Perfect example. (2.00 / 1)

Hillarious.


Check out McCain.
by you like it on Sun May 25, 2008 at 01:19:45 PM EST
[ Parent ]

Re: Using Statistics to Mislead (2.00 / 1)

Obama leads in pledged delegates.  He leads in popular vote (not that it matters).  It seems to me that any of these graphs illustrates the efficiency and effectiveness of the Obama campaign.
     Stand right here.  Hold your hand over your left eye.  Now, squint.  No, tilt your head back.  See it?  See it?  Stand on one leg.  Lean to the right.  Hold this piece of paper over your right eye.  See it?  No?  Try this.....
RULE    a principle or regulation governing conduct, action, procedure, arrangement, etc.: the rules of chess.

to be superior or preeminent in (a specific field or group); dominate by superiority; hold sway over: For centuries, England ruled the seas.

According to the rules, Obama rules.


We have nothing to fear but fear itself. And clowns.
by haremoor on Sun May 25, 2008 at 11:55:36 AM EST

Re: Using Statistics to Mislead (2.00 / 8)



Lost rate and rec for issuing a '1' to a trollish comment. The troll, not so much.

by map on Sun May 25, 2008 at 11:59:58 AM EST

Re: Using Statistics to Mislead (2.00 / 4)

Oh.... my mistake.   That wasn't misleading.



Lost rate and rec for issuing a '1' to a trollish comment. The trol