I've found a flaw in my analysis.
It turns out that I ignored most posts with zero comments. The results including those posts significantly change the picture. I'm not sure that my theory is wrong, but I need to do some more analysis in the database. There's a really good chance I was thinking too fast.
That's a good story, but how can we test it? A split test where group A sees comments as they are now and for group B all comments are hidden would probably do the trick. If people in the test group vote (up or down) more often that people in the control group, we can be pretty certain that displaying comments is a drag on our reputation-based economy.
Short of doing an experiment, we can check the database:

The x-axis is number of comments and the y-axis is score. I've limited the number of comments at 5 since that's the maximum we display and I'm only looking at positively scored posts. (For the curious, I've selected posts to be those with ViewCount > 0 because during one test I tried dividing score by views and got a divide by zero error. I haven't investigated that yet.)
One way to interpret the graph is that showing even one comment costs the author of a good post an average of 1.75 in score and the second comment reduces voting by a tiny bit more. The third and subsequent comments seem to give some of that back, but likely that's because both commenting and voting are a function of views. When you expand the graph to look at longer comment threads, I think the trend is clear:

Beyond about 15 comments, the data gets messy since there are so few posts that attract that much attention. And those posts tend to be the very best on the site and naturally get upvoted often as well. Since views include both active users and people who find the post via Google (and can't vote) factoring in ViewCount does not clear up the picture.
Now what?
I can think of a few alternate explanations (good posts get fewer comments or comments are more likely to be purged on good posts). A split test should be able to disprove the theory that comments cost votes. If we see evidence of causation and not simply correlation, we should consider displaying comments harmful because they stealing votes. Any algorithm for displaying comments should err on the side of hiding comments. If the test shows no behaviour difference, than we can't safely change our system until we understand the dynamic between comments and voting better.
Appendix
To address the comments:
If your theory is correct, then long answers should also cost votes because by the time people get to the end they're far away from the voting buttons. Does the data say anything about that? – Monica Cellio
I assumed that one line of a post is about 50 characters. That may or may not be accurate since some lines are much shorter (specifically code and embedded images). Mostly I had to bin the data and that's as good a way as any. Here's how body length correlates to score (assuming a positive score):

As you can see, when a oneliner answer is any good, it's much more likely to be upvoted than longer answers. Not all sites show this pattern. On English.SE, short answers are much less valued on the whole than longer ones:

I happen to think length is correlated with quality in general. Great oneliners are common on Stack Overflow, but are much harder to pull off on sites such as ELU. However, I'm not sure we are looking at the same effect. The real problem with comments in terms of post voting is that reading comments causes a mental context switch. The dropoff of score between 0 and 1 comment seems to hold on every site I've checked.
Does it matter if comments cost people reputation? I'm not sure I care, though lower reputation users might more. – ben is uǝq backwards
Yes. I see voting as the currency of the Stack Exchange economy. New users need votes in order to be able to do things on the site. Equally concerning is that votes are the primary measure of quality of posts on the site. If fewer people vote on answers, it's harder for readers to separate good answers from poor answers. We spend a considerable amount of effort making sure people vote fairly, so if we have a systematic bias in our system, it would be good to know about.
I'm not sure that I agree with your analysis of the relationship between comments and votes. A question that needs clarification often gets a comment. All custom close messages are comments. Trying to point out where something is wrong in an answer (often in the critical early visibility) is a comment. All of these are issues that cost votes and comments are the symptoms of the problem (low vote scores are also a symptom), not the causes. – MichaelT
There are several comments that make more or less the same point. Determining cause and effect is often quite difficult when querying a database. The purpose of looking at past data is to see if there's any indication that a split test is worth while. I'm not at all sure what the results of such a test might be.
What is worth noting on this point is that if we were certain that comments pointed to legitimate problems with a post, than I'd agree that comments are more likely to be symptoms of the problem than causes. In the course of examining comment quality I found that comments can be divided into three categories:
- Valuable meta-information about a post.
- Suggested edits, answers, or new questions.
- Could be flagged "not constructive", "obsolete", and "too chatty".
Quite a few of the third category are ought to correlate with increased score ("+1", "Thanks for the answer.", etc.). Without measuring user interaction directly, we can't really know what the effect of hiding comments will be.