Men beat women in self-citations

Does it really imply some foul play?

There are still tons of highly annoying, bogus, feminist propagandist texts about the "subtle discrimination of women in STEM", such as this today's article by a fat Indian American feminist, but I chose another one that was released on the first day of August. The Washington Post has printed a weird story about "women in science":

New study finds that men are often their own favorite experts on any given subject
which builds on a preprint by Molly, Shelley, Jenny, and two co-authors who are nominally male:
Men set their own cites high: Gender and self-citation across fields and over time
Christopher Ingraham's article in the Washington Post starts with a picture of a self-confident man pointing to himself. The main message is that men cite themselves (the same person) more often than women do – some "index" is 1.5-1.7 times higher for men than for women – so it proves that there is something unfair about men's behavior or some discrimination against women or something like that.

Is it true?

Assuming that you're not a complete moron, once you think about this question at least for 5 seconds, you must immediately realize: Men have contributed way over 90% of insights to science (and technology) so they're surely more likely to be cited even without any foul play, right? Molly and pals surely can't talk about the overall number of self-citations because the ratio would have to be closer to 10-to-1 because the citation count of men beats the overall citation count of women by a similar clear factor and the two ratios shouldn't be too different.

Note that between 1779 and 2011, they find out that about 10% of citations are self-citations. There are various natural null hypotheses in which this fraction – 10% – should survive even if you truncate the dataset in certain (but not all) ways.

So clearly, they must have considered some ratios to compare the men and women, right? And indeed, they did. And the ratio removes the basic huge and obvious asymmetry – the fact that scientists are mostly men. But there are still very different ratios that one could consider. A slightly sensible ratio that could be correlated with "foul play" would be the fraction of a researcher's collected citations that are self-citations.

Let me say that even these "smarter" ratios wouldn't prove foul play. Men could be more likely to self-cite because their work could be more focused and original while the women's work could be more derivative and jumping from one place to another. So even if they collected the same number of citations, 1 man and 1 woman could have very different numbers of self-citations even if all the citations were completely meritocratic.

But what Molly, Shelley, Jenny et al. have done is vastly more stupid. It is just spectacularly stupid. They have counted the number of citations per authorship. So their observation is simply that in an average paper with a male (co-)author, there is a greater number of references pointing to older papers written by the same (co-)author than in a paper with a female co-author.

Holy crap: The "alternative" explanation to their "discrimination" is that
men are simply more likely to write a sufficiently ground-breaking paper that will have to be cited.
When they do so, they are forced to cite themselves, too. When Einstein writes a followup paper that depends on relativity, which woman should he credit for relativity? Maybe Mileva but you need to believe some conspiracy theories for that conclusion. Moreover, Mileva hasn't written any papers. He must cite himself because he's Einstein, he's the guy who found relativity. And a large majority of the authors of important papers were men. This makes men more likely to be cited – and also more likely to be cited by themselves.

In the previous paragraph, I wrote the word "alternative" inside quotation marks because this adjective is really silly. What I wrote isn't really an alternative explanation of the observation. It's the default, obvious, first explanation that a sensible person must offer. The claim that a higher number of citations proves some foul play is a highly alternative and in this case utterly idiotic conspiracy theory. It's analogous to the policy of arresting Olympic athletes as soon as they win the race and arrive to receive their medal. They must have taken banned substances or something else. Well, couldn't they have won because they are better? Haven't you thought of that?

The Washington Post partly does admit – in a sentence that is hidden near the end of the article and almost invisible – that the gap may have legitimate reasons. But the only legitimate reason they mention is that 1 male researcher usually writes a greater number of papers than 1 female researcher, especially in some part of the career (their preferred example is when women are often get pregnant etc.). So with a (somewhat reasonably) fixed fraction of his or her previous papers that are self-cited, it's obvious that the male researcher will create a greater number of self-citations than his female counterpart, too.

If the probability is \(p\) to cite any given (average) previous paper of yours, the \(k\)-th paper will contain approximately \(pk\) self-citations. Note that \(pk\) is proportional to \(p\) which could be the same for men and women but it's also proportional to \(k\) so obviously a people with a greater number of papers will have a higher number of self-citations, too.

So the whole "effect" that Molly, Shelley, Jenny et al. "discovered" could easily be explained by the larger number of papers that one male researcher writes. So what's your justification for this picture of an arrogant self-loving male? But there are several other totally different possible attributions and they do virtually no attempt to find the correct attribution of the "effect" to the possible "causes". Despite this fact, the references to "discrimination" and "what to do about it" are written even in the very abstract of their preprint.

If you think about it, there are really lots of additional "totally legitimate" differences between men and women that may contribute – and probably do contribute – to the ratio of the indices 1.5-1.7 showing an asymmetry between men and women. The gender gap between "papers per researcher" is a really obvious cause. But even if you had a collection of male and female researchers who write the same total number of papers per person; and who collect the same total number of citations, there could still be totally justifiable reasons why the men could have a higher number of self-citations. Men could be more focused and more patient than women, and so on.

You may also imagine that the self-citations are somewhat analogous to "leveraging" in trading. It's a strategy to make many other people work on that theory or problem. But a trader with a higher "leveraging" isn't necessarily a worse or less fair trader. It's just a different strategy.

It's obvious that there are many differences between men and women that are either positively or negatively correlated with their "self-citation count per authorship". A basic observation is that the total effect of all these differences obviously implies that as a group, men are better in research than women – because they self-evidently are. So whenever you find some interpretation of the differences that sounds in the opposite way, it's either due to a mistake of yours or it is a cherry-picked, not so important exception.

Molly, Shelley, Jenny, and their would-be male co-authors have written a piece of atrocious pseudoscience. The only point you may be certain about is their bias and their ideological goal to say something bad about men in science relatively to women in science, to hurt men and (less importantly) help women, to make men feel guilty, to demonize, stigmatize, or criminalize men, at least those who realize that feminism is a pile of feces, and perhaps even the intelligent women who realize that feminism is a pile of feces. The fact that all the logic is absolutely wrong and the argumentation is complete crap doesn't bother them at all. These days, crooks like that always find a fellow scumbag in some Washington Post who will write a positive review of their absolutely atrocious piece of pseudoscientific crap. In a sane world, they would be immediately fired and beg on the street. We don't live in a sane world.

I have argued that the actual main reason of the gap they have found is that men are simply better in science than women, even after the selection, and this key difference isn't really removed from the quantity that they want to interpret morally.

Note that even if you pick just the people above some threshold to be scientists (imagine people with IQ above 125) so that the number of men will already exceed the number of women, it will still be true that the average citation count and success of this selected subgroup of people will still show an advantage for men. Why? Because the distribution for women is already quickly decreasing near the threshold – so most of the women will be just slightly above the threshold – while there will be many men "way above the threshold" who will improve the score even after the selection.

In fact, the "selection of the research community by the fair gender-blind threshold" actually makes the ratio of the average man's citations and average woman's citation higher than what you would get without any filter at all!

But even if it weren't true that "men are better in science than women" wouldn't be the reason of their observed gap in some ensemble of male and female researchers and even if one could prove that the gap is due to the men's egotist, self-confident, macho attitude, well, it would still not imply that men are fundamentally worse than what it looks like. Why? Simply because the egotist, self-confident, macho attitude is often absolutely essential for progress in science – and in many other human activities. The history of science offers very many examples.

The obsession of feminist and similar ideologically driven crooks similar to Molly, Shelley, and Jenny – and their nameless male assistants (note that I am improving the visibility of the female co-authors, they should appreciate me for that) – with attempts to deny the obvious, namely that men have been and will remain the drivers of an overwhelming majority of the scientific and technological progress, not to mention other types of progress, is absolutely stunning. If Molly, Shelley, and Jenny dedicated the same efforts to laundry instead of quantitative science – something they have no chance to master – the clothes would be really nice and clean.

