The variability hypothesis, also known as the greater male variability hypothesis, is the hypothesis that males generally display greater variability in traits than females do.
It has often been discussed in relation to human cognitive ability, where some studies appear to show that males are more likely than females to have either very high or very low IQ test scores. In this context, there is controversy over whether such sex-based differences in the variability of intelligence exist, and if so, whether they are caused by genetic differences, environmental conditioning, or a mixture of both.
Sex-differences in variability have been observed in many abilities and traits –– including physical, psychological and genetic ones –– across a wide range of sexually dimorphic species.
The notion of greater male variability — at least in respect to physical characteristics — can be traced back to the writings of Charles Darwin. When he expounded his theory of sexual selection in The Descent of Man and Selection in Relation to Sex, Darwin cites some observations made by his contemporaries. For example, he highlights findings from the Novara Expedition of 1861–67 where "a vast number of measurements of various parts of the body in different races were made, and the men were found in almost every case to present a greater range of variation than the women" (p. 275). To Darwin, the evidence from the medical community at the time, which suggested a greater prevalence of physical abnormalities among men than women, was also indicative of men's greater physical variability.
Although Darwin was curious about sex differences in variability throughout the animal kingdom, variability in humans was not a chief concern of his research. The first scholar to carry out a detailed empirical investigation on the question of human sex differences in variability in both physical and mental faculties, was the sexologist Havelock Ellis. In his 1894 publication Man and Woman: A Study of Human Secondary Sexual Characters, Ellis dedicated an entire chapter to the subject, entitled “The Variational Tendency of Men”. In this chapter he posits that “both the physical and mental characters of men show wider limits of variation than do the physical and mental characters of women” (p. 358). Ellis documents several studies that support this assertion (see pp. 360–367), and
The publication of Ellis's Man and Woman led to an intellectual dispute about the variability hypothesis between Ellis and the renowned statistician Karl Pearson, whose critique of Ellis's work was both theoretical and methodological. After Pearson dismissed Ellis's conclusions, he then "presented his own data to show that it was the female who was more variable than the male" Ellis wrote a letter to Pearson thanking him for the criticisms which would allow him to present his arguments "more clearly & precisely than before", but did not yield his position regarding greater male variability.
Support for the greater male variability hypothesis grew during the early part of the 20th century. During this period, the attention of researchers shifted towards studying variability in mental abilities partly due to the advent of standardised mental tests (see the history of the Intelligence quotient), which made it possible to examine intelligence with greater objectivity and precision.
One advocate of greater male variability during this time was the American psychologist Edward Thorndike, one of the leading exponents of mental testing who played an instrumental role in the development of today's Armed Services Vocational Aptitude Battery ASVAB. In his 1906 publication Sex in Education, Thorndike argued that while mean level sex differences in intellectual ability appeared to be negligible, sex differences in variability were clear. Other influential proponents of the hypothesis at this time were psychologists G. Stanley Hall and James McKeen Cattell. Thorndike believed that variability in intelligence could have a biological basis and suggested that this could have important implications for achievement and pedagogy. For example, he postulated that greater male variation could mean "eminence and leadership of the world's affairs of whatever sort will inevitably belong oftener to men." In addition, since the number of women that fall within the extreme top-end of the intelligence distribution would be inherently smaller, he suggested that educational resources should be invested in preparing women for roles and occupations that require only a mediocre level of cognitive ability.
See also: Leta Hollingworth
By examining the case records of 1,000 patients at the Clearing House for Mental Defectives, Leta Hollingworth determined that, although men outnumbered women in the clearing house, the ratio of men to women decreased with age. Hollingworth explained this to be the result of men facing greater societal expectations than women. Consequently, deficiencies in men were often detected at an earlier age, while similar deficiencies in women might not be detected because less was expected of them. Therefore, deficiencies in women would be required to be more pronounced than those in men in order to be detected at similar ages.
Hollingworth also attacked the variability hypothesis theoretically, criticizing the underlying logic of the hypothesis. Hollingworth argued that the variability hypothesis was flawed because: (1) it had not been empirically established that men were more anatomically variable than women, (2) even if greater anatomical variability in men were established this would not necessarily mean that men were also more variable in mental traits, (3) even if it were established that men were more variable in mental traits this would not automatically mean that men were innately more variable, (4) variability is not significant in and of itself, but rather depends on what the variability consists of, and (5) that any possible differences in variability between men and women must also be understood with reference to the fact that women lack the opportunity to achieve eminence because of their prescribed societal and cultural roles. Additionally, the argument that great variability automatically meant greater range was criticized by Hollingworth.[how?]
In an attempt to examine the validity of the variability hypothesis, while avoiding intervening social and cultural factors, Hollingworth gathered data on birth weight and length of 1,000 male and 1,000 female newborns. This research found virtually no difference in the variability of male and female infants, and it was concluded that if variability "favoured" any sex it was the female sex. Additionally, along with the anthropologist Robert Lowie, Hollingworth published a review of literature from anatomical, physiological, and cross-cultural studies, in which no objective evidence was found to support the idea of innate female inferiority.
The 21st century has witnessed a resurgence of research on gender differences in variability, with most of the emphasis on humans. The results vary based on the type of problem, but some recent studies have found that the variability hypothesis is true for parts of IQ tests, with more men falling at the extremes of the distribution. Publications differ as to the extent and distribution of male variability, including on whether variability can be shown across various cultural and social factors.
A 2007 meta-analysis found that males are more variable on most measures of quantitative and visuospatial ability, making no conclusions of its causation.
A 2008 analysis of test scores across 41 countries published in Science concluded that "data shows a higher variance in boys' than girls' results on mathematics and reading tests in most OECD countries", the results implying that "gender differences in the variance of test scores are an international phenomenon". However, it also found that several countries failed to exhibit a gender difference in variance.
A 2008 study reviewed the history of the hypothesis that general intelligence is more biologically variable in males than in females and presented data which the authors claim "in many ways are the most complete that have ever been compiled [and which] substantially support the hypothesis".
A 2009 study in developmental psychology examined non-cognitive traits including blood parameters and birth weight as well as certain cognitive traits, and concluded that “greater intrasex phenotype variability in males than in females is a fundamental aspect of the gender differences in humans”.
Recent studies indicate that greater male variability in mathematics persists in the U.S., although the ratio of boys to girls at the top end of the distribution is reversed in Asian Americans. A 2010 meta-analysis of 242 studies found that males have an 8% greater variance in mathematical abilities than females, which the authors indicate is not meaningfully different from an equal variance. Additionally, they find several datasets indicate no or a reversed variance ratio.
A 2014 review found that males tend to have higher variance on mathematical and verbal abilities but females tend to have higher variance on fear and emotionality; however, the differences in variance are small and without much practical significance and the causes remain unknown. A 2005 meta-analyses found greater female variability on the standard Raven's Progressive Matrices, and no difference in variability on the advanced progressive matrices, but also found that males had a higher average general intelligence. This meta analysis, however, was criticized for bias by the authors and for poor methodology.
A 2016 study by Baye and Monseur examining twelve databases from the International Association for the Evaluation of Educational Achievement and the Program for International Student Assessment, were used to analyse gender differences within an international perspective from 1995 to 2015, and concluded, "The 'greater male variability hypothesis' is confirmed." This study found that on average, boys showed 14% greater variance than girls in science, reading, and math test scores. In reading, boys were significantly represented at the bottom of score distribution, whereas for maths and science they featured more at the top.
The results of Baye and Monseur have been both replicated and criticized in a 2019 meta-analytical extension published by Helen Gray and her associates, which broadly confirmed that variability is greater for males internationally but that there is significant heterogeneity between countries. They also found that policies leading to greater female participation in the workforce tended to increase female variability and, therefore, decrease the variability gap. They also point out that Baye and Monseur had themselves observed a lack of international consistency, leading more support to a cultural hypothesis.
A 2018 meta analysis of over 1 million individuals failed to find consistent evidence for greater male variability, concluding that "Simulations of these differences suggest the top 10% of a class contains equal numbers of girls and boys in STEM, but more girls in non-STEM subjects." However, in the analysis the datasets from universities were discarded as they were considered possibly biased and insufficient. Instead only data from schools was used for the analysis.
In October 2020, with respect to brain morphometry, researchers reported "the largest-ever mega-analysis of sex differences in variability of brain structure"; they stated that they "observed significant patterns of greater male than female between-subject variance for all subcortical volumetric measures, all cortical surface area measures, and 60% of cortical thickness measures. This pattern was stable across the lifespan for 50% of the subcortical structures, 70% of the regional area measures, and nearly all regions for thickness." The authors emphasize, however, that this has of yet no practical interpretive meaning, says nothing on causation, and requires further examination and replication.
In 2021, two meta-analyses on preference measurement in experimental economics find strong evidence for greater male variability for cooperation (variance ratio: 1.30, 95% CI [1.22, 1.38]), time preferences (1.15, [1.08, 1.22]), risk preferences (1.25 [1.13, 1.37]), dictator game offers (1.18 [1.12, 1.25]) and transfers in the trust game (1.28 [1.18, 1.39]).
A 2022 analysis of a large database on energy expenditure in adult humans found that "even when statistically comparing males and females of the same age, height, and body composition, there is much more variation in total, activity, and basal energy expenditure among males".
The variability hypothesis has continued to spur controversy within academic circles.
In a 1992 paper titled "Variability: A Pernicious Hypothesis," Stanford Professor Nel Noddings discussed the social history which she argued explains "the revulsion with which many feminists react to the variability hypothesis."
One of the most prominent incidents occurred in 2005 when then Harvard President, Larry Summers, addressed the National Bureau of Economic Research Conference on the subject of gender diversity in the science and engineering professions, saying: "It does appear that on many, many different human attributes –– height, weight, propensity for criminality, overall IQ, mathematical ability, scientific ability –– there is relatively clear evidence that whatever the difference in means –– which can be debated –– there is a difference in the standard deviation, and variability of a male and a female population." His remarks caused a backlash; Summers faced a non-confidence vote from the Harvard faculty, prompting his resignation as President.
In a similar incident in 2017, Google software engineer James Damore was fired immediately after posting an internal memo on diversity (see Google's Ideological Echo Chamber) suggesting possible innate biological factors including greater male variability to help explain the underrepresentation of women in hi-tech jobs.
That same year, a mathematics research paper presenting a possible evolutionary explanation for the variability hypothesis was peer-reviewed, accepted, and formally published in The New York Journal of Mathematics. Three days later that article was suddenly removed without explanation or discussion, and replaced by an unrelated article by different authors. This caused widespread debate in the scientific community     and international publicity. 
((cite journal)): CS1 maint: unfit URL (link)