A paper titled "WP:Clubhouse? An exploration of Wikipedia’s gender imbalance", to be presented next month at WikiSym 2011 by a team from the University of Minnesota, was posted online on August 11. The team of seven researchers became interested in the imbalance after the January 31 New York Times front-page article on Wikipedia's gender gap (see earlier Signpost coverage: January 31, February 7) and sought a more data-driven analysis of the issue, as opposed to the by now traditional "here is a random 'male'-article, here is a 'female'-article, they are different lengths" approach.
Accompanied by a press release and audio/video summaries from the university, the paper has been widely covered by external media sources—see In the news.
The study confined itself to editors who self-disclosed their gender via a userbox on their user pages or through their user preferences. As the paper notes, this may have introduced a bias, and the gender as self-reported by users (and in particular vandal accounts) may not always reflect the truth.
Area | Percentage of women editing |
People | 10.7% |
Arts | 10.4% |
Philosophy | 8.3% |
Religion | 7.1% |
Health | 7.1% |
History | 6.7% |
Science | 5.2% |
Geography | 3.7% |
After the Wikimedia Board of Trustees last week published a letter threatening to withdraw direct funding from those chapters that do not conform to a number of criteria, including expectations on transparency, most discussion on the matter was on the internal-l mailing list, a private list now used for WMF-chapter communications (see also last week's "News and notes"). The news came just weeks after new fundraising agreements had been signed with several chapters, which require them to submit a budget to the WMF to have access to the funds. According to Wikimedian David Gerard, "quite a lot" of chapters complained about aspects of the letter, while none enthusiastically welcomed it. This week the discussion spilled over into the public mailing list, foundation-l, opening it up to the wider Wikimedian community, who responded with a number of viewpoints.
“ | There is no desire or agenda to take away power and autonomy from chapters. But there is a strong moral duty to note that financial controls, reporting requirements, transparency, and evaluation of effectiveness are always at the top of our agenda. | ” |
— Jimmy Wales, writing on foundation-l |
Some were critical of chapters' apparent resistance to the pro-transparency message. "What chapters seem to want is for the WMF to sign over the trademarks they need to do their own fundraising, and then simply hand over a portion of the WMF's own revenue on top of that. ... there's nothing particularly 'normal' or 'fair' about it" wrote Kirill Lokshin, an arbitrator on the English Wikipedia. Nathan agreed that the Foundation's position is understandable, noting that it has responsibilities to donors, said that "any misuse of funds by a chapter using Wikimedia marks would reflect back on the Foundation", anyway. "At least criteria are to be put in place now [which is better] than never. For chapters in good order they should not be an issue", wrote FT2.
There was also sympathy for the chapters. "Being on the board of a small nonprofit organization is both incredibly fun and rewarding and also totally not fun and thankless" commented Wikipedia founder Jimmy Wales. Wikimedia Australia president John Vandenberg had numbers to show that chapters are influential in driving fundraising (and hence in supporting the Foundation itself), wrote David Gerard. Wikimedia UK member Chris Keating and French Wikimedian Anthere agreed with the sentiment that chapters are valuable institutions in terms of both fundraising and their ability to provide "local partnerships with institutions they know about". Likewise, Jimmy Wales added that he believes chapters should be "innovative, creative, and independent".
As a result, some of the pro-chapter support spilled over into direct criticism of the WMF Board's methods, if not their aims. For example, Gerard described the letter and its aftermath as representing "a potentially catastrophic failure of volunteer liaison". BirgitteSB went further, suggesting that attempts to centralise control over chapters could suppress their diversity. Among the solutions suggested were "a simple and non-controlling framework of accountability and responsibility" (Jimmy Wales) and a "well-developed grants program" that would prioritise the retention of low overheads (Phoebe Ayers).
At The Amaz!ng Meeting 2011 (an annual US conference on science, skepticism, and atheism), Susan Gerbic gave a talk on "guerrilla skepticism on Wikipedia and how important that is as skeptics for us to get the message out there". She suggested that skeptics should seek to redress a perceived imbalance in the presentation of the skepticism–religion divide on Wikipedia.
Despite assurances from Gerbic that "it's not vandalism, which it kinds of sounds like, because we are totally following the rules", concern has already been expressed that editors may attempt to give otherwise neutral articles a pro-skeptic slant. Although in the past there have been crackdowns on religious POV-pushing (most notably the Scientology arbitration case), Gerbic was clear that what has been left behind is not sufficiently pro-skeptic, describing the "skeptical content" on Wikipedia as "not very good". A YouTube video of Gerbic's talk and an accompanying blog post are available.
In unrelated news, Foundation for Apologetic Information & Research (FAIR), a non-profit organization that specializes in Mormon apologetics, has said they intend to be more active in Mormonism topics on Wikipedia. Church News, an authorized news site of the LDS Church, carried complaints from a FAIR sponsored conference that evangelical Christian editors (who have different religious beliefs) have "taken editorial control over several high-profile LDS articles" and that "if you show up on one of those articles, you will very likely, with 99 percent probability, have your edits reverted". The Deseret News, an LDS Church owned newspaper, had already touched the subject earlier this year (Signpost coverage: "Mormon newspaper examines struggles about Mormon topics on Wikipedia").
Building up steam after last week and boosted by the release of a research paper on the topic (see Signpost article), many new outlets have covered the Wikipedian gender-gap issue. Discover Magazine reported with tongue in cheek that "Wikipedia’s a sausage fest, study says", while the Hindustan Times reported that "Wiki is an all male reality", but did not expand greatly on previous coverage. This week, "Men call the shots on Wikipedia, say researchers" in TG Daily had probably the most detailed coverage of any news outlet, including research figures and quotes from several researchers to add to its coverage. By contrast, KSTP-TV included only the bare bones article "Researchers say Wikipedia has gender bias", focussing solely on research conducted by the University of Minnesota (which put female participation at 16% of editors).
The A.V. Club Chicago, the Chicago-specific branch of the American entertainment website The A.V. Club, has published a humorous analysis of the 226 English Wikipedians who identify as coming from the city. "Chicago's Wikipedians: a look at the people you've probably plagiarized" goes through the user pages of these Wikipedians ("it’s crazy to think that random human volunteers [could build] Wikipedia into the hulking hive of not-citable knowledge it is today") to build up a picture of the average editor from Chicago. It did acknowledge that it was relying on how users described themselves to be accurate, and that registered users were a representative sample of the body of editors that work out of Chicago, since it could not easily determine which anonymous editors it could reasonably include.
Starting off with the discovery that among these 226 editors are "a filmmaker, a cartographer, a financial engineer, a handful of Russians, a schizophrenic, and a gay pastor in the United Church Of Christ", the article continues by confirming many of the biases in editor composition that have hit headlines over the years. Of the editors sampled, 96% of those who stated a gender identified as male, whilst of those who gave a statement of their religious views, Christianity was by far the most common. Despite being a humorous take on editor composition, the article still reserved praise for the editorship. "Thirty percent of those who list their education are still in school... But before people freak out about using a high schooler’s handiwork on college research papers... [these are] exceptionally bright kids, many of whom make very specific contributions to topics they truly seem to get." On a more humorous note, the article concluded that editing Wikipedia is "a learning experience! And what kind of Britannica-spooning encyclo-scrooge could deprive youngsters of that?"
This week we returned to the always energetic WikiProject Oregon. When we first interviewed them in June 2009, the project was proud of their successful collaboration system, a WordPress blog, and their tendency to blur Wikipedia with real life. While some of these efforts have slowed a bit recently and several project members have found themselves living far from their Oregon roots, the project has nonetheless continued to foster discussion and churn out Good Articles. The project is currently home to 22 Featured Articles, 4 Featured Lists, and 88 Good and A-class articles. We interviewed Peteforsyth (Pete), tedder, Aboutmovies, and Jsayre64. Pete even offered to write an introduction for this article, so I'll turn it over to the experts:
Last December, WikiProject Oregon created its 10,000th article, Spruce Production Division, which was displayed on the Main Page as a DYK at the end of December and became a Good Article in February. Share with us the story behind this article. Was its incubation different from most articles created on Wikipedia?
One unique feature about WikiProject Oregon is the amount of collaboration that occurs in the real world. Project members meet for "Wiki Wednesdays" each week and three RecentChangesCamp unconferences have been hosted in Portland. What do you do at these meetups? Is there something unique about Oregon that fosters these kinds of offline collaborations? Would this work in other states?
When we first interviewed WikiProject Oregon, we were amazed by the project's double collaboration of the week system. Is the collaboration still as strong today as it was then? Are there any limitations to the collaboration of the week/fortnight/month concept?
The Oregon Portal is a Featured Portal. How much effort goes into building and maintaining a Featured Portal? What role does the portal serve as a component of WikiProject Oregon?
Does WikiProject Oregon collaborate with any other projects? Have you considered annexing the wayward WikiProject California?
What are the project's most pressing needs? How can a new contributor help today?
Anything else you'd like to add?
Next week's article will be very animated. Until then, draw your own conclusions in the archive.
Reader comments
Five lists were promoted:
One topic was promoted:
No articles were promoted to featured status last week.
The request for arbitration submitted nearly two weeks ago for user conduct issues related to Abortion-related articles has now been accepted. Nineteen users are involved parties.
The request was submitted by Steven Zhang after formal mediation failed to produce results. He stated there were some remaining content issues involved in the dispute (the titles of abortion-related articles are a focal point of the issue), but stressed that user conduct is the impediment to progress. MastCell agreed, writing that "the underlying problem isn't what to call these articles ... There's clearly no One True Naming Convention for the pro-choice/pro-life articles. The real problem is the unreasoning intransigence with which the naming dispute has been litigated."
Arbitrator response was tepid at first, with a smattering of comments, opposes, and recusals; but in the end the case was accepted, with five arbitrator supports, one oppose, and three recusals.
Sven Manguard was the first to make it to the evidence page, presenting a plea that the "canvassing" attempts in the earlier steps in the dispute resolution process—where most users reacted poorly to being brought in—be avoided this time.
As of time of writing, seven editors have submitted evidence, related to issues as diverse as page moves, user conduct, and image selection.
Will Beback joked that the lack of submissions to the case "may set a record"; as it stands, four users have submitted evidence in the last week:
The case workshop was much more active, with concerns about scope. This was followed by an unsuccessful attempt to have the case closed. Arbitrator David Fuchs acknowledged "the frustration of parties who aren't exactly sure how this is being cleaved or what's being dealt with; this case has suffered from ... everyone yelling about everything else and being a badly-framed request with lots of people wanting lots of things", but said he didn't "think it's pointless to develop a sort of 'best principles' result without sanctions—given that this is a novel approach anyhow". Newyorkbrad is the drafting arbitrator. Mirroring Fuchs' comments, he agreed that "it may be that this case winds up with a reaffirmation of general principles, and guidelines for dispute resolution when there are allegations those principles have been violated, rather than with findings and sanctions against particular editors. ... But we will see what other evidence comes in, and then as drafter I anticipate being able to do something useful with the case, even though I was not the biggest proponent of splitting the request into two cases precisely as was done."
On 9 August, the deadline for submitting evidence to this case was extended to 15 August. In doing so, drafter Roger Davies suggested that the deadline may have to be pushed back further to allow "time to respond to new evidence submitted". As of time of writing, nine editors have submitted on-wiki evidence.
Reader comments
The question of how easy it is to "fork" Wikimedia wikis, or, indeed, to merely mirror their content on another site, was posed this week on the wikitech-l mailing list by Wikimedian David Gerard. The concept is also related to that of backups, since a Wikipedia fork could provide a useful restore point if Wikimedia server areas were affected by simultaneous technical failure, such as that caused by a potent hacking attempt.
During the discussion, Lead Software Architect Brion Vibber suggested that the Wikimedia software setup could be easily recreated, as could page content. Instead, he said, the major challenge would lie in "being able to move data around between different sites (merging changes, distributing new articles)", potentially allowing users of other sites to feedback improvements to articles whilst also receiving updates from Wikimedia users. So far, at least one site (http://wikipedia.wp.pl/) has been successful in maintaining a live copy of Wikimedia wikis, lagging behind the parent wiki it tries to mirror by only minutes. No site has yet implemented an automated procedure for pushing edits made by its users upstream to its parent wiki, however. Other contributors suggested that few external sites would have the facility to host their own copy of images, and keeping in line with Wikimedia's strict policy on attribution.
In unrelated news, there were also discussions about making pageview statistics more accessible to operators of tools and apps (also wikitech-l). In particular, the current reliance on the external site http://stats.grok.se to collate data was noted. As MZMcBride wrote, "currently, if you want data on, for example, every article on the English Wikipedia, you'd have to make 3.7 million individual HTTP requests to [the site]".
Although hampered by a lack of data points, anecdotal evidence collected over the past fortnight pointed to a slowdown in the speed of uploading files to Wikimedia wikis. The problem therefore made mass API uploading very difficult, and, as a result, a bug was opened. "An upload that should take minutes is taking hours", wrote one commenter. Another pinpointed Wikimedia servers as the bottleneck: during a test, uploads to the Internet Archive had been over ten times quicker. As it became clear that the problem was affecting a large number of users and data collected seemed to show a dramatic decrease in upload speeds earlier this year, significant resources were devoted to the issue. WMF technicians Chad Horohoe, Roan Kattouw, Sam Reed, Rob Lanphier and Asher Feldman have all worked on the problem.
Once the upload chain was determined as "User → Europe caching server → US caching server → Application server (Apache) → Network File System → Wikimedia server MS7", members of the operations team worked to profile where the bottleneck was occurring. Unfortunately, an error introduced by the profiling meant that uploads were in fact blocked for several minutes. Then, on 12/13 August, the problem was pinpointed and fixed: a module for helping optimise network connections, Generic Receive Offload (GRO), had in fact been slowing them down. According to WMF bugmeister Mark Hershberger, smaller data packets were being collated into much larger ones. The new packets were then too large to be handled effectively by other parts of the network infrastructure. Although there are still some reports of slowness, test performance has increased by a factor of at least three. In the future, more data on upload speed is likely to be collected to provide a benchmark against which efficiency can be tested.
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.
This week, the Foundation's Rob Lanphier reiterated that the Foundation is having problems hiring a new Data Analysis engineer and a software developer. Know someone who might be interested? Link them to the details.