Mapping content gaps on Wikimedia

Collating information to help bridge content gaps




Wikipedia and other Wikimedia projects have large gaps in information, it is estimated that Wikipedia could cover over 100 million topics, currently English Wikipedia, the largest Wikipedia has 6,799,014 articles. There are very large gaps in many topics for example less than 20% of biographies on English Wikipedia are about women and non binary people.

But how do we map what subjects are missing from Wikimedia? One way is to collate existing sources; databases, reference sources, and document expert knowledge.




Section 1: Context




The value for Wikimedia

Some parts of Wikimedia are already doing this work, Women in Red is a project which writes about women and non binary people on Wikipedia and uses lists compiled by organisations and crowdsourcing to map who is missing from Wikipedia.


Collecting information from a wide range of sources and groups is extremely important to include as many topics and viewpoints as possible, for more information see Standpoint Theory. There are a wide range of people with expert knowledge including academics, professionals,, people with (tacit) lived experience, activists, policy makers etc. Sharing an overview of a topic area is an easy and quick way for people with knowledge to start contributing to Wikimedia projects. Note: not everyone who says they have in depth knowledge of a topic does e.g people who believe in conspiracy theories, Wikipedia:Reliable sources can provide some guidance.


Traditionally Wikipedia has found it hard to engage with experts to share information on Wikimedia projects, retention rates for in person training workshops are below 1%. Working with experts to compile topics provides an opportunity separates knowledge sharing from editing skills which are very often an unrealistic barrier to entry. It allows us to work with experts to quickly collect information on many topics in a way that can be used by Wikimedia projects in 300 languages.




The value for others

Experts

People are often very happy to work with Wikipedia to share information with Wikipedia, they recognise it is a valuable source of knowledge and understanding of their area of work.

People using Wikimedia as a resource

Compiling data can also help others who use Wikimedia in their work, e.g only 1/5th of experts interviewed in the media worldwide are women, could this change if journalists had an easy to access lists of all experts in specific areas? In turn this would make more reference sources available on women experts, allowing us to write about them on Wikipedia.




Examples

Working with experts

Several projects have been run to collate information from different groups with knowledge on a topic area. Each take different approaches and collect information in different ways based on the audience.

COVID related topics

Wikiproject COVID 19 Main messages

Wikimedians worked with UN agencies compile important messages, missing topics,, and reference sources related to the COVID pandemic. People with extensive subject matter expertise on a very fast moving topic which is very popular on Wikipedia 

Sexuality topics

Wikidata: Switched On: Working with experts to collate a worldwide database of sexuality topics

Wikimedians worked with experts on sexuality and sex education at a UN conference to map which sexuality topics and reference materials are missing from Wikimedia projects, creating a resource useful to both Wikimedia and for the experts to use and contribute to.

FindingGLAMs

Meta:FindingGLAMs

Wikimedia Sverige (Sweden), UNESCO and the Wikimedia Foundation are working to build a truly worldwide database of cultural heritage institutions and their collections on Wikipedia. They have worked with government delegations to UNESCO remotely to collate datasets on cultural heritage institutions.

Icelandic women

Wikidata:Icelandic Women

Wikimedians collaborated with Icelandic institutions to compile on Wikidata the most exhaustive list of Icelandic Women available online, adding over 1900 new women to Wikidata. This work automatically feeds into the work of Women in Red using Wikidata.

Working with a community

Mapping the open movement

Github:Mapping the open movement

Wikimedians ran a social media campaign to map organisations working in the open movement on Wikidata.



Section 2: Information




  1. Scope: approaches to mapping a topic
  2. Types of information: kinds of information to collate
  3. Identifying sources: which sources or people may hold the information




Scope

There are several ways to approach mapping a topic area and many valuable kinds of information to collect. Wikimedia projects have different notability rules so collecting all information available allows it to be useful for all Wikimedia projects.




Types of information

There are several kinds of information that can be mapped to help improve knowledge available on Wikimedia projects including:




Information sources

There are many ways to collect this information, it can be done both online and in person:




Section 3: Preparation




This section explains the process of collecting the information:

  1. Communication: creating messages when asking others to help compile the information.
  2. Ways to collect information: tools used to collect information
  3. Campaign / Workshop: Working with others to compile the information




Communication

When asking people to take part in sharing their knowledge it is important to be clear and make contribution easy.

Clarity

Ease of use




Ways to collect information

There are several options for collecting information:

Collaborative document

Online collaborative documents, like Google Docs and Google Sheets are an easier way for people to use than a Wiki page. Lots of online collaborative documents have used Google Docs and Google Sheets to collate information on a subject.

Survey

Another option is creating a form, like Google Forms, which people fill in individually feeds into a spreadsheet is good for getting structured feedback but is bad for repeated information and creating a sense of community and shared work.

Wiki page

Wiki pages are significantly more difficult and take more time to learn than other options and are only really realistic to use for existing Wikimedia community members or others who are familiar with using a wiki.




Section 4: Collecting information




This section describes the process of collecting information

  1. Option 1: Online collaboration: collating information online
  2. Option 2: Workshop: how workshops can be run with people with knowledge to collate information.




Option 1: Online collaboration

Coming soon

Examples of online or remote collaboration include FindingGLAMs and Mapping the Open Movement.

An offline version of this work can also be run as a pinboard or other interactive display at a conference or other physical event. See Wikidata:Switched On as an example.




Option 2: Workshop

A workshop can be run at a conference session or a stand alone event. It can also be mixed with other activities like sharing knowledge on a pinboard using post it notes. An example of this activity is the Switched On Conference session. Here is a suggested structure for a workshop:

  1. An introduction to Wikipedia, describing what Wikipedia is, its audience and how it fits into a large context eg Sustainable Development Goals. Often it is helpful to use Wikipedia as a catch all term to save time explaining Wikimedia vs Wikipedia.
  2. Examples of projects that can be done with Wikipedia, this included sharing text, images, data and supporting Wikipedia by hosting events and promoting initiatives. Also describing the value of collating data e.f the FindingGLAMs project which created a worldwide database of cultural heritage institutions as an example.
  3. A workshop to collate the expert's knowledge, asking people to share their knowledge; people, projects and organisations, books publications and databases, topics, videos, main messages etc. Ask people to write down answers to each topic on post it notes for several minutes and then everyone come to compile the information on a large sheet of paper where people can see the information growing as it was added.
  4. The messages can be displayed on a table to allow everyone at the conference to contribute their knowledge.




Section 5: Processing the information




This section outlines the steps to process and understand the information collected.

  1. Processing the information: transforming the information into a usable format
  2. Analysis: understanding patterns, mistakes and themes in the information collected.




Processing the information

Sort the information into categories or other ways of organising it to help others understand it and import it into a spreadsheet or list.




Analysis

It can be very helpful to analyse the information you’ve collected before sharing it more widely to understand if any issues have happened, any obviously missing or incomplete information. Include your notes on the information you’ve collected when sharing the information to help others to understand what to consider when using the information.


Example: Switched On workshop




Section 6: Sharing the information




  1. Making the information available: to the Wikimedia community and the people who share information.
  2. Helping people find the information: People can only use the information if people are aware of it
  3. Sharing your process: so other people can learn from it.




Making the information available

Once the information has been collected it can be imported into Wikimedia projects in many different ways, look at existing projects for ideas of how to use the data collected. It is also important to provide people with the information in a format that they can use, e.g don’t expect people to be able to use the Wikidata query service or have the time or interest to learn.




Helping people find the information

People can only use the information if people are aware of it and know where to find it tell people who can use it, places you can tell people that the information is available include:




Sharing your process

Sharing your process allows people to understand how you collected the information, what can be done to build on your work and to learn how to run their own projects.

Process

This includes information about who you are working with, the steps in the process you took and how and why you did them that way and sharing any resources you created for the work so others can use them.

Example: The Switched On workshop

The workshop was a conference session to educate people about Wikipedia's role in sexuality education and to crowd source people's knowledge on the topic. The presentation used in the workshop is available here. The workshop had four stages:

  1. An introduction to Wikipedia, describing what Wikipedia is, its audience and how it fits into the Sustainable Development Goals. Wikipedia was used as a catch all term to save time explaining Wikimedia vs Wikipedia.
  2. Examples of projects that can be done with Wikipedia from UN agencies, this included sharing text, images, data and supporting Wikipedia by hosting events and promoting initiatives. Also describing the value of collating data using the FindingGLAMs project to create a worldwide database of cultural heritage institutions as an example.
  3. A workshop to collate the expert's knowledge, asking people to share their knowledge; people, projects and organisations, books publications and databases, topics, videos, main messages. People were asked to write down answers to each topic on post it notes for several minutes and then everyone came to the front to compile the information on a large sheet of paper. People could see the information growing as it was added. Note: A workshop was run at a UNFPA event in Nairobi, Kenya which was a less developed version of this workshop, the results have been combined.
  4. The messages were displayed on a table to allow everyone at the conference to contribute their knowledge, around 20% of the information was added outside of the workshop.

Suggest improvements to the process

What did you learn from the work you did? What improvements could be made in the process, either generally or for working with a specific audience?

Example: The Switched On workshop

The most important improvements to make for the next version of the workshop are to explain the value of crowdsourcing the information of the knowledge in the room earlier in the presentation and asking people to write as clearly as possible.

Suggest follow on work

What could be done to build upon the work you’ve done to map this topic area? What has mapping this topic area highlighted as missing? What else could be done in this area?

Example: The Switched On workshop

400+ ideas and suggestions were created in half an hour by 20 people, this volume of information demonstrates the breadth of content available in this area. The majority of the people attending the conference were from Europe and North America meaning that local knowledge from other areas of the world was not included.