The Bank of English is a representative subset of the 4.5 billion words COBUILD corpus, a collection of English texts. These are mainly British in origin, but content from North America, Australia, New Zealand, South Africa and other Commonwealth countries is also being included.
The majority of the texts are from written English, collected from websites, newspapers, magazines and books, but there is also a large component of spoken data using material from radio, TV and informal conversations. The Bank of English totals 650 million running words. Copies of the corpus are held both at HarperCollins Publishers and the University of Birmingham. The version at Birmingham can be accessed for academic research.
The Bank of English forms part of the Collins Word Web together with the French, German and Spanish corpora.