Distributionalism was a general theory of language and a discovery procedure for establishing elements and structures of language based on observed usage. It can be seen as an elaboration of structuralism but takes a more computational approach. Originally mostly applied to understanding phonological processes and phonotactics, distributional methods were also applied to work on lexical semantics and provide the basis for the distributional hypothesis for meaning. Current computational approaches to learn the semantics of words from text in the form of word embeddings using machine learning are based on distributional theory.
Distributionalism can be said to have originated in the work of structuralist linguist Leonard Bloomfield and was more clearly formalised by Zellig S. Harris.[1][2] This theory emerged in the United States in the 1950s, as a variant of structuralism, which was the mainstream linguistic theory at the time, and dominated American linguistics for some time.[3] Using "distribution" as a technical term for a component of discovery procedure is likely first to have been done by Morris Swadesh in 1934 [4] and then applied to principles of phonematics, to establish which observable various sounds of a language constitute the allophones of a phoneme and which should be kept as separate phonemes.[5] According to Turenne and Pomerol, distributionalism was in fact a second phase in the history of linguistics, following that of structuralism, as distributionalism was mainly dominant since 1935 to 1960.[6] It is considered one of the scientific grounds of Noam Chomsky's generative grammar and had considerable influence on language teaching.
Distributionalism has much in common with structuralism. However, both appear in the United States while the theses of Ferdinand de Saussure are only just beginning to be known in Europe: distributionism must be considered as an original theory in relation to Saussurianism.
Behaviorist psychological theories which allowed the birth of distributionalism are reminiscent of Pavlov's work on animals. According to these theories, human behaviour would be totally explainable, and its mechanics could be studied. The study of reflexes, for example, should have made it possible to predict certain attitudes. Leonard Bloomfield argues that language, like behaviour, could be analysed as a predictable mechanism, explicable by the external conditions of its appearance.
The notions of "mechanism", "inductive method" and "corpus" are key terms of distributionalism.
Bloomfield calls his thesis mechanism, and he opposes it to mentalism: for him, in fact, speech cannot be explained as an effect of thoughts (intentions, beliefs, feelings). Thus, one must be able to account for linguistic behaviour and the hierarchical structure of the messages conveyed without any assumptions about the speakers' intentions and mental states.[7]
From the behaviourist perspective, a given stimulus corresponds to a given response. However, meaning is an unstable thing for distributionists, depending on the situation, and is not observable. It must therefore be eliminated as an element of language analysis. The only regularity is of a morphosyntactic nature: it is the structural invariants of the morphosyntax that allow us to reconstruct the language system from an analysis of its observable elements, the words of a given corpus.
The main idea of distributionalism is that linguistic units "are what they do",[8] which means that the identity of linguistic units are defined by their distribution. Zellig Harris used to consider meaning as too intuitive to be a reliable ground for linguistic research. Language use has to be observed directly while looking at all the environments in which a unit can occur. Harris advocated for a distributional approach, since "difference of meaning correlates with difference of distribution.".[9]