Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are used extensively in embedded computers and high-performance computing.
Manycore processors are distinct from multi-core processors in being optimized from the outset for a higher degree of explicit parallelism, and for higher throughput (or lower power consumption) at the expense of latency and lower single-thread performance.
The broader category of multi-core processors, by contrast, are usually designed to efficiently run both parallel and serial code, and therefore place more emphasis on high single-thread performance (e.g. devoting more silicon to out of order execution, deeper pipelines, more superscalar execution units, and larger, more general caches), and shared memory. These techniques devote runtime resources toward figuring out implicit parallelism in a single thread. They are used in systems where they have evolved continuously (with backward compatibility) from single core processors. They usually have a 'few' cores (e.g. 2, 4, 8) and may be complemented by a manycore accelerator (such as a GPU) in a heterogeneous system.
Cache coherency is an issue limiting the scaling of multicore processors. Manycore processors may bypass this with methods such as message passing, scratchpad memory, DMA, partitioned global address space, or read-only/non-coherent caches. A manycore processor using a network on a chip and local memories gives software the opportunity to explicitly optimise the spatial layout of tasks (e.g. as seen in tooling developed for TrueNorth).
Manycore processors may have more in common (conceptually) with technologies originating in high-performance computing such as clusters and vector processors.
GPUs may be considered a form of manycore processor having multiple shader processing units, and only being suitable for highly parallel code (high throughput, but extremely poor single thread performance).
A number of computers built from multicore processors have one million or more individual CPU cores. Examples include:
Quite a few supercomputers have over a million of even over 5 million CPU cores. When there are also coprocessors, e.g. GPUs used with, then those cores are not listed in the core-count, then quite a few more computers would hit those targets.