The following isn't specific to any one multicore design, but rather is a basic overview of multicore architecture. Although manufacturer designs differ from one another, multicore architectures need to adhere to certain aspects. The basic configuration of a microprocessor is seen in Figure 2.
Closest to the processor is Level 1 (L1) cache; this is very fast memory used to store data frequently used by the processor. Level 2 (L2) cache is just off-chip, slower than L1 cache, but still much faster than main memory; L2 cache is larger than L1 cache and used for the same purpose. Main memory is very large and slower than cache and is used, for example, to store a file currently being edited in Microsoft Word. Most systems have between 1GB to 4GB of main memory compared to approximately 32KB of L1 and 2MB of L2 cache. Finally, when data isn't located in cache or main memory the system must retrieve it from the hard disk, which takes exponentially more time than reading from the memory system.
|Figure 2: Generic Modern Processor Configuration
Figure by Bryan Schauer
If we set two cores side-by-side, one can see that a method of communication
between the cores, and to main memory, is necessary. This is usually
accomplished either using a single communication bus or an interconnection
network. The bus approach is used with a shared memory model,
whereas the interconnection network approach is used with a distributed memory model.
The shared and distributed memory models are depicted in Figure
3. After approximately 32 cores the bus is overloaded with the
amount of processing, communication, and competition, which leads
to diminished performance; therefore, a communication bus has
a limited scalability.
Figure 3: (a) Shared Memory Model, (b) Distributed Memory Model 
Multicore processors seem to answer the deficiencies of single
core processors, by increasing bandwidth while decreasing power
consumption. Table 1, below, shows a comparison of a single and
multicore (8 cores in this case) processor used by the Packaging
Research Center at Georgia Tech. With the same source voltage
and multiple cores run at a lower frequency we see an almost tenfold
increase in bandwidth while the total power consumption is reduced
by a factor of four.
Table 1: Single Core vs. Multicore 
Go To Multicore Implementations