No doubt many FAH-Addict readers were excited when we published information last year regarding Bulldozer, AMD's next-gen CPU architecture.Unfortunately in retrospect, it would seem that Fudzilla — from whom we sourced the information — thoroughly butchered the information they received from AMD before publishing their article.

I used the term "core" because it is the closest equivalent term currently being used; it is a single core in the same manner as that of a Phenom or Nehalem, but the author at Fudzilla was probably confused by the core (what AMD refers to as a Bulldozer "module") having two cores of its own. What we consider to be a quad-core processor today would look more like this:

SMT implementations such as Intel's Hyperthreading technology normally increase performance, but do so while only increasing the silicon die size by roughly 5%, representing a nearly "free" performance increase. However, there are also downsides to implementing SMT in lieu of adding more processing cores. Most notable is the potential to actually lower overall performance, and even in the best of situations, despite a computer recognizing two or more processing cores, performance of a virtual core can only provide 10%-20% - possibly 30% according to some - of what a second physical core would provide.
AMD's Bulldozer architecture, in my opinion, seems to be an attempt to strike the best possible balance between a virtual core, and a second physical core. The second core's Integer and Scheduler units are much larger than the addition that HT requires, representing ~33% of the size of the entire module, however this is significantly less than adding a second full core, and because both of the cores are physical, there is no potential performance penalty, and adding the second core offers 180% performance compared to the first core on its own.
While I think most will agree that this is a brilliant concept, in my opinion Bulldozer has something else to offer that folders like you and I should find very interesting.
If you take another look at the first diagram, you will notice that even though each Integer core has its own scheduler, both FPUs share a single Floating-Point scheduler. Normally each core has access to its own 128-bit FPU, which is also found in all Core2, Nehalem, and Phenom/II/Athlon II processor cores. The difference is that when the core of a Bulldozer module accesses the FP Scheduler to perform an operation on the FPU, it also checks whether or not the other core is currently using theirs, and if the neighbour core is not performing a Floating-Point Operation, then it is free to use BOTH 128-bit FPU's as if they were a single 256-bit FPU! Using this capability, the core can perform up to 4 Double-Precision, or 8 Single-Precision operations in a single clock!
To put it into perspective, a 256-bit FPU, on it's own running at only 2Ghz, represents a theoretical maximum of 8 GFLOPS DP, or 16 GFLOPS SP.
4 DP or 8 SP per clock
2Ghz = 2,000,000,000 Hz
1Hz = 1 clock
Therefore, FLOPS = 4 or 8 * 2,000,000,000
GFLOPS = FLOPS/1,000,000,000
2Ghz = 2,000,000,000 Hz
1Hz = 1 clock
Therefore, FLOPS = 4 or 8 * 2,000,000,000
GFLOPS = FLOPS/1,000,000,000
Finally, I'm sure that many people will be pleased to know that AMD's official stance is that Bulldozer-based CPUs WILL be available to the market in 2011, and to quote AMD's director of server product marketing, "not on December 31, either!".
Note: After initially releasing information to the public, AMD has announced that from now on, their references to the number of cores in Bulldozer products will be as individual cores, not Bulldozer modules. I.e. they would refer to the image above as having 8 cores.
Contributed by: Berserker29
KaySL
On: 01/05/10














