[XCSSA] Old computers

xcssa@xcssa.org xcssa@xcssa.org
Wed, 23 Aug 2006 00:19:55 -0500


On 8/19/06, xcssa-admin@xcssa.org <xcssa-admin@xcssa.org> wrote:
> Hmmm.. I'm not so big on the dual and quad core procs.  With AMDs hyper
> transport system (an on die "4 port" CPU/BUS/Memory IO switch).. you really
> only one one proc per die to maximize CPU/Mem/BUS Xsfer speed.

Actually I just read something about this and what you said is exactly
the problem.  The memory subsystem is so super-slow relative to the
core on-chip operations that it is a bottleneck, and by using two
cores running at half the speed, then the demands on the memory bus
for each core are less, and I suppose the memory subsystem does a
better job of feeding two sequential streams than one stream at twice
the speed, for the speeds we're talking about.

http://storagemojo.com/?page_id=207:

``For example, Intel's new Core 2 processors can issue up to four
instructions per clock cycle. On a 2GHz processor, that is up to 8
billion instructions per second. Dual-core probably comes close to
doubling that number - although actual instructions per clock are
typically 2-3. Do the math: 2 Ghz = 0.5 nanosecond clock. With dual
processors averaging a total of 5 instructions per clock you get 10
instructions per nanosecond. The very fastest RAM, which few of us
use, is about 5ns. So every memory access means a 20 clock cycle
stall. A disk with a 10ms access means a 20,000,000 clock cycle stall.

This huge I/O access cost is one of the key factors that led Intel to
de-emphasize clock speed and focus on dual-core processors to grow
performance. The storage couldn't keep up with the CPU.''

Also, electrical stability is an issue at the higher clock speeds.

> I mean if you
> simply MUST have multiple processors I suppose more is usually better
> (computationally).. but from a system design perspective and RAM/BUS access
> perspective.. a two socket single core will outperform a same speed single
> socket dual core any time (it would seem anyway since you're not sharing the
> hypertransport switch on the die).

Unless there's a lot of overlap due to locality; that is, if the 2nd
level cache is shared and some of the data is common, you avoid a
cache miss.  I suppose the TLB could be shared too.  Basically the
same idea as having a shared DNS server for an organization, instead
of individual ones.  Also if the cores need to talk to each other,
then staying on-chip is faster.  I am speculating here, but I think
dual-socket single-core is more expensive in terms of motherboard real
estate and number of interconnects.  But yes, it should generally be
faster.
-- 
"If you're not part of the solution, you're part of the precipitate."
Unix "guru" for rent or hire -><- http://www.lightconsulting.com/~travis/
GPG fingerprint: 9D3F 395A DAC5 5CCC 9066  151D 0A6B 4098 0C55 1484