What follows is a write I did on computer componets. It
has been edited from the original format for several reasons. For
those wishing to locate the original article, it is aviable here
in pdf format.
Part
1: Processor Theory
Part 2: Hardware
Part 3: Thread Response
The following covers processors and motherboards, so lets dive in:
Processor
Theory : A litte bit about why certain products are used
The processor and the motherboard
make up a large part of a computers performance. Motherboards with
cheap parts can break more often and have significantly lower
performance for the end user. In most cases buying "brand name"
products over generics can be a bit silly. Why pay the extra $2 for
Pepto
Bismol when the Equate bottle has the same exact active
ingrediants?
Why buy faded/ripped/jeans from the Gap for $65 when you can do the
same job in 15 minutes with a $10 pair from WalMart?
Sometimes
though, the brand does make a difference. Why buy Bungie when Hal
Labratories is available? Why buy "Sams Brand Cola" with Coca
Cola and
Pepsi available? Sometimes the brand does make a
difference. This is
even more true in the computer world where a strong brands can create a
good computing experience and weak brands can create a bad computing
experience. Over the years that I've worked with hardware I've
developed a pretty short list of brands that I prefer to buy from, as
well as products where going "generic" is acceptable.
In the processor market there is only one real choice for the gamer or
average user and that's AMD. AMD products tend to be:
Cheaper: This was true in
the days of AthlonXp where comparitive AMD
products were anywhere from 1/3 to 2/3's cheaper than a comparable
Intel Processor
More powerful: related to
cheaper. Take for
example today. The Athlon64 2800+ costs about $115 online. The
closest
Intel in price is the 2.4ghz Prescott weighing in at $120. The closest
processer in terms of overall performance is the 3ghz Prescott at
$170. The closest proccessor from Intel in terms of performance
and
feature support is the Pentium 630, which weighs in at $210. $210
in
AMD processors gets you the latest Socket 939 3200+ at $198, or the
Socket 754 3400+ at $165
Lower Heat Output: Feel
like roasting your leg? A 2.6ghz Northwood processor puts out more heat
than an AthlonXp 2500+ overclocked to 2.2ghz (default speed 1.83ghz)
and an Athlon64 754 3200 overclocked to 2.4ghz (default speed 2.0ghz).
The AMD processors already decimate the Intel performance at stock
speed, and when brought up to a point where heat output is identical,
there is little comparision. Keep in mind that this refers to
Pentium processors based on Northwood as processors based on Prescott
are even hotter.
In the current processor landscape, Intel only has one product worth
looking at, the Pentium-M which is better known from the Centrino brand
name. The
Pentium-M is essentially a remix of the Pentium3 built with mobile
applications in mind. The good news is that
A: the Pentium-M's power/heat ratio is in line with Athlon64.
B: the Pentium-M's instruction per clock cycle (IPC) performance is
right in line with AthlonXp
C: the Pentium-M is extremely cool
D: the Pentium-M does not require hypertheading.
The Bad news for end users is
A: the Pentium-M does not support X86-64
The Bad news for Intel is that
A: the Pentium-M didn't require Hypertheading
Okay,
at this point I should probably explain Hyperthreading.
Hyperthreading
is the term Intel uses to describe the technology that allows a
Pentium4 processor to be seen as two processors by the computer
system. Intel advertises Hyperthreading as a performance boost,
like
you are getting 2 processors for the price of 1. While not an
outright
lie, it is important to understand some factors behind Hyperthreading.
The
Pentium4 has a small problem. It is an extremely fast
processor. Given the amount of marketing made by Intel that
speed=power, this does not sound like a problem, and it is easy to
question why being fast is a problem. Understand two items
Being fast, in and of
itself, is not a bad thing.
The ends do not justify
the means.
The problem
is in how Intel got the Pentium4 design to be fast. If you want an
engineers side of the story, read over at Arstechnicia
for a more in-depth view of the Pentium design. The short story
is
that the Intel Pentium4 design has an extremely long processing
pipeline. Much like the factory at Ford Motor Company, a computer
processor does its job in an assembly-line fashion. The Pentium4
has
many different steps in the assembly line that do just a little bit of
work.
The more assembly line
units there are in the processor, the less work
each unit has to perform, and the "faster" the instruction can move
through each section.
This is known as "IPC" or Instructions Per Clock
and refers to the amount of work that can be done each time the
processor completes a cycle of the clock. The Intel Pentium4,
by design,
cannot process many instructions per clock cycle. But,
because the Pentium4 design is so speedy, the design should be able to
make up for the amount of processing work that is lost in simplifying
each step.
The problem is, much as in an
assembly line in a car factory, things break. Think back to Ford
Motor Company for a
second. When a product breaks Ford simply shuts down that part of the
line until the broken product is fixed or replaced. Meanwhile, the work
that is supposed to occure is either shifted to another assembly
line or halted at that point.
Computer
processors don't have that luxery. When a part of the process
fails,
the entire instruction has to be started over. This is when you get
into things like Branch Prediction that is supposed to guess how an
instruction is going to be processed. The better the branch prediction,
the less likely part of the assembly line will break.
The
problem is with the Pentium4 design is that there is a significant
performance hit for missed instructions or aborted processing
lines.
There is also the problem that Depending on the instruction being
processed, anywhere from 25% to 75% of the processor may not be doing
anything. Hence the requirement of HyperThreading.
HyperThreading
allows for certain types of processess to be computed at the same
time. So, if one of the processes breaks and has to be restarted,
at
least the processor completed SOMETHING.
The actual real-life
impact of HyperThreading is that it allows the processor to be more
efficient in some multiple application enviroments. If you do a
lot of
multi-tasking that uses different types of applications, the 2 logical
processors can boost percieved performance. However, if you do a
lot
of multi-tasking with applications that are of the same type and use
the same instructions to process, you will suffer a performance
decrease. Similar to if you run an application that is written
with 2
real processors in mind. Sometimes you will get a boost of performance,
and sometimes you will get a performance hit.
Intel prefers to call their Hyperthreading technology under the term SMT, for Simultaneous Multi Threading. The
intent is to indicate that the processor is capable of processing two
processor threads at once instead of just one, which is typical of just
about all other processors. However, Intels implementation of SMT
is drastically different from SMP,
or Symmetric Multi Processing.
SMP refers to having multiple physical processors installed and
available to the operating system and program. This allows the
operating system and program full access to all of the system
components.
The ineffieciency of the Pentium4 design plays into Intels hands for
promoting SMT. With between 25-75% of the processor not doing
anything, even enabling just part of that wasted power to be used
enables the Pentium4 designs to be percieved as more powerful. However,
with the thread and usage restrictions on HyperThreading, SMT is no
substitute for SMP, and users should understand that SMT's only real
function in the Pentium4 processor is to make a horrible design seem
somewhat decent. |
Going
back to the original talk about Pentium-M, the work Intel's Israeli
team has put into the Pentium-M has has
improved the IPC to AMD levels and Intel's SMT implementation is not
necessary to keep
the processor busy. However, since Intel made the mistake with
the
Pentium4 of promoting Hyperthreading as "the next big thing" and
essential for "high-end" computers, Intel simply..
well, to put
it like this, Intel boxed themselves into a corner. Intel
promoted
higher clock speeds and HyperThreading as the future of the
desktop.
Completely counter to AMD's claims about higher performance, real dual
processor systems, and dual-core processors..
Because of this Intel is in
a bad position today. It's only competitive product to the Athlon64
doesn't support X86-64. And due to it's Pentium4 advertising, Intel
simply can't back down, admit it was wrong, and get back on track with
meaningful products.
Perceived Speed Analogy
Think of it like this: Imagine you have two vehicles.
One is a Ferrari
One is a dump truck.
Your task is to take a load of dirt and move it from one location to
another.
The
Ferrari, while much faster than the dump truck, can carry much less
dirt. So although the Ferrari can get to the destination and back
faster than the dump truck, it is going to be much slower overall than
the dump truck will be when the dirt is moved. However, it is
possible
to create situations where the superior speed of the Ferrari will allow
it to deliver more dirt to a location than the dump truck.
Take a step back and look at that again.
The Ferrari is the Intel
processor.
Fast, small, but it can't haul a lot of dirt.
The Dump Truck is AMD
processor.
It has a slower top speed, but it can haul a lot of dirt.
The dirt is the actual
instruction issued to the processor.
For most average tasks the AMD processor can plow through more code
than can the Intel processor. But this isn't always true.
Technologies
like Altivec, 3Dnow, 3Dnow2, MMX, SSE, SSE2, and SSE3 allow a processor
to handle some types of instructions in one cycle rather than in
multiple passes. Properly optimizing for these types of
instructions
can allow some processors to complete some tasks faster. The most
prominent example is Altivec, which allowed the Apple to run many
graphic editing programs much faster than near identical programs on
Windows computers. However, optimizing for one of these special
instructions can be a double edged sword. If very few people use your
instruction set, like AMD's 3Dnow, your technology could fall by the
wayside.
To fit this into our load of dirt analogy, this is like
giving our vechicles a short cut to a destination. The
more shortcuts that are made available to our cars could ensure
that the load is delivered in less amount of time.
Another
difference that can affect percieved performance is the type of
instruction itself. Going back to our load of dirt, let us divide it up
into several small piles. What we need to do is have these
piles taken to a number of
different locations. In such a case, the Intel processor has
a raw speed advantage over the AMD processor. A common
example of this in practice can be
media encoding, where Intel has long dominated. This is due in large
part to
the relatively small size of media files and the number of operatings
needed to be performed. Intels domination has lost ground as media
encoders
programmed to run in a full 64bit mode have become more common,
allowing for more work to be done in each process. Another factor
is the presence of larger files,
which tend to erase the speed difference as more work is done on each
file. |
64bits
One of
the buzzwords surrounding home computing today is 64bits. What is
64bits and what does it mean to the home computing user?
Currently there are 4 64bit designs available to the average computer
user. Those are
1: PowerPC 64 :
represented by IBM's PowerPC 970
2: IA-64 : represented by
Intel's Itanium series
3: X86-64 : represented by
AMD's Athlon64 and Sempron64 processors
4: EMT : represented by Intel
Celeron EMT, Pentium4 w/ EMT, and Pentium-D
However,
there are a multitude of other 64bit processors available, which
include architectures like Sparc, MIPS, POWER and Alpha. These 64bit
processors tend to be extremely expensive and could only be purchased
with propietary UNIX Operating Systems geared towards server or
database usage.
These processors are generally known and reffered to as RISC, which stands for Reduced
Instruction Set Computing, and are used in dedicated enviroments
where
the computer only has one or two tasks to complete.
The average
business or home client wants to run more than just one or two
programs on each system which makes RISC systems poor choices for
general computer use. The primary computing architecture that
targets the multiple needs
of the home user is known as CISC,
for Complex Instruction Set
Computing. The most popular CISC architecture in use when
home
computers became available was X86,
or Instruction Architecture 32
(IA-32). Older users are
familiar with the original home computers sold
as 286, 386, and 486, and the varients such as DX and SX. Intel ran
into a slight problem though with the x86 chips as they did not own the
architecture.
Competitors such as Cyrix and AMD were free to use the x86 architecture
as they wished. Intel, who wished to seperate themselves from the
inferiors products that Cyrix and AMD were known for (keep in mind this
was before K7 launched, when Intel actually had some sense) created the
brand name Pentium and developed new technologies such as MMX that
allowed them to drastically seperate the Intel processors from other
X86
knockoffs. One of the primary advancments made by Intel was the
expansion of x86 from a 16bit processor, to a 32bit processor.
Unfortunantly, X86 isn't a great architecture. It's actually screwed up
from what engineers say. There are
several parts of the X86 design that are irrelevent in today's computer
needs, and overall the architecture has long been thought to be one
with a short life-span.
Intel, realizing that X86 was turning into a dead end
architecture wrote a completely new architecture known as IA-64 for, you guessed it, Instruction Architecture 64.
IA-64
was designed as a 64bit RISC type
processor from inception. And
in some types of computing the IA-64 processors are unchallenged, such
as floating point operations. However, Intel ran into what many
would refer to as a small
problem. That problem was revealed to be a propietary software
company known as Microsoft. Microsoft's flagship
products were all based on IA-32, and almost all of the products
available to the average home user ran only on IA-32. One of the
few exceptions was a
tiny company called Apple, whose products ran on PowerPC. One of
many issues that have plauged IA-64 is its lack of any relationship
with IA-32. IA-32
applications, such as Windows 2000, Word Perfect, or Half-Life, needed
to run in an emulation mode which was too slow to be of use. The
Register, an online tech newspaper, termed the Intel Itantium as
the Itanic, named after the Titanic. Nobody
wanted to purchase a new architecture product that ran their existing
programs at less than desirable speeds.
AMD had a different
plan for 64bit. Like Intel did with the original X86 design, AMD
extended the capabilities of X86. The new architecture was
nick-named
AMD64 and tried to clean up some of the legacy items X86 just didn't
need. The architecture was adopted by as X86-64, the long
story short beomg that X86-64 is fully compatible
with IA-32.
That brings us up to the present, where it gets more convoluted.
Intel
on X86-64
Intel,
under pressure from Microsoft, created their own version of X86-64 that
is named EMT, for Extended Memory Technology.
Intel simply promotes
X86-64 as an enhanced memory technology that allows users to directly
address more than 4gigs of RAM. Much like the hypertheading
already
discussed, while not an outright lie, it's not the truth either.
Yes,
one of the benifits of X86-64 is that it allows you to directly address
several terabytes of memory. However, the architecture is more
than
just a memory boosting technology. X86-64 brings to the table
more
instruction registers and a wider executable path for code. As
any
modern game developer can tell you having more bits to work with means
more room to get creative.
Just ask Capcom what they could do with
64bits in a console known as Gamecube and a game known as RE4.
Ask Hal Labratories what they could do with 64bits in a MIPS based
console known as the N64.
Ask Naughty Dog what they can do with a 128bit MIPS design in a console
known as the PS2.
Ask Retro Studios what they could do with a 64bit PPC architecture and
a game known as Metroid Prime.
The
X86-64 design is supposed to give the same kind of processor freedom to
developers who work under X86. As an added benifit, X86-64 code easily
ports over to PowerPC 64. Just ask around at http://icculus.org/ on
how long it takes to convert PowerPC 64 code to X86-64 code and back.
Some
developers who you may have heard of by Name, such as Tim Sweeny, John
Carmack, and Gabe Newell, all have stated that content creation for
future game engines will be 64bit processor only. You may not
neccasarily NEED 64bits to play that content back, but you will need it
to create content.
Basically, don't buy Intel's B.S. that X86-64
is only good for getting your ram count above 4gigs. It's going
to be
used for a lot more, and to think otherwise is probably not a good idea. |
 |