Fade To Irrelevance: Rig Reccomendations

Processor and Motherboard

Processor Theory

by: je.saist

What follows is a write I did on computer componets.  It has been edited from the original format for several reasons.  For those wishing to locate the original article, it is aviable here in pdf format.

Part 1: Processor Theory
Part 2: Hardware
Part 3: Thread Response

The following covers processors and motherboards, so lets dive in:


Processor Theory :  A litte bit about why certain products are used

The processor and the motherboard make up a large part of a computers performance. Motherboards with cheap parts can break more often and have significantly lower performance for the end user.  In most cases buying "brand name" products over generics can be a bit silly. Why pay the extra $2 for Pepto Bismol when the Equate bottle has the same exact active ingrediants?  Why buy faded/ripped/jeans from the Gap for $65 when you can do the same job in 15 minutes with a $10 pair from WalMart?

Sometimes though, the brand does make a difference.  Why buy Bungie when Hal Labratories is available?  Why buy "Sams Brand Cola" with Coca Cola and Pepsi available?  Sometimes the brand does make a difference.  This is even more true in the computer world where a strong brands can create a good computing experience and weak brands can create a bad computing experience. Over the years that I've worked with hardware I've developed a pretty short list of brands that I prefer to buy from, as well as products where going "generic" is acceptable.

In the processor market there is only one real choice for the gamer or average user and that's AMD.  AMD products tend to be:

Cheaper: This was true in the days of AthlonXp where comparitive AMD products were anywhere from 1/3 to 2/3's cheaper than a comparable Intel Processor

More powerful: related to cheaper. Take for example today. The Athlon64 2800+ costs about $115 online.  The closest Intel in price is the 2.4ghz Prescott weighing in at $120. The closest processer in terms of overall performance is the 3ghz Prescott at $170.  The closest proccessor from Intel in terms of performance and feature support is the Pentium 630, which weighs in at $210.  $210 in AMD processors gets you the latest Socket 939 3200+ at $198, or the Socket 754 3400+ at $165

Lower Heat Output: Feel like roasting your leg? A 2.6ghz Northwood processor puts out more heat than an AthlonXp 2500+ overclocked to 2.2ghz (default speed 1.83ghz) and an Athlon64 754 3200 overclocked to 2.4ghz (default speed 2.0ghz). The AMD processors already decimate the Intel performance at stock speed, and when brought up to a point where heat output is identical, there is little comparision.  Keep in mind that this refers to Pentium processors based on Northwood as processors based on Prescott are even hotter.


In the current processor landscape, Intel only has one product worth looking at, the Pentium-M which is better known from the Centrino brand name. The Pentium-M is essentially a remix of the Pentium3 built with mobile applications in mind.  The good news is that

A: the Pentium-M's power/heat ratio is in line with Athlon64. 
B: the Pentium-M's instruction per clock cycle (IPC) performance is right in line with AthlonXp
C: the Pentium-M is extremely cool
D: the Pentium-M does not require hypertheading.


The Bad news for end users is

A: the Pentium-M does not support X86-64
 

The Bad news for Intel is that

A: the Pentium-M didn't require Hypertheading



Okay, at this point I should probably explain Hyperthreading.  Hyperthreading is the term Intel uses to describe the technology that allows a Pentium4 processor to be seen as two processors by the computer system.  Intel advertises Hyperthreading as a performance boost, like you are getting 2 processors for the price of 1.  While not an outright lie, it is important to understand some factors behind Hyperthreading.

The Pentium4 has a small problem.  It is an extremely fast processor.  Given the amount of marketing made by Intel that speed=power, this does not sound like a problem, and it is easy to question why being fast is a problem.  Understand two items

Being fast, in and of itself, is not a bad thing.
The ends do not justify the means.

The problem is in how Intel got the Pentium4 design to be fast. If you want an engineers side of the story, read over at Arstechnicia for a more in-depth view of the Pentium design.  The short story is that the Intel Pentium4 design has an extremely long processing pipeline. Much like the factory at Ford Motor Company, a computer processor does its job in an assembly-line fashion.  The Pentium4 has many different steps in the assembly line that do just a little bit of work.

The more assembly line units there are in the processor, the less work each unit has to perform, and the "faster" the instruction can move through each section.

This is known as "IPC" or Instructions Per Clock and refers to the amount of work that can be done each time the processor completes a cycle of the clock.  The Intel Pentium4, by design, cannot process many instructions per clock cycle.  But, because the Pentium4 design is so speedy, the design should be able to make up for the amount of processing work that is lost in simplifying each step. 

The problem is, much as in an assembly line in a car factory, things break.  Think back to Ford Motor Company for a second. When a product breaks Ford simply shuts down that part of the line until the broken product is fixed or replaced. Meanwhile, the work that is supposed to occure is either shifted to another assembly line or halted at that point.

Computer processors don't have that luxery.  When a part of the process fails, the entire instruction has to be started over. This is when you get into things like Branch Prediction that is supposed to guess how an instruction is going to be processed. The better the branch prediction, the less likely part of the assembly line will break.

The problem is with the Pentium4 design is that there is a significant performance hit for missed instructions or aborted processing lines.  There is also the problem that Depending on the instruction being processed, anywhere from 25% to 75% of the processor may not be doing anything.  Hence the requirement of HyperThreading.  HyperThreading allows for certain types of processess to be computed at the same time.  So, if one of the processes breaks and has to be restarted, at least the processor completed SOMETHING.

The actual real-life impact of HyperThreading is that it allows the processor to be more efficient in some multiple application enviroments.  If you do a lot of multi-tasking that uses different types of applications, the 2 logical processors can boost percieved performance.  However, if you do a lot of multi-tasking with applications that are of the same type and use the same instructions to process, you will suffer a performance decrease.  Similar to if you run an application that is written with 2 real processors in mind. Sometimes you will get a boost of performance, and sometimes you will get a performance hit.

Intel prefers to call their Hyperthreading technology under the term SMT, for Simultaneous Multi Threading. The intent is to indicate that the processor is capable of processing two processor threads at once instead of just one, which is typical of just about all other processors.  However, Intels implementation of SMT is drastically different from SMP, or Symmetric Multi Processing. SMP refers to having multiple physical processors installed and available to the operating system and program.  This allows the operating system and program full access to all of the system components.  

The ineffieciency of the Pentium4 design plays into Intels hands for promoting SMT.  With between 25-75% of the processor not doing anything, even enabling just part of that wasted power to be used enables the Pentium4 designs to be percieved as more powerful. However, with the thread and usage restrictions on HyperThreading, SMT is no substitute for SMP, and users should understand that SMT's only real function in the Pentium4 processor is to make a horrible design seem somewhat decent.




Going back to the original talk about Pentium-M, the work Intel's Israeli team has put into the Pentium-M has has improved the IPC to AMD levels and Intel's SMT implementation is not necessary to keep the processor busy.  However, since Intel made the mistake with the Pentium4 of promoting Hyperthreading as "the next big thing" and essential for "high-end" computers, Intel simply..

well, to put it like this, Intel boxed themselves into a corner.  Intel promoted higher clock speeds and HyperThreading as the future of the desktop.  Completely counter to AMD's claims about higher performance, real dual processor systems, and dual-core processors.. 

Because of this Intel is in a bad position today. It's only competitive product to the Athlon64 doesn't support X86-64. And due to it's Pentium4 advertising, Intel simply can't back down, admit it was wrong, and get back on track with meaningful products.




Perceived Speed Analogy

Think of it like this: Imagine you have two vehicles.

One is a Ferrari

One is a dump truck. 


Your task is to take a load of dirt and move it from one location to another.

The Ferrari, while much faster than the dump truck, can carry much less dirt.  So although the Ferrari can get to the destination and back faster than the dump truck, it is going to be much slower overall than the dump truck will be when the dirt is moved.  However, it is possible to create situations where the superior speed of the Ferrari will allow it to deliver more dirt to a location than the dump truck.


Take a step back and look at that again. 

The Ferrari is the Intel processor.
Fast, small, but it can't haul a lot of dirt.

The Dump Truck is AMD processor. 
It has a slower top speed, but it can haul a lot of dirt.

The dirt is the actual instruction issued to the processor.


For most average tasks the AMD processor can plow through more code than can the Intel processor. But this isn't always true.

Technologies like Altivec, 3Dnow, 3Dnow2, MMX, SSE, SSE2, and SSE3 allow a processor to handle some types of instructions in one cycle rather than in multiple passes.  Properly optimizing for these types of instructions can allow some processors to complete some tasks faster.  The most prominent example is Altivec, which allowed the Apple to run many graphic editing programs much faster than near identical programs on Windows computers. However, optimizing for one of these special instructions can be a double edged sword. If very few people use your instruction set, like AMD's 3Dnow, your technology could fall by the wayside.

To fit this into our load of dirt analogy, this is like giving our vechicles a short cut to a destination.  The more shortcuts that are made available to our cars could ensure that the load is delivered in less amount of time.


Another difference that can affect percieved performance is the type of instruction itself. Going back to our load of dirt, let us divide it up into several small piles.  What we need to do is have these piles taken to a number of different locations.  In such a case, the Intel processor has a raw speed advantage over the AMD processor.  A common example of this in practice can be media encoding, where Intel has long dominated. This is due in large part to the relatively small size of media files and the number of operatings needed to be performed. Intels domination has lost ground as media encoders programmed to run in a full 64bit mode have become more common, allowing for more work to be done in each process.  Another factor is the presence of larger files, which tend to erase the speed difference as more work is done on each file.





64bits

One of the buzzwords surrounding home computing today is 64bits.  What is 64bits and what does it mean to the home computing user?

Currently there are 4 64bit designs available to the average computer user.  Those are

1: PowerPC 64  : represented by IBM's PowerPC 970
2: IA-64 : represented by Intel's Itanium series
3: X86-64 : represented by AMD's Athlon64 and Sempron64 processors
4: EMT : represented by Intel Celeron EMT, Pentium4 w/ EMT, and Pentium-D


However, there are a multitude of other 64bit processors available, which include architectures like Sparc, MIPS, POWER and Alpha. These 64bit processors tend to be extremely expensive and could only be purchased with propietary UNIX Operating Systems geared towards server or database usage.  These processors are generally known and reffered to as RISC, which stands for Reduced Instruction Set Computing, and are used in dedicated enviroments where the computer only has one or two tasks to complete.

The average business or home client wants to run more than just one or two programs on each system which makes RISC systems poor choices for general computer use. The primary computing architecture that targets the multiple needs of the home user is known as CISC, for Complex Instruction Set Computing.  The most popular CISC architecture in use when home computers became available was X86, or Instruction Architecture 32 (IA-32). Older users are familiar with the original home computers sold as 286, 386, and 486, and the varients such as DX and SX.  Intel ran into a slight problem though with the x86 chips as they did not own the architecture.  Competitors such as Cyrix and AMD were free to use the x86 architecture as they wished.  Intel, who wished to seperate themselves from the inferiors products that Cyrix and AMD were known for (keep in mind this was before K7 launched, when Intel actually had some sense) created the brand name Pentium and developed new technologies such as MMX that allowed them to drastically seperate the Intel processors from other X86 knockoffs.  One of the primary advancments made by Intel was the expansion of x86 from a 16bit processor, to a 32bit processor.

Unfortunantly, X86 isn't a great architecture. It's actually screwed up from what engineers say. There are several parts of the X86 design that are irrelevent in today's computer needs, and overall the architecture has long been thought to be one with a short life-span.

Intel, realizing that X86 was turning into a dead end architecture wrote a completely new architecture known as IA-64 for, you guessed it, Instruction Architecture 64.  IA-64 was designed as a 64bit RISC type processor from inception.  And in some types of computing the IA-64 processors are unchallenged, such as floating point operations.  However, Intel ran into what many would refer to as a small problem.  That problem was revealed to be a propietary software company known as Microsoft.  Microsoft's flagship products were all based on IA-32, and almost all of the products available to the average home user ran only on IA-32.  One of the few exceptions was a tiny company called Apple, whose products ran on PowerPC.  One of many issues that have plauged IA-64 is its lack of any relationship with IA-32.  IA-32 applications, such as Windows 2000, Word Perfect, or Half-Life, needed to run in an emulation mode which was too slow to be of use. The Register, an online tech newspaper, termed the Intel Itantium as the Itanic, named after the Titanic.  Nobody wanted to purchase a new architecture product that ran their existing programs at less than desirable speeds.


AMD had a different plan for 64bit.  Like Intel did with the original X86 design, AMD extended the capabilities of X86.  The new architecture was nick-named AMD64 and tried to clean up some of the legacy items X86 just didn't need.  The architecture was adopted by as X86-64, the long story short beomg that X86-64 is fully compatible with IA-32.

That brings us up to the present, where it gets more convoluted.




Intel on X86-64

Intel, under pressure from Microsoft, created their own version of X86-64 that is named EMT, for Extended Memory Technology.  Intel simply promotes X86-64 as an enhanced memory technology that allows users to directly address more than 4gigs of RAM.  Much like the hypertheading already discussed, while not an outright lie, it's not the truth either.

Yes, one of the benifits of X86-64 is that it allows you to directly address several terabytes of memory.  However, the architecture is more than just a memory boosting technology.  X86-64 brings to the table more instruction registers and a wider executable path for code.  As any modern game developer can tell you having more bits to work with means more room to get creative. 

Just ask Capcom what they could do with 64bits in a console known as Gamecube and a game known as RE4.

Ask Hal Labratories what they could do with 64bits in a MIPS based console known as the N64.

Ask Naughty Dog what they can do with a 128bit MIPS design in a console known as the PS2.

Ask Retro Studios what they could do with a 64bit PPC architecture and a game known as Metroid Prime.

The X86-64 design is supposed to give the same kind of processor freedom to developers who work under X86. As an added benifit, X86-64 code easily ports over to PowerPC 64. Just ask around at http://icculus.org/ on how long it takes to convert PowerPC 64 code to X86-64 code and back.


Some developers who you may have heard of by Name, such as Tim Sweeny, John Carmack, and Gabe Newell, all have stated that content creation for future game engines will be 64bit processor only.  You may not neccasarily NEED 64bits to play that content back, but you will need it to create content.

Basically, don't buy Intel's B.S. that X86-64 is only good for getting your ram count above 4gigs.  It's going to be used for a lot more, and to think otherwise is probably not a good idea.








Document made with Nvu