Do
I need ECC or non-ECC Memory?
Let's start by looking at a few terms used when describing
ECC memory. Non-ECC, or non-parity, memory is fine for most
systems not running a server.
ECC
SDRAM Guide
ECC
or Non-parity?
You may have to decide whether you want ECC or non-parity.
ECC can find and correct some memory errors, but it comes
with a performance price-it will slow your system by about
2%. Fortunately, memory errors are rare in today's memory
chips, so most average users don't have a need for ECC. If
you're planning to use your system as a server or other "mission-critical"
machine, we recommend ECC. If you're looking for maximum speed,
we recommend non-parity.
What is ECC
SDRAM?
ECC (error correction code) SDRAM is memory that is able
to detect and correct some SDRAM errors without user intervention.
ECC SDRAM replaced parity memory which could only detect,
but not correct, SDRAM errors.
What
are Parity and ECC (Error Checking and Correction)?
Early on, RAM was not as stable a solution as it is today.
Irregularities could cause the data in memory to corrupt
or alter in ways that often led to a system crash or hard
disk data damage. This problem was first solved with Parity
RAM. Through additional or modified chips, it added an additional
bit to each byte of RAM which verified the validity of each
byte. If the data did not check out properly, your computer
would typically halt to avoid further problems. ECC added
a further process to the cycle. Instead of merely checking
the bytes, it can correct most errors with an extra bit.
It is fairly popular with the CAD crowd, as it helps maintains
strict accuracy. For most consumers, however, it is not
necessary due to the low rate of errors in today's memory,
and actually involves a slight performance hit.
What
causes SDRAM errors?
Per Dell, "Memory errors are characterized as hard or
soft. Hard errors are caused by defects in the silicon or
metalization of the SDRAM package, and are usually permanent
once they manifest. Soft errors are caused by charged particles
or radiation, and are transient. In the past, soft errors
were primarily caused by alpha particles, but that failure
mode has been mostly eliminated today by strict quality control
of the packaging material by SDRAM vendors. Currently the
primary source of soft errors in SDRAM is electrical disturbance
caused by cosmic rays, which are very high-energy subatomic
particles originating in outer space."
What
happens when a SDRAM crash occurs?
When main memory crashes, all data in memory is lost. The
larger the amount of main memory on the computer, the greater
the possibility of nonrecoverable data loss.
What
kind of errors can ECC SDRAM correct?
Most ECC SDRAM can correct single bit errors, and detect,
but not correct larger errors. Thus, errors greater in size
than 1 bit will still crash the computer.
Chipkill
was invented to augment ECC DRAM. Large server manufacturers
have implemented additional error correcting hardware capabilities
with a technology known as Chipkill. Per Dell, "Chipkill
correct is the ability of the memory system to withstand a multibit
failure within a SDRAM device, including a failure that causes
incorrect data on all data bits of the device. These methods
rely on the chip set and hardware architecture of the system
and cannot be achieved through software upgrades."
So
what is the possibility of data loss?
The data shown below illustrates the results of an IBM analysis
comparing server outages due to memory failures of parity, ECC
and Chipkill-equipped servers.
In
summary, the following outage rates were identified:
A 32MB parity memory-equipped server received
7 outages per 100 servers over 3 years.
The 1GB ECC memory-equipped server received
9 outages per 100 servers over 3 years.
The 4GB Chipkill-equipped server received
6 outages per 10,000 servers over 3 years.
It
can be seen that the Chipkill equipped server had a failure
rate of a magnitude of over 10 times lower than regular ECC
SDRAM. Also, remember that the more system memory a computer
has, the more likely it will crash due to a memory error.
What
about speed?
I could find no conclusive evidence that ECC SDRAM performed
any slower than non-ecc SDRAM. Both Dell and IBM stated in
their referenced articles there was no speed penalty to use
a Chipkill enhanced server instead of an ECC memory equipped
server without Chipkill.
So
who should buy ECC SDRAM?
First, the average user should be frequently saving data to
their hard drive, so the likelihood of catastrophic memory
failure should be small and therefore ECC memory would be
overkill. Second, if you are thinking of running a server,
you definitely want to have a working RAID disk array, as
your hard drives are much more likely to fail then your memory.
Third, if you want to run a server, there is no reason not
to have ECC memory if your motherboard supports it. Currently
ECC SDRAM only costs a little bit more than regular SDRAM.
We
hope this ECC SDRAM Guide has been useful to you. Crucial
makes finding the right ECC a simple process. We've catalogued
all the information you need into one easy-to-use, searchable
databaseThe Crucial Memory Selector. Our Memory
Selector is the most complete of its kind. It has more than
87,000 memory upgrades for more than 15,000 different servers,
printers, cameras, desktops, notebooks and more. All you have
to do is select what system you have and our Memory Selector
tells you what upgrades will work for you. And we guarantee
the memory you buy will be perfectly compatible with your
system or you will get your money back.
Use
the Memory Selector to select the right ECC
for your particular system.
|