Recently, we had a server crash and then a number of issues arose. One problem was that a add-in 3ware RAID card was detected prior to the SATA drive. As a result, the RAID array became /dev/sda while the SATA system drive became /dev/sdb. Once I sorted that out by modifying modprobe, I still had issues with the system.
As it turned out, the initially crash was due to a memory issue. So what to do? I did not have memory on hand for this system but needed to get it back in operation. That’s when I ran across the memory mirroring feature in the BIOS.
Using PXEBoot, I booted into our memory check tool memcheck. Memcheck was failing. So we had some bad RAM. However, we did not have any replacement RAM in stock as this server is a one-off used for internal purposes.
While digging in the BIOS to confirm some drive settings, I noticed this motherboard has advanced memory RAS (reliability, availability, and serviceability). In short, the system has memory mirroring capabilities. Think of it as RAID 1for RAM. So , I enabled memory mirroring and booted into memcheck again.
Guess what? Memcheck passed. So I booted into the system again and the crashes and segfaults were gone.
This is the type of thing you never really see in the books … using memory mirroring to mitigate a RAM issue until you can fix the problem. The drawback is that the system only sees half of the RAM with mirroring enabled. So the 4GB only appears as 2GB. But this system can typically run with 2GB of RAM with few performance issues. So in the short term, we will have half the normal RAM but stable.
This type of trick could be useful at many dedicated server providers. With their mid to upper-end servers, the system may support memory mirroring. Could be a good way to get around a pesky RAM issue until it could be evaluated properly.