Solved Dell t5400 overheat problem(?)

May 31, 2013 at 10:04:14
Specs: Windows 7 Ultimate 64, Xeon dual core/16 GB Ram
I have a Dell T5400 that was a refurb I picked up about a year and a half ago. It has a single dual core xeon processor and 16gb of ram and an Nvidia Quadra 4600. I'm an architect, so I do a decent amount of rendering and CAD work. Everything was great for the first few months. Then, one morning after a windows update, I lost the ability to use dual monitors. This went on and off for a few months.

Last summer, having given up on the second monitor, and after another windows update, I suddenly got the BSOD. I tried a number of times to get it back up but couldn't. The 1234 lights indicated a thermal issue and major errors. I let it sit for a few hours and came back to it. I was able to get it up again and ran a restore. Everything was perfect and my second monitor was back! The next morning when I restarted, back to the same issue. I realized that windows updated had reinstalled all of its new info and was causing the same old problems. I restored again, and again, all was good...until I unintentionally updated Windows again.

So last week I start working and five minutes into the day the box shuts down. I had noticed that the fans didn't seem to be running, so I took the side panel off. After letting it cool awhile and after much cussing and wasted time I was able to restore and work the rest of the day. I had a guy come in to take a look over the weekend. He went through everything - hardware and software - and couldn't find an issue. BIOS and Windows updates are current. He cleaned everything up and installed new thermal compound. It ran great for the next 5-6 hours.

Monday morning, five minutes in and BSOD. Took the side panel off, restarted, and worked for the next 8 hours with no issues. When I remove the side, I get the "thermal solution" error, but everything runs fine, just REALLY loud. So...What the heck is going on and how do I fix it???


See More: Dell t5400 overheat problem(?)

Report •

✔ Best Answer
May 31, 2013 at 14:08:45
Running memtest for only 2 hours may not show a bad range. You should run memtest86 from the boot and run overnight.


#1
May 31, 2013 at 10:09:42
I should mention that I later learned that the dual monitor issue was some sort of conflict with a driver that Windows was forcing on the video card. That seems to have been resolved.

I should also add that I now get a message about "failure in Dimm 3 or 4" at startup, but we ran memory scans and everything came up clean.


Report •

#2
May 31, 2013 at 10:48:47
You do NOT have to accept any updates from Microsoft. I recommend you set to download but not install updates and you can also choose to download ONLY Windows updates. You can mark any update not to offer again.

Go to Windows update and change the settings there.

Generally speaking, you don't need to update drivers for hardware unless there is a specific reason. If that is the case, then get the hardware drivers from the manufacturer.

All computers need physical maintenance. Open the case and look. You will probably find a lot of dust/ dirt inside. Blow out with canned compressed air form the office supply/computer store, or with an air compressor. Be sure to blow the power supply from both ends. This is messy and best done outside.

Contacts on RAM can get corroded due to the use of different metals. Snap each stick of memory in and out 4 or 5 times to burnish the contacts.

Be sure all the above is performed with the computer unplugged or if the power supply has a switch, turn it off.

DO NOT use a vacuum cleaner. Electrostatic discharge can kill your hardware.


Report •

#3
May 31, 2013 at 10:58:43
Thanks, Othehill. I had my update set as you suggest, but then while loading office a few months ago I inadverntently accepted the "let microsoft update". I have since changed it back. I also blew out the case, fans, heat sinks, and anything else I could see. The inside of the box is pretty clean. I saw something on another post about the case fan having a built in heat sensor and the potential for failure there. Just wondering if that could be my problem.

Report •

Related Solutions

#4
May 31, 2013 at 11:18:51
Right now I don't think we know whether some of the symptoms described are hardware or software.

If you get the BSOD again let us know the error code, particularly the first set of figures. If there is any file mentioned let us know that too.

I too have had issues with RAM edge connectors and usually clean them with a pencil eraser and pop them in and out a few times to ensure the sockets are cleaned too. I've had similar problemswith SATA HD connections (power and signal), also video cards - so it might be worth doing the same with those too.

Always pop back and let us know the outcome - thanks


Report •

#5
May 31, 2013 at 11:24:44
I'll give that a shot Monday morning and let you know how it works. Using a pencil eraser reminds me of slot cars when I was a kid...

Report •

#6
May 31, 2013 at 11:56:06
"He cleaned everything up and installed new thermal compound"

Proper thermal compund application is critical. Is the "he" who re-applied the compound experienced at doing so? According to the following, your CPU is a "Dual-core Intel® Xeon® 5200 series".

http://www.dell.com/downloads/ap/pr...

And according to Arctic Silver, the compound should be applied using the vertical line method. If the horizontal line or surface spread method was used, that may explain why you're still having temperature related issues. How about posting some temp readings - system, CPU, GPU, HDD?

http://www.arcticsilver.com/pdf/app...

Has the RAM ever been tested with memtest86?

http://www.memtest86.com/

One other thing, you mentioned getting blue screens errors. How about posting the STOP code & error message(s)?


Report •

#7
May 31, 2013 at 12:50:47
Thanks, Rider,

Yes, he is experienced. I was otherwise occupied when he was applying the compound so I cannot verify how it was applied.

That is correct, I currently have a single Xeon 5200 series cpu. I just purchased dual 5460s and will be upgrading as soon as we get the current issue sorted out.

Yes, the RAM was tested with memtest86. It ran for a couple hours and did not turn up any errors.

I will post the stop code and error message next time it happens.


Report •

#8
May 31, 2013 at 14:08:45
✔ Best Answer
Running memtest for only 2 hours may not show a bad range. You should run memtest86 from the boot and run overnight.

Report •

#9
June 3, 2013 at 05:56:40
Ok, got to the office this morning and pulled and cleaned DIMM 3 and DIMM 4 as suggested. These were the two that were coming up as "previous failure" at startup. Ran for about five minutes, started CPUID to watch the temps and got the BSOD.

A problem has been detected...

MEMORY_MANAGEMENT

blah, blah, blah...

STOP: 0x0000001A (0x0000000000005003, 0xFFFFF70001080000, 0X000(got cut off in display), 0x0005eda0036BD00)

According to CPUID, the temp in the processor and RAM did not exceed 130 degrees F. I restarted and the processor is currently showing about 110 and the RAM is at 150-170. Graphics card is at 150.


Report •

#10
June 3, 2013 at 08:12:48
170F (66C) is cosy but not usually enough to shout about. I assume the case sides are on because cooling is often poor when they are off.

RAM still sounds most likely but are you using the latest Win7 service pack?

Always pop back and let us know the outcome - thanks


Report •

#11
June 3, 2013 at 09:04:26
That is correct, case sides were on. The funny thing is, with them off I can run for 8 hours+ with no issues...other than going deaf from the fans running full speed.

So, I ran the Windows Memory Scan function. Bad idea. She locked up - black screen - and would not restart. Just beeped at me with lights 3 and 4 on. Had a meeting to go to, so I shut it off and left it alone for about an hour and a half. When I came back it started just fine and ran the test. It requested that I insert the Windows Repair Disk - don't have one - and reboot. I rebooted without it and it came up just fine. However, after about five minutes it shut down again. I wasn't next to it this time, so I'm not sure what the error code was this time.


Report •

#12
June 3, 2013 at 09:05:07
Sorry, not sure if I have the latest service pack or not.

Report •

#13
June 3, 2013 at 09:24:54
Here's how to find out:
http://pcsupport.about.com/od/windo...

You should be on SP1, its been around for some time.

Always pop back and let us know the outcome - thanks


Report •

#14
June 3, 2013 at 09:55:22
Yes, I am on SP 1

Report •

#15
June 3, 2013 at 14:25:47
That's fine - it related to something I read.

Always pop back and let us know the outcome - thanks


Report •

#16
June 3, 2013 at 18:21:45
See the link below. This error may be related to hardware drivers. Hardware may be attempting to access memory in the wrong range.

If you are running any hardware in a compatibility mode I suggest you try disconnecting it. Update your motherboard drivers using the latest form the manufacturers site.

http://msdn.microsoft.com/en-us/lib...


Report •

#17
June 12, 2013 at 06:22:54
Guys,

Thanks for all of your help! It turned out to be a bad RAM module. I replaced it and have had no further issues...knock on silicone...


Report •

#18
June 12, 2013 at 09:26:26
Good news then. Thanks for taking the trouble to let us know.

Always pop back and let us know the outcome - thanks


Report •

Ask Question