Summarizing the Nvidia problems with laptop chips overheating
Last month Nvidia disclosed that due to a manufacturing flaw, some of their laptop computer graphics processors and chipsets are overheating and failing. This is a brief summary of the story for those that missed it.
All of the flawed processors and chipsets are not failing but the frequency of failure is unclear. Nvidia put it this way:
"Certain notebook configurations with GPUs and MCPs manufactured with a certain die/packaging material set are failing in the field at higher than normal rates. To date, abnormal failure rates with systems other than certain notebook systems have not been seen."
The day after the announcement, Humphrey Cheung at tgdaily noted that "significant quantities" of Nvidia chips are overheating and failing.
Two ways that failures manifest themselves are not being able to start the computer and, of course, a blank screen. Dell said that failure symptoms include multiple images, random characters on the screen lines on the screen. HP lists not detecting wireless networks as a sign of failure along with the wireless adapter not appearing in the Windows Device Manager. They also note that if the "battery charge indicator light does not turn on when the battery is installed and the AC adapter is connected" it may be due to this Nvidia problem.
The problem has existed for a while. CNET blogger Brooke Crothers says the HP knew about this since November 2007. At The INQUIRER Charlie Demerjian wrote about this problem back in April of 2007. Last month, Mr. Demerjian offered a fascinating explanation of what's going on in his article Nvidia plays the meltdown blame game. In it he says "...this problem hasn't cropped up in desktop parts yet, but it most assuredly will."
Today, the Wall Street Journal had a story about dissatisfaction with the way Nvidia has dealt with this issue, Chip Problems Haunt Nvidia, PC Makers. The article notes that "Nvidia hasn't recalled the affected chips or identified which models have problems." Nvidia's failure to publicly identify the problematic hardware, strikes me as inexcusable. According to The INQUIRER, All Nvidia G84 and G86s are bad.
Are You Affected?
The only laptop vendors to step up to the plate so far have been Dell and HP.
Owners of 24 HP laptop computer models need to be concerned. See HP Pavilion dv2000/dv6000/dv9000 and Compaq Presario v3000/v6000 Series Notebook PCs - HP Limited Warranty Service Enhancement and HP Limited Warranty Service Enhancement. I can't tell which of these two items is the most recent since HP doesn't date stamp them.
Owners of 15 Dell laptop computers are affected, including models in the Inspiron, Latitude, Precision, Vostro, and XPS lines. Dell owners should read NVIDIA GPU Update: Dell to Offer Limited Warranty Enhancement to All Affected Customers Worldwide.
What To Do If You Are Affected
The solutions offered by both HP and Dell boil down to running the fan all the time to prevent the Nvidia hardware from getting too hot.
Both companies offer a BIOS update. HP seems to have an updated BIOS for all affected machines, Dell has one for 10 of their 15 affected models.
HP describes the BIOS update thusly:
"HP has identified a hardware issue with certain HP Pavilion dv2000/dv6000/dv9000 and Compaq Presario V3000/V6000 series notebook PCs, and has also released a new BIOS for these notebook PCs... The new BIOS release for your notebook PC is preventative in nature to reduce the likelihood of future system issues. The BIOS updates the fan control algorithm of the system, and turns the fan on at low volume while your notebook PC is operational."
A very different perspective on the BIOS update is offered by Charlie Demerjian in The INQUIRER:
"If you look at the HP page, the prophylactic fix they offer is to more or less run the fan all the time. Once again, for the non-engineers out there, fan running eats a lot of power, so this destroys the battery life of notebooks. Basically, people bought a machine with a battery life of X, and now it is Y to prevent meltdown from a bum part. It doesn't fix anything, it just makes the failures take longer, hopefully past the warranty period, at a huge battery life cost. Fire up your class actions people, you got shafted."
Both Dell and HP have extended the warranty on affected machines by one year.
Other Steps To Take
If you own a laptop computer with Nvidia chips and you haven't registered it with the hardware vendor, I suggest doing so. This way they can contact you if need be, and it can only help grease the wheels should you need warranty repair.
Some motherboards have thermometers for measuring and reporting the temperature. Try to contact the hardware vendor to see if they offer software that you can use to watch the internal temperature. I use the free HD Tune to watch the temperature in hard disks but the hard disk might be nowhere near the Nvidia chips. The System Information for Windows program can also display some temperatures. Still, the best monitoring is probably with software from the motherboard or computer manufacturer, if they offer it.
Be aware of where the vents are and make sure they aren't blocked. Also, check for dust on the fan and remove any that's there. Go to the Power options in the Control Panel and make sure that all the available power management facilities are being used. They include powering down the hard disk after a period of inactivity as well as CPU power management. The Thinkpad T42 that I'm writing this on also offers PCI Bus power management.
And, of course, the most important advice of all, backup your important files to some place outside your computer. Locally resident backups on an external hard disk or a USB flash drive are a great starting point.
Update August 20, 2008: A reader with a ThinkPad T61 laptop computer wrote to tell me that the fan runs all the time. I haven't seen anything about Lenovo in terms of this Nvidia problem but the computer in question has an NVIDIA Quadro NVS 140M.
Update September 10, 2008: A lawsuit broke out. See Lawsuit alleges Nvidia hid chip defects.
See a summary of all my Defensive Computing postings.
Michael Horowitz is an independent computer consultant and the author of several classes on Defensive Computing. He is a member of the CNET Blog Network, and is not an employee of CNET. Disclosure. 





Modern Os uses lots of 3d animations on the Surface so Nvidia have reduced the 2D Part of the Grafikcards and simulates a lot over the 3D part. This Shows the significant Power Plus from a 7series to a 8 series Geforce on Desktop operations they need roundabout twice a much Power.
i knew some poeples using Dell notebooks, and can compare them with other Notebooks with the same age: Dell notebooks have a very tight and bad termal design. So they are not designed to have a high passiv thermal rate. They need the fans. This is one of the reasons for some of their spectactular fails. Back to the theme. All gefore Chips seems to have a lower Thermal resistance then the previous great series (6 and 7 are nearly immortal) and are back to normal standard. For all Desktop and Botebook users i can only say look at the Cards Temperatures, most Monitoring programms can read the internal sensors out. Notebook designers didn't credited in somecases the extra 20 Watt power the chip needs in Desktop. To the good side: mostly on battery the notebook don't suck much power and goes in low Profile mode, so mostly it dies when it runs on cord. At home on your desk you can simply putt the notebook on a cooling Plate (a platform with two big slow fans) this keeps it cool and out of trouble. And maybe even less noisi as the internal fans don't have to spin up to much.
to the inquirer.. It's Rubbish about the Extra Juice the fans draw. A highperformance fan sucks 0.5 Amps. That means 6 Watts max about2 Watts on lownoise. The Geforce 8 and 9 series sucks about 38 Watts in Desktp the Mobile around 25 Watts. The CPU and chipset is with another 25 Watts a big player. So you got around 60 Watts and the Lownoise 2 Watt more which reduces your Standing time by 3% matters?
sorry for my bad English i am a Germans student of technical Computerscience.
http://forums12.itrc.hp.com/service/forums/bizsupport/questionanswer.do?threadId=1260549
http://forums12.itrc.hp.com/service/forums/bizsupport/questionanswer.do?threadId=1260549
http://discussions.apple.com/thread.jspa?threadID=1478474&tstart=0
https://spreadsheets.google.com/ccc?key=pV-CKzYqbB6dLQRx8wpC2aw
1. HP admits to me that my laptop is affected by the bum NVidia chipsets.
2. Because it is out of warranty, and not yet on the list of affected latops, I have to pay 300.00 dollars to fix my laptop.
3. Exasperation
So, since I ustilitze my latop for work and school, it is imperative that I get my laptop fixed. I pay the 300 dollars to have ti fixed and they tell me that a box will arrive at my door the following morning. This was 3 days ago and the box still hasnt arrived. The 300.00 dollars have already been deducted from my account. I called HP to find out what the hold-up was. On my first attempt I was told that since their system was down they couldn't tell me the status of my order and to call back tomorrow. Unwilling to take the hint I call a second time immediately after. I am told that my order has been delayed on their end due to Hurricane Ike and that their Houston office, which sends out the boxes is closed due to the evacuations. They said they sent a request to them and that I should call back in 48 hours to find out the ETA of the box.
Now, let me get this straight,
First, I am told by HP that their system si down and they can't answer my questions.
Second, I am told that their is a lag time on their end because of Hurricane Ike and that their Houston Offices are closed, but that they send a request to them for confirmation on the ETA.
Now, am I daft? Why would you send a request to an office you just said was closed due to evacuations because of Hurricane Ike?
I digress, Now, not only did I pay 300.00 for the repair, but I have to call them two days from now jsut to find out what their time schedule looks like. I am a little scared at this point of even sending my laptop to them even if the box showed up this second, as I may never see it again.
So, the question is. What is really going on at HP behind the scenes? It seems to me something massive is happening that we dont know about causeing them to jsut be uncaring. Maybe, I should just try to get a refund on the repair and have it independantly fixed before the company files bankruptcy and I lose my 300.00 for good.
Anyway, I just thought I should let you know that the retailer sent my laptop to HP for repairs and it stayed there for 6 (six!) weeks ("parts shortage" they said, what about that?). So, I had it back after 7 weeks.
To answer your last question... I don't think there's anything happening behind the scenes... Probably not even a company like HP cannot afford to properly handle this kind of problem ?!
I called HP and they said that my particular model is not included in the additional year warranty extension, which I think is complete BS. I have seen numerous posts online from people with 9000 series laptops with the Intel chip and the SAME problem. He said it is limited to models with the AMD processor. I have the Nvidia 7600GPU. I am beyond pissed. I work from home and this is my only computer. I've also had the high capacity battery go faulty just after the warranty and the number 9 key on the keyboard popped up and no longer works very well. I've never had issues with laptops before like this.
If anyone has had any success getting this issue resolved without having to pay the $300+ cost please help!
Your kidding right ???
Yea HP stepped up to the plate allright. They carefully stepped up to the plate like it was a viper, tuched the corner of the plate, and then jumped back.
They are not stepping up to the plate, they are just trying to make it look like they are stepping up to the plate.
Wayne Sallee
Wayne@WayneSallee.com
Sorry if this has already been posted but it links to another part of the HP forum where hundreds of customers are experiencing similar problems considered to be with the NVIDIA graphics chip. Look at the post by Santos on 29th November 2008 for links to external information.
http://forums13.itrc.hp.com/service/forums/bizsupport/questionanswer.do?threadId=1191277
We cannot let this matter go quiet as HP would like, we have all been sold a dud and they really need to take FULL responsibility.
Regards
Celticprince (UK)
so before i call dell the next time i would love to know if there is even a fixed gpu out there and they just tried to give me an old one...
I recommend the free program "Speedfan" which can provide some tweaking of fan control but mainly it will display component core and ambient temps on your tasktray. This can give you some heads up when to stuff your laptop in the fridge for a few minutes. If I run anything graphics intensive I need a "good" cooling platform or it will fry. I am now waiting for another mainboard for my Tecra M4 tablet (the 4th one with Nvidia chip fried) and plan to beef up the heat sink or airflow. Does anyone have suggestions? Im also wondering about the gateway? Intel 087 chip just downstream (airflow wise) from the video chip. It appears to have a thin copper adhesive heatsink/shield on it. Ive measured this chip temp and its over 130 deg F after just 20 seconds of operation. Im thinking of gluing (with some fairchild glue) some fins on this puppy.
How are these chips mounted? They appear to be glued down. I cant see how they could be flow soldered. Do they have compression contacts beneath them? It sure would be nice to be able to change one and not have to replace the whole mainboard which are about $500 from Toshiba. Something is real wrong here.
These jerks aren't getting away with it.
-Stacey
-Stacey
I'd appreciate anyone's input.
All one has to do are the following steps:
1. Remove the motherboard from the laptop and remove the heat sink from the CPU and GPU. Someone who has done this previously can do it in ten to twenty minutes. If you haven?t taken it apart previously there is a good chance you can find the service manual online. For my HP the service manual has both pictures and instructions. Take a few pictures as you take it apart to help with reassembly.
2. Shield the motherboard with several layers of foil. Open the foil where the Nvidia GPU is by cutting around it with a finger nail or knife. Lay several coins around the perimeter of the hole in the foil to seal up the hole.
3. Place the motherboard on a flat surface in a well lit area.
4. Use a heat gun, the type you can by at any Walmart or Harbor Freight to heat the chip to 375 F for about a minute as measured by a meat or candy thermometer held slightly above the chip. There are also videos on Youtube showing people using higher quality heat guns which would be a good idea if you have access to one.
5. Place a small metal weight with a flat bottom on the center of the Nvidia GPU. I use one that weighs 100 grams that came as a calibration weight with an electronic pocket scale. Be extremely careful not to touch any of the tiny resisters on the top of the Nvidia GPU; the solder will still be molten and they will be easily knocked out of place.
6. Using a file, a dremel tool, or sand paper to smooth the top and bottom of a penny that is older than 1982, when they were still made out of copper.
7. Wait about five minutes for the chip to cool.
8. Pull the thermal tape off of the heat sink where it contacts the GPU.
9. Place a small amount of high quality heat sync grease just on the actual chip of the GPU. Smear it around so that it covers just the the chip in the center of the GPU. If you get grease on the perimeter of the GPU, carefully wipe it back off or there is a good chance it will cause a short.
10. Place a small amount of high quality heat sync grease on the heat sync above where the GPU makes contact.
11. Carefully clean the heat sync and the CPU and place a small amount of high quality heat sync grease evenly on the CPU.
12. Center the penny on top of the Nvidia GPU.
13. Tighten the heat sink back down on top of the GPU and CPU. Make sure that the penny clamped tightly enough that it will not move easily.
14. Reinstall the motherboard and reassemble the computer. Someone who is comfortable taking laptops apart and putting them back together can complete this entire operation in an hour or two.
This corrects both the black screen where the computer won?t start at all and laptops where the wireless stops working. If just the wireless is not working? It is a risky enough operation that I would suggest waiting until the computer stops working altogether. This procedure fixed all issues with my computer. I would also suggest using a laptop cooling pad in the future and thoroughly cleaning the fan and fins on the heat sink.
I can?t guarantee that this will work for every computer, but it will probably fix a large percentage of them. It definitely gives one a great feeling to revive one of these dead motherboards. If you heat the chip too much it could cause damage and if you don?t heat it enough it might need a second attempt. Good luck!
Problems:
1) I tried Computer Graphics Programs written in language C and CPP but they never worked on my laptop. Screen goes blind (screen collapse)
whenever I tried executing the program.
2) I have Advanced Computer Graphics subject in my current semester in which we have to do some practical assignments in Visual Studio 6.0 using openGL as programming language. For that I had done some initilazation:
download it from:
http://www.xmission.com/~nate/glut/glut-3.7.6-bin.zip
a) I had copied glut32.dll in c:/windows/system directory
b) I had copied glut32.lib in c:/program files/microsoft visual studio/vc98/lib directory
c) I had copied glut.h in c:/program files/microsoft visual studio/vc98/include/Gl
AND I just refreshed once on that moment My laptop Screen get collapsed(blurr).
Now after the second problem arose display device crashed (it is showing any random display ). When I consulted to technician he said that Graphics control IC has been damaged. And he also commented it to be the same problem for whole series (AMD & Nvidia Combination).
So I would like you to investigate it and do the needful.
I would reccomend you to execute the following programs on same machine:
And try the initialization that I have done with glut files and Visual Studio 6.0.
This site is a database of other people who have had the same problem with their HP laptops and what they have done to get them fixed, replaced or refunded.
This is bad, because they also use the affected NVidia chip (9600M GT series) which is also overheating very fast.
I'm not sure that NVidia is only guilty here: clearly this is HP that has not even tested any basic 3D demo program, to see that the cooling was clearly inefficient: even when the fans are constantly running at maximum speed, the GPU will overheat in ALL OpenGL and OpenCL animations in less than 1 minute, to more than 80°C. The PC will suddenly power off abruptly, and won't be able to reboot immediately without crashing and switching off also abruptly (you'll have to wait for about 10 minutes).
Why do notebooks crash? The cooling vents are incorrectly placed (only under the notebook), instead of the side or on the top (such as above the keyboard). Even when placed on a hard surface like a table desk, the air flow is too much reduced.
HP (like other notebook vendors) are trying to make stupid economies in the CPU and GPU mounting process, they use subquality heat pipes, the vents are much too small, and in general, all the cooling is extremely bad (they even forget to interconnect all the heat pipes to the vents, you can see glued aluminium surfaces going to nowhere, except to the plastic surface behind the PC, notably for the North bridge. And there's absolutely no heat pipes for memory chips.
Only the (Intel) CPU is correctly piped to the vents. But the fans are runing too slowly, and are too small and do not dissipate enough air.
Many PCs (most notebooks in fact) are affected: you can't use them to run any 3D application for more than 1 minute. Almost all common games are causing the PC to crash.
And there's not even a way to reduce the GPU frequency in the BIOS settings so that it will not overheat too fast and so that software thermal regulation will effectively work as expected (the overheat starting at time t from the internal cores of the GPU or from the CPU or from the NorthBridge, takes about 1 minute to reach the surface of the chip, and even the best cooling systems will not be able to dissipate this heat. Clearly, the thermal sensors within the NVidia GPU chips are badly placed if they can't detect rapidly the overheat condition and immediately (and automatically, by hardware automation, rather than depending on external softwares) regulate the working frequency.
NVidia asserts that it makes fast GPUs, but even the GPUs for notebooks (M series) are affected, because they use incorrect thermal material for their casing. But PC manufacturers are also not respecting the thermal requirements.
This is scandalous, PC makers are just lying: their notebooks cannot work realiably with the supplied 3D chip. What is even worse is that they don't even allow users to reduce the frequency of the GPU directly in the BIO, for safe operation.
Because the overheating can seriously damage your PC, it can die suddenly after running a basic demo or some rich-media ad banners when browsing the web. The GPU is also overheating when viewing some videos with the standard codecs supplied with some media players (including windows Media Player, Real Networks, QuickTime, or Adobe Flash).
Isn't there some independant test labs that will clearly indicate to consumers: don't buy this PC; and that will inform all online PC shops so that they stop delivering them?
Why putting a GPU on a notebook if it's unusable for something else than just basic Office applications, and if it can even fail to boot Windows, if the initial temperature is above 40°C or the room environment is just above 20°C and still below 40°C with normal hygrometry? If these notebooks are made for office apps, then remove the accelerated GPUs, or use less accelerated chips.
- by verdyp December 28, 2009 9:20 PM PST
- I had already have similar problems with ATI chips (on a AMD notebook from Acer, or and Intel-base notebook from Asus), it also affects PC build by HP with Intel CPU + nVidia GPU...
- Like this Reply to this comment
-
(22 Comments)Who tests the PCs? Are they just tested to open a Google search page in Internet Explorer or to read some PDFs or working in Word docs ? For me, every function that does not work safely as indicated by the capabilities should be completely disabled in the hardware, as if this hardware was not present. We would then see that our PCs are now subequipped (and really perform slower than previous generations).
NEVER BUY A NOTEBOOK TO RUN ANY GAME. AVOID PLAYING ANY HD VIDEO ON THEM, even if your Internet connection speed allows it. NEVER PLAY VIDEOS in full screen mode (unless it uses a reduced resolution). All these actions can cause your notebook to become permanently damaged, and all your data lost (unless you extract its internal harddisk and can connect it to another PC to restore the data) ! Never use a notebook from within its transport bag: you really need a flat hard desk surface, without any textile on it. Make sure that the room temperature never exceeds about 21°C (don't use it in a car, or outdoor during sunny days...
This means that notebooks should not be used at all during the two warmest months in Summer (use your notebook only 10 months a year !), or buy and use a energivore air conditioner in your desktop room !
And always use an external backup solution with today's notebooks. All this means that todays notebooks are in reality no longer made to be used for the mobility. If you are going to use it indoor, why buying a subquality and more costly notebook? Use a desktop PC instead, it will be cheaper, and you will have alternate solutions for their cooling (there's no simple solutions for customers without breaking the warranty, if you break the seals to install a much better internal cooling) !
(In fact from all I read above, it seems that ONLY the Intel Mobile GPUs are safe on notebooks, and that you should also never use any AMD CPU on a notebook, not even the mobile models).
And if your GPU can reach 80°C in some rare condition, stopping the animation immediately should allow the fans to dissipate the excess of temporature by at least 10 °C in less than 2 seconds. And the normal working temperature should never exceed 50°C, to which your PC should be able to return in less than 30 seconds. If this is not true, the cooling system is clearly defective.
If after stopping the animation immediately, you still see the temperature growing by more than 3°C, the temperature sensors are badly placed, and this is a serious manufacturing defect in the GPU/CPU that cannot be solved, even with the best cooling system: this serious bug affects NVIDIA 9600M series of GPUs (due to incorrect packaging), used in HP notebooks.
Using an external cooling support for your notebook does not help at all: the cooling vents are too small, and this won't work if conencted via the battery. Lots of efforts are needed from PC manufacturers to respect a responsaible use of energy; if a notebook can consume more energy than it can dissipate by heat, this is a serious design defect. Clearly PC makers are lying with false performance reports or false capabilities. But who pays the defects ? Customers only.
Make the PC a bit more expensive if this is needed for getting correct and safe cooling. But don't make dangerous economies in the manufacturing.
(Note: software thermal control tools will not work to correct this, as they don't react fast enough, and the nVidia or ATI GPUs will even fail if their main clock get suspended for too long time, for the time needed to allow the temperature to get down).