AOA Forums AOA Forums AOA Forums Folding For Team 45 AOA Files Home Front Page Become an AOA Subscriber! UserCP Calendar Memberlist FAQ Search Forum Home


Go Back   AOA Forums > General > ThunderRd's AOA FOLDING@HOME Team > nVidia/ATI GPU2 clients

nVidia/ATI GPU2 clients Folding clients that use GPUs on graphics cards running the GPU2 client


Reply
 
LinkBack Thread Tools Rate Thread
  #1 (permalink)  
Old 15th February, 2009, 10:02 PM
dabaerman's Avatar
Member
 
Join Date: October 2001
Location: Auburn, WA
Posts: 2,101

nans detected

I'm starting to see a lot of this on one vid card. an MSI 9800GT OC Edition @ 660mHZ. happens on 5750 and 5771. so far just those two.
any ideas as to what might be causing this. I am heading for the Folding Community pages too.

Ron
__________________
Folding 24/7 with AMD QuadCore CPU's & nVidia Cuda Powered GPU's

AOA Team fah
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 16th February, 2009, 03:38 AM
ThunderRd's Avatar
Irreverent Query Chairman
 
Join Date: June 2007
Location: NYC native in northern Thailand
Posts: 2,232

Is it overclocked and/or hot?
__________________
#1: Tt Armor, ASUS Maximus Extreme, QX9650@4.1G, 8G Corsair Dominator GT DDR3-2000, Corsair HX1050, H2O-Swiftech, Gigabyte GTX470/Arctic Accelero Xtreme Plus II, Intel 520 SSD, Kingston SSD, 2xRaptor 150G RAID0, Win 7 Pro 64
#2: Tt Shark, ASUS P5Q Pro Turbo, Q6600@3.8G, 4G HyperX-1600, Corsair HX850, CoolerMaster V10, 2xASUS 9600GT, 2xRaptor 74G RAID0, OCZ Vertex 4 SSD, Gentoo/siduction Linux [64-bit]
#3, #4: Opteron 170@2.75G nude, A8N-SLI Deluxe, Gentoo

AOA Folding @HomeOur sister site: www.gamersonlinux.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 16th February, 2009, 01:09 PM
Chief Systems Administrator
 
Join Date: September 2001
Location: Europe
Posts: 13,075

NaN is a floating point error, where the result is Not a Number (Hence the abbreviation). Typically this happens when the results of a calculation rely on a previous calculation, and there's been an error somewhere. For example, trying to divide a number by 0 will lead to a NaN, as infinity can't be represented.

That suggests that the card isn't functioning properly and you're getting errors in it's calculations. Heat, power supply problems (either main PSU or oncard power), card RAM issues and a dying card could all cause this kind of error!
__________________
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 16th February, 2009, 06:08 PM
ThunderRd's Avatar
Irreverent Query Chairman
 
Join Date: June 2007
Location: NYC native in northern Thailand
Posts: 2,232

With folding it's usually the overclock and the heat it brings. Ron, I'd put teh card at factory specs, point a fan at it and see if your problems continue. I've looked at those project numbers and can't see anyone reporting unusual about them. 90% of the time it's our equipment, and not Stanford.

I've produced about two dozen 5750s and several 5771s already without incident, FWIW.
__________________
#1: Tt Armor, ASUS Maximus Extreme, QX9650@4.1G, 8G Corsair Dominator GT DDR3-2000, Corsair HX1050, H2O-Swiftech, Gigabyte GTX470/Arctic Accelero Xtreme Plus II, Intel 520 SSD, Kingston SSD, 2xRaptor 150G RAID0, Win 7 Pro 64
#2: Tt Shark, ASUS P5Q Pro Turbo, Q6600@3.8G, 4G HyperX-1600, Corsair HX850, CoolerMaster V10, 2xASUS 9600GT, 2xRaptor 74G RAID0, OCZ Vertex 4 SSD, Gentoo/siduction Linux [64-bit]
#3, #4: Opteron 170@2.75G nude, A8N-SLI Deluxe, Gentoo

AOA Folding @HomeOur sister site: www.gamersonlinux.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 16th February, 2009, 10:43 PM
dabaerman's Avatar
Member
 
Join Date: October 2001
Location: Auburn, WA
Posts: 2,101

a quick look at my settings : card is factory oc @ 660mHZ, mem @ stock 950mHz and shaders @ default 1738mHZ, temp of 67c. nans on 7 consecutive 5771's, completed #8, chocked on the next 3 then got a 5754 that got nans @ 49%. all of this on Sunday. no issues today.
a visit to the Folding Community points to many projects in the 57xx to 5772 range with nans issues. the list covers all nVidia cards on all chipsets and processors, ie: intel, AMD and SiS.
I have completed many 5750's and 5771's on 4 other nVidia cards, 2- 9800GT, 1- 9600GT and 1-9600GSO with no issues. I will add my details to the FC database: AMD AM2 5000 BE @ 3.0mHZ, 200X15, Gigabyte GA-MA770-DS3, 2gb G-Skill DDR2-800, 2.0V, Enermax Liberty 450W psu all in a Koolance 601B (Chieftech Server Case).

Ron
__________________
Folding 24/7 with AMD QuadCore CPU's & nVidia Cuda Powered GPU's

AOA Team fah

Last edited by dabaerman; 17th February, 2009 at 02:29 AM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 17th February, 2009, 03:57 AM
ThunderRd's Avatar
Irreverent Query Chairman
 
Join Date: June 2007
Location: NYC native in northern Thailand
Posts: 2,232

What driver are you using on that box? Is it the same as the others?

Certainly doesn't seem like it's too hot or oc'd too much. Wonder why it's happening in batches like that, though. I could see the odd NaN if everything is ok, but 7 in a row seems to indicate something else.

We could take a look at the run-clone-gen numbers to see if there is any correlation. Do you still have the logs?
__________________
#1: Tt Armor, ASUS Maximus Extreme, QX9650@4.1G, 8G Corsair Dominator GT DDR3-2000, Corsair HX1050, H2O-Swiftech, Gigabyte GTX470/Arctic Accelero Xtreme Plus II, Intel 520 SSD, Kingston SSD, 2xRaptor 150G RAID0, Win 7 Pro 64
#2: Tt Shark, ASUS P5Q Pro Turbo, Q6600@3.8G, 4G HyperX-1600, Corsair HX850, CoolerMaster V10, 2xASUS 9600GT, 2xRaptor 74G RAID0, OCZ Vertex 4 SSD, Gentoo/siduction Linux [64-bit]
#3, #4: Opteron 170@2.75G nude, A8N-SLI Deluxe, Gentoo

AOA Folding @HomeOur sister site: www.gamersonlinux.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 17th February, 2009, 04:29 PM
dabaerman's Avatar
Member
 
Join Date: October 2001
Location: Auburn, WA
Posts: 2,101

TR, took a close look at the log file. it was 5 in a row:
5750 R-4 C-147 G-106 2%
I rebooted and it loaded:
5771 R-5 C-228 G-156 14%
5771 R-5 C-228 G-156 5%
5771 R-5 C-228 G-156 5%
5771 R-5 C-228 G-156 3%
5771 R-5 C-228 G-156 100% completed 10 in a row, then;
5752 R-9 C-069 G-88 49%
5752 R-9 C-069 G-88 100% then;
5749 R-1 C-011 G-125 67%
5749 R-1 C-011 G-125 100%
5749 R-12 C-311 G110 100% and completed 6 more so far.

it was MONDAY


Ron
__________________
Folding 24/7 with AMD QuadCore CPU's & nVidia Cuda Powered GPU's

AOA Team fah
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 17th February, 2009, 08:34 PM
ThunderRd's Avatar
Irreverent Query Chairman
 
Join Date: June 2007
Location: NYC native in northern Thailand
Posts: 2,232

All in all, that doesn't seem terribly extraordinary since the same WU was drawn multiple times. I guess it's a bit excessive, but you ran 10 successful completions after that, so who knows? I did some digging at the forum, there are some complaints about 57xx but I wouldn't say it's more than normal.

What happened to r5 c228 g156, r9 c069 g88, and r1 c011 g125 when they got to 100%? Did they fail or upload successfully?

I'd still do what I said, open the case, point a house fan at the card, and run it like that for several days; see if you get any more failures. If you do, we'll take more drastic measures
__________________
#1: Tt Armor, ASUS Maximus Extreme, QX9650@4.1G, 8G Corsair Dominator GT DDR3-2000, Corsair HX1050, H2O-Swiftech, Gigabyte GTX470/Arctic Accelero Xtreme Plus II, Intel 520 SSD, Kingston SSD, 2xRaptor 150G RAID0, Win 7 Pro 64
#2: Tt Shark, ASUS P5Q Pro Turbo, Q6600@3.8G, 4G HyperX-1600, Corsair HX850, CoolerMaster V10, 2xASUS 9600GT, 2xRaptor 74G RAID0, OCZ Vertex 4 SSD, Gentoo/siduction Linux [64-bit]
#3, #4: Opteron 170@2.75G nude, A8N-SLI Deluxe, Gentoo

AOA Folding @HomeOur sister site: www.gamersonlinux.com

Last edited by ThunderRd; 17th February, 2009 at 08:38 PM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 18th February, 2009, 06:14 AM
dabaerman's Avatar
Member
 
Join Date: October 2001
Location: Auburn, WA
Posts: 2,101

oh, they finished and uploaded successfully. it was 7 failures and the at one point they were reloaded and finished. some were the same 57XX but different R-C-G. as for drivers, ForceWare 180.60 (nv4_dsp 6.14.11.8060) I'll check, I know I used the current CUDA drivers at the time but am sure the same driver on all nv cards.

Ron
__________________
Folding 24/7 with AMD QuadCore CPU's & nVidia Cuda Powered GPU's

AOA Team fah
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 3rd March, 2009, 03:05 AM
dabaerman's Avatar
Member
 
Join Date: October 2001
Location: Auburn, WA
Posts: 2,101

a little more information. seems like the majority of failures are with the 511 point projects. I've made sure the drivers were up to date. lowered the core and shader speed and watched the temps. still have a few that don't finish and they are all over the place, from 4% to 92% when they fail. more later.

Ron
__________________
Folding 24/7 with AMD QuadCore CPU's & nVidia Cuda Powered GPU's

AOA Team fah
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
No signal detected... Shemo CRASHED! 24 28th March, 2008 02:55 AM
CPU not detected correctly! feds64 AMD Motherboards & CPUs 2 20th March, 2005 07:25 AM
GF2 mX400 not detected jutboy EPoX MotherBoards 13 15th July, 2003 11:01 AM
TDK CDRW not being detected at POST eberglar EPoX MotherBoards 7 24th April, 2003 07:34 PM
new hardware detected ?? draboo EPoX MotherBoards 6 27th December, 2001 03:02 PM


All times are GMT +1. The time now is 10:43 AM.


Copyright ©2001 - 2010, AOA Forums
Don't Click Here Don't Click Here Either

Search Engine Friendly URLs by vBSEO 3.3.0