![]() |
SMP folding client uses one core on new kernel build I'll post this here and if we come up with any ideas or solutions I'll ask Aedan to move the thread to the folding page. My Xubuntu machine, like my 2 Ubuntu servers, runs a 2.6.26.8 kernel patched with Ingo Molnar's RealTime patch, and all 3 boxes fold with the SMP client problem-free. I had a little free time on the weekend so I did some experimenting with the latest stable source[2.6.29.3, patched in the same way.] Using the same .config file as I used to build the other kernel I created .debs for the image and headers. Everything installed fine, and seemed to work fine with the exception of folding - the new kernel does not utilize both cores on an Opteron-185. It uses one core to 100% while the other simply idles. Reinstalling the client did nothing. Booting into the old kernel, folding runs fine again. I noticed that while setting up menuconfig there are a fair number of new options[some experimental]. I did not change any of the default settings on these, simply importing the .config file from the old kernel into the new build. I know that the question is a bit esoteric, but what is the consensus on where to begin to look [in the kernel configuration] for what is preventing full core utilization, perhaps something related to SMP task scheduling or the like? Or am I overlooking something else? Also, is there anyone out there who could either confirm or deny what I've experienced by building a kernel on 2.6.29.3 and trying the SMP client? I'm wondering if there's a bug. There are some reports on the forum of folks having trouble with 2.6.28, which is the standard distribution kernel in 9.04 Jaunty [the current Ubuntu/Xubuntu release]. I may have just proven my own argument for rolling release distros rather than scheduled release [read:rushed?] distros like Ubuntu. |
Does the kernel include MPICH2 support? |
Well, when the client is running, there is an mpiexec process active with 4 fahcore children. In the old kernel, the 4 fahcore processes add up to about 90% or so of CPU use. In the new build only two fahcore processes show CPU use, although all 4 are present. Since there is a running mpi process, I'd say yes, it does support it. There is no kernel option for it though, I've just looked at .config. |
Interesting, I'd agree. |
I've had some issues with the 2.6.29 kernel on my gentoo install as well. A little different than yours in that I can only get 25% load on all cores regardless of what is set. I believe it is a task scheduler issue but do not have the knowledge to get it worked out. Good news is I have enlisted some help that I will get with tonight and see if we can work it out.. I'll point out your issue to him as well. |
Hmmm...if I have some time I may try the next unstable revision and see if the problem goes away [2.6.30]. I'll let you know if I find anything. |
I wasn't able to get with my friend last night. Maybe tonight. I still feel they are trying to make the kernel to interactive, sacrificing overall performance for interactivity. Since the 2.6.23 kernel my rigs have gotten quite a bit more powerful, but the time it takes to get certain tasks done has not improved much. Although I can do more things at the same time, which is not important to me as I hardly ever have more than a couple of windows open.. |
Quote:
Thanks for keeping me in mind ;) |
I've got a few different distributions on my rig at the moment, currently working in RPath linux with a 2.6.29 kernel so I figured I would test it with some folding. Everything is fine under RPath.. I noticed you used your config from the 2.6.28 to the 2.6.29 kernel, generally not a good idea but sometimes it will work.. I pretty much stick to configuring the kernel manually, unless it is just a revision bump, from 2.6.28-r3 to 2.6.28-r4 as an example. I won't be able to get back into gentoo until tomorrow to go into the kernel config.. Edit: I know it probably isn't as simple as SMP being disabled in the kernel but worth a check I guess.. Post a copy of your config file and I'll browse through it.. |
Here's the relevant sections of the .config file. I'll try it again later in the week from scratch. Also, I still want to see if the newest source will behave differently, but don't have a clear enough head to do it right now. I'll keep you informed. I wish it were as simple as SMP being disabled but under processor type it's definitely enabled. Remember, the old kernel has been running fine; it's only the new build that acts up, with the same options. Code: # |
I was only able to get a few minutes with my bro last night, he says to recompile the kernel without oldconfig and the problem will be fixed. I'll get more specifics from him tonight. Rolling release distros have there benefits and pitfalls as well. Staying up to date is easier, but things will break from time to time due to dependancies and I've seen 3 kernel updates come in one week through Gentoo which tells me someone somewhere isn't being real careful with the code they are writing.. |
Quote:
That's interesting. Wonder why it's any different from importing the same settings? I'll try it as soon as I have time. |
As new features are added or if the name changes, using an old config will leave any option that is unknown unset. I went through my kernel config and made a few changes to group related components and now have a 100% working 2.6.29 gentoo kernel.. I can throw my config to you as a starting point.. You will have to go through it manually still as it is set up for my rig and slimmed down quite a bit.. |
Cliff- OK, that makes some sense. I should be able to get through it as I did with my current 2.6.28 RT kernel. Mine is down to about 10MB from the stock Xubuntu kernel of 23MB, and I can probably get it a bit smaller if I learn a bit more about the options. In some cases I was forced to use some common sense to set them, and could be wrong. What did you do with the new "experimental" options? There are a good number of them. Maybe you could post your file, as you said, and I can take a look. Cheers for the help. |
1 Attachment(s) To be honest I try to stick with a pretty non-experimental kernel so I will browse through the menu, if I find something that may improve my system I will give it a run.. Really haven't seen any performance benifits in the last year kernel wise.. Here's my cpu portion of the config. The NUMA options will have to be changed for your system to boot.. Code: CONFIG_64BIT=y |
2.6.29.6 and 2.6.31-rc7 do the same thing. I finally had time to play around with it, and the result is the same. Using the same .config file, without make oldconfig, the second core refuses to light up on the SMP client. Compiling a vanilla kernel, without the RT patch, on either revision is fine, running both cores normally. I'm not sure if this is a bug and have no experience reporting bugs. Any thoughts on whether I should or not, and how and where to report it? |
I usually go to the distribution forums and start a thread when I have kernel issues, they will either point you to a bug report that may have an answer, or have you submit a new bug report. It sounds like it refuses to set either smp or number of cpus in the kernel config. Does it show that it is actually picking it up? Use cat /proc/config.gz |
I'll try the Ubuntu forum and see what they say... |
I wish I could be of more help, just don't know kernels very well and the RT kernel patch was just never very appealing to me.. I had the same problem on the 2.6.25 kernel awhile back, fix for me was to just move back to a 2.6.23 kernel. I still run a 23 kernel on my server. It works just the way it should so I haven't felt the need to do anything with it.. |
That's OK, it's more of an exercise in learning for me, I may stick with the one that's working well for me and wait for the next major revision to try again. I wish I could open some dialog with the RT developers but they are a rather secretive bunch...I reckon that I'd get lost in the first few words anyway ;) As for the RT patch, I have found it to be noticeably lighter and quicker on its feet with desktop apps than the standard release kernel. It also is slightly faster in frame times in folding; don't know why that should be though. |
All times are GMT +1. The time now is 11:23 AM. |
Copyright ©2001 - 2010, AOA Forums