Folding Command line arguments
Missileman
Orlando, Florida Icrontian
Okay guys. I'm looking at more stuff on the forcesse,forceasm and such.
Over at the ABXZone a guy posted that they had done extensive testing and found the best way to set the switches for highest production.
Here's how it goes.
AMD XP processor: -forceasm only
P4 Processor: -advmethods only
P4 HT Processor: -advmethods - local
AMD 64 Processors: -advmethods -forceSSE
He claimed that advmethods switch has the server try for a gromac or double gromac first. No advmethods requests a tinker first.
He claimed tinkers ran faster on AMD XP processors using 3DNOW and SSE shouldn't be forced as it was much slower.
I tried something. My barton system was about 250 steps through a 400 step 244 point tinker. It has been averaging 5:44 per step with SSE forced. I took it down and changed it to forceasm. It is now averaging 4:58 per step. Now over 400 steps that is a big difference (6 hours). It's at step 327 now and maintaining the faster pace. How much slower would a gromac run without SSE. How often would it get gromacs versus tinkers.
Just some questions cause I am going to change the office machines tomorrow and want to set them the best way.
Over at the ABXZone a guy posted that they had done extensive testing and found the best way to set the switches for highest production.
Here's how it goes.
AMD XP processor: -forceasm only
P4 Processor: -advmethods only
P4 HT Processor: -advmethods - local
AMD 64 Processors: -advmethods -forceSSE
He claimed that advmethods switch has the server try for a gromac or double gromac first. No advmethods requests a tinker first.
He claimed tinkers ran faster on AMD XP processors using 3DNOW and SSE shouldn't be forced as it was much slower.
I tried something. My barton system was about 250 steps through a 400 step 244 point tinker. It has been averaging 5:44 per step with SSE forced. I took it down and changed it to forceasm. It is now averaging 4:58 per step. Now over 400 steps that is a big difference (6 hours). It's at step 327 now and maintaining the faster pace. How much slower would a gromac run without SSE. How often would it get gromacs versus tinkers.
Just some questions cause I am going to change the office machines tomorrow and want to set them the best way.
0
Comments
2.) John_D (Ageek) asked to make sure you guys all know this and this seems like a good place to say it.
-Advmethods: OK here it is what it is and what it does. Advmethods is a switch that will get you WUs that are in the last stage of development. For the most part they are stable but there are batches of WUs that get out that can freeze your computer and early end a stock system. It is NOT a swtich that will get you gromacs only. Thats not what its for. Since the introduction of the switch, most of the time it was a way to recieve gromacs but I remember a time when Stanford was testing out genomes and they were being passed out to people with advmethods. Tinkers have been passed out as well.
3.) -forcesse (can be -forcsse or -forceSSE) is like 30% faster than -forcseasm. But as it has been reported lately with the eewest gromac cores, even these switches are not needed as the client will automatically use SSE optimizations. In fact Stanford recommends to not constantly use -forcesse or -forceasm on any machine. If you are forcing it to run on a machine because it stops using the optimizations because of unstability, then something needs to change. Its unstable and needs to be clocked down some.
So here is my correct way to set up AMDs and Intels.
32-bit AMDs: -forcesse -verbosity 9
Intel/64-bit AMDs: -forcesse -advmethods -verbosity 9
HT computers should have the -local switch as well since it will need 2 clients. I am not sure if this switch is entirely needed anymore. But it doesnt hurt.
64-bit AMDs have SSE2 so they can definetely benefit from the double gromacs WUs just like the P4s can. Tinkers or double gromacs would be best for them.
P3s: I havent tested them with the new tinkers and havent read anything on them. They have a higher IPC than the P4s so I would think that might do better on the tinkers than the P4s. Someone might want to test this out. I would say no -advmethods on P3s.
I dont have advmethods on any of my Athlon machines and I still get mostly Gromacs WUs. There are just more gromacs than tinkers and Stanford can get more info from gromacs than tinkers. They are just faster than tinkers so they are doing more gromacs than tinkers.
If anyone didnt understand that, speak up.:)
One thing thats kinda odd is that some WUs are passed out without any flags that are also passed out with the advmethods flag. Not all WUs but some are in both areas. Those are WUs that pretty much no problems on any platform.
If I have trouble getting a WU on a machine I'll add -advmethods or something like that and that seems to help sometimes.
One thing that can't be explained is the Tinker I was originally talking about in this thread. Even though it is not supposed to use any optimizations, it did run significantly faster changing it to forceasm. I know the log doesn't show anything being turned on and such. This was a big WU and nothing changed on the machine. Nobody was using it. It was just folding. To lose almost a minute per frame a big jump considering it was more than half way through the WU. It did finish right on the projected time of the faster pace. Something I need to ponder for a while.
The only reason why they made it so that SSE had to be forced on was because on some AMDs, SSE was not implemented right and since gromacs was such a stress test, some computers were crashing.
If it works faster that is great ...I'm certain that John_D has pointed this out on numerous posts. However, my concern is with stability at this point.
The clients are meant to be run with no arguments unless there are issues that need to be addressed. IMO ppl begin to run into various problems when implementing the arguments ...not to say that the arguments cause problems only that the risk is greater.
If you are comfortable with the arguments and they give you more production then by all means use them. If you develop any issues make us aware and we can do what it takes to try and resolve them. Sometimes the issues are caused by local problems and sometimes by the fault of the WU or core itself. That is where the troubleshooting begins.
An example: I've run -advmethods for at least 1 entire year on my 20+ rigs in lab with no issues due to that argument that I am aware of.
Happy Folding!!!
csimon
I'm running a P3, a console running as a service. The only tag it's running is -service. 933Mhz, it gets about 17-18 minutes a frame for the tinker WU it has been getting recently. I would compare it to this machine, but as I said, I can't even remember the last time this machine got one. Tinkers seem to be the only WU's that PIII gets. I think it's pretty close if not as fast as this machine (i guess i should say 1/2 of this machine) goes. IF i remember correctly this computer had ABOUT 14 mins/frame or something, so the PIII running tinkers isn't too bad at all. EM calculates ~330 PPW. I don't exactly remember how it does with gromacs WU's. It gets only tinkers, kinda like my own machine gets only gromacs.
-Rick