Folding Command line arguments

Missileman · May 2004

Okay guys. I'm looking at more stuff on the forcesse,forceasm and such.

Over at the ABXZone a guy posted that they had done extensive testing and found the best way to set the switches for highest production.

Here's how it goes.

AMD XP processor: -forceasm only

P4 Processor: -advmethods only

P4 HT Processor: -advmethods - local

AMD 64 Processors: -advmethods -forceSSE

He claimed that advmethods switch has the server try for a gromac or double gromac first. No advmethods requests a tinker first.

He claimed tinkers ran faster on AMD XP processors using 3DNOW and SSE shouldn't be forced as it was much slower.

I tried something. My barton system was about 250 steps through a 400 step 244 point tinker. It has been averaging 5:44 per step with SSE forced. I took it down and changed it to forceasm. It is now averaging 4:58 per step. Now over 400 steps that is a big difference (6 hours). It's at step 327 now and maintaining the faster pace. How much slower would a gromac run without SSE. How often would it get gromacs versus tinkers.

Just some questions cause I am going to change the office machines tomorrow and want to set them the best way.

mmonnin · May 2004

1.) Tinkers have NO, Absolutely NONE, optimizations. It wont matter if you run -forceasm, forcesse, or nothing. Since it doesnt have optimizations there is nothing to turn on. Use -verbosity 9 to see the max in your log file. When SSE or 3dnow! is on it will say so. Tinkers say nothing.

2.) John_D (Ageek) asked to make sure you guys all know this and this seems like a good place to say it.
-Advmethods: OK here it is what it is and what it does. Advmethods is a switch that will get you WUs that are in the last stage of development. For the most part they are stable but there are batches of WUs that get out that can freeze your computer and early end a stock system. It is NOT a swtich that will get you gromacs only. Thats not what its for. Since the introduction of the switch, most of the time it was a way to recieve gromacs but I remember a time when Stanford was testing out genomes and they were being passed out to people with advmethods. Tinkers have been passed out as well.

3.) -forcesse (can be -forcsse or -forceSSE) is like 30% faster than -forcseasm. But as it has been reported lately with the eewest gromac cores, even these switches are not needed as the client will automatically use SSE optimizations. In fact Stanford recommends to not constantly use -forcesse or -forceasm on any machine. If you are forcing it to run on a machine because it stops using the optimizations because of unstability, then something needs to change. Its unstable and needs to be clocked down some.

So here is my correct way to set up AMDs and Intels.

32-bit AMDs: -forcesse -verbosity 9

Intel/64-bit AMDs: -forcesse -advmethods -verbosity 9

HT computers should have the -local switch as well since it will need 2 clients. I am not sure if this switch is entirely needed anymore. But it doesnt hurt.

64-bit AMDs have SSE2 so they can definetely benefit from the double gromacs WUs just like the P4s can. Tinkers or double gromacs would be best for them.

P3s: I havent tested them with the new tinkers and havent read anything on them. They have a higher IPC than the P4s so I would think that might do better on the tinkers than the P4s. Someone might want to test this out. I would say no -advmethods on P3s.

I dont have advmethods on any of my Athlon machines and I still get mostly Gromacs WUs. There are just more gromacs than tinkers and Stanford can get more info from gromacs than tinkers. They are just faster than tinkers so they are doing more gromacs than tinkers.

If anyone didnt understand that, speak up.:)

primesuspect · May 2004

I understand it, but I've always seen an advantage to not using any flags whatsoever (excepting -service).. Since my farm consists of so many machines at a variety of remote locations, and they are all production machines used for daily work, I can't risk anything beta or potentially unstable on any of them. So with no flags, I still managed to squeeze into the top 5 ranking.

mmonnin · May 2004

Thats something people like you have to take into consideration. The machines have to be on and cant be brought down by something not been thru full testing. Thats understandable most certainly.

One thing thats kinda odd is that some WUs are passed out without any flags that are also passed out with the advmethods flag. Not all WUs but some are in both areas. Those are WUs that pretty much no problems on any platform.

csimon · May 2004

primesuspect wrote:

I understand it, but I've always seen an advantage to not using any flags whatsoever (excepting -service).. Since my farm consists of so many machines at a variety of remote locations, and they are all production machines used for daily work, I can't risk anything beta or potentially unstable on any of them. So with no flags, I still managed to squeeze into the top 5 ranking.

I'm doing the same ...no flags.
If I have trouble getting a WU on a machine I'll add -advmethods or something like that and that seems to help sometimes.

Missileman · May 2004

Well I understand what you all have said. It actually looks like no arguments would be best. The Barton finished that 244 Tinker WU and got a Gromac. Log shows it is running with SSE boost even though it has forceasm on. According to the Stanford boards this is correct as forceasn just forces optimizations on, whenever available. Core should now detect SSE first over 3DNow+ and use it. It obviously does this correct on a Barton. I will check the others at the office to see. They currently have no arguments at all.

One thing that can't be explained is the Tinker I was originally talking about in this thread. Even though it is not supposed to use any optimizations, it did run significantly faster changing it to forceasm. I know the log doesn't show anything being turned on and such. This was a big WU and nothing changed on the machine. Nobody was using it. It was just folding. To lose almost a minute per frame a big jump considering it was more than half way through the WU. It did finish right on the projected time of the faster pace. Something I need to ponder for a while.

mmonnin · May 2004

For AMDs -forceasm is foring 3dnow! and -forcesse should force SSE. But with the new gromacs cores as I mentioned, none are needed. It will automatically use SSE.

The only reason why they made it so that SSE had to be forced on was because on some AMDs, SSE was not implemented right and since gromacs was such a stress test, some computers were crashing.

csimon · May 2004

Missileman wrote:

Well I understand what you all have said. It actually looks like no arguments would be best. The Barton finished that 244 Tinker WU and got a Gromac. Log shows it is running with SSE boost even though it has forceasm on. According to the Stanford boards this is correct as forceasn just forces optimizations on, whenever available. Core should now detect SSE first over 3DNow+ and use it. It obviously does this correct on a Barton. I will check the others at the office to see. They currently have no arguments at all.

One thing that can't be explained is the Tinker I was originally talking about in this thread. Even though it is not supposed to use any optimizations, it did run significantly faster changing it to forceasm. I know the log doesn't show anything being turned on and such. This was a big WU and nothing changed on the machine. Nobody was using it. It was just folding. To lose almost a minute per frame a big jump considering it was more than half way through the WU. It did finish right on the projected time of the faster pace. Something I need to ponder for a while.

If it works faster that is great ...I'm certain that John_D has pointed this out on numerous posts. However, my concern is with stability at this point.

The clients are meant to be run with no arguments unless there are issues that need to be addressed. IMO ppl begin to run into various problems when implementing the arguments ...not to say that the arguments cause problems only that the risk is greater.

If you are comfortable with the arguments and they give you more production then by all means use them. If you develop any issues make us aware and we can do what it takes to try and resolve them. Sometimes the issues are caused by local problems and sometimes by the fault of the WU or core itself. That is where the troubleshooting begins.

An example: I've run -advmethods for at least 1 entire year on my 20+ rigs in lab with no issues due to that argument that I am aware of.

Happy Folding!!!

csimon

Medlock · May 2004

My HT P4 runs with -advmethods -local and -verbosity 9. A while ago I took off -advmethods upon learning that the switch does not actually request gromacs WU's. Ran it that way for almost a week, no tinkers. Same even with the switch. I put the argument back because it didn't seem to hurt anything. I can't really remember the last time this computer got a tinker, with or without the switch. I'm going to take off -verbosity 9 because I don't need it anymore. I only added it because it had some odd problem... It seemed to be folding at exactly half of its usual speed, even though the Task Manager had reported 100% processor usage. I went and changed the settings, so that they both used 96% of the processor and I made sure that both clients were running on different logical processors in the Task Manager. (CPU 0 and CPU 1) Problem Solved.

I'm running a P3, a console running as a service. The only tag it's running is -service. 933Mhz, it gets about 17-18 minutes a frame for the tinker WU it has been getting recently. I would compare it to this machine, but as I said, I can't even remember the last time this machine got one. Tinkers seem to be the only WU's that PIII gets. I think it's pretty close if not as fast as this machine (i guess i should say 1/2 of this machine) goes. IF i remember correctly this computer had ABOUT 14 mins/frame or something, so the PIII running tinkers isn't too bad at all. EM calculates ~330 PPW. I don't exactly remember how it does with gromacs WU's. It gets only tinkers, kinda like my own machine gets only gromacs.

-Rick

mmonnin · May 2004

Thats the Assignment Server doing its job. It will try to give Gromacs to P4s since they dont so as well with tinkers. If the P4 has advmethods on it will give the P4 some Double Gromacs when available. This is all assuming that your performance fraction is high enough.

Medlock · May 2004

Hehe I think I need a new server then. These p520s, 524s, and 924s are weak lol. Makes sense though.

Folding Command line arguments

Comments