Forcing only Tinkers?
TBonZ
Ottawa, ON Icrontian
After getting SM23 setup and contributing, I noticed a huge difference between Gromac and Tinker. My box, a 2000+ is folding a Gromac and SM23 is folding a Tinker. The tinker is worth 241 points and completes in 43 hours while the Gro on the faster box worth 240 points is set to take 70 hours to complete. This is huge discrepancy between completion times and point value.
Can Tinker be forced?
Can Tinker be forced?
0
Comments
The points were set on a P4.
AMD boxes make a killing on the big Tinkers. They are better at math.
I am happy just to get them once in a while.
Actually ...on my AMD boxes I have no flags and I have yet to get a single gromac.
I use -advmethods on my p4's but I get mostly tinkers on those too. :bawling:
Here, best as I can figure out, is why:
Semi-monolithic code, IE single (true monolithic) or small amounts of threads(multi-thread but not hyper-threaded) in each process's code, can branch many ways. The more possible branches at any one decision tree intersect or branch, the more a short pipe helps. The CPU can eval each branch faster with short pipes. With many threads per process, threads can be part way through the longer pipe and be dedicated to each possible decision tree branch eval by thread.
Think a modern, very complex game situation. Many choices can be made, and an interactive game has many possible choices allowed for in each situation. IF the decision tree allows for rules of if NOT this then NOT that, and each eval is in a thread, this is hyperhtreaded decision tree processing. If game has rules only for consequences of possible action based on what IS done, then the game tends to have less possible branches that can be run each as a factor thread. Player chooses one action each time situation changes.
Now, think folding:
Folding uses many successive frames to get a result. They are based primarily on positive consequences evals. Negative consequences are dealt with by Early-Ends if severe and humans have to eval the weight of a negative consequence. Wrong hypotheses in essence yield Early-Ends, with minor clues as to WHY the WU ends. Once the scientists know more about a certain form of cancer, or a certain form of avian flu interaction, results both negative and positive can be matched to a knowledge base and rules derived. Then each step can be compared to what is known, and be a thread by step. THERE, the P4 will have an advantage, but it is future advantage due to the way coding is done now and traditionally has been done for decades. UNLESS the P4 is very fast, and the study Anand Shimpi did on AMD versus Intel actually shows that until AMD breaks the 3.2 GHZ ACTUAL barrier per pipe, that a fast Intel can be almost exactly as effective as a slower absolute speed per pipe AMD. BUT, the longer pipe is meant for many-threaded processes. Intel, based on what is being done with 64 bit work, tried for absolute speed. Heat is a formidable barrier for normal and Fairly inexpensive boxes really being stable-- and lots of the heat is genned by racing (not oversped excessively, but very FAST) cache RAM in large amounts. Increase cache at L2 and L1 levels, the heat and proximity (close-togetherness of things) causes one areas heatup to heat other areas. Intel had to have a lot of cahce at high speed in order to be able to pend threads because each thread DID take longer to go through teh CPU pipe. AMD, wanting to be able to handle ANY code, did not tune for 64 bit hyperthreaded code as much. Less cache was used and needed, older code DOES do better on a shorter pipe.
BUT, here is how I got to the Intel box handling many threads at once better at very high speed:
I have typically 40-300 processes running on the Prescott box. This is an early gen Prescott, not earliest, but not latest as of last six months. When I have 41 processes running, and 1346 Threads, the box folds fine. When the box has 200 processes and over 2000 threads running (note the ratio change between first set parts and second set parts , I do have code sets that run more monolithicly to add to run simultaneously) the processes that are monolithic slow down but are still stable. When I run additionally, real old code processes in addtion to this, the CPU stumbles badly. Many-branched folding breaks. Tinkers, OTOH, just fold slower. AND, because they are LESS efficient on a P4 than on an AMD of same speed, they also take longer until P4 speed is greater than 3.4 GHz. Then, sheer speed of cache and CPU makes them faster on P4, but at a heat gen cost that is huge.
Oh happy day!
Seems odd to me, but I haven't been folding too long. it doesn't seem to match the frame number for a p1084.
Stacy
I wouldn't wish a big Gromac on anybody.
With the recent upswing in large Gro units my production has taken a dump.
Thunderbird at 1.2 ghz: P1136 Tinker, start 12/13 about 8am, currently at 89%
Very telling.
The majority of my rigs are P4 and with all the gromacs I estimate my production is down about 25% vs when I was getting nothing but tinkers.
This thing just won't die. It's now the 18th and it's only at 89%.