Edit: 16 x Dual1.8GHz PPC G5s and 16 x Dual 2.0GHz PPC G5s now folding.
I have only set each with a single instance of the Terminal/Console, so I assume they will show up as a single CPU each.
Regarding the efficiency of FAH configuration, do you mean you are running 2 FAH jobs on an HT CPU or a dual-core CPU. My experience with HT P4 CPU's is that, it is more efficient two run single FAH job. But if you mean dual-core, yeah, I am also running two instances in parallel on the dual-core processors.
We are running strictly on single cored cpu's, the dual core cpu's will not come in untill after xmas.
As for running 1 verses 2 instances, i find a gain of approximatly 20% by running 2 instances unless ofcourse if it is a qmd protein, in which case we would run 1 instance per pc.
Edit: 16 x Dual1.8GHz PPC G5s and 16 x Dual 2.0GHz PPC G5s now folding.
I have only set each with a single instance of the Terminal/Console, so I assume they will show up as a single CPU each.
The following is my at present 7 dedicated Boxen (14 processors):
Opteron 170
P4 830
P4 840XE
P4 920
P4 940
E6400
E6600
All run with 2Gb of RAM.
I overclock 20-30% depending on heat output, I prefer stability over summer as it can get into 50C plus ambient territory.
I run with big packets, and only run the -local flag for the two Consoles.
Datsun 1600
Ah, ok. So it seems that enabling bigpackets will result in 2 to 3 times the points per day per processor! Thanks for the info! If there is anyone with dedicated folding machines, looks like now is the time to make sure you've got bigpackets enabled!
We are running strictly on single cored cpu's, the dual core cpu's will not come in untill after xmas.
As for running 1 verses 2 instances, i find a gain of approximatly 20% by running 2 instances unless ofcourse if it is a qmd protein, in which case we would run 1 instance per pc.
Shal
Hey Shal,
Thanks for waking me up I tested two jobs on a single HT CPU. It is indeed faster to run two FAH instances at once.
It was 52 minutes for 1% computation (Gromacs core) when running a single job. I started another one (both of them Gromacs core) the time to complete 1% became 87 minutes for the first job . So, it is exactly 16.3% faster to complete two jobs in parallel instead of running sequentially on an HT CPU. :bigggrin:
Hey Shal,
Thanks for waking me up I tested two jobs on a single HT CPU. It is indeed faster to run two FAH instances at once.
It was 52 minutes for 1% computation (Gromacs core) when running a single job. I started another one (both of them Gromacs core) the time to complete 1% became 87 minutes for the first job . So, it is exactly 16.3% faster to complete two jobs in parallel instead of running sequentially on an HT CPU. :bigggrin:
nice research!
I wonder if the L2 cache or FSB speed of the processor makes a difference when using 2 instances on a HT processor.
What's your proc? fsbspeed//cache?
If anyone else is out there running 2 instances of FAH on a dedicated HT proc, please post how long it takes to complete 1% of a gromacs, and your processor stats!
(I've got a few, but none of them are on gromacs right now... )
I wonder if the L2 cache or FSB speed of the processor makes a difference when using 2 instances on a HT processor.
What's your proc? fsbspeed//cache?
If anyone else is out there running 2 instances of FAH on a dedicated HT proc, please post how long it takes to complete 1% of a gromacs, and your processor stats!
(I've got a few, but none of them are on gromacs right now... )
The test was done with Northwood P4 3.06GHz (512K L2, 533 FSB) running at 3.45GHz@600 FSB, on an MSI GNB-Max board (i7205 chipset) with dual channel 1GB DDR300 memory.
I wonder if the L2 cache or FSB speed of the processor makes a difference when using 2 instances on a HT processor.
What's your proc? fsbspeed//cache?
If anyone else is out there running 2 instances of FAH on a dedicated HT proc, please post how long it takes to complete 1% of a gromacs, and your processor stats!
(I've got a few, but none of them are on gromacs right now... )
In short yes.
In the current run of gromacs it doens't seem to matter much, but when the QMD's were out, it would cripple a machine because they took sooooo much memory bandwidth (and wouldn't readily give it back up)
This was on a dual Xeon (2.9 currently) with HT. The fsb was bumped from 100 to223.
On this set-up I ccould run 1 qmd per proc, with a standard WU on the other thread.
Jon and Sally could prob. give you the info as most of their farm is HT Intels. (IIRC)
With the QMD Proteins, a 600 series P4 would run 2 instances of QMD's with 1Gb of RAM. The 2Mb of cache, not only made them run a lot quicker, but allowed the second instance to run at the same time, the cache on the CPU has a big affect on the times of the memory hungry Proteins.
With the QMD Proteins, a 600 series P4 would run 2 instances of QMD's with 1Gb of RAM. The 2Mb of cache, not only made them run a lot quicker, but allowed the second instance to run at the same time, the cache on the CPU has a big affect on the times of the memory hungry Proteins.
Datsun 1600
I think we need to start working out a priority list for parts...seems like we have enough smart people here that we can figure it out. I'll have a go, knowing full well that it isn't quite right...maybe...
I'm moving this to a new thread, too important to bury in the challenge here! :headbange
With the QMD Proteins, a 600 series P4 would run 2 instances of QMD's with 1Gb of RAM. The 2Mb of cache, not only made them run a lot quicker, but allowed the second instance to run at the same time, the cache on the CPU has a big affect on the times of the memory hungry Proteins.
Datsun 1600
And a bloody ripper CPU that has turned out to be. That and it's minime 530 brother.
Comments
Of course my production is in the ****ter. It looks like I have had a couple of boxes go offline in my absence. Sometimes you can't win.
I have only set each with a single instance of the Terminal/Console, so I assume they will show up as a single CPU each.
Hi mirage,
We are running strictly on single cored cpu's, the dual core cpu's will not come in untill after xmas.
As for running 1 verses 2 instances, i find a gain of approximatly 20% by running 2 instances unless ofcourse if it is a qmd protein, in which case we would run 1 instance per pc.
Shal
Opteron 170
P4 830
P4 840XE
P4 920
P4 940
E6400
E6600
All run with 2Gb of RAM.
I overclock 20-30% depending on heat output, I prefer stability over summer as it can get into 50C plus ambient territory.
I run with big packets, and only run the -local flag for the two Consoles.
Datsun 1600
NICE!!!
that's 32 instances of FAH - GREAT JOB!!!!!
Ah, ok. So it seems that enabling bigpackets will result in 2 to 3 times the points per day per processor! Thanks for the info! If there is anyone with dedicated folding machines, looks like now is the time to make sure you've got bigpackets enabled!
no kidding, we're doing great! Let's keep this momentum going!
Active processors
(within 50 days) 257
ehhh, I dunno... I rolled out to those machines before the thread started...
anyways, it looks like we've got some T93 team members really stepping up! Don't think we'll need to count those machines of mine!
E - Encourages
A - Awesome
M - Momentum
:bigggrin: thanks Q...
I think today I'll try to answer all helpdesk calls either in acrostic or haiku.
(I've done it before!)
Hey Shal,
Thanks for waking me up I tested two jobs on a single HT CPU. It is indeed faster to run two FAH instances at once.
It was 52 minutes for 1% computation (Gromacs core) when running a single job. I started another one (both of them Gromacs core) the time to complete 1% became 87 minutes for the first job . So, it is exactly 16.3% faster to complete two jobs in parallel instead of running sequentially on an HT CPU. :bigggrin:
nice research!
I wonder if the L2 cache or FSB speed of the processor makes a difference when using 2 instances on a HT processor.
What's your proc? fsbspeed//cache?
If anyone else is out there running 2 instances of FAH on a dedicated HT proc, please post how long it takes to complete 1% of a gromacs, and your processor stats!
(I've got a few, but none of them are on gromacs right now... )
The test was done with Northwood P4 3.06GHz (512K L2, 533 FSB) running at 3.45GHz@600 FSB, on an MSI GNB-Max board (i7205 chipset) with dual channel 1GB DDR300 memory.
i loaded up the server that is in our apartment, it should be crunching away right now! :bigggrin:
In short yes.
In the current run of gromacs it doens't seem to matter much, but when the QMD's were out, it would cripple a machine because they took sooooo much memory bandwidth (and wouldn't readily give it back up)
This was on a dual Xeon (2.9 currently) with HT. The fsb was bumped from 100 to223.
On this set-up I ccould run 1 qmd per proc, with a standard WU on the other thread.
Jon and Sally could prob. give you the info as most of their farm is HT Intels. (IIRC)
nice! added.
every little bit helps. added.
Datsun 1600
I think we need to start working out a priority list for parts...seems like we have enough smart people here that we can figure it out. I'll have a go, knowing full well that it isn't quite right...maybe...
I'm moving this to a new thread, too important to bury in the challenge here! :headbange
And a bloody ripper CPU that has turned out to be. That and it's minime 530 brother.
NICE! added.
Also added 2 x Dual Cores... Quess that means 4.
So... mark me down for 6 CPUS!!!
nice!! added.