PDA

View Full Version : Performance hit every 25 seconds.


csimon
10 Jan 2004, 5:08am
Ok ...I'm just noticing this but perhaps it's been happening all along.
I notice that while folding my performace meter drops from 100% to 0% about every 25 seconds or so. Can anyone else verify this is or isn't happening to them?
If it isn't does anyone have any suggestions as to what may be causing this?

using FAH4Console.exe -forcesse and core_78 v1.55
wtf? :fold:

a2jfreak
10 Jan 2004, 5:22am
what does the actual F@H executable do? (in the processes tab)

Slick
10 Jan 2004, 5:30am
I believe the program is saving the progress to disk everytime the usage drops down. I am not sure though.

EOC_Jason
10 Jan 2004, 5:31am
Do you have some sort of system monitoring script?

mmonnin
10 Jan 2004, 5:39am
I got something similar when a frame went by.

Thats also why you run a genome in the background which I realized I didnt set up after I got my NF7.

TBonZ
10 Jan 2004, 6:02am
I believe the program is saving the progress to disk everytime the usage drops down. I am not sure though.

That is absolutely correct, the drop intervals are due to checkpoint writes.

csimon
10 Jan 2004, 8:34am
what can I do to elimenate that?

Here is my client.cfg:

[settings]
username=csimon
team=93
asknet=no
machineid=1
local=22

[http]
active=no
host=localhost
port=8080
usereg=no

[clienttype]
type=1

[core]
checkpoint=15

If I set my checkpoint to checkpoint=0 will that supress the function?
nevermind ...no matter what I set it to it still does it.

TheBaron
10 Jan 2004, 8:42am
Thats also why you run a genome in the background which I realized I didnt set up after I got my NF7.

wait ... huh? care to elaborate on this?

csimon
10 Jan 2004, 8:43am
That is absolutely correct, the drop intervals are due to checkpoint writes.
Can I prolong or eliminate the intervals?

csimon
10 Jan 2004, 8:59am
Do you have some sort of system monitoring script?
just coolmon but it does it with coolmon off and I have no scheduled tasks.

csimon
10 Jan 2004, 9:02am
what does the actual F@H executable do? (in the processes tab)
drops to 0% then goes right back up to 100% again ...only for an instant every 25 secs approx.

t1rhino
10 Jan 2004, 3:20pm
Install another instance. :D

TBonZ
10 Jan 2004, 3:21pm
Sorry Chris, it was late and I completely missed the 25 sec thing, brain fart I guess. :scratch:

I just looked at my client and it's behaving the exact same way. lsevald brought this up along time ago when Gro's were fairly new but I cannot remember if the intervals were this short or if the intervals were strictly between checkpoints which would be a matter of minutes not seconds.

This would be a good topic to post at the community forum as I am now very interested in why this is happening.

mmonnin
10 Jan 2004, 3:56pm
I stil have V3.25 and it does it only when finishing a frame. The checkpoints are not on V3 but the checkpoints are at a min of every 3 minutes so this 25 sec thing cant be that.

You are finishing a frame every 25 seconds are you?:)

TBonZ
10 Jan 2004, 4:09pm
I'm also using V3.25 and my checkpoints with this protein are 13-14 minutes.

a2jfreak
10 Jan 2004, 4:31pm
I just checked and it's doing it on my system too.
I'm fairly certain that my system did not do this before I switched from core v.1.5.4 to 1.5.5. Anyone else that is experiencing this, which core version are you using? If the concensus seems to be v1.5.5, then I think I might go back to v.1.5.4 just to double check.

csimon
11 Jan 2004, 5:04pm
marc or terry can you post a copy of your v3.25 client.cfg ...something may be missing from v4.0 ...I'm mainly looking for cpuusage=

mmonnin
11 Jan 2004, 5:18pm
[settings]
username=mmonnin
team=93
asknet=no
machineid=1
local=371

[http]
active=no
host=localhost
port=8080
usereg=no
usepasswd=yes

[clienttype]
type=1

[core]
priority=96
cpuusage=100
disableassembly=no
ignoredeadlines=no

This is running at low priority since there is a genome in the background at idle.

csimon
11 Jan 2004, 5:20pm
notice anything odd?

[settings]
username=csimon
team=93
asknet=no
machineid=1
local=28

[http]
active=no
host=localhost
port=8080
usereg=no

[clienttype]
type=1

[core]
checkpoint=30

I'm left to assume that with the new client 4.0 that if [core] settings are set to anything but default then they are set in client.cfg. I manually added cpuusage=100 and it made no difference.

maybe it is the core? or core + v4client

t1rhino
11 Jan 2004, 6:30pm
delete your client.cfg and start the console with the -config.
re-enter team #93, userid=t1rhino,
cpuusage=100 :D

mmonnin
11 Jan 2004, 6:41pm
Yeah I think what t1rhino said will fix it.;)

I have v1.55 core.

t1rhino
11 Jan 2004, 7:31pm
I have core 1.54. How do I get core 1.55?

Straight_Man
11 Jan 2004, 7:52pm
Well, as far as genome, I am not sure. As to the Gromacs client, look in a file called client.cfg and the checkpoint entry is in minutes after checkpoint=. HOWEVER, what you see is probably not that checkpoint var. What you might be seeing,which was done in the newer core and client together, is that the workunit.cp file is revised after a frame with tinkers and after a percent or frame with Gromacs WUs. What you are seeing, imho, is a read\write cycle by frame, frame complete to the .cp file (work progress archive) and a read of next frame, as it is the core that is pulling the load and it is not drawing load as much while writing and reading to HD as while calculating.

If you play with this, you will have to disassemble the client and the core and build in a workspace in RAM to store a percentage worth of work in-- essentially you will be losing up to one percent every time the client or core hangs this way. they did this simply so the client would not have to backtrack if, say, windows crashed instead of someone doing an orderly shutdown before rebooting windows and widnows not hanging. The new core has never backtracked more than 1 percent on anything I have let it run, and same for tinkers, except there it now is never mere than one frame-- so most of this is in the client, though the core does not exit. Client would have to be rewritten, and Guha at Folding did most if not all the work on current core and client, or co-ordinated that. Talk to guha, best way to find out if you have a hyper stable box.

One reason I think this is so, is I get pattern that is part of a percent of very tiny time drop like you do out of my F@Hs here, and it does not relate to the checkpoint= numbers, as it is tiny short time, and the checkpoint in client.cfg on both clients(linux and Windows, same release versions but appropriate for the O\S as to client) is 15 MIN cycles. Most folks like less to lose work not saved than to take 4% of time to write and not have to recalc if box dies or locks and and becasue they have other than perfectly stable boxes. In theory if you wanted the client to grab much more RAM, you could put the accumulation into RAM for 1-5% of work instead of 1-500 FRAMES of work (depenmding on size of Wu, the new biggest ones run 500 steps per prercent and I do not know if those are using one FRAME per step or not), but I think you might have fun rewriting what would be needed to get your client to go always-on all the time without giving it a priority that would override normal use of computer. My computer graphical client is set to run CPU 100%, and except for write chunk to HD and then set up for next portion of work in RAM, it does. But, with new client and core, have enver lost 15 min of work.... Even by trying.... :D

John.

t1rhino
11 Jan 2004, 7:53pm
Just started watching task manager, and it seems that cpu usage goes down to zero at the completion of each step.

csimon
11 Jan 2004, 8:09pm
this is what I get from a fresh config

gtghm
11 Jan 2004, 9:47pm
Mine does this too, however I doubt that its any thing to worry about.

Straight_Man
11 Jan 2004, 11:07pm
When data is handed off to client to be written to either HD or a client workspace in RAM for acccumulating steps equal to a "write to HD chunk", the core suspends until new step is in RAM, basicly-- the core is what pumps your CPU usage, when it is calculating results actively. Nothing to worrry about, normal for this software set.

John.

csimon
12 Jan 2004, 9:02pm
strange but this doesn't seem to be happening on any instances running on my P4's ...only the one instance on the AMD xp3000+/400

csimon
13 Jan 2004, 2:25am
:clap: magnificent ...it must have been the particular protein ...and now my processor is running hotter than hell on the new wave of gromacs. :clap: