I have ...scarey isn't it?
Seems like when it's all said and done that the program is only getting about 25% cpu usage.
0
Straight_ManGeeky, in my own wayNaples, FLIcrontian
edited March 2004
Well, they say protiens fold in tiny parts of a second, and this WU appears to have that part down pat.... They ARE folding, regular as a heartbeat.... Might have to do with the CPU being so dang effective that that WU leaves lots to SPARE for other things. But, I think the CPU monitor actually was calibrated for a SINGLE pipe CPU, so to IT, each pipe WOULD be HALF of a 6 GHz CPU.... CPU monitor is confused, methinkx.
BUT, the heartbeat shows that 3 GHz is overkill for that WU. OR, the box is busy doing other things at the time the snapshot was taken???? OR, BOTH????
I would think that would be an abnormality too until I checked. I've always had a steady upper 90's cpu usage with other cores. I checked task manager and had this. It has never acted this way. Norton AV, zonealarm, and em-dc were the only progs running in the background.
I've got one running on mine right now but with the other two instances going on different gromaks I don't notice any oscilations, I do see my disk getting accessed quite often though.
LeonardoWake up and smell the glaciersEagle River, AlaskaIcrontian
edited April 2004
Two instances of P935_LG2 in Water running. Average CPU (both CPU and second virtual CPU, hyperthreading) are running at about an average 65% usage, according to MBM5. System 1:
I wonder if these are only being distributed to P4's...
Has anyone with an A64 got one yet? I haven't.
0
LeonardoWake up and smell the glaciersEagle River, AlaskaIcrontian
edited April 2004
Good question, Bill. I've not received any with my Barton system (No. 2). Neither have any of my P4s at the office downloaded any. If it makes any difference, the office machines all are non-hyperthreaded P4s, with the exception of a P3 850 Celeron. Two of the office machines are Gateway P4Centrino 1300MHz laptops - very nice laptops, but sllowww folders.
I just read that someone ran 5 clients with these WUs and CPU usage was still not 100% all the time. I do no recommend this but someone else said they got an 18% increase with running a 3rd client and the 4th was only 2% in PPD production.
I am on my second P936 using "Extra SSE2 Boost" on my laptop (A64). I guess they're not only for P4's.
[09:05:35] Assembly optimizations on if available.
[09:05:35] Entering M.D.
[09:05:41] Protein: p936_fkfe2_all
[09:05:41]
[09:05:41] Writing local files
[09:05:44] Extra SSE2 boost OK.
[09:05:45] Writing local files
[09:05:45] Completed 0 out of 100000 steps (0)
[09:10:32] Writing local files
[09:10:32] Completed 1000 out of 100000 steps (1)
[09:15:19] Writing local files
[09:15:19] Completed 2000 out of 100000 steps (2)
[09:20:06] Writing local files
[09:20:06] Completed 3000 out of 100000 steps (3)
[09:24:53] Writing local files
[09:24:53] Completed 4000 out of 100000 steps (4)
[09:29:39] Writing local files
[09:29:39] Completed 5000 out of 100000 steps (5)
I don't know what the difference in P935 and P936 is, but the CPU Usage History graph in task manager stays at 100%.
They were supposed to go to P4s and beta team only. I got one on one of my Athlons yesterday as well. Dont know if its and AS error or they are sending them out to everyone.
I did notice that it was taking all of the CPU tho. Not sure if its just the SSE2 part that doesnt keep it at 100%. Leo said his P4 was at 100% with 2 clients, for HT. Maybe they fixed it.
0
LeonardoWake up and smell the glaciersEagle River, AlaskaIcrontian
edited April 2004
In the thread where I just posted 100% usage I was reporting on P936 Double Gromacs. Come to think of it though, I did have a 935 a few days again. If I remember correctly, it too was using 100% CPU capacity, which in hyperthreading applications, was 50%; the other work unit taking an additional 50%.
In the thread where I just posted 100% usage I was reporting on P936 Double Gromacs. Come to think of it though, I did have a 935 a few days again. If I remember correctly, it too was using 100% CPU capacity, which in hyperthreading applications, was 50%; the other work unit taking an additional 50%.
Double Gromacs use some SSE2. They will stress a computer more. p935 is very complex also. Core_78's new version uses some SSE2.
Folding, if my reports to Vijay now and then are a sample of what he got, is tuning the cores 78 and 79 for some or most SSE2 use so they do more complex calcs. One of the reasons for this is that research is going more and more into fine details, and modeling many at once is needful to detect what works for what without harming the body in adverse ways while killing what needs killing. Think about how long it took supercomputers to calc the Genome. We now have boxes that calc as fast in smaller cluster calcing as the supercomputers did
The simpler WUs rule out single factors or indicate they need to be explored more as factors in more complex models. We folders with very fast boxes are providing folding with ways to accomplish that more complex folding.
BUT, folks in research have many single-factor experiments that are needed still, to rule out things that will not work, but might and need to be ruled out or put on back burner for now and explored in combination with other factors.
So far, cancer treatment has been with toxics. Folks have been injected or intraveniously treated with things thta the treaters have to take extreme precautions with, and those toxics can accumulate. Doctor's oaths say to do no more harm while fixing the problem (in essense)-- so the obvious harm causers need to be ID'd and eliminated as well as having the problems fixed. Tinkers indicate need for further research, or something as valuable, ie the elimination of dead ends so that what works can be deeply explored.
In order to explore the processor effectiveness effects, they need a benchmarker that hits the effects on mainstream processors at current time that are made by complex and simple WUs. The P4s do not do non-SSE or non- SSE2 WUs to full effectiveness. That is one reason that the benchmarker needs to be a P4. It also needs to be fast so that many WUs can be benchmarked faster so new projects can be tested faster, gotten out into the mainstream faster and calced more effectively in regards to time expended by the distributed network including the core boxes that hanlde results. Power is expensive in California, and lower voltage boxes tend to give more calc per watt based on incoming 110 or 230 nominal AC. They are tuning for more calcs per watt of incoming from electric util power also versus computing capcacity yielded and actually used for good results. So, the benchmarker is being upgraded massively from what it was.
MY guess is that Stanford University is throwing money into this infrastructure expansion effort and incoming students are interested in exploring a high-power distributed network. They can get some time hands-on and experience credit in IT by halping admin Folding's distributed network. Let's look at trends. IBM is selling time on servers remote from teh company offices, on big powerful servers. IBM gets realtime testing of product, and proof of performance meeting desiogn specs and money from the computing time sales. Folding gets better analysis of increasingly complex work this way.
And SSE2 calcs are VERY important to this increasingly complex modelling. It takes many vector calcs to model movement of mocelcular chains and proteins are not simple chains and fold differently based on environment. Pure arithmetic (ALU calcs) can only calc coarse effects. Increased amonts of more efficient FPU, or vector calcs, can handle multivariable combos that need to be evaluated. They allow for chemical variances in combo to alter movement motion in 3D-- and that takes complex vector calcs showing accumulative effects of many factors at once.
Comments
Seems like when it's all said and done that the program is only getting about 25% cpu usage.
BUT, the heartbeat shows that 3 GHz is overkill for that WU. OR, the box is busy doing other things at the time the snapshot was taken???? OR, BOTH????
John D.
KingFish
KingFish
Has anyone with an A64 got one yet? I haven't.
http://forum.folding-community.org/viewtopic.php?t=7385&highlight=
[09:05:35] Assembly optimizations on if available.
[09:05:35] Entering M.D.
[09:05:41] Protein: p936_fkfe2_all
[09:05:41]
[09:05:41] Writing local files
[09:05:44] Extra SSE2 boost OK.
[09:05:45] Writing local files
[09:05:45] Completed 0 out of 100000 steps (0)
[09:10:32] Writing local files
[09:10:32] Completed 1000 out of 100000 steps (1)
[09:15:19] Writing local files
[09:15:19] Completed 2000 out of 100000 steps (2)
[09:20:06] Writing local files
[09:20:06] Completed 3000 out of 100000 steps (3)
[09:24:53] Writing local files
[09:24:53] Completed 4000 out of 100000 steps (4)
[09:29:39] Writing local files
[09:29:39] Completed 5000 out of 100000 steps (5)
I don't know what the difference in P935 and P936 is, but the CPU Usage History graph in task manager stays at 100%.
I just checked my xp2500+ Barton and it has one. No SSE2 support.
Doesn't show any kind of "extra boost" in the log file.
I thought the Double Gromacs would only be distributed to processors that support SSE2? I guess not.
I did notice that it was taking all of the CPU tho. Not sure if its just the SSE2 part that doesnt keep it at 100%. Leo said his P4 was at 100% with 2 clients, for HT. Maybe they fixed it.
Double Gromacs use some SSE2. They will stress a computer more. p935 is very complex also. Core_78's new version uses some SSE2.
Folding, if my reports to Vijay now and then are a sample of what he got, is tuning the cores 78 and 79 for some or most SSE2 use so they do more complex calcs. One of the reasons for this is that research is going more and more into fine details, and modeling many at once is needful to detect what works for what without harming the body in adverse ways while killing what needs killing. Think about how long it took supercomputers to calc the Genome. We now have boxes that calc as fast in smaller cluster calcing as the supercomputers did
The simpler WUs rule out single factors or indicate they need to be explored more as factors in more complex models. We folders with very fast boxes are providing folding with ways to accomplish that more complex folding.
BUT, folks in research have many single-factor experiments that are needed still, to rule out things that will not work, but might and need to be ruled out or put on back burner for now and explored in combination with other factors.
So far, cancer treatment has been with toxics. Folks have been injected or intraveniously treated with things thta the treaters have to take extreme precautions with, and those toxics can accumulate. Doctor's oaths say to do no more harm while fixing the problem (in essense)-- so the obvious harm causers need to be ID'd and eliminated as well as having the problems fixed. Tinkers indicate need for further research, or something as valuable, ie the elimination of dead ends so that what works can be deeply explored.
In order to explore the processor effectiveness effects, they need a benchmarker that hits the effects on mainstream processors at current time that are made by complex and simple WUs. The P4s do not do non-SSE or non- SSE2 WUs to full effectiveness. That is one reason that the benchmarker needs to be a P4. It also needs to be fast so that many WUs can be benchmarked faster so new projects can be tested faster, gotten out into the mainstream faster and calced more effectively in regards to time expended by the distributed network including the core boxes that hanlde results. Power is expensive in California, and lower voltage boxes tend to give more calc per watt based on incoming 110 or 230 nominal AC. They are tuning for more calcs per watt of incoming from electric util power also versus computing capcacity yielded and actually used for good results. So, the benchmarker is being upgraded massively from what it was.
MY guess is that Stanford University is throwing money into this infrastructure expansion effort and incoming students are interested in exploring a high-power distributed network. They can get some time hands-on and experience credit in IT by halping admin Folding's distributed network. Let's look at trends. IBM is selling time on servers remote from teh company offices, on big powerful servers. IBM gets realtime testing of product, and proof of performance meeting desiogn specs and money from the computing time sales. Folding gets better analysis of increasingly complex work this way.
And SSE2 calcs are VERY important to this increasingly complex modelling. It takes many vector calcs to model movement of mocelcular chains and proteins are not simple chains and fold differently based on environment. Pure arithmetic (ALU calcs) can only calc coarse effects. Increased amonts of more efficient FPU, or vector calcs, can handle multivariable combos that need to be evaluated. They allow for chemical variances in combo to alter movement motion in 3D-- and that takes complex vector calcs showing accumulative effects of many factors at once.
John D.