Problem getting new WU.
Esso
Stockholm, Sweden
Hi,
Yesterday my folding stopped due to problem with one of the servers.
It didn't respond, to the request for a new WU.
Initially I had problem just to report the completed WU, but after a while it went through.
Never the less I couldn't get a new WU and that halted my folding.
I solved the problem, by coping the FAH504-Console.exe to a new directory
and reconfigured the folding parameters.
After that it requested new WU from a different folding server, (171.64.122.120).
So please make sure that your current folding computers are up and running.
Here is the log from the falling server: 171.64.122.112 (just 15 minutes ago)
[15:55:23] Loaded queue successfully.
[15:55:23] + Benchmarking ...
[15:55:25] The benchmark result is 6024
[15:55:25] - Preparing to get new work unit...
[15:55:25] - Autosending finished units...
[15:55:25] + Attempting to get work packet
[15:55:25] Trying to send all finished work units
[15:55:25] - Will indicate memory of 1023 MB
[15:55:25] + No unsent completed units remaining.
[15:55:25] - Connecting to assignment server
[15:55:25] - Autosend completed
[15:55:25] Connecting to http://assign.stanford.edu:8080/
[15:55:27] Posted data.
[15:55:27] Initial: 40AB; - Successful: assigned to (171.64.122.112).
[15:55:27] + News From Folding@Home: Welcome to Folding@Home
[15:55:27] Loaded queue successfully.
[15:55:27] Connecting to http://171.64.122.112:8080/
[15:55:31] - Couldn't send HTTP request to server
[15:55:31] (Got status 503)
[15:55:31] + Could not connect to Work Server
[15:55:31] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[15:55:39] ***** Got a SIGTERM signal (2)
[15:55:39] Killing all core threads
Yesterday my folding stopped due to problem with one of the servers.
It didn't respond, to the request for a new WU.
Initially I had problem just to report the completed WU, but after a while it went through.
Never the less I couldn't get a new WU and that halted my folding.
I solved the problem, by coping the FAH504-Console.exe to a new directory
and reconfigured the folding parameters.
After that it requested new WU from a different folding server, (171.64.122.120).
So please make sure that your current folding computers are up and running.
Here is the log from the falling server: 171.64.122.112 (just 15 minutes ago)
[15:55:23] Loaded queue successfully.
[15:55:23] + Benchmarking ...
[15:55:25] The benchmark result is 6024
[15:55:25] - Preparing to get new work unit...
[15:55:25] - Autosending finished units...
[15:55:25] + Attempting to get work packet
[15:55:25] Trying to send all finished work units
[15:55:25] - Will indicate memory of 1023 MB
[15:55:25] + No unsent completed units remaining.
[15:55:25] - Connecting to assignment server
[15:55:25] - Autosend completed
[15:55:25] Connecting to http://assign.stanford.edu:8080/
[15:55:27] Posted data.
[15:55:27] Initial: 40AB; - Successful: assigned to (171.64.122.112).
[15:55:27] + News From Folding@Home: Welcome to Folding@Home
[15:55:27] Loaded queue successfully.
[15:55:27] Connecting to http://171.64.122.112:8080/
[15:55:31] - Couldn't send HTTP request to server
[15:55:31] (Got status 503)
[15:55:31] + Could not connect to Work Server
[15:55:31] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[15:55:39] ***** Got a SIGTERM signal (2)
[15:55:39] Killing all core threads
0
Comments
~FA
At least mine was for a couple of hours, until I had enough and created a new folding directory.
I'm using Opty-165 (dual core processor) with two folding directory's, configured as unit 1 and 2 respective.
And when unit 2 couldn't proceed, I halted it and created a new folding directory and configured it as unit 3.
So that it will not not get the same WU that couldn't be reported.
In this way my computer will keep on folding.
Later people with problem reporting their finished WU, can do so when Stanford fixes the server.
Please add -oneunit doing this, then you will not be assigned a new WU again for this folding directory ...
It has been down now ~ 36 hours.
I received this timeless WU (249 points) from (171.64.122.120).
So I'm currently running two of those ...
[12:21:36]
[12:21:36] - Couldn't get size info for dyn file: work/wudata_01.dyn
[12:21:36] Starting from initial work packet
[12:21:36]
[12:21:36] Protein: p1112_L939_K12M_nat_min1_355K
[12:21:36] - Run: 45 (Clone 54, Gen 20)
Edit,
Stanford should improve the way WU's are assigned, if it can't report/get new WU's from a failing server, it should switch to another server after 3 failing attempts.
I mean if 10 % of all folding machines is stopped becasue of this, they will loose a lot of computer power.
Also some folders might be upset, but I will not be frustrated because ...
I'm now up to EIGHT machines just twiddling their digital thumbs...
Esso is right though, just one server being down like this (a timeless server) is money to stanford because it takes 100s of Ghz of computers and makes them idle.
~FA
Cribbed from the Folding Community Forums:
For those of you using the console version, there is a quick guide to making changes on our main FAH page. Look for the item named Reconfiguring The FAH Console.
LINKY
bikerboy
Though i am yet to thank Pette Broad for the advise, we changed about 4 comps using all our usual flags. At least we don't have 10 procs sitting idle now as it becomes extremely frustrating.
I'm sure it is not just an overload of WU's affecting server 112. It has been running quite well for a long time and the problem started about 3 or so days ago. As Mudd said, it's FUBAR.
Jon
EDIT; nope, it's 200PPD now that no one is using the computer. Not too shabby for a GbGromacs on an old Athlon, is it!
I guess that the folding was talking to me. You need to get up and fix this ... :tongue2:
I changed to standard as Leonardo did, and it's now folding p2505, 200 points.
Will report the ppd in this post later.
Now it's time for coffee, make that black, very black.
The server 171.64.122.112 has been down now for ~ 54 hours.
Edit,
p2505 is doing 15m47s to 15m57s / frame, depending on load ~ 183 ppd ( 366 ppd using two cores).
Time less projects executes at ~153 ppd (~ 306 ppd using two cores), so its even better .....
The P3-500 MHz needs 97m / frame running p2502, so one Opty-165 core is ~ 6 times faster.