How often should this site poll stats for folding???
Straight_Man
Geeky, in my own wayNaples, FL Icrontian
I think that as Stanford gets more and more faster boxes folding, they will have problems with increasing bandwidth costs. We as team folders can help by not expecting real fast updates at our sites.
Let's say a site grabs the 7 MB users update txt file every three hours. That is 56 MB of load to folding's admin net every day from that one site. We have how many team sites out there??? How many grab stats many times daily??? Guess, 500-700 minimum, not counting individual coolmon users. So, how many GIG just for stats per DAY???? Lets take 600 sites worldwide, and I am being conservative I bet:
So, 56 MB times 600 = 33,600 MB daily the stats server serves just to sites so we can all see updates.... That does not count extra bandwidth from individual hits on Folding's stats server. I think we should cut down the number of times per day we demend to see updates here, and wonder what you all each think, so will start with a minimum of 3 hours between polls to Stanford, then talk 6 hours, then 9, then 12, in poll. Units of three as that is how often Stanford is updating their text file for us today.
What do you think???
John.
Let's say a site grabs the 7 MB users update txt file every three hours. That is 56 MB of load to folding's admin net every day from that one site. We have how many team sites out there??? How many grab stats many times daily??? Guess, 500-700 minimum, not counting individual coolmon users. So, how many GIG just for stats per DAY???? Lets take 600 sites worldwide, and I am being conservative I bet:
So, 56 MB times 600 = 33,600 MB daily the stats server serves just to sites so we can all see updates.... That does not count extra bandwidth from individual hits on Folding's stats server. I think we should cut down the number of times per day we demend to see updates here, and wonder what you all each think, so will start with a minimum of 3 hours between polls to Stanford, then talk 6 hours, then 9, then 12, in poll. Units of three as that is how often Stanford is updating their text file for us today.
What do you think???
John.
0
Comments
Keebler should know the way it works though as he maintains it.
Second, any drop on stats server as to load makes it more likely that Folding will not have to massively upgrade stats server to handle increases in volume of stats as more and more folks fold.
Side benefit for this site, it does not have to be processing and fetching stats as often that way. Figured poll plus thread could make this more understandable and obvious to not just this site's users but others also.
John.
Right now I think its at 2.5 hours with it slightly staggered so it wont hit right when stanford is updating and everyone else is checking. I guarantee you, we dont pull that much off of stanford everyday. Not as much as we could.
Stats like EOC really help because all they have to do is pull both links once and they have everything they need.
If you want to help look at a site like theirs or statsman.
Yup, thought my guess of 600 was way low, Google pulled 32,800 hits on "Folding +User +Stat" which got forum after forum with a stats page (not just sigs). It takes one site to start a trend, and why not one of the top ten teams????
Reductio ad absurdum: If the stats were eliminated altogether, would Folding participation go up, go down, or stay the same?
It costs companies money to do payroll calculations every pay period, but they would soon lose every employee they had if they stopped doing it.
And most of these companies only pay their employees twice a month not every single hour...
John.
I think a lot of people folding is driven by stats and anything to keep more people interested in folding is good...
and the message on the folding site says nothing about bandwidth being a problem, it says people downloading the stats INCORRECTLY using the CGI pages rather than just downloading the text files (they even suggest it and say its the right way to do it) so I dont see the problem with updating....
Camman makes a good point - if bandwidth alone was the problem, I think Stanford would update the stats for all of the hundreds of thousands of Folders less often than every 2.5 hours.
I Fold for my parents, who both have significant health problems, and for those like them. I wouldn't stop even if there were no stats at all. I do think having frequent stat updates builds interest, resulting in more people Folding - and more raw data for Stanford to work with.
Another project keebler?
(Referring to myself)
I'm hoping the guy wins the Lottery, so he can drop out of school and work for S-M full-time.
(Just don't lose that ticket!)
But prof's got a valid point about the folks that are just in it for the glory of points wanting to see their stats updated often - I don't think there are too many of THAT type on this team, though.
I think that pretty much invalidates the poll as 2.25MB a day is nothing.
If it helps, the user stats file is organized this way:
user with highest total points on top (bubble sort a user name table, with sort key of accumulated points in toto local to here, from the active table used for sigs, as a an update match work table).
Folding stats txt file fields are TAB delimited with each record field set LF\CR or LF delimited. Record 1 is time and date stamp. Record two is filed header names.
So, for one easy parsing:
sort an SQL output table with users by points accumulated, high to low, local to here (bubble sort, as above, key total accumulated points).
Use user name from the table just made to seek user matches, they will occur in close to order of thie folding stats master user file sort so you can avoid having to do individual queries, but can build a table in 2-3 passes instead of a search per user through out file for a record. Build a site-local sig data update table with this pass set.
Now, top to bottom, pass through table three times, not updating data that matches, or use CSV (CVS??? Seen BOTH used in different parts of world)(which uses byte by byte DIFFing to determine replace need and leaves same bytes alone) to sync in single pass and replace old table with changes. Use end result of sync to be table used for sigs.
That is a top level quick and dirty parsing spec. Reason for multiple passes in first run through what local server grabs is to get order mismatches, because you are using local table to gen order and new table is in order as of text file time and day stamp which is record one of file. Record two is headers telling what data is in corresponding fields.
John.
I thought there would not be a lot who were worried and needed updates fro own needs or wants, and that the points system is used by most of us to tell us when our own boxes are folding slow ro fast, and wanted to see if others felt as we did. Poll was valid, I wanted to probe why folks look at points and want points, here, hoping other admins on other sites would do same thing. Since this is S-m, I figured a thread coulds bring out real needs also.
Load, real, on server for this site, is part background work thta has to be done. Overall resources available are in essence feed+admin bandwidth available,and as site becomes more popular the feed needs versus background will need to grow and background drop unless we want to throw hardware improvement funds into the site. So, if we improve background to use minimal resources (process capacity per second) the site will be stable longer with heavier feed loads and lack of need to upgrade physical server, whic has been done three times including initial server built AFAIK. Benefits are long term here as well as short term. I care enough to want a stable site server here, one way to get that is to minimize background work here.
Users drive a forum site to a large degree (within constraints of reality, which is why the rules), I wanted real figures as to how many thought what and to see if wants were really for 2.5 hour updates overall, or if less often could work either now or later. We have a vast majority the would accept updates half as often, so far.... Poll and thread did what I wanted it to do. I think I timed it to close in two weeks,the poll part.
Also, if folks later ask why we do not get points more often, this thread will serve as answer. Folding at Stanford, as more and more fold, will take mainfoldly more load than each individual site, it will take sum of load of all users and sites' seeks. Hardware is expensive, even if older and slower machines that draw less power also are used for single tasks. And power in CA. is getting hyper-expensive. So, deloading there lets them use more of a finite budget to get results into world faster.
Geeky1 even showed us this, he runs an older box to server printing, and routers and hubs and small switches are cheap. I got my broadband access and firewall router for $33.50 after rebate.... By determining needs, and concentrating money on the boxes and not the routing.
Part of site bogs is limited pipe site works though, and as load increases servers break because equal priority processes conflict for resources in any one second, and deemphasizing things that can be will let processes run less often. That is one reason Linux and alternative O\Ss are used for servers, they can use limited resources and are so modular that you can set up LANs behind firewalls with older boxes performing smaller setsof tasks and network work to a web feed server.
John.