How often should this site poll stats for folding???

Straight_ManStraight_Man Geeky, in my own wayNaples, FL Icrontian
edited January 2004 in Folding@Home
I think that as Stanford gets more and more faster boxes folding, they will have problems with increasing bandwidth costs. We as team folders can help by not expecting real fast updates at our sites.

Let's say a site grabs the 7 MB users update txt file every three hours. That is 56 MB of load to folding's admin net every day from that one site. We have how many team sites out there??? How many grab stats many times daily??? Guess, 500-700 minimum, not counting individual coolmon users. So, how many GIG just for stats per DAY???? Lets take 600 sites worldwide, and I am being conservative I bet:

So, 56 MB times 600 = 33,600 MB daily the stats server serves just to sites so we can all see updates.... That does not count extra bandwidth from individual hits on Folding's stats server. I think we should cut down the number of times per day we demend to see updates here, and wonder what you all each think, so will start with a minimum of 3 hours between polls to Stanford, then talk 6 hours, then 9, then 12, in poll. Units of three as that is how often Stanford is updating their text file for us today.

What do you think???

John.

Comments

  • a2jfreaka2jfreak Houston, TX Member
    edited January 2004
    I voted for 12 because I don't look to my sig for the most up-to-date information so there's no reason it needs to be extremely accurate, though I seriously doubt S-M changing the frequency will make a difference in the overall bandwidth usage of F@H. It's like a drop in the ocean.
  • edited January 2004
    This is a no-brainer. Resources that can be spent on finding a cure shouldn't be squandered on the ability to have up-to-the-moment statistics. Twice daily updates should be enough
  • EnverexEnverex Worcester, UK Icrontian
    edited January 2004
    Good thread, but I have a feeling that our stats sigs do not even pull even 1MB (or even 300k) from the server, as all it needs to do is pull up the teams page, and from that it has all the information for every users score, WU and rank.

    Keebler should know the way it works though as he maintains it.
  • pseudonympseudonym Michigan Icrontian
    edited January 2004
    12 is fine with me. That'll just keep me from clicking the link all the time rather than doing my homework :)
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited January 2004
    What I want to do is post something similar on the folding at home community forms. The 33 GIG I talked about??? That is about 15,000 big WUs worth of traffic (about 2-2.5 MB both ways per BIG WU, varies).... Other admins from other sites also read things here.... Hopefully setting an example will carry over to tohers when they see the sheer numbers involved. :D

    Second, any drop on stats server as to load makes it more likely that Folding will not have to massively upgrade stats server to handle increases in volume of stats as more and more folks fold.

    Side benefit for this site, it does not have to be processing and fetching stats as often that way. Figured poll plus thread could make this more understandable and obvious to not just this site's users but others also.

    John.
  • mmonninmmonnin Centreville, VA
    edited January 2004
    I dont think 600 teams do this. Most likely not even 100 do. Remember it takes someone skilled in programming to do this. It took lsevlad a long time to get something like ours put together.

    Right now I think its at 2.5 hours with it slightly staggered so it wont hit right when stanford is updating and everyone else is checking. I guarantee you, we dont pull that much off of stanford everyday. Not as much as we could.

    Stats like EOC really help because all they have to do is pull both links once and they have everything they need.

    If you want to help look at a site like theirs or statsman.
  • t1rhinot1rhino Toronto
    edited January 2004
    There should be official stats mirror sites where other sites can get the stats from. :)
  • csimoncsimon Acadiana Icrontian
    edited January 2004
    t1rhino wrote:
    There should be official stats mirror sites where other sites can get the stats from. :)
    agreed.
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited January 2004
    Well, the file that Vijay said to load for users was 7.56 MB actual size in download, pure text, no hypertext code, one line per user for all users, four data fields per line. That goes in entirety to all who request it. Figure for sites was based on not full mirros, but individual teams tracking their stats from this user file, at user level, and showing in some form-- be it sigs, a stats page for team only(by users) updated 5-10 times a day, or for whatever internal use the file is used for. The full service mirrors calc most of what they show, I like EOC, Statsman, and zerothelements stats for different things. But Linux admins, many of them, can code PHP and cgi scripts, it is in part what their boxes run on, is scripts. Let me google something, will be right back....

    Yup, thought my guess of 600 was way low, Google pulled 32,800 hits on "Folding +User +Stat" which got forum after forum with a stats page (not just sigs). It takes one site to start a trend, and why not one of the top ten teams????
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited January 2004
    I voted every three hours. The mild strain on the server due to frequent stat updates is more than offset by the increased interest following the stats brings. In other words, many more people start Folding - and stick with it - due to the competition aspect.

    Reductio ad absurdum: If the stats were eliminated altogether, would Folding participation go up, go down, or stay the same?

    It costs companies money to do payroll calculations every pay period, but they would soon lose every employee they had if they stopped doing it. :wave:
  • edited January 2004
    profdlp wrote:

    It costs companies money to do payroll calculations every pay period, but they would soon lose every employee they had if they stopped doing it. :wave:

    And most of these companies only pay their employees twice a month not every single hour...
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited January 2004
    Our server is not strained, Folding's stats server at Stanford is strained. Per Vijay Pande.

    John.
  • CammanCamman NEW! England Icrontian
    edited January 2004
    Ageek wrote:
    Folding's stats server at Stanford is strained. Per Vijay Pande.

    John.

    I think a lot of people folding is driven by stats and anything to keep more people interested in folding is good...


    and the message on the folding site says nothing about bandwidth being a problem, it says people downloading the stats INCORRECTLY using the CGI pages rather than just downloading the text files (they even suggest it and say its the right way to do it) so I dont see the problem with updating....
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited January 2004
    tefleming wrote:
    And most of these companies only pay their employees twice a month not every single hour...
    They could save money by only doing payroll once a year, instead of 24 times. I doubt the idea would be popular.

    Camman makes a good point - if bandwidth alone was the problem, I think Stanford would update the stats for all of the hundreds of thousands of Folders less often than every 2.5 hours.

    I Fold for my parents, who both have significant health problems, and for those like them. I wouldn't stop even if there were no stats at all. I do think having frequent stat updates builds interest, resulting in more people Folding - and more raw data for Stanford to work with.
  • edited January 2004
    12 for me. I dont look at the sig info very often.
  • LincLinc Owner Detroit Icrontian
    edited January 2004
    I seem to recall Lasse mentioning that the stats system does not use the text file, as that was created after our system (I think). In any case, I will review the scripts when I get a minute (they're not on this system) and see precisely what files it calls and what it does with them, but I would venture that our stats system is fairly benign in the bandwidth department. Percentage-wise it may be 1/2 or 1/3 of the usage, but that may be the literal equivalent of splitting a hair.
  • mmonninmmonnin Centreville, VA
    edited January 2004
    If it doesnt use the stats it would take some work to make it then wouldn't it I suppose?

    Another project keebler?
  • LincLinc Owner Detroit Icrontian
    edited January 2004
    mmonnin wrote:
    Another project keebler?
    It'll have to wait in line with the rest of them :ninja:
  • mmonninmmonnin Centreville, VA
    edited January 2004
    Sure, I know there are other things in line already.
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited January 2004
    mmonnin wrote:
    Another project keebler?

    :topic: (Referring to myself)

    I'm hoping the guy wins the Lottery, so he can drop out of school and work for S-M full-time.

    (Just don't lose that ticket!) :eek3:
  • GHoosdumGHoosdum Icrontian
    edited January 2004
    I'm in it for the cause, not the points, so you can guess how I voted.

    But prof's got a valid point about the folks that are just in it for the glory of points wanting to see their stats updated often - I don't think there are too many of THAT type on this team, though.
  • LincLinc Owner Detroit Icrontian
    edited January 2004
    The sig/milestone system uses approximately 2.25 MB of bandwidth per day. I think Stanford can handle that :)
  • csimoncsimon Acadiana Icrontian
    edited January 2004
    yeah I say we keep things the way they are unless we get notified by stanford.
  • EnverexEnverex Worcester, UK Icrontian
    edited January 2004
    The sig/milestone system uses approximately 2.25 MB of bandwidth per day. I think Stanford can handle that :)

    I think that pretty much invalidates the poll as 2.25MB a day is nothing.
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited January 2004
    Ok, if everyone who is an admin in places that pull stats checks and tries to help, the load on foldings stats server (based on its internal total feed bandwidth of resources to handle flow requests plus its internal needs) will drop as when it is feeding just text file as file it is not using the http port used to feed direct requests, among other things. The stats server can and will ftp the .txt file.

    If it helps, the user stats file is organized this way:

    user with highest total points on top (bubble sort a user name table, with sort key of accumulated points in toto local to here, from the active table used for sigs, as a an update match work table).

    Folding stats txt file fields are TAB delimited with each record field set LF\CR or LF delimited. Record 1 is time and date stamp. Record two is filed header names.

    So, for one easy parsing:

    sort an SQL output table with users by points accumulated, high to low, local to here (bubble sort, as above, key total accumulated points).

    Use user name from the table just made to seek user matches, they will occur in close to order of thie folding stats master user file sort so you can avoid having to do individual queries, but can build a table in 2-3 passes instead of a search per user through out file for a record. Build a site-local sig data update table with this pass set.

    Now, top to bottom, pass through table three times, not updating data that matches, or use CSV (CVS??? Seen BOTH used in different parts of world)(which uses byte by byte DIFFing to determine replace need and leaves same bytes alone) to sync in single pass and replace old table with changes. Use end result of sync to be table used for sigs.

    That is a top level quick and dirty parsing spec. Reason for multiple passes in first run through what local server grabs is to get order mismatches, because you are using local table to gen order and new table is in order as of text file time and day stamp which is record one of file. Record two is headers telling what data is in corresponding fields.

    John.
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited January 2004
    Well, saying what load we put on server each day to Stanford would prempt the need and let other admins compare. This poll was put up in part to show how few really are worried about points first as a motivator. Essentially, the points system lets users check quickly that machines ARE working for their IDs also, and that they are not hung-- also is an error checking data provider.

    I thought there would not be a lot who were worried and needed updates fro own needs or wants, and that the points system is used by most of us to tell us when our own boxes are folding slow ro fast, and wanted to see if others felt as we did. Poll was valid, I wanted to probe why folks look at points and want points, here, hoping other admins on other sites would do same thing. Since this is S-m, I figured a thread coulds bring out real needs also.

    Load, real, on server for this site, is part background work thta has to be done. Overall resources available are in essence feed+admin bandwidth available,and as site becomes more popular the feed needs versus background will need to grow and background drop unless we want to throw hardware improvement funds into the site. So, if we improve background to use minimal resources (process capacity per second) the site will be stable longer with heavier feed loads and lack of need to upgrade physical server, whic has been done three times including initial server built AFAIK. Benefits are long term here as well as short term. I care enough to want a stable site server here, one way to get that is to minimize background work here.

    Users drive a forum site to a large degree (within constraints of reality, which is why the rules), I wanted real figures as to how many thought what and to see if wants were really for 2.5 hour updates overall, or if less often could work either now or later. We have a vast majority the would accept updates half as often, so far.... Poll and thread did what I wanted it to do. I think I timed it to close in two weeks,the poll part.

    Also, if folks later ask why we do not get points more often, this thread will serve as answer. Folding at Stanford, as more and more fold, will take mainfoldly more load than each individual site, it will take sum of load of all users and sites' seeks. Hardware is expensive, even if older and slower machines that draw less power also are used for single tasks. And power in CA. is getting hyper-expensive. So, deloading there lets them use more of a finite budget to get results into world faster.

    Geeky1 even showed us this, he runs an older box to server printing, and routers and hubs and small switches are cheap. I got my broadband access and firewall router for $33.50 after rebate.... :D By determining needs, and concentrating money on the boxes and not the routing.

    Part of site bogs is limited pipe site works though, and as load increases servers break because equal priority processes conflict for resources in any one second, and deemphasizing things that can be will let processes run less often. That is one reason Linux and alternative O\Ss are used for servers, they can use limited resources and are so modular that you can set up LANs behind firewalls with older boxes performing smaller setsof tasks and network work to a web feed server.

    John.
  • witenoizwitenoiz 19,356 miles East of Kansas City, MO Member
    edited January 2004
    I think we should poll the server once a day!! :type: And I think basketball scores should only be shown at the half and again at the end of the game. :Pwned: Come to think of it we could do without football scores until the game is over since everyone knows the score all the time they are watching. :clap: I'll have to give baseball a little thought - maybe once every 3 innings :Pwned: Jack
Sign In or Register to comment.