Tuesday, July 24, 2012

Success Factors for Distributed Computing Projects

We have talked generally about success factors for distributed computing projects for weeks now.  We've also talked about success factors for all interactive projects, as well as sub-dividing it all into success factors for astronomy and ecology projects.  We even dove into the correlation statistics on them.  So now it's time to do the same thing for Distributed Computing.

First let's look at the data broadly.  The most significant result is no result...nearly fifty percent of the success factors have zero correlation to either the popularity or scholarly success of these projects.  This is not unexpected and part of the reason why we separated results for distributed computing projects in the first place.  But it still merits discussion.  Looking closely at the data all of these items show absolutely no difference between projects in their success ranking factor.  In other words, each project addresses these factors in the same way or they just aren't expressed at all.  So there are a more limited number of factors that differentiate the projects, limiting the potential factors for success.  A positive interpretation is that project designers only need to focus on a few key elements to make a highly successful project since most will not impact the result.  However it could also be interpreted as proof that further research is needed on success factors for distributed computing projects beyond what we've done here.  While I do promise to address that in future posts, let's see what the data we do have says.

Now let's look at what effects the scholarly scholarly impact of Distributed Computing projects.  By fat the largest correlation, almost 1:1, is Providing Data Access.  This confused me for a while and made me double-check my data to make sure it wasn't just one outlier or a series of unreliable data points causing the trend.  And that's what it turned out to be.  The only project that does anything different in this aspect than other projects is SETI@Home, and the overall success of that project is likely showing that correlation.  So it may be a real factor but there is not enough data here to prove it.  I also need to review the other projects more closely on this factor to make sure the lack of differentiation seen in my experience truly holds up.  that may be a future post as well, and I would encourage anyone who sees differences from their experience to let me know in the comments below.

Be audacious has a very similar problem.  While there are differences with two projects (and not just one), it's still a minor differentiation and nothing I can build a statistical argument on.  So we will also need to disregard that item for lack of evidence.  So what does that leave us with?  The correlation chart of success factors for scholarly success below shows the answer:

Entertainn/a
Reward0.006933365
Challengen/a
Educaten/a
Motivate the Usern/a
Create a Communityn/a
Interact in Real Time0.006933365
Provide Feedback0.699078158
Offer Excitement0.515624856
Encourage Dialogue0.444842746
Provide Data Access0.967159265
Allow for Errorsn/a
Be Audacious0.705797165
Stay Focusedn/a
Make it Convenientn/a
Make Learning Easy0.567665542
Make Participating Easy0.071321992

Tthe importance of Providing Feedback sticks out the most to me.  This data point has significant variation as project designers work to varying degrees to provide timely updates on the project and it's results.  Some use e-mail newsletters, some update information on their web site, some provide usage statistics in the program interface, and others use combinations of these techniques.   This helps keep projects popular by keeping them in the participant's (and the public's ) eye, and keeps people motivated by showing the benefits they provide to science, but how is this a scholarly success factor?

The important thing to remember is that distributed computing projects typically have a single goal (or set of goals) that don't change over time.  Researchers design the project for a very particular problem and use the brute force of everyone's computers to solve it.  So there is no benefit to a project only half-completed...the problem remains unsolved and the work already done gets discarded.  So if there are not enough participants to see a project to it's end there will be no results and all the time will be wasted.  Meaning no final result and no academic papers.  So even though this is a popularity measure it is vital to scholarly success.

Speaking of popular success, below is the correlation chart for those success factors:

Entertainn/a
Reward-0.075083869
Challengen/a
Educaten/a
Motivate the Usern/a
Create a Communityn/a
Interact in Real Time-0.075083869
Provide Feedback0.021533374
Offer Excitement0.231458187
Encourage Dialogue0.052357277
Provide Data Access0.274326387
Allow for Errorsn/a
Be Audacious0.144201148
Stay Focusedn/a
Make it Convenientn/a
Make Learning Easy0.053787176
Make Participating Easy0.077344773

The important idea on this chart is the need to "Offer Excitement".  This should not be a surprise when you think of how projects gain popularity in the first place.  Most bring people in through existing citizen science portals (such as OpenScientist) advertising them, or more likely, from people reading popular press articles about the project.  The press loves an exciting story and will focus most on projects with the best narrative.  So in some ways it is not directly related to what we consider scientific success.  Except for one important thing.

Remember that from a participant's point of view, none of these take significant time, energy, or resources.  There is minimal operational time and many even share the same infrastructure (such as the BOINC platform).  So projects can't differentiate themselves on ease of use or on a network infrastructure effect. Instead they compete by how exciting a project is.  The more exciting, the more participants will join and the more computing cycles will be performed.  So it is a necessary component to project success, bringing in a critical mass of participants and ensuring enough computer time will be devoted to the project.

4 comments:

  1. Packers and movers in bangalore@
    http://www.verified5.co.in/packers-and-movers-bangalore/
    Packers and movers in delhi@
    http://www.verified5.co.in/packers-and-movers-delhi/
    Packers and movers in mumbai@
    http://www.verified5.co.in/packers-and-movers-mumbai/
    Packers and movers in pune@
    http://www.verified5.co.in/packers-and-movers-pune/
    Packers and movers in kolkata@
    http://www.verified5.co.in/packers-and-movers-kolkata/
    Packers and movers in ahmedabad@
    http://www.verified5.co.in/packers-and-movers-ahmedabad/

    ReplyDelete
  2. Packers and movers in bangalore@
    http://www.verified5.co.in/packers-and-movers-bangalore/
    Packers and movers in delhi@
    http://www.verified5.co.in/packers-and-movers-delhi/
    Packers and movers in mumbai@
    http://www.verified5.co.in/packers-and-movers-mumbai/
    Packers and movers in pune@
    http://www.verified5.co.in/packers-and-movers-pune/
    Packers and movers in kolkata@
    http://www.verified5.co.in/packers-and-movers-kolkata/
    Packers and movers in ahmedabad@
    http://www.verified5.co.in/packers-and-movers-ahmedabad/

    ReplyDelete