Monday, July 23, 2012

Diving into the Statistics...Ecology

One thing I've learned over the last few weeks is the wide array of success factors important to different types of citizen science projects. There is a lot of diversity out there and no single formula defines success for everyone. Hours staring at Excel spreadsheets and wild regressions taught me that. But once you organize things into neat piles and keep the apples with the apples things start coming into focus.

Up next are the ecology projects, looking first at measures of public popularity. I expect they will be significantly different than the astronomy projects based on the nature of the science. Fortunately the data seem to bear this out.

Motivate the User-0.12342609
Create a Community-0.112233026
Interact in Real Time-0.27786182
Provide Feedback-0.230619358
Offer Excitement-0.179413057
Encourage Dialogue-0.33835345
Provide Data Access-0.236025844
Allow for Errors0.100673012
Be Audacious-0.237861864
Stay Focused0.379573448
Make it Convenient0.20049689
Make Learning Easy0.149773189
Make Participating Easy0.27213335

Very quickly we see the most important factors for success appears to be all from the Keep it Simple family: Stay Focused, Make it Convenient, Make Learning Easy, and Make Participating Easy. This is the first time we are seeing a whole group of items together and it's likely there is a strong correlation amongst them. As discovered earlier the success factors aren't completely distinct; while they have their own definitions there is certainly potential for overlap. They are also based on general concepts and not on rigidly-defined criteria. So a correlation between these is not unexpected.

Looking more closely at Keep it Simple it makes some sense for these to be strong success factors for ecology projects. Unlike other astronomy or meteorology projects, many ecology projects can be performed with minimal time, investment, or training. Of the ones we've reviewed many just require taking notes in your backyard or a nearby area, and any identification is either 1) relatively simple for the lay-person, or 2) aided by tools provided by the project. So it doesn't require extensive biology training or expertise, and reduces the need for extensive education. 

As long as designers work to keep the projects simple and easy to participate in, they can expect to be successful. Part of this is staying focused on a particular species or problem...this focus helps keep things simple and ensures that minimal training is required.  Only a few simple things need to be taught. But even then, making the learning easy will ensure a well-trained volunteer able to participate and they won't get frustrated trying to learn by themselves. Making it convenient, by allowing measurements in one's backyard and on relatively infrequent timetables, also makes participation easy. All these factors work together to help people get involved, stay involved, and tell their friends to become involved.

Surprisingly (or not so surprisingly), the data on important factors leading to academic success (as measured through Google Scholar citations) are much different than those for ensuring popular success.

Entertain 0.372179967
Reward -0.090097108
Challenge 0.595873844
Educate 0.527128459
Motivate the User 0.524441676
Create a Community 0.671781181
Interact in Real Time 0.50987457
Provide Feedback 0.409862612
Offer Excitement 0.194700875
Encourage Dialogue 0.461991758
Provide Data Access 0.504180285
Allow for Errors 0.221117582
Be Audacious 0.28583062
Stay Focused -0.141373398
Make it Convenient -0.406432517
Make Learning Easy -0.214070343
Make Participating Easy -0.062347621

The strongest, and to me one of the most interesting success factors, is creating a community.  This fits very strongly with the highest-ranking projects (Great Backyard Bird Count and Christmas Bird Count) which have both created very strong communities around the project.  For one thing, developing a community of volunteers that supports one another and encourages themselves to stay involved is able to keep strongly motivated users who will work hard for the project.  But while important for remaining popular and bringing in new participants, it is also very important to scientific success as well.  This also correlates with "Interact in Real Time" where project participants work with one another and keep up a regular dialogue about the project.  My thinking is that the training component involved with having experienced users teach new birdwatchers how to participate significantly improves the quality and quantity of results.  It keeps people in the field longer and ensures the data is accurate.  It also teachers new participants tips on spotting more birds and increases the number of available data points.  All important parts of developing strong data sets.

Finally, I find it interesting that the success factors for a publicly popular project are actually negatively correlated with scientific success.  In other words, keeping a project simple can actually hurt it's chances of creating scientifically important results.  Why is this?  Well, consider that the simplicity involved in these projects actually over-simplifies the situation does not allow enough meaningful data to be collected.  Some of this may introduce uncontrolled variables when people can work on their own and on their own time...this lack of rigor can  increase errors and make the data unreliable.  Conversely there is not enough flexibility in the data...much of science is not just looking for data on the problem you understand, but also looking for new problems and unexpected connections.  Making a project to simple can stymie that type of discovery.

So those are my initial thoughts on Ecology projects.  But what about Distributed Computing projects?  Find out about the final scientific area being explored in my next post.

