26 mai 2015

Elastic Community Health Metrics: Why *You* Should Share Your Stories

Par Sonja Winter

Sonja Winter is the Founder and Principal Data Scientist at Winter Statistics, a boutique data consulting firm headquartered in Amsterdam. After earning her Masters Degree in Developmental Psychology, she realized statistics was her greatest passion, so she transformed her experience gained during her masters research into the basis for her thriving big data analysis start up. Her goal is to ensure that fundamental knowledge informs applied data analysis and vice versa, giving her clients the best of both worlds. Even better, she makes extensive use of the ELK stack to help her clients get insights from their data. 

Hi all! I'm Sonja Winter, a data scientist/statistician, currently helping Elastic analyze their community data. In my daily work, I use the ELK stack to explore the data I work with, so I am excited to now be getting insights from the ELK stack into the data from the same community that helped build it. This post will be the first in a series of short updates on Elastic’s community development and engagement.

Any open source project thrives on the engagement and contributions of their community, and the ELK stack is no different. Therefore, it is important to gain insight into the factors that play a role in engaging and motivating this community. Hosting meetups is a popular method of getting the community together to share knowledge and for the projects’ developers to figure out their users’ praise and pain points. So how do you get more people to register for your meetup events? The analysis I will show you today will demonstrate that getting someone from your community to speak to your community at a meetup event, will result in a higher registrant-to-member ratio.

The fine folks at Elastic have been monitoring meetup activity for all events taking place since October 2011. Since then, a total of 649 Elastic related events have been organised and tracked via meetup.com; plus, there are many more user group meetings happening than are reflected on just the meetup.com site. For 138 of these events, we have data on the origin of the speaker of the event, and we know that the event was hosted by an Elastic-focused meetup group.

In total, 83 events had community speakers (60.1%) and 55 events had Elastic employee speakers (39.9%):

As a proxy for the popularity of an event, I computed the registrants-to-member ratio, which equals the number of registrants to an event divided by the total number of members of that meetup group. A ratio of 1 indicates that all members of the meetup group went to the event. A ratio < 1 indicates that some members of the group went to the event and a ratio > 1 indicates that more people than members of the group went to the event. The box-plot figure below shows that the average registrants-to-member ratio (the thick line in each box) is higher when the speaker is a community member than when the speaker is from Elastic Inc.. More specifically, for a community speaker, the average ratio is 0.69, meaning that for every 1 member, 0.69 members will register for the event (or for every 100 members, 69 will register). For an Elastic speaker, the average ratio is 0.58, meaning that for every 1 member, 0.58 members will register for the event (or for every 100 members, 58 will register). Thus, on average, this data indicates that an event with a community speaker will attract 11 per 100 members more than an event with a Company speaker.

To explore whether this difference is also statistically significant, I performed a Poisson regression. Since a Poisson regression is used for count data, I first multiplied the ratio by 100 and rounded to the nearest integer. This number now represents the number of registrants per 100 members. The data is overdispersed (z = 3.85, p < .001; see also the histogram below), meaning that the variance of the data is larger than its mean. You can also see this in the box-plot by looking at the outliers (the dots). Because of this, I chose to perform a negative binomial regression analysis. The results indicate that the difference between a community and company speaker is not statistically significant (Est. = 0.19, 95% CI = -0.17 - 0.53, p = .291). This is not surprising, as the variance around the mean is very large (you can see this in the box-plot by looking at the length of the whiskers [lines] and the dots that indicate very high values). To make the estimate more meaningful, we take its exponent, which results in an incidence rate. For every 1 registrant per 100 members for an event with a company speaker, 1.21 members per 100 members register for an event with a community speaker.

Can a statistically non-significant result still be meaningful? I think in this case, it can be. Having, on average, 11 extra registrants per 100 members (an increase of 11%), just by having a speaker from the community instead of from your company, is an easy win in my eyes.

Editor's Note: Many thanks to Sonja for this awesome analysis! We have a ton of resources available for people who'd like to speak at Elastic meetups or who are already sharing their stories  at user groups worldwide. Head on over to our Elastic User Group discussion forum to learn more about how we can support you in your community knowledge sharing quest.