Using Elasticsearch to build a CRM

Jilles van Gurp is the lead architect for Inbot, which is a Berlin based company building a mobile CRM solution for field sales people. The Inbot customer relations assistant automatically captures your customer communication from the apps that you use, provides real time insights to the sales process, and helps you collaborate on customer projects with your team members directly from your mobile phone.

At Inbot, we are creating a mobile first CRM that heavily uses Elasticsearch to do all the data heavy lifting and analytics. Mobile first in this case means the primary user experience is running in an environment that is typically not controlled by enterprise IT departments: the user's smartphone, which is the same device that sales people use to communicate with their customers and consequently the ideal place to gather data for CRM purposes.

Being on the phone and close to where data is generated enables Inbot to automate CRM data gathering and relieves the user from redundant and tedious data entry tasks. As data enters our system, contextual reports about the sales activity adapt to new incoming data right away. This seamless experience effectively turns what used to be a monthly or quarterly chore for sales representatives into actionable insights generated straight from their sales activity in real-time.

Why Elasticsearch?

Elasticsearch is a perfect fit for our use case since it provides real-time analytics through aggregations. Aggregations are a game changing feature in our industry and an obviously useful tool for the type of reports that are needed in a CRM product. Few other databases provide this feature and most products that do tend to be specialized data mining products or specialist tools such as Apache Hadoop that are used by data scientists. Elasticsearch aggregations make it possible to get real-time answers to complex questions such as how many of my leads converted in the last quarter, what was the average time from prospect to sale, who sells the most in my team, etc. Additionally, Elasticsearch can store massive amounts of data, and provides very flexible ways of querying and ranking information.

Elasticsearch as database

Over the course of last year, we re-built the Inbot backend on top of Elasticsearch.

Prior to this, we had a prototype backend based on MySQL that was struggling in a several areas.

When I joined Inbot - early 2014 through acquisition of my location based search company Localstream - the plan initially was to complement the Inbot data platform with a new analytics engine based on Elasticsearch aggregations.

However, in May 2014 we decided to re-build the complete backend on top of Elasticsearch and use it as a database as well. From past experience with Localstream, where we used Elasticsearch as a database, and Nokia, where we (ab)used Lucene as a key value store, I already knew that Elasticsearch could easily handle hundreds of millions of small JSON objects, even on very modest servers.

At the time, using Elasticsearch as a database was still a somewhat controversial thing to do and the recommended practice was to have a separate datastore and then pump the data from there to Elasticsearch to expose it for search. The reasoning for this was that in case of data loss, you can always simply rebuild your indices from your primary store and everything will be fine. This was also before the infamous Call Me Maybe article was published that brought up some resiliency related issues in Elasticsearch. The developer team has responded brilliantly to this and as of the latest 1.5.x release most of the issues have been addressed. The upcoming 2.0 release promises to deliver additional resilience related enhancements.

The reasons we decided to focus on Elasticsearch as a storage layer were very simple:

  • We needed a simple, scalable data storage solution with advanced, real-time querying capability such as Elasticsearch provides.
  • Given that we were going to rely heavily on its querying abilities to provide real-time reporting, search, and other features, any Elasticsearch outage would effectively mean Inbot was unavailable, regardless of what our primary data store was going to be.
  • When Elasticsearch goes down, it needs to be rebuilt from a known good source of data. Whether that is a backup or a live database does not really matter. Until it is rebuilt, Inbot remains unavailable.
  • Having a separate database and synchronizing that with Elasticsearch continuously adds latency and complexity to our architecture.
  • Elasticsearch is designed from the ground up to be highly scalable and available. This makes a cluster outage both rare and recoverable provided good backup strategies are in place. Inbot backs up all our data in raw JSON form several times a day so that in the case of such an outage we can recover simply by reindexing all the data.

The renewed Inbot platform that was launched November 2014 stores documents that represent contacts, comments, people, teams, customer accounts, deals, conversions, and other business objects in the CRM domain. A typical user will have thousands of contacts, tens or even hundreds of thousands of activities, and hundreds of customers and leads that each have a deal history. All of this data is stored in Elasticsearch in a way that facilitates our reporting and querying needs.

Using Elasticsearch for CRM

A lot of features in the Inbot app are powered directly by Elasticsearch. A key feature of the app is to highlight contacts and customer accounts that are relevant based on the user's recent activity. This feature is powered using a function score query that we use to rank search results for e.g. a contact search by name. The scoring function relies on an aggregation query that calculates the most relevant contacts for the user based on recent activity. This results in an experience that always emphasizes contacts that are the most relevant to the user. When searching, users can find most people they care about with one or two character searches because people they communicate with less get ranked lower.

Most CRM functionality in Inbot is accessible via a commenting feature that is integrated into the application. Users can leave notes on accounts and contacts to each other or for themselves. These notes can contain hashtags, at-tags and other metadata. This provides an easy mechanism for users to group contacts together with a hashtag.

Inbot also uses tags to drive the sales pipeline. Users can design their own sales funnel and pick hashtags such as #prospect or # sale that are associated with the right a particular stage. To help users select a deal stage tag or other tags, we use context suggesters when they are searching or leaving notes on contacts.

Since all the user's data is stored and indexed straight into Elasticsearch, providing reports and analytics is as simple as running aggregation queries and presenting the results. CRM reports include breaking down activity by team member, deal stage and period. For example, how many deals were closed in January; how many deals moved from the prospect to offer stage; what is the sales forecast for May; etc. are all questions that can be answered from the data Inbot stores in Elasticsearch.


Inbot has been available in the iOS app store since the beginning of the year. Currently, we are very busy supporting our initial customers and adding more features. We took a big bet on Elasticsearch and this has worked really well for us. The number of users, their contacts, and activities that we have in Elasticsearch is rapidly growing. So, far performance has been as planned and expected. Elasticsearch answers most queries in milliseconds. Typical API requests are handled in well below 50-100ms.

Overall, we have been very pleased with Elasticsearch, how well it performs, and how it is evolving. Since we started using it, most releases have included new features or enhancements that we ended up using as well as performance and resilience related enhancements.