The Challenge

How do you run campaigns that quickly identify the right key influencers from a database of 400 million?

The Solution

By using Elasticsearch real-time analytics to search petabytes of data and return results in seconds

Case study highlights

Improve operational efficiency across the company

  • Accelerate database updates by 6x
  • Increase search query response time by 4x
  • Scale users by 4x without adding more hardware

Leverage search power to generate revenue

  • Empower mission-critical search with high performance
  • Maintain a competitive edge
  • Launch new services to double revenue

Leveraging Search to Target People with "Klout"

Klout measures social media influence. Collecting data from multiple social networks – YouTube, Facebook, Twitter and many more – the company calculates a “Klout Score" for prominent “influencers" in any particular market, and uses this information to help customers such as McDonald's, Audi and Red Bull target influential people in their markets.

Search is a key function for Klout. The company developed an internal targeting system, which searches petabytes of data every day. The targeting system is used by a team of Klout business analysts every day to build marketing campaigns for customers.

"This targeting system generates 100% of our revenue," explains Felipe Oliveira, Director of Engineering, Backend, for Klout. "It is a very critical part of our business."

"In the previous system there would be targeting mistakes, and it was very difficult to determine why that happened. With Elasticsearch, we can run it through the Explain API to find out why a user was targeted. We can easily see when a campaign is configured wrong and fix it."

Jieren Chen, Senior Software Engineer

The Pain of Updating Terabytes of Data on a Daily Basis

Prior to Elasticsearch, Klout used a MongoDB solution that did not scale. All the Klout scores are updated on a daily basis, and this became a challenge as Klout's user base expanded.

“We have to update a dataset of a few terabytes of data on a daily basis," said Oliveira. “The throughput of Mongo was horrible."

“We had this job that would scroll through the targeting database and run queries," adds Jieren Chen, Senior Software Engineer at Klout. “This job was taking a full day to run. It was completely unacceptable. We needed something that was much faster."

Search query response time was also slow. Klout business analysts search the Klout database of social media users to target the right influencers for their customers' campaigns. On the old system, a query response would often take several minutes, slowing productivity of the entire team.

Integrating Elasticsearch with Hadoop for Fast Query Results

Klout switched to Elasticsearch for multiple services. Most importantly, Elasticsearch is a critical function within the company's internal targeting system. Klout also provides a search function on Klout.com, based on Elasticsearch. In addition, Klout is introducing a new iPhone app that uses Elasticsearch.

Elasticsearch is used to search Klout scores across 400 million social media users in the Klout database. The data loaded on Elasticsearch includes everything Klout knows about the user – including topic areas they are influential in, and their Klout score – and it is used to target specific users for every campaign. In less than two years, Elasticsearch has enabled Klout to scale from 100 million to 400 million users, while reducing database update time from one day down to four hours, and delivering query results to the business analysts in seconds rather than minutes.

Klout's multiple petabytes of data are stored in HBase, the Hadoop big data store, using Hadoop Distributed File System (HDFS) and connecting to Elasticsearch through the Hive data warehouse system.

“Elasticsearch has a very good integration with Hadoop," says Oliveira. “It allows us to export a Hive table to an index on Elasticsearch very easily. HBase is a great data store, and it allows random access to the data, which Elasticsearch is perfect for. Elasticsearch fits very nicely into our data pipeline."

Using Elasticsearch to Create New Revenue Streams

Klout will soon be unveiling a new self-service option which will allow customers to run campaigns automatically online, without having to talk to a Klout business analyst. This new service will also depend on Elasticsearch, same as the internal targeting system.

“Launching a self-service option would not have been possible with our previous solution, because of the terrible performance," Oliveira states. “The performance of Elasticsearch gives us a competitive edge, allowing us to create this type of self service tool."

Currently, Klout can only create campaigns if they have people in house to handle them. A self-service version of the service will provide a much more sustainable and scalable revenue model, allowing them to tap into markets such as small and medium size businesses that were previously not economical for the company. Klout expects that by enabling this new service, Elasticsearch will help the company double current revenues.

Klout's benefits using Elasticsearch

Increased productivity

Elasticsearch reduced database updates from one day to four hours, and query response time has gone from minutes to seconds. This speed, coupled with Elasticsearch's ease of use, has made the Klout team twice as productive.

Reduced capital expenses

Elasticsearch's scalability allowed Klout to search 4x more data,on the same hardware they were using with MongoDB, saving significantly on capital expenses.

Faster development

Klout's development process is much faster due to Elasticsearch features including REST APIs, easy integration with other technologies, and comprehensive, effective debugging tools.

Generate new revenue streams

Elasticsearch's performance is enabling Klout to launch a self-service option, expected to double the company's revenue.