The Challenge

Create a scalable, ultra-fast Internet discovery platform for products based on user interest

The Solution

Use Elasticsearch to power intelligent search and better inform recommendations to millions of customers a month

Case study highlights

Create a world-class customer experience

  • A "Stumble" provides real-time recommendations to 30 million customers per day
  • Intelligent search is key to providing fast and more informed recommendations
  • Update your searches immediately with newly posted content

Develop and scale easily

  • Build in intelligent search to scale with millions of users and interactions
  • Take advantage of powerful and flexible APIs for easy data integration
  • Use easy to use but powerful solutions for your big data search and analytics needs

Powerful and intelligent search is key to enhanced customer experiences

StumbleUpon is the easiest way to discover new and interesting things from across the web. Every month millions of people turn to StumbleUpon to be informed, entertained and surprised by content and information recommended just for them. In addition, more than 80,000 brands, publishers and other marketers have used StumbleUpon's Paid Discovery Platform to tell their stories and promote their products and services.

As a customer "stumbles" through great web pages, they can tell StumbleUpon whether they like or dislike recommendations so they can be shown more of what they are looking for. StumbleUpon will show web pages based on that feedback as well as what similar Stumblers have liked or disliked.

In 2012, StumbleUpon set out to upgrade its search infrastructure to improve the overall customer experience, enhance performance and simplify platform management.

"The key to attracting millions of loyal followers is providing the best possible customer experience. Intelligent search is an essential component to creating that experience."

Dwayn Matthies, Platform Architect at StumbleUpon

Moving from Apache Solr to Elasticsearch

StumbleUpon previously used an implementation of Apache Solr for it search requirements. However, as StumbleUpon grew to millions of users, Solr had difficulty scaling and was difficult to administer and manage. "Apache Solr was not a good fit for us," Dwayne explains. "It was very difficult to manage. We had issues with servers going down due to overload. It was simply not scalable."

When StumbleUpon decided to upgrade its search infrastructure it began to look for alternative solutions. After attending an Elasticsearch training course and meeting Elasticsearch creator Shay Banon, StumbleUpon decided to implement Elasticsearch for all keyword search requirements due to its real-time capabilities and ease of use.

Elasticsearch powers 30 million "Stumble" recommendations per day

Today, Elasticsearch is deeply integrated in the core StumbleUpon recommendation engine which handles 30 million "stumbles" or recommendations per day.

The majority of Elasticserch queries come from the StumbleUpon recommendation engine. Elasticsearch can grab all categories and balance the categories and updates in the keyword search index to get real-time, high-speed queries. Specialty purpose indexes with different levels of search are also updated frequently.

StumbleUpon used to hold data in MySQL but now stores data in Elasticsearch to handle more flexible queries and achieve very fast, horizontally scalable performance. StumbleUpon's main keyword search index's average latency equals 25 milliseconds of throughput and 3,000 queries per second.

Elasticsearch is also very easy to administer and manage. Adding additional nodes to a cluster is greatly simplified and its very flexible API approach make integrating with additional indexes and data sources easy, such as Redis, MySQL and HBase.

StumbleUpon's benefits using Elasticsearch

Highly scalable and fast

Elasticsearch provides horizontal scalability to millions of users and is lighting fast for lookups which is essential for StumbleUpon to provide the best experience for customers and higher quality recommendations.

Simplified administration and management

Elasticsearch makes it easy for developers to manage the overall system, such as adding new nodes, starting and stopping of clusters, and setting up new indexes.

Flexible APIs ease integration

A unique API approach makes it easy for developers to integrate Elasticsearch with other data sources like Redis, MySQL and HBase.

Analytics provide greater insights

StumbleUpon uses the analytics capabilities in Elasticsearch to monitor and analyze trends in overall system performance.