How do you provide a social search recommendation engine that rivals the performance of Facebook and Google?
By using Elasticsearch to provide real-time search results from an index of more than 10 billion documents
Case study highlights
Boost the power of search performance
- Scale to tens of thousands of users generating 200 million documents per day
- Decrease search query latency by 5x
- Deliver real-time results that compete with the world's leading social networking platforms
Add value to search capabilities
- Customize searches based on data across multiple social network platforms
- Develop, test and roll-out new features quickly
- Pioneer new ways to generate revenue from search recommendations
Customized searches enhance the social networking experience
Wajam is combining the collective knowledge of a user's social networks to make recommendations based on their friends' and connections' preferences. The company provides a social search platform that collects data from multiple social networks, and then serves customized recommendations for the user, alongside the regular search results from any major search platform.
"There is an information overload today, as people share more and more information on different social networks," says Alain Wong, Marketing Manager of Wajam. “We make sense of that data and create value from it. Anywhere people usually search – Google, TripAdvisor, eBay, Wikipedia and more – we can add recommendations from friends."
“We're uncovering hidden value in existing social networks," he adds. "Great search is a key element of creating that user experience."
The climb from 200 million to 10 billion documents
Just two years ago, hundreds of users searched less than one million documents per day. Today, tens of thousands of users download Wajam and generate more than 200 million documents every day. Sphinx, the company's previous search server, simply could not handle the exponential growth.
"Sphinx was not meant to handle the millions of requests for data we have," Wong recalls. "That was the main challenge."
"We could not serve all the requests. During the rush hour, we had to reduce the number of searches going through Wajam because it was too much for Sphinx," says Jerome Gagnon, Developer at Wajam.
Searches were too slow, at about 1.5 seconds, compared with user expectations of 200 milliseconds or less. Adding to the challenges, Sphinx was not designed to provide real-time or geo-located search results, and filtering was not adequate. Each time Wajam added a new node to scale the system, it was a time-consuming manual task to split and distribute data.
"We looked for a search engine better suited for our needs," Wong continues. “After researching several solutions including Apache Solr, it was clear that Elasticsearch had greater market traction with well-known adopters like Foursquare and SoundCloud."
"We used Elasticsearch support during development to help us validate our ideas as well as understand how we could do things even better. That was key to us scaling quickly."
Using Elasticsearch for real-time Hadoop search
Now, Wajam leverages Elasticsearch to deliver superior search recommendations to tens of thousands of users per day. Elasticsearch integrated seamlessly with the Hadoop Distributed File System (HDFS) via Wonderdog from Infochimps, providing real-time search results to its Apache Hadoop data. This is a key to success against competitors like Google and Facebook that set user expectations for high performance.
“Users are used to getting answers to a search right away," says Wong. “On Google, they get an answer in less than 200 milliseconds most of the time. If we are not able to match that, we would not be competitive with other search sites, and we would not be adding value for our users."
“Elasticsearch has allowed us to reduce response time from 1.5 seconds to under 200 milliseconds," Gagnon adds.
Scaling to 100+ nodes with Elasticsearch development support
As Wajam's data scaled from 200 million to 10 billion documents in less than two years, they've needed to make the right decisions quickly about how best to scale their Elasticsearch deployment. So they tapped Elasticsearch Development Support to help them plan and execute the right architectural decisions.
Elasticsearch ensured that Wajam had the resources they needed to make their Elasticsearch deployment successful at scale. This included helping the Wajam team validate and improve their designs, and provide solutions to problems they encountered along the way.
Easily scalable to 100+ nodes
Elasticsearch allowed Wajam to easily scale from hundreds of users to tens of thousands of users every day, and grow from searching a data set of 200 million documents to 10 billion.
Ultra-fast query response
Elasticsearch has delivered a 5x improvement on the latency of Wajam's search query response, enabling the company to meet user expectations for response times at 200 milliseconds or less.
Wajam must deliver true real-time search results to remain competitive with platforms such as Google and Facebook. Elasticsearch provides the real-time search capabilities required by Wajam's demanding user base.
New ways to generate revenue
Wajam will use Elasticsearch to develop new revenue channels based on relevant recommendations from the user's friends.