Center for Open Science
How do you enable scientists around the world to improve processes, share, and collaborate?
By using Elasticsearch to provide 5x faster search and an improved user experience
Case study highlights
Improve search performance
- Gain 5x faster query response time
- Handle searches across 215,000 documents
- Boost relevance of search results
Increase developer productivity
- Easily scale to meet any demand
- Add new search features in minutes
- Add search to prototypes in under one hour
Improving scientific research worldwide
The Center for Open Science (COS; http://cos.io) is a non-profit technology organization dedicated to improving the alignment between scientific values and scientific practices. Open source software developed by COS – such as the Open Science Framework (OSF; http://osf.io ) – is provided free to all scientists worldwide with the goal of supporting new discoveries that can change the world.
Currently 6,000 scientists utilize OSF, which is part network of research materials, part version control system, and part collaboration software. It is a Web service that integrates with the scientist’s daily workflow, helps document and archive materials and data, facilitates data sharing, and enables transparency. With the OSF, users can create and manage scientific research projects, collaborate with other researchers, and make project data publicly accessible. In just over one year of its operation, users have downloaded 215,000 documents available from the OSF.
"Solr was used in production as the search engine for OSF before implementing Elasticsearch," recalled Fabian von Feilitzsch, a summer intern (and soon to be full-time developer) at COS. "We are rapidly prototyping many new features and products. But Solr added a lot of complexity and overhead that prevented us from integrating search into our prototypes. It wasn’t worth the time."
Delivering 5x search performance
COS replaced Solr with Elasticsearch as the primary search engine for all content on OSF, internal and external. All registered users and components of projects are indexed by Elasticsearch and searchable through the search bar on osf.io, and through OSF’s API.
"We have an increasing amount of content in the system," said von Feilitzsch. "In addition, scientists are specialists that know exactly what type of content they want. Elasticsearch helps them find content on OSF much faster." von Feilitzsch points out that OSF searches in Solr took about 250 milliseconds, while Elasticsearch takes only 50 milliseconds – a 5x improvement in query response time.
"We tested Elasticsearch with 50,000 to 100,000 documents and did not see any slow down of performance," he added. "The key factor for us is speed of delivery to end-users. Elasticsearch delivers this value to us."Additionally, von Feilitzsch noted that easy scalability is a major advantage of Elasticsearch. COS can quickly spin up several Elasticsearch nodes to deliver infinite horizontal scalability.
Enabling plug-and-play flexibility
OSF is a highly modular system, and COS develops all the other tools and features with a plug-and-play design. As OSF evolves, the COS development team adapts to add and remove components continuously, which requires easy interaction with the search engine. Elasticsearch meets this need for flexibility.
"Elasticsearch allows us to focus on what the data should look like, and not worry about whether the data is compatible with the search engine," said von Feilitzsch. "Elasticsearch removes any worries about the search engine as a consideration. There is very little configuration needed. It just works."
"With Elasticsearch, it is very easy to integrate search into any prototype," von Feilitzsch explained. "We recently prototyped a new service and integrated Elasticsearch in less than one hour. It is absurdly easy."
"Because search is so much simpler with Elasticsearch, it is easier to add additional features to the search engine and to keep up-to-date with changes happening on the backend of OSF," he added. "This was much more difficult with Solr, when we had to manually define everything that we were changing."
One example of a simple feature COS added easily with Elasticsearch is a quick filter on document types. It was a small addition to a single Elasticsearch query. Now, OSF can filter by any document type in 50 milliseconds.
"It took us only 20 minutes to add the functionality in Elasticsearch," said von Feilitzsch. "This new feature adds more power to user searches – power we just couldn’t offer before on OSF."
Enhancing the user experience
"Elasticsearch is a means of delivering on the promise of providing better, more professional tools to scientists," said Andrew Sallans, Partnerships, Collaborations, and Funding at COS. "We are building this environment where people can store, manage, and share all of this content, but in order for them to want to use OSF, it must be a desirable environment to work in. Elasticsearch helps us improve the user experience for both the scientists producing the research and the users consuming the data."
By adding more features easily, COS not only gains development productivity but also enhances the user experience. Improving the search query speed is also an important augmentation of the user experience. In addition, Elasticsearch capabilities such as filtering and boosting increase the relevance of search results, consequently improving the user experience.
"We could have stayed with Solr, but it made our lives harder, and the user experience was not as good," von Feilitzsch concluded. "Elasticsearch raised the bar on both sides, bringing value to us, our users, and of course to our mission as a result."
Faster search query response
Elasticsearch delivered a 5x increase in search query speed, compared with the previous search engine.
Greater development productivity
Elasticsearch's plug-and-play flexibility enables COS developers to add new search features in minutes and add search capabilities to new prototypes in less than an hour.
New Elasticsearch nodes can be added as needed to meet OSF's growing demand, providing unlimited horizontal scalability.
Enhanced user experience
By streamlining the addition of new features, accelerating search speed, and improving the relevance of search results, Elasticsearch provides an enhanced user experience on OSF.