16 June 2017 User Stories

Addressing Human Security Issues: Cause Award Honoree IST Research

By Renuka Hermon

We introduced the Elastic Cause Awards this year to celebrate three Elastic community projects that are using the Elastic Stack to advance the greater good.

In this post, we talked with Michael Paley and the team from IST Research to discover how they are changing the way we interact with data (and each other) by connecting inspired organizations with the data they need to fight human security issues. Harnessing the power of user-driven research, their Pulse Platform uses technologies like Elasticsearch to make a positive impact by countering human trafficking and violent extremism, monitoring and evaluating development programs and civil affairs, and promoting humanitarian efforts.

Please note that this interview has been edited for length and clarity.

Can you start by telling us about what inspired you to develop this application you call the Pulse Platform?

Paley: The Pulse Platform was initially conceived to interact with hard-to-reach populations through the technology they already have in their pocket: their cell phone. This initial vision for SMS-based population engagement expanded as the internet and social media became more widely available and emerged as the primary mechanism to recruit and exploit individuals across the globe.

What exactly did you build? What inspired you to use Elastic products to do so?

Paley: The Pulse Platform consists of hardware and software spanning user interface, collection, storage, processing, analysis, and visualization to provide a comprehensive data aggregation and analysis system. Pulse provides real-time information collection across the internet and from hard-to-reach populations.

Digital internet monitoring is based on a web-scraping infrastructure, providing continuous capture of targeted selections of accounts, terms, content, and entire websites for prolonged periods of time. Active engagement involves direct communication with individuals in at-risk populations utilizing their mobile technology. Monitoring and engagement data are fused in a common database and custom analytics help identify bad actors and labor exploitation. The Elastic Stack’s ability to index, interpret, and monitor the overall enterprise has made us huge fans of Elastic products.

What impact does this application have and how is it making lives better or improving the greater good?

Paley: Pulse is a fully functional product that has been applied to analyze extensive internet data in support of the District Attorneys of New York and San Francisco to prosecute sex trafficking cases, to monitor and analyze ISIS recruitment activity, to interact with local populations in Afghanistan to understand teacher pay, in Uganda to identify potential for overseas labor trafficking, and in Liberia to support Ebola tracking.

Would you mind describing a little of the technical architecture, which Elastic products you’re using, and their functional role?

Paley: We are currently using Elasticsearch 2.3.1, Kibana 4.5.4, and Logstash in our production environment. Our current solution is built in AWS (although we're in the process of looking at Elastic Cloud in the near future). All data nodes are fed through an Elastic Load Balancer that runs through an http auth proxy before hitting the cluster. We also have the master nodes on a load balancer that we have set up for visualizing the cluster with Kopf.

We have a large pipeline that incorporates many different technologies including HDFS, HBase, Storm/Spark, Kafka, Zookeeper, and other open source technologies that ultimately feed Elasticsearch the real-time data that allows us to perform analyses on breaking events around the world.

We currently utilize Kibana for all our data analysis, and incorporate all the different widgets, including geolocation maps, to help pinpoint epicenters correlating to data points, as well as graphs and pie charts to show comparisons between data points.

We also utilize a Marvel [monitoring] cluster hosted in Elastic Cloud to gather health statistics on our AWS-based clusters. We chose a Marvel cluster in Elastic Cloud as a starting point for migrating our other clusters to Elastic Cloud.

We currently have Logstash set up across our entire technology stack to gather logs in our Elasticsearch cluster should we need them to troubleshoot issues. We keep 7 days’ worth of logs for each technology we're using. We utilize Curator to automate the removal of data from our Elasticsearch cluster.

Can you give us a sense for your vision for where this application is heading and your current plans for its future?

Paley: We are currently in the process of hardening the Pulse Platform to fully realize it as a truly global information collection tool. We will continue to do innovative research and development funded by DARPA and other innovative research organizations to maintain cutting-edge functionality. We are deploying the Pulse Platform for more countries and new customers every month, and will continue to support a variety of operations in support of enhancing human security.

Michael Paley and David Cavitt share how the Pulse Platform was initially used to gather data for organizations like The Red Cross after the earthquakes in Haiti. Search became a natural extension of their technology and is now being used in cases of sex trafficking.

To learn more about their deployment, watch the presentation Paley delivered during the Elastic{ON} 2017 closing keynote.