Large volumes of technical and application data can be integrated within a short period of time.
Centralization of log and metric files
Information from different application layers is centralized to analyze all the end-to-end activities, as well as to monitor the operation of transactions in different environments for a cross-functional view.
Improved responsiveness to incidents
Using targeted analytics and app operator dashboards, teams are better able to detect anomalies and make use of automatic alerts.
The BPCE Group is the second largest banking group in France, and one of the top ten European banking groups.
It employs 106,500 people serving 31 million clients, 9 million of whom are members, and it finances more than 20% of the French economy. The BPCE Group offers its clients a complete range of products and services, including savings, investment, cash, financing, insurance, and investment solutions. True to its cooperative status, it helps clients with their projects and builds a lasting relationship with them.
Processing and analytical power for a secure and scalable multi-business services platform
Founded in 2015, BPCE Infogérance et Technologies is a common structure of the IT subsidiaries of the BPCE Group. As a result of industrial logistics, it is an Economic Interest Grouping (EIG) which aims to consolidate infrastructures and pool purchases to optimize costs and enhance the service quality of the Group’s entire IT production. It also offers value-added infrastructure services (messaging, videoconferencing, etc.) for users and information systems.
BPCE-IT brings together the IT management activities of six software editors: IT-CE, i-BP, BPCE SA (IT Division), Natixis (financing, payments, and securities), Palatine, and Crédit Coopératif.
As part of the BPCE Group’s “Innov 2020” strategic plan, which seeks to improve collective efficiency by pooling investments and resources, and with regard to IT use, two objectives are clearly defined:
- To industrialize and secure the maximum use of data for performance and functional analytics (both in terms of infrastructure and applications)
- To improve the services offer made to partners (publishers, internal clients such as Natixis, etc.) so that they may improve flexibility, agility, and performance
To meet these challenges, BPCE-IT processes large volumes of log files related to infrastructures and applications, a source of information which has remained underused. The company has chosen the Elastic Stack for its capacity to process and analyze large amounts of miscellaneous data in real time. Additionally, Elastic Stack security features provide access rights to data stored in clusters. This has allowed BPCE-IT to secure its data and optimize its infrastructure and management costs.
A standard architecture based on the Elastic Stack has been implemented for optimized processing of vast volumes of data in real time. This system has greatly improved the responsiveness of the teams by allowing for cross-functional analysis of activities and detection of incidents as early as possible to anticipate and better prevent service interruptions or deterioration.
The BPCE Group’s experience with Elastic
The BPCE-IT Architecture and Security Divisions
Two teams have promoted the use of the Elastic Stack within the Group by starting with an open source use and an assessment of the solution as a key component of the industrialized service offer orchestrated by BPCE-IT.
- The Architecture and Innovation Division, which is in charge of administering infrastructures as well as creating roadmaps for the transformation, experimentation, and implementation of IT solutions.
- The Information System Security Division, which is in charge of reinforcing SOC activities and of the evolution of SIEM (Security Information & Event Management).
Industrializing the use of log files and creating a multi-business service offer
BPCE-IT is therefore committed to the development and industrialization of a range of services for the Group’s internal clients and partners (developers, integrators, various operational teams, and subsidiaries such as Natixis and others). Several inherent challenges of this project led the Group to collaborate with Elastic experts to work on shared platform objectives. This includes the analysis of infrastructure and application log files (remote banking, cybersecurity, Web API, etc.), operational security, performance and optimization of IT operations, as well as centralized administration and support of clusters deployed on demand according to various use cases.
Successfully evaluated with an initial 1,200 servers of the Group’s data centers, Elastic alerting features are now fully operational. This makes it possible to optimize the relevance of the alerts generated according to a pre-established configuration and to identify issues flying under the radar of other monitoring solutions in place. Ultimately, the alert system — currently in email format — will be integrated into the ticketing platform to manage anomalies, incidents, and requests for assistance.
Ensuring fast and secure access to large volumes of log files stored in a scalable cluster
BPCE-IT is seeking a solution as quickly as possible to manage a large set of log files from security equipment on a single platform. The goal is to improve incident response time and obtain visibility into the state of the systems. The open-source nature of the Elastic Stack, its scalable architecture, and its proven ability to integrate the search and analysis of large volumes of log files in near real time have attracted the interest of BPCE-IT. As such, BPCE-IT has decided to purchase a Platinum Subscription with Elastic for their SOC (Security Operation Center) project and to expand the collection of log files to their entire Information System (IS). The decisive advantage was the Elastic Stack security features and, more specifically, the management of large-scale data access rights, which is a strategic interest for BPCE-IT’s operational data. Elastic also has the advantage of easily integrating SIEM with Logstash, which ensures a log file collection chain regardless of the SIEM solution used. The Elastic Stack also provides real-time analysis and ad-hoc scanning of very large amounts of security data for threat hunting.
The Elastic Stack is very well integrated with our IS. The Stack has allowed us to reclaim log files and obtain real-time visibility on our security platforms, as well as carry out our Threat Hunting activities within the SOC.
Achieving proper operation of banking applications
The centralization of log files permits cross-sectional analysis to follow a transaction from end to end, thus making the incident resolution process more efficient. The nature of the problems encountered, however, is not always identical: some are related to the use of infrastructure while others are related to anomalies in the software delivered by partner publishers. The anomalies revealed by the content of application and technical log files are a valuable source of information to progressively achieve 100% software reliability, especially when it comes to new types of processing.
Elasticsearch is used to track the various client services in production, as well as to examine the so-called “out-of-production” activities, namely the acceptance and certification platforms used for software development. Development and maintenance teams, as well as operators, are therefore able to validate the operation of new services and to ensure that there is no regression or negative impact on overall function.
The analysis of application log files, viewed in various forms with Kibana, is also commonly used by various entities. The “Digital Factory,” an entity created as part of the convergence of systems to create a unique system from which all teams can benefit, is an example of an internal client benefiting from the services provided by BPCE-IT. Another example is the new entity 89C3 (BPCE in the Leet Speak language), which is in charge of developing and launching the production of applications related to the Company’s digital transformation.
Thanks to Alerting, we have made gains in terms of responsiveness, especially with regard to atypical response times and HTTP error codes from log files. Depending on the problem identified, the tool automatically sends details of the malfunction to the teams in charge. It also makes it possible to measure performance during scaling through a comprehensive diagnostic provided to the relevant services to improve link chains.
Since the number of Elastic clusters has increased along with the number of clients, a proactive and automated monitoring system has quickly become a necessity to improve responsiveness to indexing problems caused by various associated elements (Beats, Logstash, Kafka, Elasticsearch) of the solution. Elastic Stack alerting features can detect incidents in near real time to restore service and data availability even before operators become aware of such issues.
The alerting features also automate the daily aggregation of the most critical business data and redistribute the data into lighter indices with a longer lifespan. The performance and response times of Kibana dashboards have thus been improved and the disk space required for certain indices has been reduced 300 fold.
A business-oriented implementation strategy
Ensuring the operational security of their system with a solution easily integrated with the SIEM was the first step for BPCE-IT. Subsequently, the department began to process and analyze several silos of infrastructure and application log files from different business lines to exploit this data on a large scale. The objective was to achieve a very transverse, shared log file analysis platform capable of managing data from a variety of sources.
The integration of log files into the system is now almost always planned for projects. The teams running the applications are now comfortable using the Kibana interface. Automated alerts are regularly implemented to improve responsiveness. BPCE-IT also uses Kibana to produce performance metrics and generate corporate reporting.
In the context of a future service to deploy, a POC (Proof Of Concept) with a Machine Learning plug-in has been validated to analyze the stability and use of the workstations of Caisse d’Epargne’s branches. This includes, in particular, anticipating future incidents and accelerating their resolution time by better identifying their frequency and causes while studying the behavior of the applications. Eventually, all of the Group’s online banking log files could be collected on this platform to study the use of their remote services.
Examples of Dashboards
Securing log file clusters for a diversified service offer on demand
Thanks to the scalability and abundance of the Elastic Stack Features, BPCE-IT fulfills its mission of implementing a diversified service offer and now strives to promote it while responding to the diverse needs of the Group’s many internal clients and partners.
With the successful deployment of several use cases in production, BPCE-IT is now injecting log files from software and various interfaces of the very active “Digital Factory” into their Elastic clusters. One particular objective is to be able to exploit this data with Elastic machine learning features to establish predictive analytics and to detect code anomalies or application interdependencies.
Application teams regularly ask us for performance or functional analyses to get a better understanding of what is happening in their environment and to make the most of data which is sometimes difficult to use. By industrializing log file processing, we are able to satisfy the variety of requests from our customers who in turn improve flexibility and agility.
BPCE-IT has validated a POC with Elastic Cloud Enterprise (ECE) to facilitate the piloting and implementation of all Elasticsearch clusters from a single console. The goal is also to offer all of the Group’s internal clients a premium service based on access to all the features included in Elastic's Platinum Subscription. BPCE-IT also intends to exploit the data of many open source clusters which will first need to be secured, since users are not always aware of the risks to which their often-sensitive data are exposed.
After operating for more than a year in a “cluster on demand” mode, we plan to deploy ECE into production during the first quarter of 2019 to offer a complete and identical service for all within the Group and to centralize the management of future app developments on this common management platform, while gradually migrating existing deployments.
Extension of the Elastic Stack is set to continue further with other projects planned for 2019. Another POC with machine learning is planned for the first quarter by the Information System Security Department. Its aim will be to improve the detection of banking cyber-fraud and data loss.
These various projects are in line with the Group’s “Innov 2020” strategic plan aimed at improving collective efficiency by industrializing and securing data, as well as enhancing the range of services offered to partners.