By akademiotoelektronik, 11/12/2022

Return of experience: How does artificial intelligence help cybersecurity actors to fight against ransomics?

Since the start of the health crisis, companies, public organizations and associations have had to adapt quickly and massively deploy digital tools allowing them to ensure the continuity of their activities.

At the same time, these organizations had to face a real explosion in the number of cyber attacks.According to the National Authority for Information Systems (ANSSI), the number of cyber attacks in France has multiplied by four in 2020 and their sophistication is increasingly high [1].

This figure is largely explained by the lack of awareness of cyber risks, the lack of control of information systems, non-compliance with computer hygiene measures, the shortage of cybersecurity experts and, in aCertain measurement, the increase in the attack surface due to the generalization of telework which are all weaknesses exploited by cybercriminals [2].

Among these cyber attacks, ransomware ransomware are experiencing a real explosion.These are malicious software that blocks access to a computer or files by encrypting them and demanding from the victim the payment of a ransom to obtain access again [3].

French companies and public organizations are increasingly victims of this type of attack.According to the ANSSI, this type of attack increased by 255% in just one year and now represents the first threat to businesses and public organizations in France.Many activity sectors have been affected by ransoms in France in 2020 [4]:

Sectors of activity affected by ransomics in France in 2020 - ANSSI

The attacks on the Dax hospital centers [5] and Villefranche-sur-Saône [6] clearly showed us the criticality of this threat with important consequences on care as well as on patient monitoring.

These cyber attacks also strike the world of the business even going so far as to generate significant losses of turnover and disturbances in the production system of certain industrial companies.

For example, the losses suffered by Sopra Steria, victim of Ryuk [7] in October 2020, were estimated at around 50 million euros [8].This phenomenon affects all companies, whatever their size and sector of activity, since according to the sixth edition of the barometer of the Club of Information and Digital Safety Experts (CESIN), 20% of large French companieswere victims in 2020 in ransoms and 30% of them employ at least 5,000 employees [9].Knowing that these figures are certainly under assessment, this however allows us to measure the extent of this phenomenon.

In 2020, a large French industrial group turned to the OpenStudio datascientist team to understand how cybercriminals have managed to break into the group's computer network and propagate Sodinokibi on a large number of servers and postswork.

The Rançongiel Sodinokibi (also called Revil and Sodin) was detected for the first time in April 2019, during a Zeta attack [10] exploiting a vulnerability in Oracle Weblogic.It was developed and is marketed by former Gandcrab affiliates (ransomiciel that appeared for the first time in January 2018) having bought the source code.

Retour d’expérience : Comment l’intelligence artificielle aide les acteurs de la cybersécurité à lutter contre les rançongiciels ?

The infection generally involves downloading malware sometimes hidden in the attachment of a trapped email (phishing) or through a link to a compromise website.A main characteristic of Sodinokibi is its great ability to escape the detection of antivirus systems.

Knowing that several elements indicate that Sodinokibi is of Russian origin, it is sold as Raas (ransomware as a service) on certain Russian -speaking cybercriminals, thus offering the possibility to affiliates to create and distribute their own ransomware [11].

As part of a 3 -day hackathon, the OpenStudio data scientists team then mobilized to treat and analyze millions of logs (event newspapers) from the antivirus and thefirewall from this large industrial group.The objective of this hackathon was to analyze the logs to retrace the attack and identify the vulnerability points.

Being faced with large volumes of data and having no precise idea of the form that this attack could take, the OpenStudio data scientists team estimated that the use of artificial intelligence could be a suitable solution to detectabnormal and unusual events.

Through unopensed learning models, underlying structures have been discovered from unstarted data, thus making it possible to select suspicious logs which will have to be analyzed by experts in systems and networks as well as cybersecurity.

With regard to scientific literature, he clearly appeared to the OpenStudio data scientists team, that the algorithm of the K-Means (or K-Moyennes in French) was relatively well suited in order to detect anomalies in datanetwork [12].This approach can also be automated in order to detect, in real time, intrusive activities on computer systems and networks [13] [14].

It is important to recall that the K-MEAN algorithms allows you to analyze a dataset characterized by a set of descriptors, in order to group "similar" data into groups (or clusters).The similarity between two data being inferred thanks to the "distance" separating their descriptors [15].

The algorithm of the K-means has made it possible to create two clusters, one corresponding to data structures (or patterns) which are mainly found in the logs and the other to minority data structures that can be considered asunusual and abnormal.

Visualization of the clusters created by the algorithm of the K-MEANs

The data structures considered to be abnormal have been analyzed by experts in systems and networks as well as cybersecurity in order to rule on their level of dangerousness.

In order to characterize the clusters created by the K-Means algorithm, the OpenStudio data scientists team then used the Random Forest algorithm (or forest of decision-making trees, also called random forest in French).This algorithm [16] made it possible to identify the variables that discriminate the clusters through a new labeled dataset and supervised learning.

Representation of the importance of variables in clustering

The use of non-supervised learning-based data partitioning models, as is the case with the K-Means clustering algorithm, has made it possible to identify unusual and abnormal data structures.All of these structures are then labeled by experts in systems and networks as well as cybersecurity, it was then possible to constitute a labeled dataset in order to identify the variables that discriminate these structures via the use of modelssupervised learning, as is the case with the Random Forest algorithm.

The constitution of a labeled dataset integrating data structures relating to proven or simulated attacks, as well as the automation of artificial intelligence tools could allow real surveillance of systems in almost real time and alert internal actorsin charge of cybersecurity of potential attacks.

Given the revenues generated by the attacks by ransom and the increase in the number of attackers, facilitated by the Raas model, it is clear that the phenomenonnerançongiel will continue to grow in the years to come.

Faced with the scale and sophistication of this type of cyber attacks, the use of artificial intelligence appears more and more necessary in order to help cybersecurity experts to detect attacks whose consequences can be extremely detrimental inthe real world.

Kévin Cortial, Data Scientist at OpenStudio.

Jean-Luc Marini, director of LAB AI and the agency of Lyon OpenStudio

  1. Les cyberattaques ont été multipliés par quatre en 2020”, Zoom Sectoriel – Le chiffre, Bpifrance , 21 mai 2021, https://www.bpifrance.fr/A-la-une/Actualites/Les-cyberattaques-ont-ete-multipliees-par-4-en-2020-52306
  2. “L’ANSSI et le BSI alertent sur le niveau de la menace cyber en France et en Allemagne dans le contexte de la crise sanitaire”, ANSSI, 17 décembre 2020, https://www.ssi.gouv.fr/actualites/
  3. “Les rançongiciels (ransomwares)”, Cybermalveillance.gouv.fr, 20 novembre 2019, https://www.cybermalveillance.gouv.fr/tous-nos-contenus/fiches-reflexes/rancongiciels-ransomwares
  4. “Cybersécurité, faire face à la menace : La stratégie française”, ANSSI, 18 février 2021, https://www.ssi.gouv.fr/actualites/
  5. “L’hôpital de Dax en partie paralysé par une attaque informatique”, Le Monde, 10 février 2021, https://www.lemonde.fr/pixels/article/2021/02/10/l-hopital-de-dax-en-partie-paralyse-par-une-attaque-informatique_6069430_4408996.html
  6. “Après celui de Dax, l’hôpital de Villefranche paralysé par un rançongiciel”, Le Monde, 15 février 2021, https://www.lemonde.fr/pixels/article/2021/02/15/apres-celui-de-dax-l-hopital-de-villefranche-paralyse-par-un-rancongiciel_6070049_4408996.html
  7. Logiciel malveillant de type rançongiciel observé pour la première fois en août 2018.
  8. “Ransomware : Ryuk aurait empoché plus de 150 millions de dollars”, ZDNet, 08 janvier 2021, https://www.zdnet.fr/actualites/ransomware-ryuk-aurait-empoche-plus-de-150-millions-de-dollars-39915797.htm
  9. “Au moins 20% des entreprises françaises ont subi une attaque par rançongiciel l’an passé”, BFM Business, 10 février 2021, https://www.bfmtv.com/economie/au-moins-20-des-entreprises-francaises-ont-subi-une-attaque-par-rancongiciel-l-an-passe_AN-202102100290.html
  10. Une attaque ZETA (Zero Day Exploit Attack) est une cyberattaque ciblée basée sur une vulnérabilité zero-day, qui survient le jour même où une faiblesse est détectée dans un logiciel. Ce point faible est exploité avant la mise à disposition d’un correctif par le créateur du logiciel.
  11. “Etat de la menace rançongiciel à l’encontre des entreprises et des institutions”, 4.2, CERT ANSSI, 1er mars 2021, https://www.cert.ssi.gouv.fr/uploads/CERTFR-2021-CTI-001.pdf
  12. Münz, G., Li, S., & Carle, G. (2007). Traffic Anomaly Detection Using K-Means Clustering https://www.semanticscholar.org/paper/Traffic-Anomaly-Detection-Using-K-Means-Clustering-Münz-Li/634e2f1a20755e7ab18e8e8094f48e140a32dacd
  13. Gu, Y., Li, K., Guo, Z., & Wang, Y. (2019). Semi-Supervised K-Means DDoS Detection Method Using Hybrid Feature Selection Algorithm. IEEE Access, 7, 64351-64365 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8717648
  14. Kumari, R., Sheetanshu, Singh, M. K., Jha, R., & Singh, N. K. (2016). Anomaly detection in network traffic using K-means clustering. 2016 3rd International Conférence on Recent Advances in Information Technology (RAIT), 387-393. https://ieeexplore.ieee.org/document/7507933
  15. “K-means (ou K-moyennes)”, DAP (Data Analytics Post), https://dataanalyticspost.com/Lexique/k-means-ou-k-moyennes/
  16. “Random Forest”, DAP (Data Analytics Post), https://dataanalyticspost.com/Lexique/random-forest/
Tags: