Файл: Doi 10. 15514ispras201931(5)15.docx

ВУЗ: Не указан

Категория: Не указан

Дисциплина: Не указана

Добавлен: 11.01.2024

Просмотров: 125

Скачиваний: 1

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.

Using ML for protection


The scope of ML usage in cybersecurity is huge, starting with identifying anomalies and suspicious or unusual behaviours and ending with detecting zero-day vulnerabilities and patching known ones. Dilek et al. [12] presented the most comprehensive review of applications of ML techniques.

Reathi and Malathi [13] presented a set of ML algorithms trained on the NSL-KDD intrusion detection dataset for misuse detection. Meanwhile, Buczak et al. [14] focused on network intrusion detection using ML.

Melicher et al. [15] proposed using NNs to check password guessing resistance. They compressed the model to hundreds of kilobytes and developed a client-side JavaScript tool. The similar experiment was conducted by Ciaramella et al. [16]. To proactively check the strength of passwords, they use NNs, such as Multilayer Perceptron (MLP) and Single Layer Perceptrons (SLPs). Notably, MLPs provide better results than SLPs when testing datasets. Moreover, the number of layers equal to 10, and thus obtains better result.

User and entity behaviour analytics (UEBA) use ML capabilities to analyze behaviour logs and network traffic in real-time and respond appropriately in the event of an attack [17]. This process is done by getting the user to log in again, blocking an attack or assessing risk levels and alerting the company’s information security officers so that they can take necessary action.

Most of the ML and DL methods, such as ensemble learning, clustering, and decision tree, [18] are used to detect misuse, anomaly and hybrid cyber intrusion.

As mentioned in the Eugene Kaspersky Official Blog [19], Kaspersky detects 99% of cyber threats using ML technology. The time interval between the disclosure of suspicious behaviour on the protected device and the release of the corresponding new 'tablet' lasts an average of 10 minutes.

DARPA collaborated with BAE Systems to develop a system that allow us to configure sensors and apply protective measures 'at machine speed'. This initiative called the CHASE program, which stands for Cyber Hunting at Scale, seeks to develop automated tools to detect and characterize novel attack vectors, collect the right contextual data, and disseminate protective measures both within and across enterprises [20].

Cyberattacks performed by hacktivists relate to a common opinion about high-profile news. Information gathered from social media can help predict such incidents using NLP and ML techniques [21].

Moreover, we can use ML to identify the author of the program. Rachel Greenstadt and Aylin Caliskan
developed a system that can 'deanonymize' programmers [22] by analyzing source code or compiled binary files [23]. Identifying the developer of malware is now much easier.

Another way to monitor systems and networks for malicious activity or policy violation is through the intrusion detection system (IDS). Intrusion prevention system (IPS) is a system connected with IDS; these systems perform intrusion detection and stop the detected incidents. Both systems use supervised and unsupervised ML techniques to detecting point anomaly, contextual anomaly, and collective anomaly [24].

The main task of firewalls [25] is to ensure a network security system that monitors and controls incoming and outgoing network traffic. Firewalls allow or block traffic by comparing its characteristics with predefined patterns (i.e. firewall rules). In their paper, Ucar and Ozhan [26] presented the result of the automatic detection of anomalies in firewall rule repository based on ML and high-performance computing methods, such as Naive Bayes, kNN, Decision Table and HyperPipes. All six firewall rules from the given 93 rules were detected by the system and verified by the experts as an anomaly. Firewalls filter the content between servers, and there is also a solution specifically meant for the content of web applications. Web application firewall (WAF) is deployed in front of web applications; it analyzes bi-directional web-based (HTTP) traffic and detects and blocks anything malicious [27]. WAF prevents vulnerabilities in web applications from being exploited by outside threats. To implement such functionality in WAF, developers use regular expressions, tokens, behavioural analysis, reputation analysis and ML technologies [27].

Among ML methods, special predictive ones can also be used for data loss/leak prevention (DLP) to reduce the risk for breaches or leaks [28]. DLP software solutions allows us to set business rules that classify confidential and sensitive information so that they cannot be disclosed maliciously or accidentally by unauthorized end users. This process can be done by using supervised learning algorithms and two types of examples: positive examples (i.e. content that needs to be protected) and counterexamples (i.e. documents that are similar to the positive set but should not be protected).

  1. 1   2   3   4   5   6   7

Using ML in cyberattacks


This section describes how cyberattack can succeed using ML. Automated vulnerability scanning is one of the most obvious and common tasks in a cyberattack. For example, CSRF is found in only 5% of applications, as reported in the 2017 OWASP Top 10, because most frameworks include CSRF defences [29]. Accordingly, Calzavara et al. presented Mitch [30], the first ML-based tool for the black-box detection of CSRF, which allows the identification of 35 new CSRF vulnerabilities on 20 websites from the Alexa Top 10,000 websites and three previously undetected CSRF vulnerabilities on production software already analyzed with the state-of-the-art tool Deemon [31]. Mitch is a binary classifier, labelling sensitive or insensitive requests using a random forest algorithm on a 49-dimensional feature space. Compared to the heuristic classifiers BEAP [32] and CsFire [33], Mitch shows the best F1-score and precision (Table 1).

Marketers use ML methods for profiling. Trustwave released an open source intelligence tool that uses face recognition to automatically track subjects across social media networks [34]. Facial recognition aids this process by removing false positives in the search results, making data review faster for a human operator.

Table1.Validitymeasuresforthetestedclassifiers(BEAP,CsFire,Mitch)


Classifier

Precision

Recall

F1

BEAP

0.30

0.89

0.45




CsFire

0.20

0.97

0.33

Mitch

0.78

0.67

0.72

Using collected data about the target, an attacker can hook a victim with specially created fake news. ML tools can help identify fake news, but to do so, researchers confirm that the best way is for that ML to learn to create fake news itself [35]. As such, they created a model for controllable text generation called Grover. In the research process, four classes of articles were used: human news, machine news, human propaganda and machine propaganda. Workers on Amazon Mechanical Turk rated each article, including overall trustworthiness. In the case of propaganda, the score increased from 2.19 (out of 3) on articles created manually to 2.42 on articles created by a machine.

SNAP_R was introduced at DEFCON 24. SNAP_R is the world's first automated end-to-end spear-phishing campaign generator for Twitter [36]. While previous tools were based on models with Markov chains, SNAP_R is based on a recurrent NN with LSTM architecture. Using
Twitter as an environment offers some advantages for automatically generating text. For example, limiting the length of a post decreases the probability of grammatical errors. Moreover, Twitter links are often shortened, which allows masking of malicious domains. This, in turn, significantly increased the success rate from 5–14% on Markov chain-based tools [37, 38] to 30– 66%, which is comparable to the 45% rate for manual spear-phishing [39].


In most cases, attackers do not know the malware detection algorithm but can figure out features it uses through carefully designed test cases in the black-box algorithm. MalGAN is a generative adversarial network-based algorithm that generates adversarial malware examples that are able to bypass black-box ML-based detection models. It can decrease the detection rate to nearly zero and make it hard for the retraining-based defensive method against adversarial examples to work [40]. The architecture of MalGAN is shown in fig. 3 [40].

Figure3.ArchitectureofMalGAN

The generator takes the malware feature vector and the noise vector to transform the former into its adversarial version. Substitute detector is used to fit the black-box detector and provide gradient information to train the generator. Both nets are represented as multi-layer feed-forward ANNs. Adversarial examples tested against the black-box detector according to different ML methods trained on 160-dimensional binary feature vectors representing system API calls include random forest, logistic regression, decision trees, support vector machines, and multi-layer perceptron as well as a voting-based ensemble of these algorithms. All these classifiers detect over 90% of original samples, but random forest and decision trees show the best result of less
than 0.20% on adversarial examples. Anti-malware vendors retrain detectors after exploring such undetected examples, but MalGAN only needs one epoch retraining to obtain a 0% true positive rate. Kawai et al. later proposed some performance improvements [41].



Fig.4.GeneratorarchitectureofPassGAN

Fig.5.DiscriminatorarchitectureofPassGAN


Another example use case for GAN in cybersecurity is the password guessing attack. There is a new way of generating password guesses based on DL and generative adversarial networks known as PassGAN [42]. The key difference in this approach is that NNs do not need a priori knowledge of the structure of passwords, in contrast to approaches based on rules, Markov models [43] and FLA [15]. PassGAN uses the improved training of Wasserstein GANs (IWGAN) of Gulrajani et al. [44] with the ADAM optimizer [45]. The generator and the discriminator in PassGAN are built from ResNets [46]. The architecture of the generator and the discriminator are shown in fig. 4 and fig. 5 [42], while residual block representation is shown in fig. 6 [42].

Fig.6.ArchitectureofresidualblockinPassGAN

For maximum effectiveness, attackers most likely use several password-cracking tools, such as a HashCat [47], John the Ripper [48], PCFGs [49], OMEN [50] and FLA [15] to combine different attack methods. For example, by combining the output of PassGAN with the output of HashCat Best64 [51], researchers were able to guess between 51% and 73% additional unique passwords compared to HashCat [47] alone.

Traditional botnets wait for commands from the C&C, but now, attackers use automation to make decisions independently. Fortinet researchers predicted that cybercriminals will replace botnets with intelligent clusters of compromised devices called hivenets, a type of attack that is able to leverage peer-based self-learning to target vulnerable systems with minimal supervision [52].

In the initial stages of an attack, attackers often face the challenge of bypassing captcha. Suphannee et al. [53] designed a low-cost attack that uses DL technologies for the semantic annotations of images. The system requires about 19 seconds per challenge to solve challenges, with an accuracy of 70.78% for reCaptcha [54] and 83.5% for the Facebook image captcha. The system has to automatically identify which of the given images are semantically similar to the