TY - JOUR AU1 - Asad,, Muhammad AU2 - Asim,, Muhammad AU3 - Javed,, Talha AU4 - Beg, Mirza, O AU5 - Mujtaba,, Hasan AU6 - Abbas,, Sohail AB - Abstract At the advent of advanced wireless technology and contemporary computing paradigms, Distributed Denial of Service (DDoS) attacks on Web-based services have not only increased exponentially in number, but also in the degree of sophistication; hence the need for detecting these attacks within the ocean of communication packets is extremely important. DDoS attacks were initially projected toward the network and transport layers. Over the years, attackers have shifted their offensive strategies toward the application layer. The application layer attacks are potentially more detrimental and stealthier because of the attack traffic and the benign traffic flows being indistinguishable. The distributed nature of these attacks is difficult to combat as they may affect tangible computing resources apart from network bandwidth consumption. In addition, smart devices connected to the Internet can be infected and used as botnets to launch DDoS attacks. In this paper, we propose a novel deep neural network-based detection mechanism that uses feed-forward back-propagation for accurately discovering multiple application layer DDoS attacks. The proposed neural network architecture can identify and use the most relevant high level features of packet flows with an accuracy of 98% on the state-of-the-art dataset containing various forms of DDoS attacks. 1. Introduction The rapid expansion of communication infrastructure and online services brings along a very realistic threat to deny services to Internet users. Distributed Denial of Service (DDoS) attacks are one of the biggest challenges that security researchers and analysts face today and have metastasized into an affirming existence threat for every internet-centric organization. A DDoS attack can be defined as an attack in which multiple compromised orchestrated computer systems attack a target, such as a business server, website or other network resource and can cause serious damage to a company’s operability and availability. The flood of incoming messages, connection requests or malformed packets, originated from these infected machines, forces the target system to slow down or completely shut down, thereby denying services to legitimate users or systems. This network of infected machines has a command-and-control infrastructure, called botnet [1, 2]. As per Kasperky Lab’s research [3], in the second quarter of 2016, the longest DDoS attack took place longed for 291 hours because of botnets that were used by a large number of vulnerable Internet of Things (IoT) devices [4, 5]. This DDoS attack on DYN servers, a Web application security company and considered as one of the world’s biggest DNS provider, crippled major Web-based services like Netflix, Twitter, Amazon and PayPal. The attack was launched through a network of interconnected devices connected to the Internet (from home routers to digital video recorders), with a special malware known as a ‘Mirai botnet’, where the target servers were bombarded with traffic until it collapsed under the strain [6, 7]. Previously, DDoS attacks were mostly being carried out on network and transport layers protocols such as ICMP, UDP and SYN flooding [8]. The intent of these attacks is to overwhelm the victim network bandwidth and available resources with unwanted traffic in order to make it difficult or impossible for legitimate users to access them. Sufficient research had already been done to detect and mitigate network layer DDoS attacks [8, 9]. As a result, in recent years, the attackers have shifted their offensive strategies toward the application layer [10]. Exploiting the vulnerabilities of application layer protocols, attackers with the help of DDoS attacks are capable of creating the same level of impact as traditional network layer flooding DDoS attacks, at a much lower cost. In addition to the flooding pattern, application layer DDoS attacks also tend to consume the network resources such as sockets, CPU, memory, disk/database bandwidth and I/O bandwidth [1]. Application layer DDoS attacks are perceived as stealthy, sophisticated and tougher to detect, as the attack traffic resembles genuine client traffic. In this paper, we present an Artificial Neural Networks (ANNs)-based DDoS detection method. Neural networks are information-processing models that are based and inspired from the human nervous system [11]. It consists of numerous exceptionally interconnected processing nodes (called neurons) that work simultaneously to solve the specified problems [12]. Neural networks learn by examples as humans do. These examples need to be selected correctly and delicately; otherwise the precious time of the system will get wasted or the network might work improperly. Having the extraordinary characteristic of deriving meaning from complex and indefinite data, neural networks can be used to recognize and detect the patterns that are exceptionally complicated to be even observed or detected by humans and even by the traditional techniques [13]. Typically, ANN has three layers: input, output and a hidden layer. At the input layer, an ANN receives a sequence of normalized values of packet flow attributes representing normal or DDoS communication. The output signals from the first layer triggers the hidden layer activating the cells that recognize micro patterns. The subsequent hidden layers act as a bridge between the two and learn to recognize coarser patterns which by the final layer are coarse enough to classify flows as attack or benign flows. In this paper, we propose a deep neural network architecture, consisting of seven hidden layers, and is based on feed-forward back-propagation that models multiple application layer DDoS attacks. The proposed approach protects Web-based services from application layer DDoS attacks by routing traffic through the proposed system and detecting malicious behavior. The malicious behavior of the packets will be detected by using pre-learned patterns within the application. It can also detect a malicious behavior from packets, if an entirely new malicious pattern is being used. Those patterns would serve as a secondary data set to train the neural networks in the proposed system. Furthermore, we used the Canadian Institute of Cybersecurity dataset (CIC IDS 2017; labeled data set) [14] to train our system before validating our model for accuracy. The objectives of our research are 2-fold. Our work attempts the following: Apply state-of-the-art deep learning technique to distinguish between benign network flows and DDoS attack flows. Evaluate our model on a realistic Internet traffic using accuracy as a performance matrix. In order to achieve these objectives, this article makes the following contributions: Proposes a novel deep neural network that uses feed-forward back-propagation architecture using seven hidden layers for classifying network flows as attack or normal flows. A technique for automatically tuning hidden layers of ANN to iteratively detect macro patterns in network flows. A thorough evaluation of the proposed model using the state-of-the-art CIC IDS 2017 dataset for DDoS detection. The rest of the paper is organized as follows. Section 2 discusses some of the major application layer DDoS attacks. Section 3 gives an overview of the related literature. Our proposed deep learning methodology and detailed service architecture is described in Section 4. Section 5 elaborates the experiments on the CIC IDS 2017 dataset giving a thorough validation of our model. Finally, Section 6 concludes our work. 2. Background In this section, we discuss some of the major application layer DDoS attacks. Broadly, there are three basic categories of DDoS attacks: volume-based attacks that includes UDP, ICMP and other spoofed-packet attacks; protocol-based attacks that includes SYN floods, fragmented packet attacks, smurf DDoS; and the third category is the application layer attacks which include low and slow attacks, GET/POST attacks. Volume-based attacks saturate the bandwidth of the targeted site while protocol-based attacks consume the actual resources or those of the intermediate communication equipment such as firewalls and load balancers. On the contrary, application layer attacks, along with the flooding pattern, consume the Web servers’ resources. These attacks are difficult to detect since they comprise of apparently legitimate requests. 2.1. Application layer DDoS attacks Among the most important application layer DDoS attacks, our proposed work focuses on SlowHTTPTest, Golden eye and Hulk. SlowHTTPTest: It is a powerful tool to launch attacks of the slow DoS category. With the use of SlowHTTPTest tool, three attacks can be performed: Slow Header, Slow Read and Slow Post attack. The following section briefly describes the three aforementioned attacks. Slow Header attack (Slowloris): In considering the ramifications of DoS attacks against particular services rather than flooding networks, a new concept has emerged. It would allow a single machine to take down another machine’s Web server [8, 15] with minimal bandwidth and side effects on related services and ports, while keeping intact all the remaining services of victim Web server. Slowloris was born from this concept, which is why it is relatively stealthier and dangerous in nature. Slowloris holds connections open by sending partial HTTP requests. It continues to send subsequent headers at regular intervals to keep the sockets from closing. In this way, Web servers can be quickly tied up [16]. Slowloris makes a full TCP connection, not a partial one; however it is making partial HTTP requests. This is not a TCP-based DoS attack, since it is the request that causes DoS [15]. Slow Read attack: Slow read is an attack that starts with a legitimate HTTP request from an attacker to a victim server followed by a slow consumption of the HTTP response sent by the victim. As a result, the server keeps the connection open for a long period of time, thus, resulting in DoS attack. The successes of the attack lie in the ability to maintain a number of active concurrent connections with a server. In slow read attacks, TCP windowing (data receiver uses it for flow control) functionality is used to deplete Web server’s resources. At the beginning of the attack, the attacker sets the TCP window size (number of bytes a receiver is willing to receive) to a relatively small quantity. As a result, the server assumes that the client is busy reading data. Hence, force the server to keep the connection open for a longer period of time. Such a multiple requests can make the service unresponsive to genuine requests [17]. Slow HTTP post attack (R-U-Dead-Yet): It is also known as R-U-Dead-Yet (RUDY) attacks portray a very similar attack to Slowloris that’s sends HTTP post commands slowly to bring down Web servers. The attacker sends a complete HTTP header that defines the ‘content-length’ field of the post message body as it sends this request for benign traffic. It then sends the data to fill the message body at a rate of 1 byte every 2 minutes. Hence, the server waits for each message body to be completed while slow post attack grows rapidly, which causes the flooding attack on the Web server [8, 18]. GoldenEye: GoldenEye is an HTTP or HTTPS application layer DoS testing tool. GoldenEye uses a command line interface that leverages HTTP Keep Alive connection paired with cache (Connection: KeepAlive + cache) to consume all of the available sockets on the HTTP or HTTPS server. It uses the random referrer and user agents to overwhelm all the resources. The GoldenEye HTTP-based DDoS attack tool generates a series of random-sized packets within a specified range, and it is very difficult to predict its exact behavior and traffic-generation pattern [19, 20]. Hulk: The hulk (HTTP Unbearable Load King) is another application layer DDoS assaulting tool that generates a unique HTTP GET request for each request with randomly generated headers and URL parameter values. It uses referrer forgery, and it can bypass caching engines by directly hitting the server resource pool [16]. Hulk has the ability to take down the server in a minute as it directly affects the server’s load. It generates TCP SYN flood and multi-threaded HTTP GET flood requests. It can hide the actual user agent and has the ability to send different patterns of attack requests that can obfuscate the referrer for each request [20]. 3. Related Work After the comprehensive exploration of the available literature, we found that information theory, statistical models and machine learning are the three leading methods that form the basis of majority of present-day detection techniques [21, 22], which concentrate on application layer DDoS attacks. Most of the existing mechanisms include attack prevention, detection and reaction. Attack prevention tries to filter incoming and outgoing traffic before the attack causes any damage. Attack reaction aims to minimize the loss of DDoS attacks. Existing approaches for DDoS attacks detection include statistical methods and machine learning methods. Most of the statistical model-based approaches focus on improving the accuracy of intrusion detection [23, 24] and shorten detection time [23]. FIGURE 1. View largeDownload slide DeepDetect neural network architecture for detecting DDoS attacks. FIGURE 1. View largeDownload slide DeepDetect neural network architecture for detecting DDoS attacks. On the contrary, machine learning detection methods proposed for application layer DDoS attacks are detection based on anomaly and classification [25]. In addition, existing techniques are unable to distinguish whether the abnormal network traffic is caused by genuine users or by the DDoS attacks [26]. In [27], a DDoS detection system used a big data platform integrated with neural networks. The detection system is composed of the open source big data computing framework, i.e. Apache Spark, compiled by R language. In [28], the authors found that Naive Bayes can perform very well when moderate dependencies exist in the data. In this paper, it has been shown that the performance of Naive Bayes classifier improves when redundant features are removed. In [29], a preliminary analysis of ADFA-LD is performed, in an attempt to extract useful information for developing new Host-Based Anomaly Detection Systems. A few typical features are analyzed particularly against ADFA-LD, such as length, common pattern and frequency. The experimental results show that an acceptable performance can be acquired for a few types of attacks. In [25], a text-mining approach has been proposed to extract features, which represent a user’s HTTP request sequence using bigrams. The One-Class Support Vector Machine (SVM) algorithm is applied on the extracted features from normal user’s HTTP request sequences. The One-Class SVM labels any newly seen instance that deviates from the normal trained model as an application layer DDoS instance. Yao et al. [30] presented a deep graph feature learning framework called DeepGFL, which extracts higher-order network flow features from lower order (raw features), forming a hierarchical graph representation to detect network attacks. They proposed a graph-based feature learning algorithm to represent the network flow relationships and perform a feature evaluation routine to choose the important features exposing the different patterns between benign and attack network flows. They adopted the CIC FLowMeter [14] to extract raw traffic. Experimentation is performed to evaluate the proposed method using random forest. We also compared our results with DeepGFL in Section 5. Jiang et al. [31] used the back propagation neural networks for anomaly detection, through combining both the traffic features and the user behavior features extracted from the Web server logs into a hybrid two-layer detecting structure. The CIC IDS 2017 dataset is used for performance evaluation. However, the paper does not provide comprehensive results for individual attacks. The authors in [32] presented an approach to identify DDoS traffic with NetFlow feature selection and machine learning. They also used the CIC IDS 2017 dataset for the evaluation along with a real-world NetFlow logs provided by a large ISP China Unicom. The experimentation is performed using random forest as detector, yielded results via the CIC IDS 2017 shows accuracy of 99%. However, they use accuracy as the performance metric, which is not suitable for evaluating DDoS classifiers. Vijayanand et al. [33] proposed an IDS system with genetic algorithm-based feature selection and multiple SVM classifiers for wireless mesh networks. They used two standard intrusion datasets ADFA-LD and CIC IDS 2017 for the evaluation process. However, their work focuses on different attacks such as Jamming, Black hole and Grey hole. Table 1. Training and validation loss values. Dropout rate = 0.4 Dropout rate = 0.2 Training loss 0.1397 0.0583 Training F1 score 0.9581 0.9821 Validation loss 0.0814 0.0402 Validation F1 score 0.9709 0.9866 Dropout rate = 0.4 Dropout rate = 0.2 Training loss 0.1397 0.0583 Training F1 score 0.9581 0.9821 Validation loss 0.0814 0.0402 Validation F1 score 0.9709 0.9866 View Large Table 1. Training and validation loss values. Dropout rate = 0.4 Dropout rate = 0.2 Training loss 0.1397 0.0583 Training F1 score 0.9581 0.9821 Validation loss 0.0814 0.0402 Validation F1 score 0.9709 0.9866 Dropout rate = 0.4 Dropout rate = 0.2 Training loss 0.1397 0.0583 Training F1 score 0.9581 0.9821 Validation loss 0.0814 0.0402 Validation F1 score 0.9709 0.9866 View Large 4. DeepDetect In this section, we will discuss our proposed scheme, called DeepDetect, that is an ANN-based DDoS detection technique. Typical ANN architectures consist of input, hidden and output layers; the input layer is fed with the patterns representing the characteristics of network flows whereas the output layer classifies the flows as either benign or one of the aforementioned attacks. The hidden layers deal with the intermediate patterns contained within the flow in order to assist in the classification computation. FIGURE 2. View largeDownload slide Training loss at learning rate 0.01/dropout rate 0.4 and at learning rate 0.001 and dropout rate 0.2. FIGURE 2. View largeDownload slide Training loss at learning rate 0.01/dropout rate 0.4 and at learning rate 0.001 and dropout rate 0.2. FIGURE 3. View largeDownload slide Service architecture: we structure our security as a service architecture that subsumes FloWatcher, FlowParser, classifier and alert generator. FIGURE 3. View largeDownload slide Service architecture: we structure our security as a service architecture that subsumes FloWatcher, FlowParser, classifier and alert generator. Table 2. Selected numerical features of a few sample flow records from the CIC IDS dataset. Active mean Active std Active max Active min Idle mean Idle std Idle max Idle min Label 1997.0 0.0 1997.0 1997.0 85 800 000.0 0.0 85 800 000.0 85 800 000.0 Hulk 102 973.6 41 267.5 176 795.0 84 438.0 10 000 000.0 12 914.7 10 000 000.0 9 997 640.0 Benign 997.0 0.0 997.0 997.0 97 800 000.0 0.0 97 800 000.0 97 800 000.0 Hulk 1991.0 0.0 1991.0 1991.0 101 000 000.0 0.0 101 000 000.0 101 000 000.0 Hulk Active mean Active std Active max Active min Idle mean Idle std Idle max Idle min Label 1997.0 0.0 1997.0 1997.0 85 800 000.0 0.0 85 800 000.0 85 800 000.0 Hulk 102 973.6 41 267.5 176 795.0 84 438.0 10 000 000.0 12 914.7 10 000 000.0 9 997 640.0 Benign 997.0 0.0 997.0 997.0 97 800 000.0 0.0 97 800 000.0 97 800 000.0 Hulk 1991.0 0.0 1991.0 1991.0 101 000 000.0 0.0 101 000 000.0 101 000 000.0 Hulk View Large Table 2. Selected numerical features of a few sample flow records from the CIC IDS dataset. Active mean Active std Active max Active min Idle mean Idle std Idle max Idle min Label 1997.0 0.0 1997.0 1997.0 85 800 000.0 0.0 85 800 000.0 85 800 000.0 Hulk 102 973.6 41 267.5 176 795.0 84 438.0 10 000 000.0 12 914.7 10 000 000.0 9 997 640.0 Benign 997.0 0.0 997.0 997.0 97 800 000.0 0.0 97 800 000.0 97 800 000.0 Hulk 1991.0 0.0 1991.0 1991.0 101 000 000.0 0.0 101 000 000.0 101 000 000.0 Hulk Active mean Active std Active max Active min Idle mean Idle std Idle max Idle min Label 1997.0 0.0 1997.0 1997.0 85 800 000.0 0.0 85 800 000.0 85 800 000.0 Hulk 102 973.6 41 267.5 176 795.0 84 438.0 10 000 000.0 12 914.7 10 000 000.0 9 997 640.0 Benign 997.0 0.0 997.0 997.0 97 800 000.0 0.0 97 800 000.0 97 800 000.0 Hulk 1991.0 0.0 1991.0 1991.0 101 000 000.0 0.0 101 000 000.0 101 000 000.0 Hulk View Large For DeepDetect, we propose an ANN architecture based on feed-forward back-propagation architecture, as shown in Fig. 1, which includes the following: 1 input layer – for the selected 66 features and bias factor 7 hidden layers – initializing synaptic weights and connections 1 output layer – yielding probabilities of 5 classes, i.e. (benign, DoS Slowloris, DoS SlowHTTPTest, DoS Hulk and DoS GoldenEye). The size of the input layer has been chosen according to the number of selected categorical features of the network flow. The number of neurons in the output layer is equal to the number of classes in which we classify the flows. The seven hidden layers have been carefully selected representing the complicated sub-patterns and their combinations being detected by each subsequent layer. The size of each hidden layer has been carefully tuned based on the number of combinations of sub-patterns at that particular intermediate stage. Table 3. A couple of categorical features for a sample of flow records in the CIC IDS dataset. S# Dest port Flow duration Label 324200 80 11016 DoS Hulk 496726 53 148139 BENIGN 475353 443 61238152 BENIGN 152979 80 117509395 DoS Hulk 471398 443 5189647 BENIGN 549215 53 354010 BENIGN S# Dest port Flow duration Label 324200 80 11016 DoS Hulk 496726 53 148139 BENIGN 475353 443 61238152 BENIGN 152979 80 117509395 DoS Hulk 471398 443 5189647 BENIGN 549215 53 354010 BENIGN View Large Table 3. A couple of categorical features for a sample of flow records in the CIC IDS dataset. S# Dest port Flow duration Label 324200 80 11016 DoS Hulk 496726 53 148139 BENIGN 475353 443 61238152 BENIGN 152979 80 117509395 DoS Hulk 471398 443 5189647 BENIGN 549215 53 354010 BENIGN S# Dest port Flow duration Label 324200 80 11016 DoS Hulk 496726 53 148139 BENIGN 475353 443 61238152 BENIGN 152979 80 117509395 DoS Hulk 471398 443 5189647 BENIGN 549215 53 354010 BENIGN View Large Table 4. Training system specifications. System component Value CPU Platform 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge) GPU NVIDIA Tesla K80 CPU Cores 4 GPU Memory 24 GB of GDDR5 RAM 26 GB System component Value CPU Platform 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge) GPU NVIDIA Tesla K80 CPU Cores 4 GPU Memory 24 GB of GDDR5 RAM 26 GB View Large Table 4. Training system specifications. System component Value CPU Platform 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge) GPU NVIDIA Tesla K80 CPU Cores 4 GPU Memory 24 GB of GDDR5 RAM 26 GB System component Value CPU Platform 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge) GPU NVIDIA Tesla K80 CPU Cores 4 GPU Memory 24 GB of GDDR5 RAM 26 GB View Large In the proposed architecture, the information moves in only one direction, i.e. forward, from the input nodes through the hidden nodes and to the output nodes. There are no cycles or loops in the network (see Fig. 1). One of the novelties of the proposed ANN architecture is to use batch normalization in which we are normalizing the input layer by adjusting and scaling the activations. Batch normalization is also applied on the subsequent hidden layers with the batch size of 1024 to improve the learning rate (training). ReLu nonlinearity with dropout rate to be 0.2 has been used in each hidden layer after each affine transformation. This idea controls the impact of inaccuracy of each hidden layer on the output. The first layer consists of 66 neurons that are stacked vertically [36] and fed to a network of 8 fully connected layers as shown in Fig. 1. The final layer has output dimension equal to the number of classes, i.e. five classes. The value of the probabilities ranges between 0 and 1 at the final (output) layer. The class with the highest probability value is selected as a label to the corresponding input values. Identifying the appropriate number of neurons in each hidden layer is considered to be a challenging task. We have adopted a strategy as a good starting point that is, to increase and decrease the number of neurons by a factor of 2 in the subsequent layers, to make the input data linearly separable. Initially, the learning rate was set at 0.01, which yielded significant fluctuations as shown in Fig. 2a, due to which the training loss went down to 0.1397 as shown in Fig. 2b with dropout rate at 0.4. To make the training loss more uniform, we have reduced the learning rate to 0.001 and dropout to be 0.2. Network was under performing at dropout value 0.4, see Table 1. We have selected the reduced value of dropout rate as 0.2 using hit and trial. Furthermore, we have selected SoftMax Cross Entropy loss function appropriate for the problem at hand. 4.1. Proposed service architecture Countering DDoS is becoming more challenging as vast amounts of resources and techniques become increasingly available to hackers [8]. The proposed approach is deployed as a Web service on cloud to provide security to websites from application layer DDoS attacks. Attacks and failures are inevitable, which is why it is important to understand the cloud environment under attack and plan for detection. Figure 3 illustrates the architecture of our approach and consists of five modules that are briefly discussed below. CICFlowMeter: It is a network traffic flow generator written in Java. It offers flexibility in selecting features among its 80 statistical features, adding new ones and also having a better control of the duration of flow timeout. Within our infrastructure, CIC FlowMeter runs as a separate service that captures network flows and extracts relevant features to be written into respective directories to be analyzed. In a deployment of DeepDetect, the network traffic would flow through modules of CIC FlowMeter, which extracts the relevant features from the flow to be used by the classifier in DeepDetect. FloWatcher: It is a real time flow-based monitoring for 10 Gbit Ethernet [34]. It watches a directory for any new file flows. It will extract flows whenever a flow file is created or changed and translates the extracted flows to FlowParser. FlowParser: The JavaScript parser further filters the received flow data sent from FloWatcher. After filtering the noise from the flow data, the filtered data are further sent to the neural network classifier. Classifier: The neural network classifier in our infrastructure is pre-loaded with training model. It predicts labels for the flow data. Alert Generator: On the basis of the predictions made by the classifier, alert generator generates the alerts and forwards them to monitoring services. Alert generator also sends captcha as a response. The traffic will be marked okay if only the client is able to solve the captcha; otherwise it will be blocked. 5. Evaluation and Experiments The Canadian dataset (CIC IDS 2017) has been used while designing our network architecture that contains approximately 80 features. Features in the dataset are distributed in two categories, i.e. categorical and numerical. Mostly, the CIC IDS 2017 dataset contains numerical features. This dataset has been modified latest by 7 June 2018 at 16:38. With the modifications, the shape of the dataset has changed from 80 to 79 features with 692 703 flows. Features in the dataset are distributed in two categories, i.e. categorical and numerical. Mostly, the CIC IDS 2017 dataset contains numerical features. Categorical features for instance, SrcIP, DestIP, SrcPort and DestPort are discarded to avoid overfitting of the classifier on particular addresses and ports. In addition, a few numeric features with |$max(f) - min(f) < \epsilon $|⁠; with |$\epsilon = 0.00001 $|⁠, do not have enough variance to be significant for classification, have also been disregarded. CIC IDS 2017 recorded 5 days network traffic, both legitimate and malicious. Capturing period started at 3 July on Monday at 9:00 am. During the 5-day recording, DDoS attacks happened on Wednesday morning and in the afternoon, i.e. 5 June 2017. We extracted the network traffic from Wednesday morning, since the DoS attacks (Slowloris, SlowHTTPTest, Hulk and GoldenEye) were held on Wednesday morning. The Canadian dataset is a complete dataset that is completely labeled with more than 80 network features extracted and calculated for benign and intrusive traffic by using CIC FlowMeter software [14]. The pioneers of the Canadian dataset have presented features needed for a benchmark dataset and those include complete network configuration and traffic, labeled dataset, complete interaction and capture, available protocol and attack diversity, anonymity, heterogeneity, feature set and metadata [35]. These 11 criteria are used in common updated attacks such as DoS, DDoS, Brute Force, XSS, SQLInjection, Infiltration, PortScan and Botnet [14]. In Table 2, numerical features of flows in the CIC IDS dataset are displayed. In the first column, ‘Active Mean’ shows the mean time a flow was active before becoming idle. ‘Active Std’ shows the standard deviation time before becoming idle. Table 3 refers to the sample flow records showing the categorical features of the CIC IDS dataset. For further details of flow features, interested readers are referred to [36]. 5.1. Experiments In this section, we present the experimental results derived from the proposed neural network architecture. We train our DeepDetect neural network model on NVIDIA Tesla K80 GPUs equipped with 24 GB of GDDR5 memory. Table 4 gives a comprehensive view of the training system specifications. Table 5. Distribution of flow classes in the CIC IDS dataset. Attack Training Validation Testing Total % age Benign 281 397 70 350 87 936 439 683 63.59% GoldenEye 6587 1647 2059 10 293 1.49% Hulk 147 279 36 820 46 025 230 124 33.28% SlowHTTPTest 3519 880 1100 5499 0.80% Slowloris 3710 927 1159 5796 0.84% Attack Training Validation Testing Total % age Benign 281 397 70 350 87 936 439 683 63.59% GoldenEye 6587 1647 2059 10 293 1.49% Hulk 147 279 36 820 46 025 230 124 33.28% SlowHTTPTest 3519 880 1100 5499 0.80% Slowloris 3710 927 1159 5796 0.84% View Large Table 5. Distribution of flow classes in the CIC IDS dataset. Attack Training Validation Testing Total % age Benign 281 397 70 350 87 936 439 683 63.59% GoldenEye 6587 1647 2059 10 293 1.49% Hulk 147 279 36 820 46 025 230 124 33.28% SlowHTTPTest 3519 880 1100 5499 0.80% Slowloris 3710 927 1159 5796 0.84% Attack Training Validation Testing Total % age Benign 281 397 70 350 87 936 439 683 63.59% GoldenEye 6587 1647 2059 10 293 1.49% Hulk 147 279 36 820 46 025 230 124 33.28% SlowHTTPTest 3519 880 1100 5499 0.80% Slowloris 3710 927 1159 5796 0.84% View Large Table 6. Parameters for training model. Parameters Function/value used Optimizer Adam Loss function Categorical cross entropy Decay rate |$5 \times 10^{-5} $| Dropout rate 0.2 Batch size 1024 Epochs 300 |$\alpha $| 0.001 L2 reg 0 Parameters Function/value used Optimizer Adam Loss function Categorical cross entropy Decay rate |$5 \times 10^{-5} $| Dropout rate 0.2 Batch size 1024 Epochs 300 |$\alpha $| 0.001 L2 reg 0 View Large Table 6. Parameters for training model. Parameters Function/value used Optimizer Adam Loss function Categorical cross entropy Decay rate |$5 \times 10^{-5} $| Dropout rate 0.2 Batch size 1024 Epochs 300 |$\alpha $| 0.001 L2 reg 0 Parameters Function/value used Optimizer Adam Loss function Categorical cross entropy Decay rate |$5 \times 10^{-5} $| Dropout rate 0.2 Batch size 1024 Epochs 300 |$\alpha $| 0.001 L2 reg 0 View Large Table 7. Training results. Evaluation metric Score Training loss 0.0583 Training F1 score 0.9821 Validation loss 0.0402 Validation F1 Score 0.9866 Evaluation metric Score Training loss 0.0583 Training F1 score 0.9821 Validation loss 0.0402 Validation F1 Score 0.9866 View Large Table 7. Training results. Evaluation metric Score Training loss 0.0583 Training F1 score 0.9821 Validation loss 0.0402 Validation F1 Score 0.9866 Evaluation metric Score Training loss 0.0583 Training F1 score 0.9821 Validation loss 0.0402 Validation F1 Score 0.9866 View Large 5.2. Pre-processing 5.2.1. Cleaning the data Firstly, all the rows with ‘Not a Number’ (NaN) or positive and negative infinity (Inf) values for any feature were removed considering we already had enough data. Some features have very low variance and as a result cause issue with MinMax scaling before feeding input to the neural networks, which were removed as well. FIGURE 4. View largeDownload slide Categorical cross entropy loss and F1 score on training set. FIGURE 4. View largeDownload slide Categorical cross entropy loss and F1 score on training set. FIGURE 5. View largeDownload slide Categorical cross entropy loss and F1 score on validation set. FIGURE 5. View largeDownload slide Categorical cross entropy loss and F1 score on validation set. Table 8. Confusion matrix of the DeepDetect random forest classifier on the test set. Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 093 0 27 1815 1 87 936 GoldenEye 628 528 3 0 0 1159 Hulk 321 26 753 0 0 1100 SlowHTTPTest 322 0 0 45 703 0 46 025 Slowloris 587 0 0 17 1455 2059 Total count 87 951 554 783 47 535 1456 138 279 Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 093 0 27 1815 1 87 936 GoldenEye 628 528 3 0 0 1159 Hulk 321 26 753 0 0 1100 SlowHTTPTest 322 0 0 45 703 0 46 025 Slowloris 587 0 0 17 1455 2059 Total count 87 951 554 783 47 535 1456 138 279 View Large Table 8. Confusion matrix of the DeepDetect random forest classifier on the test set. Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 093 0 27 1815 1 87 936 GoldenEye 628 528 3 0 0 1159 Hulk 321 26 753 0 0 1100 SlowHTTPTest 322 0 0 45 703 0 46 025 Slowloris 587 0 0 17 1455 2059 Total count 87 951 554 783 47 535 1456 138 279 Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 093 0 27 1815 1 87 936 GoldenEye 628 528 3 0 0 1159 Hulk 321 26 753 0 0 1100 SlowHTTPTest 322 0 0 45 703 0 46 025 Slowloris 587 0 0 17 1455 2059 Total count 87 951 554 783 47 535 1456 138 279 View Large FIGURE 6. View largeDownload slide Multilabel classification ROC curve depicting DeepDetect random forest’s performance as a classifier. FIGURE 6. View largeDownload slide Multilabel classification ROC curve depicting DeepDetect random forest’s performance as a classifier. Table 9. Random forest training parameters. Parameter Value n-estimators 100.0 max-depth 4.0 min-sample-split 2.0 random state 0.0 min-samples-leaf 2.0 Parameter Value n-estimators 100.0 max-depth 4.0 min-sample-split 2.0 random state 0.0 min-samples-leaf 2.0 View Large Table 9. Random forest training parameters. Parameter Value n-estimators 100.0 max-depth 4.0 min-sample-split 2.0 random state 0.0 min-samples-leaf 2.0 Parameter Value n-estimators 100.0 max-depth 4.0 min-sample-split 2.0 random state 0.0 min-samples-leaf 2.0 View Large Table 10. Confusion matrix for the DeepDetect neural networks model on the test set. Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 281 13 1611 28 3 87 936 GoldenEye 14 2042 3 0 0 2059 Hulk 68 1 45 956 0 0 46 025 SlowHTTPTest 11 0 0 1085 4 1100 Slowloris 38 1 0 11 1109 1159 Total count 86 412 2057 47 570 1124 1116 138 279 Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 281 13 1611 28 3 87 936 GoldenEye 14 2042 3 0 0 2059 Hulk 68 1 45 956 0 0 46 025 SlowHTTPTest 11 0 0 1085 4 1100 Slowloris 38 1 0 11 1109 1159 Total count 86 412 2057 47 570 1124 1116 138 279 View Large Table 10. Confusion matrix for the DeepDetect neural networks model on the test set. Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 281 13 1611 28 3 87 936 GoldenEye 14 2042 3 0 0 2059 Hulk 68 1 45 956 0 0 46 025 SlowHTTPTest 11 0 0 1085 4 1100 Slowloris 38 1 0 11 1109 1159 Total count 86 412 2057 47 570 1124 1116 138 279 Predicted actual Benign count GoldenEye Hulk SlowHTTPTest Slowloris Total Benign 86 281 13 1611 28 3 87 936 GoldenEye 14 2042 3 0 0 2059 Hulk 68 1 45 956 0 0 46 025 SlowHTTPTest 11 0 0 1085 4 1100 Slowloris 38 1 0 11 1109 1159 Total count 86 412 2057 47 570 1124 1116 138 279 View Large FIGURE 7. View largeDownload slide Multilabel classification ROC curve depicting DeepDetect neural networks classifier performance. FIGURE 7. View largeDownload slide Multilabel classification ROC curve depicting DeepDetect neural networks classifier performance. Table 11. Performances of random forest classifier as compared to deep graph features after pruning. Attack (DeepGFL [30]) F1 score (Random forest) F1 score (DeepDetect) F1score Benign 0.98153 0.97895 0.98938 DoS GoldenEye 0.72475 0.22194 0.69483 DoS Hulk 0.94055 0.28922 0.98092 DoS SlowHttpTest 0.26953 0.96107 0.54853 DoS Slowloris 0.24299 0.44014 0.55394 Attack (DeepGFL [30]) F1 score (Random forest) F1 score (DeepDetect) F1score Benign 0.98153 0.97895 0.98938 DoS GoldenEye 0.72475 0.22194 0.69483 DoS Hulk 0.94055 0.28922 0.98092 DoS SlowHttpTest 0.26953 0.96107 0.54853 DoS Slowloris 0.24299 0.44014 0.55394 View Large Table 11. Performances of random forest classifier as compared to deep graph features after pruning. Attack (DeepGFL [30]) F1 score (Random forest) F1 score (DeepDetect) F1score Benign 0.98153 0.97895 0.98938 DoS GoldenEye 0.72475 0.22194 0.69483 DoS Hulk 0.94055 0.28922 0.98092 DoS SlowHttpTest 0.26953 0.96107 0.54853 DoS Slowloris 0.24299 0.44014 0.55394 Attack (DeepGFL [30]) F1 score (Random forest) F1 score (DeepDetect) F1score Benign 0.98153 0.97895 0.98938 DoS GoldenEye 0.72475 0.22194 0.69483 DoS Hulk 0.94055 0.28922 0.98092 DoS SlowHttpTest 0.26953 0.96107 0.54853 DoS Slowloris 0.24299 0.44014 0.55394 View Large 5.2.2. Scaling data When training neural networks scaling plays a crucial role. All the features are supposed to be in the same range, so to scale them we can use various scaling techniques like Mean Normalization or Standard Scaling. After a comprehensive set of experiments MinMax Scaling has been chosen for this problem as it performs well. MinMax Scaling normalizes feature values in the range of |$[0,1] $| using the formula \begin{equation*} z_i=\frac{x_i - min(x)}{max(x) - min(x)}. \end{equation*} As can be seen above that the CIC dataset has a huge imbalance due to which the initial performance of the classifier is not so good. The neural network tends to predict only the majority class. Imbalanced data problem can be solved using random oversampling of the minority class or random downsampling of the majority class, but both these techniques are not feasible in the DDoS scenario. Another solution is Cost Sensitive Learning that solves the imbalanced data problem based on the cost associated with misclassification. Upon the misclassification of minority class, a higher cost is assigned as compared to the misclassification of majority class. As a result the neural network learns to cater for all classes equally and generalizes well. 5.3. Training, validation and sest set distribution Labeled dataset from CIC IDS 2017 is split into training and testing sets using the ratio 8:2 (80% training, 20% testing). Training data are fed into our neural network. Furthermore, 20% of the training data is separated as validation set. The CIC dataset contains a list of flows labeled as benign (63.59% of all flows) or one of the possible attacks (36.41% of all flows). The class distribution of flows is given in Table 5. 5.4. Model training parameters The combination of parameters listed in Table 6 that produce the results presented in this paper are described as follows: SoftMax has been applied for cross entropy loss or categorical cross entropy loss. Adam combines the benefits of both Adaptive Gradient Algorithm (AdaGrad) and Root Mean Square Propagation (RMSProp). It is different from classical stochastic gradient descent as it does not maintain a single learning rate rather it adapts to dynamic changes. |$\alpha $| - The learning rate used with Adam is 0.001 and a decay rate of |$5\times 10^{-5} $|⁠. During DeepDetect training L2 regularization is set to 0 with dropout set to 0.2. This means that 20% of the neurons are turned off randomly on each pass. This not only controls the neural network’s learning by preventing overfitting, but also acts as a multi-stacked classifier. By randomly turning off neurons using dropout we are training not just one but multiple versions of the model. 5.5. Analysis of results DeepDetect neural network is trained using the above-mentioned parameters and validated on the training split of the CIC IDS dataset. Training results including training loss, F1 score and validation loss and F1 score at a dropout value of 0.2 are given in Table 7 that shows an F1 score of 0.9821 on the training set and 0.9866 on the validation set. 5.5.1. Tensor board graphs The categorical cross entropy on training set is shown in Fig. 4a. Training loss starts at 0.85 and goes down to 0.0583. Figure 4b shows the training F1 score that is used as an alternative metric to evaluate neural network overfitting. F1 score is a measure of accuracy; its value lies between |$[0,1] $|⁠. When working with imbalanced classes accuracy does not serve as an appropriate evaluation metric so we calculate precision and recall and use these values to calculate F1 score. The F1 score starts at 0.62 and goes up to 0.9821 during training. Categorical cross entropy on the validation set is shown in Fig. 5a. Evaluating the DeepDetect model on an independent validation set is of extreme importance. Neural networks tend to overfit very easily so we need to verify whether the network is improving on an independent validation set or not. The validation loss and F1 scores are then used to as an early signal of overfitting. The idea is that if the network starts showing poor performance on the validation set while still showing improved performance on the training set, then it is simply overfitting the training set without generalization and must be stopped. The validation loss starts at 0.507 and goes down to 0.0402. Figure 5b shows training F1 score for validation set. The F1 score starts from 0.9556 and improves up to 0.986. 5.5.2. DeepDetect random forest classifier Initially, we train a random forest model for classification of flows in DeepDetect. Random forests are an ensemble-learning technique for classification and regression. In the case of classification, random forest acts as a meta estimator that combines a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy. Furthermore random forest using decision tree ensemble controls overfitting. Following parameters are used during training and prediction. The performance metric used for evaluating the accuracy of random forest classifier is the F1 score. DeepDetect random forest classifier yields an F1 score of 0.99 on the test set whereas it achieved an overall F1 score of 0.97 on the test set with ‘average’ parameter set as ‘weighted’. Furthermore, to compare the performance of both DeepDetect classification models we give the confusion matrix showing the actual and predicted DoS classes as shown in Table 9. Consider R as Row and C as Column in Table 9. [R1, C1], i.e. [Benign, Benign], gives the number of instances for which the random forest classifier correctly predicted benign traffic as Benign, i.e. 86 093 times. On the contrary, [R1, C2], i.e. [Benign, DoS GoldenEye], gives the number of instances where the random forest classifier labeled benign traffic as attack traffic, i.e. DoS GoldenEye, zero times. It can be seen that DeepDetect random forest classifier confuses 1815 benign flows as SlowHTTPTest attacks. Furthermore, the performance of our random forest classifier is particularly affected by misclassification of hundreds of attack flows as benign traffic as can be seen from the first column most notably as 628 of the GoldenEye DoS attacks are classified as benign. Furthermore, for comparing the performance of the two DeepDetect classifiers, we plot the extended ROC curves depicting multilabel classification using random forests in Fig. 6a and a zoomed in version in Fig. 6b. However, since the overall performance of the random forest classifier is exceptionally good the area under the curve (AUC) is almost equal to one. 5.5.3. DeepDetect neural network classifier Each flow in the CIC IDS dataset is labeled as benign or attack flow along with its corresponding information treated as features, e.g. Activity min, Activity Max, Activity Std, etc. DeepDetect focuses on application layer DDoS attacks. To illustrate the performance of our DeepDetect neural network model, we construct the confusion matrix given in Table 10. The actual classes are mentioned in the rows and predicted classes in the columns along with the total prediction counts. [R1, C1], i.e. [Benign, Benign], shows the instances when the classifier correctly predicted benign traffic as benign 86 281 times. On the contrary, [R1, C2], i.e. [Benign, DoS GoldenEye], shows the instances where the DeepDetect classifier incorrectly classifies benign traffic as DoS GoldenEye attack traffic 13 times. [R1, C3] shows that 1611 benign flows are incorrectly classified as DoS Hulk. [R1, C6], i.e. [R1, Total Count], gives the total count of all the actual benign flows, i.e. 86 412 in the test set. In a nutshell, the diagonal entries of the matrix represent instances where the classifier predicts the correct label. The matrix shows that the DeepDetect neural network classifier performs significantly better than the random forest classifier except in the case of 1611 benign flows that are incorrectly classified as DoS Hulk attack. The overall F1 score is 0.99 on the test set. The ROC curves for the DeepDetect neural network classifier for each classification are shown in Fig. 7a and the zoomed in version is shown in Fig. 7b. The AUC is so close to 1 that it is labeled as 1 showing the high level of accuracy achieved by our classifier. 5.5.4. A comparative analysis of DeepDetect and DeepGFL In order to show the effectiveness of DeepDetect in identifying application level DDoS attacks using random forest and neural network we compute the F1 scores for each classified class. We also compare our results with the results of the recently proposed Deep Graph Feature Learning Approach (DeepGFL) [30]. The comparison is shown in Table 11 with the best score highlighted for each class. The authors of DeepGFL evaluate and compare the performance of four different feature functions Sum, Mul, Mean and Diff in addition to the obtained performance accuracy using raw features, deep features, raw features after pruning and deep features after pruning. According to obtained results, the overall performance of DeepDetect neural network model is much better than DeepGFL and the random forest model with the best F1 scores for three of the five classes. DeepGFL gives the best score for detecting GoldenEye at 0.725 with the DeepDetect neural network not far behind at 0.695 whereas DeepDetect random forest model is best at detecting SlowHttptest attack with an F1 score of 0.961. 6. Conclusions In this paper, we proposed a novel deep neural network that uses feed-forward back-propagation architecture using seven hidden layers for classifying network flows as attack or normal flows. The proposed approach protects services from application layer DDoS attacks by routing traffic through the proposed system and detecting malicious behavior. It can also detect a malicious behavior from packets, if an entirely new malicious pattern is being used. Those patterns would serve as a secondary data set to train the neural networks in the proposed system. The proposed approach is evaluated using the state-of-the-art Canadian dataset (CIC IDS 2017) for DDoS detection. The experimental results demonstrated the accuracy in terms of precision and recall, collectively as F1 score on the test yielded a value of 0.99. As a future work, we plan to use and extend the same approach to tackle other types such as UDP- and ICMP-based DDoS attacks. References 1 Singh , K. , Guntuku , S.C. , Thakur , A. and Hota , C. ( 2014 ) Big data analytics framework for peer-to-peer botnet detection using random forests . Inform. Sci. , 278 , 488 – 497 . Google Scholar Crossref Search ADS WorldCat 2 Liu , J. , Lai , Y. and Zhang , S. ( 2017 ) FL-GUARD: a detection and defense system for DDoS attack in SDN . Proc. 2017 Int. Conf. Cryptography, Security and Privacy (ICCSP) , pp. 107 – 111 . WorldCat 3 Stanciu , V. and Tinca , A. ( 2017 ) Exploring cybercrime–realities and challenges . Account. Manag. Inf. Syst. , 16 , 610 – 632 . WorldCat 4 Tariq , N. , Asim , M. , Al-Obeidat , F. , Farooqi , Z. , Baker , T. , Hammoudeh , M. and Ghafir , I. ( 2019 ) The security of big data in fog-enabled IoT applications including blockchain: a survey . Sensors , 19 , 1788 . Google Scholar Crossref Search ADS WorldCat 5 Abbas , N. , Asim , M. , Tariq , N. , Baker , T. and Abbas , S. ( 2019 ) A mechanism for securing IoT-enabled applications at the fog layer . J. Sens. Actuator Netw. , 8 , 16 . Google Scholar Crossref Search ADS WorldCat 6 Perlroth , N. ( 2016 ) Hackers use new weapons to disrupt major websites across us . The New York Times , 21 . 7 Kolias , C. , Kambourakis , G. , Stavrou , A. and Voas , J. ( 2017 ) DDoS in the IoT: Mirai and other botnets . Computer , 50 , 80 – 84 . Google Scholar Crossref Search ADS WorldCat 8 Zargar , S. T. , Joshi , J. and Tipper , D. ( 2013 ) A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks . IEEE Commun. Surv. Tutorials , 15 , 2046 – 2069 . Google Scholar Crossref Search ADS WorldCat 9 Srivastava , A. , Gupta , B. , Tyagi , A. , Sharma , A. and Mishra , A. ( 2011 ) A recent survey on DDoS attacks and defense mechanisms . Proc. Int. Conf. Parallel Distributed Computing Technologies and Applications , pp. 570 – 580 . WorldCat 10 Xie , Y. and Yu , S.-Z. ( 2009 ) Monitoring the application-layer DDoS attacks for popular websites . IEEE/ACM Trans. Netw. , 17 , 15 – 25 . Google Scholar Crossref Search ADS WorldCat 11 Chen , Z. and Qian , P. ( 2009 ) Application of PSO-RBF neural network in network intrusion detection . Proc. 3rd Int. Symposium on Intelligent Information Technology Application (IITA) , pp. 362 – 364 . WorldCat 12 Meti , N. , Narayan , D. G. and Baligar , V. P. ( 2017 ) Detection of distributed denial of service attacks using machine learning algorithms in software defined networks . Proc. 2017 Int. Conf. Advances in Computing, Communications and Informatics (ICACCI) , pp. 1366 – 1371 . WorldCat 13 Shih , E. , Cho , S.-H. , Ickes , N. , Min , R. , Sinha , A. , Wang , A. and Chandrakasan , A. ( 2001 ) Physical layer driven protocol and algorithm design for energy-efficient wireless sensor networks . Proc. 7th Int. Conf. Mobile Computing and Networking , pp. 272 – 287 . 14 Sharafaldin , I. , Lashkari , A. H. and Ghorbani , A. A. ( 2018 ) Toward generating a new intrusion detection dataset and intrusion traffic characterization . Proc. 4th Int. Conf. Information Systems Security and Privacy (ICISSP) , pp. 108 – 116 . WorldCat 15 Mahjabin , T. , Xiao , Y. , Sun , G. and Jiang , W. ( 2017 ) A survey of distributed denial-of-service attack, prevention, and mitigation techniques . Int. J. Distrib. Sens. Netw. , 13 , 1 – 33 . Google Scholar Crossref Search ADS WorldCat 16 Sree , T. R. and Bhanu , S. M. S. ( 2016 ) HADM: detection of HTTP GET flooding attacks by using analytical hierarchical process and Dempster–Shafer theory with MapReduce . Secur. Commun. Netw. , 9 , 4341 – 4357 . Google Scholar Crossref Search ADS WorldCat 17 Slowloris HTTP DoS . https://github.com/XCHADXFAQ77X/SLOWLORIS/ (accessed June 28, 2018) . 18 Yevsieieva , O. and Helalat , S. M. ( 2017 ) Analysis of the impact of the slow HTTP DoS and DDoS attacks on the cloud environment . Proc. Scientific-Practical Conference Problems of Infocommunications, Science and Technology (PIC S&T) , pp. 519 – 523 . WorldCat 19 Behal , S. and Kumar , K. ( 2016 ) Measuring the impact of DDoS attacks on web services-a realtime experimentation . Int. J. Comp. Sci. Inf. Secur. , 14 , 323 – 330 . WorldCat 20 Behal , S. and Kumar , K. ( 2017 ) Characterization and comparison of DDoS attack tools and traffic generators: a review . Int. J. Netw. Secur. , 19 , 383 – 393 . WorldCat 21 Singh , K. , Singh , P. and Kumar , K. ( 2017 ) Application layer HTTP-GET flood DDoS attacks: research landscape and challenges . Comput. Secur. , 65 , 344 – 372 . Google Scholar Crossref Search ADS WorldCat 22 Mirkovic , J. and Reiher , P. ( 2004 ) A taxonomy of DDoS attack and DDoS defense mechanisms . ACM SIGCOMM Comp. Comm. Rev. , 34 , 39 – 53 . Google Scholar Crossref Search ADS WorldCat 23 Oshima , S. , Nakashima , T. and Sueyoshi , T. ( 2010 ) DDoS detection technique using statistical analysis to generate quick response time . Proc. Int. Conf. Broadband, Wireless Computing, Communication and Applications (BWCCA) , pp. 672 – 677 . WorldCat 24 Kwon , D. , Kim , H. , An , D. and Ju , H. ( 2017 ) DDoS attack volume forecasting using a statistical approach . Proc. IFIP/IEEE Symposium on Integrated Network and Service Management (IM) , pp. 2015 – 2018 . 25 Najafabadi , M. M. , Khoshgoftaar , T. M. , Calvert , C. and Kemp , C. ( 2005 ) A text mining approach for anomaly detection in application layer DDoS attacks . Proc. 13th Int. Florida Artificial Intelligence Research Society Conference (FLAIRS) , pp. 312 – 317 . 26 Zhong , R. and Yue , G. ( 2010 ) DDoS detection system based on data mining . Proc. 2nd Int. Symposium on Networking and Network Security (ISNNS) , pp. 2 – 4 . 27 Hsieh , C. J. and Chan , T. Y. ( 2016 ) Detection DDoS attacks based on neural-network using apache spark . Proc. Int. Conf. Applied System Innovation (ICASI) , pp. 1 – 4 . WorldCat 28 Domingos , P. M. and Pazzani , M. J. ( 1996 ) Beyond independence: conditions for the optimality of the simple bayesian classifier . Proc. 13th Int. Conf. Machine Learning (ICML) , pp. 105 – 112 . 29 Xie , M. and Hu , J. ( 2013 ) Evaluating host-based anomaly detection systems: a preliminary analysis of ADFA-LD . Proc. 6th Int. Conf. Image and Signal Processing (CISP) , pp. 1711 – 1716 . WorldCat 30 Yao , Y. , Su , L. and Lu , Z. ( 2018 ) DeepGFL: deep feature learning via graph for attack detection on flow-based network traffic . Proc. IEEE Military Communications Conference (MILCOM) , pp. 579 – 584 . WorldCat 31 Jiang , J. , Yu , Q. , Yu , M. , Li , G. and Chen ( 2018 ) ALDD: a hybrid traffic-user behavior detection method for application layer DDoS. Proc. 17th IEEE Int. Conf. Trust, Security and Privacy/12th IEEE Computing And Communications Conference On Big Data Science and Engineering (TrustCom/BigDataSE) , pp. 1565 – 1569 . WorldCat 32 Hou , J. , Fu , P. , Cao , Z. and Xu , A. ( 2018 ) Machine learning based DDoS detection through NetFlow analysis . Proc. IEEE Military Communications Conference (MILCOM) , pp. 1 – 6 . WorldCat 33 Vijayanand , R. , Devaraj , D. and Kannapiran , B. ( 2018 ) Intrusion detection system for wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based feature selection . Comput. Secur. , 77 , 304 – 314 . Google Scholar Crossref Search ADS WorldCat 34 CIC IDS 2017 , Canadian Institute For CyberSecurity . http://www.unb.ca/cic/datasets/flowmeter.html (accessed June 28, 2018) . 35 Gharib , A. , Sharafaldin , I. , Lashkari , A. H. and Ghorbani , A. A. ( 2016 ) An evaluation framework for intrusion detection dataset . Proc. Int. Conf. Information Science and Security (ICISS) , pp. 1 – 4 . WorldCat 36 Lashkari , A. H. , Draper-Gil , G. , Mamun , M. S. I. and Ghorbani , A. A. ( 2017 ) Characterization of tor traffic using time based features . Proc. 3rd Int. Conf. Information Systems Security and Privacy (ICISSP) , pp. 253 – 262 . © The British Computer Society 2019. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - DeepDetect: Detection of Distributed Denial of Service Attacks Using Deep Learning JF - The Computer Journal DO - 10.1093/comjnl/bxz064 DA - 2020-07-17 UR - https://www.deepdyve.com/lp/oxford-university-press/deepdetect-detection-of-distributed-denial-of-service-attacks-using-0bSrn5m0hP SP - 1 VL - Advance Article IS - DP - DeepDyve ER -