# CYBERCRIME — CONFUSION MATRIX

**CYBERCRIME**

*Computer crime, or Cybercrime, refers to any crime that involves a computer and a network. Net crime is criminal exploitation of the Internet**A cyber-attack is an exploitation of computer systems and networks. It uses malicious code to alter computer code, logic or data and lead to cybercrimes, such as information and identity theft.**Intrusion detection systems (IDS) which monitor and identify malicious behaviour on network traffic have been extensively researched and used in traditional IT infrastructures.**Such tools play a key role in the understanding the cyber-attack that has occurred and can aid a faster and more efficient incident response rate.*

**CONFUSION MATRIX**

*A confusion matrix is a performance measurement technique for Machine learning classification problems.**It’s a simple table which helps us to know the performance of the classification model on test data for the true values are known.**A confusion matrix contains information about actual and predicted classifications done by a classification system.**Performance of such systems is commonly evaluated using the data in the matrix.**A much better way to evaluate the performance of a classifier is to look at the confusion matrix.**Confusion matrix is also known as “error-matrix”.**The following table shows the confusion matrix for a two class classifier.*

**● TP is the number of correct predictions that an instance is positive**

**● FN is the number of incorrect predictions that an instance is negative**

**● FP is the number of incorrect predictions that an instance positive**

**● TN is the number of correct predictions that an instance is negative**

**Several standard terms have been defined for the 2 class matrix:**

**● The accuracy (AC) is the proportion of the total number of predictions that were correct. It is determined using the equation:**

** TP + TN **

AC = -----------------

TP + TN + FP + FN

**● The recall or true positive rate (TPR) is the proportion of positive cases that were correctly identified (i.e., Sensitivity or Recall), as calculated using the equation:**

** TP**

TPR = ---------

FN + TP

**● The false positive rate (FPR) is the proportion of negatives cases that were incorrectly classified as positive, as calculated using the equation:**

** FP**

FPR = ---------

TN + FP

**● The true negative rate (TNR) is defined as the proportion of negatives cases that were classified correctly (i.e., Specificity), as calculated using the equation:**

** TN**

TNR = ---------

TN + FP

**● The false negative rate (FNR) is the proportion of positives cases that were incorrectly classified as negative, as calculated using the equation:**

** FN**

FNR = ---------

FN + TP

**● The Negative predictive value (NPV) predicts the value for both true negatives and false negatives, as calculated using the equation:**

** TN**

NPV = ---------

TN + FN

**● The Positive predictive value (PPV) predicts the value for both true positives and false positives (i.e., precision), as calculated using the equation:**

** TP**

PPV = ---------

TP + FP

**Types of error in confusion matrix are :**

**● Type 1 error (FP):**

* We predicted yes, but they are not leaving the network (not churn) i.e., we are wrongly predicted a negative as positive. It is known as a “Type 1 error”.* In case of cyber attacks, it predicts that attacks are not happening but in real it happens which leads to a vicious one.* so, that's why type 1 error is the most dangerous.

**● Type 2 error (FN):**

* We predicted no, but they are actually leaving the network (churn) i.e., we are wrongly predicted a positive as negative. It is known as a “Type 2 error” or “False Alarm”* In case of cyber attacks, it predicts that attacks are happening but in real it does not occur.

**CONCLUSION :**

*|| As a conclusion , Confusion Matrix is widely used in classification models. It is a matrix used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known. It is in the form of a square matrix where the column represents the actual values and the row depicts the predicted value of the model and vice versa. Type I and type II errors present unique problems in case of cyber attacks. Unfortunately, type I error is the most dangerous one*.||