CYBERCRIME — CONFUSION MATRIX
CYBERCRIME
- Computer crime, or Cybercrime, refers to any crime that involves a computer and a network. Net crime is criminal exploitation of the Internet
- A cyber-attack is an exploitation of computer systems and networks. It uses malicious code to alter computer code, logic or data and lead to cybercrimes, such as information and identity theft.
- Intrusion detection systems (IDS) which monitor and identify malicious behaviour on network traffic have been extensively researched and used in traditional IT infrastructures.
- Such tools play a key role in the understanding the cyber-attack that has occurred and can aid a faster and more efficient incident response rate.
CONFUSION MATRIX
- A confusion matrix is a performance measurement technique for Machine learning classification problems.
- It’s a simple table which helps us to know the performance of the classification model on test data for the true values are known.
- A confusion matrix contains information about actual and predicted classifications done by a classification system.
- Performance of such systems is commonly evaluated using the data in the matrix.
- A much better way to evaluate the performance of a classifier is to look at the confusion matrix.
- Confusion matrix is also known as “error-matrix”.
- The following table shows the confusion matrix for a two class classifier.
● TP is the number of correct predictions that an instance is positive
● FN is the number of incorrect predictions that an instance is negative
● FP is the number of incorrect predictions that an instance positive
● TN is the number of correct predictions that an instance is negative
- Several standard terms have been defined for the 2 class matrix:
● The accuracy (AC) is the proportion of the total number of predictions that were correct. It is determined using the equation:
TP + TN
AC = -----------------
TP + TN + FP + FN
● The recall or true positive rate (TPR) is the proportion of positive cases that were correctly identified (i.e., Sensitivity or Recall), as calculated using the equation:
TP
TPR = ---------
FN + TP
● The false positive rate (FPR) is the proportion of negatives cases that were incorrectly classified as positive, as calculated using the equation:
FP
FPR = ---------
TN + FP
● The true negative rate (TNR) is defined as the proportion of negatives cases that were classified correctly (i.e., Specificity), as calculated using the equation:
TN
TNR = ---------
TN + FP
● The false negative rate (FNR) is the proportion of positives cases that were incorrectly classified as negative, as calculated using the equation:
FN
FNR = ---------
FN + TP
● The Negative predictive value (NPV) predicts the value for both true negatives and false negatives, as calculated using the equation:
TN
NPV = ---------
TN + FN
● The Positive predictive value (PPV) predicts the value for both true positives and false positives (i.e., precision), as calculated using the equation:
TP
PPV = ---------
TP + FP
- Types of error in confusion matrix are :
● Type 1 error (FP):
* We predicted yes, but they are not leaving the network (not churn) i.e., we are wrongly predicted a negative as positive. It is known as a “Type 1 error”.* In case of cyber attacks, it predicts that attacks are not happening but in real it happens which leads to a vicious one.* so, that's why type 1 error is the most dangerous.
● Type 2 error (FN):
* We predicted no, but they are actually leaving the network (churn) i.e., we are wrongly predicted a positive as negative. It is known as a “Type 2 error” or “False Alarm”* In case of cyber attacks, it predicts that attacks are happening but in real it does not occur.
CONCLUSION :
|| As a conclusion , Confusion Matrix is widely used in classification models. It is a matrix used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known. It is in the form of a square matrix where the column represents the actual values and the row depicts the predicted value of the model and vice versa. Type I and type II errors present unique problems in case of cyber attacks. Unfortunately, type I error is the most dangerous one.||