CONFUSION MATRIX

Ayush Parmar
6 min readJun 5, 2021

Role of Confusion Matrix in cyber security

Photo by Bermix Studio on Unsplash

CYBER CRIME!!!

Cybercrime, also called computer crime, the use of a computer as an instrument to further illegal ends, such as committing fraud, trafficking in child pornography and intellectual property, stealing identities, or violating privacy. Cybercrime, especially through the Internet, has grown in importance as the computer has become central to commerce, entertainment, and government.

CYBER SECURITY.

Why has machine learning become so critical to cybersecurity?

Several reasons. With machine learning, cybersecurity systems can analyze patterns and learn from them to help prevent similar attacks and respond to changing behavior. It can help cybersecurity teams be more proactive in preventing threats and responding to active attacks in real time. It can reduce the amount of time spent on routine tasks and enable organizations to use their resources more strategically.

In short, machine learning can make cybersecurity simpler, more proactive, less expensive and far more effective. But it can only do those things if the underlying data that supports the machine learning provides the complete picture of the environment. As they say, garbage in, garbage out.

What is Confusion Matrix?

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

I wanted to create a “quick reference guide” for confusion matrix terminology because I couldn’t find an existing resource that suited my requirements: compact in presentation, using numbers instead of arbitrary variables, and explained both in terms of formulas and sentences.

Let’s start with an example confusion matrix for a binary classifier (though it can easily be extended to the case of more than two classes):

  • There are two possible predicted classes: “yes” and “no”. If we were predicting the presence of a disease, for example, “yes” would mean they have the disease, and “no” would mean they don’t have the disease.
  • The classifier made a total of 165 predictions (e.g., 165 patients were being tested for the presence of that disease).
  • Out of those 165 cases, the classifier predicted “yes” 110 times, and “no” 55 times.
  • In reality, 105 patients in the sample have the disease, and 60 patients do not.

Let’s now define the most basic terms, which are whole numbers (not rates):

  • true positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.
  • true negatives (TN): We predicted no, and they don’t have the disease.
  • false positives (FP): We predicted yes, but they don’t actually have the disease. (Also known as a “Type I error.”)
  • false negatives (FN): We predicted no, but they actually do have the disease. (Also known as a “Type II error.”)

I’ve added these terms to the confusion matrix, and also added the row and column totals:

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

  • Accuracy: Overall, how often is the classifier correct?
  • (TP+TN)/total = (100+50)/165 = 0.91
  • Misclassification Rate: Overall, how often is it wrong?
  • (FP+FN)/total = (10+5)/165 = 0.09
  • equivalent to 1 minus Accuracy
  • also known as “Error Rate”
  • True Positive Rate: When it’s actually yes, how often does it predict yes?
  • TP/actual yes = 100/105 = 0.95
  • also known as “Sensitivity” or “Recall”
  • False Positive Rate: When it’s actually no, how often does it predict yes?
  • FP/actual no = 10/60 = 0.17
  • True Negative Rate: When it’s actually no, how often does it predict no?
  • TN/actual no = 50/60 = 0.83
  • equivalent to 1 minus False Positive Rate
  • also known as “Specificity”
  • Precision: When it predicts yes, how often is it correct?
  • TP/predicted yes = 100/110 = 0.91
  • Prevalence: How often does the yes condition actually occur in our sample?
  • actual yes/total = 105/165 = 0.64

Type I error

The first kind of error is the rejection of a true null hypothesis as the result of a test procedure. This kind of error is called a type I error (false positive) and is sometimes called an error of the first kind.

In terms of the courtroom example, a type I error corresponds to convicting an innocent defendant.

Type II error

The second kind of error is the failure to reject a false null hypothesis as the result of a test procedure. This sort of error is called a type II error (false negative) and is also referred to as an error of the second kind.

In terms of the courtroom example, a type II error corresponds to acquitting a criminal.

Crossover error rate

The crossover error rate (CER) is the point at which Type I errors and Type II errors are equal and represents the best way of measuring a biometrics’ effectiveness. A system with a lower CER value provides more accuracy than a system with a higher CER value.

False positive and false negative

In terms of false positives and false negatives, a positive result corresponds to rejecting the null hypothesis, while a negative result corresponds to failing to reject the null hypothesis; “false” means the conclusion drawn is incorrect. Thus, a type I error is equivalent to a false positive, and a type II error is equivalent to a false negative.

How is Confusion Matrix useful in Cyber Security???

Since 1999, KDD’99 has been the most wildly used data set for the evaluation of anomaly detection methods. This data set is prepared by Stolfo et al. and is built based on the data captured in DARPA’98 IDS evaluation program . DARPA’98 is about 4 gigabytes of compressed raw (binary) tcpdump data of 7 weeks of network traffic, which can be processed into about 5 million connection records, each with about 100 bytes. For each TCP/IP connection, 41 various quantitative (continuous data type) and qualitative (discrete data type) features were extracted among the 41 features, 34 features (numeric) and 7 features (symbolic). To analysis the different results, there are standard metrics that have been developed for evaluating network intrusion detections. Detection Rate (DR) and false alarm rate are the two most famous metrics that have already been used. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while false alarm (false positive) rate is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections.

In the KDD Cup 99, the criteria used for evaluation of the participant entries is the Cost Per Test (CPT) computed using the confusion matrix and a given cost matrix. A Confusion Matrix (CM) is a square matrix in which each column corresponds to the predicted class, while rows correspond to the actual classes.

  • True Positive (TP): The amount of attack detected when it is actually attack.
  • True Negative (TN): The amount of normal detected when it is actually normal.
  • False Positive (FP): The amount of attack detected when it is actually normal (False alarm).
  • False Negative (FN): The amount of normal detected when it is actually attack.

In the confusion matrix above, rows correspond to predicted categories, while columns correspond to actual categories.

Confusion matrix contains information actual and predicted classifications done by a classifier. The performance of cyber attack detection system is commonly evaluated using the data in a matrix.

Thank you for your time!!!

--

--