Confusion Matrix (%)

Matrix layout that quantifies the joint distribution between predicted and true categorical labels, with each cell normalized to display the percentage contribution within the overall classification.

Confusion Matrix (%)

Processing

This brick generates a Normalized Confusion Matrix visualization. Unlike a standard confusion matrix that shows raw counts (e.g., "50 items"), this version displays percentages (e.g., "98%").

Specifically, it calculates the percentage relative to the True Label (the actual category). This is particularly useful when working with imbalanced data, as it allows you to see how well the model performs for each specific category, regardless of how many items are in that category.

For example, if you have 1000 "Non-Fraud" transactions and only 10 "Fraud" transactions, a raw count matrix might hide the model's failure to catch fraud. This percentage view would clearly show if the model only caught 20% of the "Fraud" cases.

Inputs

y true
The list or column containing the actual, correct labels (ground truth). These are the values you know are correct.
y pred
The list or column containing the labels predicted by your model. This must correspond index-by-index to the y_true input.

Inputs Types

Input Types
y true List, DataSeries, NDArray, DataRecords, DataFrame
y pred List, DataSeries, NDArray, DataRecords, DataFrame

You can check the list of supported types here: Available Type Hints.

Outputs

y true
The original list of true labels, passed through to the next brick unchanged.
y pred
The original list of predicted labels, passed through to the next brick unchanged.

Outputs Types

Output Types
y true List, DataSeries, NDArray, DataRecords, DataFrame
y pred List, DataSeries, NDArray, DataRecords, DataFrame

You can check the list of supported types here: Available Type Hints.

Options

The Confusion Matrix (%) brick contains some changeable options:

Labels Size
Controls the font size of the text on the X and Y axes (the category names).
Count Size
Controls the font size of the percentage numbers displayed inside the grid squares.
Color Schema
Determines the color palette used for the heatmap. Darker colors represent higher percentages (closer to 100%).
  • blues: Shades of blue.
  • yellowgreenblue: A gradient from yellow to green to blue.
  • lightmulti: A light, multi-colored scheme.
  • lighttealblue: Shades of teal and blue.
  • bluegreen: Shades of blue and green.
  • orangered: Shades of orange and red.
  • redpurple: Shades of red and purple.

Brick Info

version v0.1.4
python 3.11, 3.12, 3.13
requirements
  • shap>=0.47.0
  • numba>=0.56.0