Skip to contents

The as_visualizer() function automatically creates appropriate visualizers for model predictions on tasks with one or two features. By default, it uses ggplot2 for 1D and 2D visualizations. For 2D tasks, you can optionally specify type = "surface" to get interactive plotly surface plots.

Let’s start with the california_housing data set. The goal is to predict the median house value for California districts. We subset the data set to only use the features median_income and housing_median_age, and sample 2000 observations for faster rendering. The median_income feature is the median income in block group and housing_median_age is the median age of a house within a block.

task = tsk("california_housing")
task$select(c("median_income", "housing_median_age"))
task$filter(rows = sample(task$nrow, 2000))

We load the support vector machine learner for regression.

learner = lrn("regr.svm")

Now we create a visualizer object, using the plotly backend (type = "surface").

vis = as_visualizer(task, learner = learner, type = "surface")

First, the learner is trained on the entire task. After that a grid is created for the two features and the predictions of the model are computed for each grid point. The predictions are then visualized using an interactive surface plot.

vis$plot()

Draw with contour lines above z dimension.

vis = as_visualizer(task, learner = learner, type = "surface")
vis$add_contours()$plot()

We can add the training points to the plot using method chaining.

vis$add_training_data()$plot()

We can also flatten the surface to arrive at a 2D contour plot by using the flatten = TRUE parameter.

vis$plot(flatten = TRUE)

To switch back to the surface plot, simply use flatten = FALSE (or omit the parameter since it’s the default).

It is also possible to visualize classification tasks. We use the pima data set and impute the missing values. We select the features insulin and mass and train a support vector machine for classification.

task = tsk("pima")
task = po("imputemean")$train(list(task))[[1]]
task$select(c("insulin", "mass"))
learner = lrn("classif.svm", predict_type = "prob")

We create a visualizer object, using the default ggplot2 backend, and plot the predictions.

vis = as_visualizer(task, learner = learner)
vis$plot()

We can add (potential) decision boundaries to the plot using method chaining.

vis$add_boundary(values = c(0.3, 0.5, 0.7))$plot()

For classification tasks, add_training_data() supports setting different colors and shapes for the different classes.

vis$add_training_data(
  color = c(pos = "red", neg = "blue"),
  shape = c(pos = 17, neg = 19)
)$plot()

For surface plots, the same class-specific styling is supported.

vis_surface = as_visualizer(task, learner = learner, type = "surface")
vis_surface$add_training_data(
  color = c(pos = "red", neg = "blue"),
  shape = c(pos = 0, neg = 1)
)$plot()