hardhat

SKILL.md

Creating Modeling Packages with hardhat

The hardhat package provides infrastructure for building modeling packages with consistent interfaces. It standardizes preprocessing via mold() (training) and forge() (prediction), handling formula, XY, and recipe inputs uniformly.

Quick Reference

Task Function
Preprocess training data mold(x, y) or mold(formula, data)
Preprocess prediction data forge(new_data, blueprint)
Create model object new_model(..., blueprint, class)
XY blueprint default_xy_blueprint(intercept = TRUE)
Formula blueprint default_formula_blueprint(intercept = TRUE)
Recipe blueprint default_recipe_blueprint(intercept = TRUE)
Format numeric predictions spruce_numeric(pred)
Format class predictions spruce_class(pred)
Format probability predictions spruce_prob(pred)
Validate univariate outcome validate_outcomes_are_univariate(outcomes)
Validate prediction size validate_prediction_size(pred, new_data)

Package Architecture

Stage 1: Model Fitting

User → simple_lm() methods → bridge → implementation → constructor
         (formula/xy/recipe)    ↓           ↓              ↓
                            mold()    lm.fit()      new_model()

Stage 2: Model Prediction

User → predict.simple_lm() → bridge → implementation
              ↓                ↓            ↓
          forge()          switch()   predict_*_numeric()

Model Constructor

Create objects of your model class. Name: new_<model_class>().

new_simple_lm <- function(coefs, coef_names, blueprint) {
  if (!is.numeric(coefs)) {
    stop("`coefs` should be a numeric vector.", call. = FALSE)
  }
  if (!is.character(coef_names)) {
    stop("`coef_names` should be a character vector.", call. = FALSE)
  }

  new_model(
    coefs = coefs,
    coef_names = coef_names,
    blueprint = blueprint,
    class = "simple_lm"
  )
}

Implementation Function

Core algorithm. Name: <model_class>_impl(). Returns named list of model elements.

simple_lm_impl <- function(predictors, outcomes) {
  lm_fit <- lm.fit(predictors, outcomes)
  coefs <- lm_fit$coefficients

  list(
    coefs = unname(coefs),
    coef_names = names(coefs)
  )
}

Bridge Function

Connects user-facing methods to implementation. Converts mold() output to implementation format.

simple_lm_bridge <- function(processed) {
  validate_outcomes_are_univariate(processed$outcomes)

  predictors <- as.matrix(processed$predictors)
  outcomes <- processed$outcomes[[1]]

  fit <- simple_lm_impl(predictors, outcomes)

  new_simple_lm(
    coefs = fit$coefs,
    coef_names = fit$coef_names,
    blueprint = processed$blueprint
  )
}

User-Facing Fitting Function

Generic with methods for each interface. Each method calls mold() then the bridge.

simple_lm <- function(x, ...) {
 UseMethod("simple_lm")
}

simple_lm.default <- function(x, ...) {
  stop("`simple_lm()` is not defined for a '", class(x)[1], "'.", call. = FALSE)
}

simple_lm.data.frame <- function(x, y, intercept = TRUE, ...) {
  blueprint <- default_xy_blueprint(intercept = intercept)
  processed <- mold(x, y, blueprint = blueprint)
  simple_lm_bridge(processed)
}

simple_lm.matrix <- function(x, y, intercept = TRUE, ...) {
  blueprint <- default_xy_blueprint(intercept = intercept)
  processed <- mold(x, y, blueprint = blueprint)
  simple_lm_bridge(processed)
}

simple_lm.formula <- function(formula, data, intercept = TRUE, ...) {
  blueprint <- default_formula_blueprint(intercept = intercept)
  processed <- mold(formula, data, blueprint = blueprint)
  simple_lm_bridge(processed)
}

simple_lm.recipe <- function(x, data, intercept = TRUE, ...) {
  blueprint <- default_recipe_blueprint(intercept = intercept)
  processed <- mold(x, data, blueprint = blueprint)
  simple_lm_bridge(processed)
}

Prediction Implementation

One function per prediction type. Use spruce_*() for standardized output.

predict_simple_lm_numeric <- function(object, predictors) {
  coefs <- object$coefs
  pred <- as.vector(predictors %*% coefs)
  spruce_numeric(pred)  # Returns tibble with .pred column
}

Prediction Bridge

Converts forge() output and switches on type.

predict_simple_lm_bridge <- function(type, object, predictors) {
  type <- rlang::arg_match(type, "numeric")
  predictors <- as.matrix(predictors)

  switch(
    type,
    numeric = predict_simple_lm_numeric(object, predictors)
  )
}

User-Facing Predict Method

Call forge() with blueprint, then bridge, then validate.

predict.simple_lm <- function(object, new_data, type = "numeric", ...) {
  processed <- forge(new_data, object$blueprint)
  out <- predict_simple_lm_bridge(type, object, processed$predictors)
  validate_prediction_size(out, new_data)
  out
}

mold() Details

Returns: predictors (tibble), outcomes (tibble), extras, blueprint.

Blueprint Options

Blueprint Key Options
default_xy_blueprint() intercept
default_formula_blueprint() intercept, indicators ("traditional", "none", "one_hot")
default_recipe_blueprint() intercept

Formula Special Behaviors

  • No intercept by default (unlike base R)
  • indicators = "none" keeps factors unexpanded
  • Multivariate outcomes: y1 + y2 ~ x1 + x2 (not cbind())

forge() Validation

Automatically validates new data matches training data:

  • Column names must match
  • Column types must be compatible
  • Factor levels must be subset of training levels
  • Lossy conversions emit warnings (novel levels → NA)
# Missing column → error
# Wrong type (double for factor) → error
# Character for factor → silent conversion
# Novel factor level → warning + NA

Spruce Functions

Standardize prediction output to tidymodels conventions:

Function Output Column
spruce_numeric(pred) .pred
spruce_class(pred) .pred_class
spruce_prob(pred_matrix) .pred_{class_name}

Validation Functions

Function Checks
validate_outcomes_are_univariate() Single outcome column
validate_prediction_size() Output rows == input rows
validate_outcomes_are_numeric() Numeric outcomes
validate_predictors_are_numeric() Numeric predictors

See Also

  • designing-tidy-r-functions: Function API design
  • r-metaprogramming: Expression manipulation (if customizing blueprints)
  • testing-r-packages: Testing patterns

Vignettes

Access detailed documentation via R:

# Open vignette in browser
RShowDoc("mold", package = "hardhat")    # Molding data for modeling
RShowDoc("forge", package = "hardhat")   # Forging data for predictions
RShowDoc("package", package = "hardhat") # Creating modeling packages

# Or browse all vignettes
browseVignettes("hardhat")

External Resources

Weekly Installs
4
GitHub Stars
3
First Seen
Feb 26, 2026
Installed on
opencode4
gemini-cli4
antigravity4
claude-code4
github-copilot4
amp4