sd2R

R-hub check on the R Consortium cluster

sd2R is an R package that provides a native, GPU-accelerated Stable Diffusion pipeline by wrapping the C++ implementation from stable-diffusion.cpp and using ggmlR as the tensor backend.

Overview

sd2R exposes a high-level R interface for text-to-image and image-to-image generation, while all heavy computation (tokenization, encoders, denoiser, sampler, VAE, model loading) is implemented in C++. Supports SD 1.x, SD 2.x, SDXL, and Flux model families. Targets local inference on Linux with Vulkan-enabled AMD GPUs (with automatic CPU fallback via ggml), without relying on external Python or web APIs.

Architecture

Flux without Python:

R  →  sd2R  →  ggmlR  →  ggml  →  Vulkan  →  GPU

Key Features

Pipeline Example

pipe <- sd_pipeline(
  sd_node("txt2img", prompt = "a cat in space", width = 512, height = 512),
  sd_node("upscale", factor = 2),
  sd_node("img2img", strength = 0.3),
  sd_node("save", path = "output.png")
)

# Save / load as JSON
sd_save_pipeline(pipe, "my_pipeline.json")
pipe <- sd_load_pipeline("my_pipeline.json")

# Run
ctx <- sd_ctx("model.safetensors")
sd_run_pipeline(pipe, ctx, upscaler_ctx = upscaler)

Implementation Details

CRAN Readiness

Installation

# Install ggmlR first (if not already installed)
remotes::install_github("Zabis13/ggmlR")

# Install sd2R
remotes::install_github("Zabis13/sd2R")

During installation, the configure script automatically downloads tokenizer vocabulary files (~128 MB total) from GitHub Releases. This requires curl or wget.

Offline / Manual Installation

If you don’t have internet access during installation, download the vocabulary files manually and place them into src/sd/ before building:

# Download from https://github.com/Zabis13/sd2R/releases/tag/assets
# Files: vocab.hpp, vocab_mistral.hpp, vocab_qwen.hpp, vocab_umt5.hpp

wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_mistral.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_qwen.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_umt5.hpp -P src/sd/

R CMD INSTALL .

System Requirements

Benchmarks

FLUX.1-dev Q4_K_S — 10 steps

CLIP-L + T5-XXL text encoders, VAE. sample_steps = 10.

Test AMD RX 9070 (16 GB) Tesla P100 (16 GB) 2x Tesla T4 (16 GB)
1. 768x768 direct 44.2 s 94.0 s 133.1 s
2. 1024x1024 tiled VAE 163.6 s 151.4 s 243.6 s
3. 2048x1024 highres fix 309.7 s 312.5 s 492.2 s
4. img2img 768x768 direct 29.6 s 51.0 s 73.5 s
5. 1024x1024 direct 163.0 s 152.2 s 243.3 s
6. Multi-GPU 4 prompts 284.9 s (4 img)

FLUX.1-dev Q4_K_S — 25 steps

CLIP-L + T5-XXL (Q5_K_M) text encoders, VAE. sample_steps = 25.

Test AMD RX 9070 (16 GB) 2x Tesla T4 (16 GB)
768x768 direct 110.8 s
1024x1024 direct 553.1 s

Model size comparison

SD 1.5 Flux Q4_K_S
Diffusion params ~860 MB ~6.5 GB
Text encoders CLIP ~240 MB CLIP-L + T5-XXL ~3.9 GB
Sampling per step (768x768) ~0.1–0.3 s ~3.9 s
Architecture UNet MMDiT (57 blocks)

Examples

For a live, runnable demo see the Kaggle notebook: Stable Diffusion in R (ggmlR + Vulkan GPU).

See Also

License

MIT