phala logo

Phala: Gemma-4 26B-A4B Uncensored (Heretic)

GPU TEE
Chat
phala/gemma-4-26b-a4b-uncensored
Created May 23, 2026|66K context|$0.15/M input tokens|$0.70/M output tokens
Intel TDXNVIDIA CC

Uncensored "Heretic" variant of google/gemma-4-26B-A4B-it created using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method and row-norm preservation. Refusals drop from 100/100 to 11/100 with KL divergence 0.0499 vs the base model. The base Gemma 4 26B A4B is a Mixture-of-Experts model with 25.2B total / 3.8B active parameters (8 active / 128 total experts), 30-layer transformer with hybrid local sliding (1024) + global attention, supporting a 256K context window. Natively multimodal (text + images, variable aspect ratios). Strong on coding, reasoning, function calling, with native system prompt support across 35+ languages. Served on Phala in TDX-attested H200 enclave with end-to-end ECDSA response signing; vLLM-compatible FP8-Static quantization by cloud19 (router excluded from quantization).

Providers for Phala: Gemma-4 26B-A4B Uncensored (Heretic)

RedPill routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

phala logo
phala
Total Context
66K

API

RedPill provides an OpenAI-compatible completion API to all models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

fetch("https://api.redpill.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer <YOUR-REDPILL-API-KEY>",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    "model": "phala/gemma-4-26b-a4b-uncensored",
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  })
})