Phala: Gemma-4 26B-A4B Uncensored (Heretic)
Uncensored "Heretic" variant of google/gemma-4-26B-A4B-it created using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method and row-norm preservation. Refusals drop from 100/100 to 11/100 with KL divergence 0.0499 vs the base model. The base Gemma 4 26B A4B is a Mixture-of-Experts model with 25.2B total / 3.8B active parameters (8 active / 128 total experts), 30-layer transformer with hybrid local sliding (1024) + global attention, supporting a 256K context window. Natively multimodal (text + images, variable aspect ratios). Strong on coding, reasoning, function calling, with native system prompt support across 35+ languages. Served on Phala in TDX-attested H200 enclave with end-to-end ECDSA response signing; vLLM-compatible FP8-Static quantization by cloud19 (router excluded from quantization).
Providers for Phala: Gemma-4 26B-A4B Uncensored (Heretic)
RedPill routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.
API
RedPill provides an OpenAI-compatible completion API to all models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.
fetch("https://api.redpill.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer <YOUR-REDPILL-API-KEY>",
"Content-Type": "application/json"
},
body: JSON.stringify({
"model": "phala/gemma-4-26b-a4b-uncensored",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
})
})