Build with Private AI.

Integrate RedPill's Private AI into your app with a simple API. Access dozens of AI models through one secure endpoint. No more juggling multiple AI APIs or worrying about data compliance.

Documentation Get API Key

Key Features for Developers

Unified API for 60+ Models

One API key unlocks GPT-4, Claude, Llama, Mistral and more. No vendor lock-in - switch models or use Smart Router to auto-select the best model per request.

Privacy & Security Built-In

All API calls are processed in confidential enclaves. Feed sensitive data to the API and even we can't read it. Ideal for healthcare, legal, or enterprise apps.

Simple SDKs & Docs

SDKs available in Python, JavaScript, and more. Robust REST API with clear documentation. Get started in minutes with our quickstart guides.

Example Use Cases

Add a confidential AI assistant to your app. Process user data with AI without storing it. Use RedPill as a secure backend for chatbots or automation.

Flexible Deployment

Enterprise options for dedicated private instances. Deploy on-prem or in your VPC for maximum control and compliance with your organization's policies.

Performance & Cost Controls

Smart Router ensures efficient model usage. Save costs by routing to appropriate models per request. Rate limits and flexible pricing tiers available.

View Pricing

Just a few lines of code

YOUR CODE.
OUR PRIVACY.

Integrate private AI into your app with simple SDKs. OpenAI-compatible API means minimal code changes to switch from other providers.

View Full Docs

redpill-chat.js

// Node.js / JavaScript SDKimport RedPill from 'redpill-sdk';const client = new RedPill({  apiKey: process.env.REDPILL_API_KEY});// Simple chat completionconst response = await client.chat.completions.create({  model: 'gpt-4',  messages: [    { role: 'user', content: 'Summarize this contract' }  ]});console.log(response.choices[0].message.content);// With streamingconst stream = await client.chat.completions.create({  model: 'claude-3-opus',  messages: [{ role: 'user', content: 'Write a haiku' }],  stream: true});for await (const chunk of stream) {  process.stdout.write(chunk.choices[0]?.delta?.content || '');}

Explore AI Models

From private models in GPU TEE to all your favorites.

Z.AI: GLM 5

NewGPU TEE

GLM-5 is an open-source foundation model built for complex systems engineering and long-horizon agent workflows. It delivers production-grade productivity for large-scale programming tasks, with performance aligned to top closed-source models, and is designed for expert developers building at the system level.

by phala|203K context|$1.20/M input|$3.50/M output

Intel TDXNVIDIA CC

Z.AI: GLM 4.7 Flash

GPU TEE

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

by phala|203K context|$0.10/M input|$0.43/M output

Intel TDXNVIDIA CCBETA

Qwen: Qwen3 Embedding 8B

GPU TEE

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

by phala|33K context|$0.01/M input|$0.00/M output

Intel TDXNVIDIA CC

Phala: Venice Uncensored 24B

GPU TEE

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.

by phala|33K context|$0.20/M input|$0.90/M output

Intel TDXNVIDIA CC

Qwen: Qwen3 VL 30B A3B Instruct

GPU TEE

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.

by phala|128K context|$0.20/M input|$0.70/M output

Intel TDXNVIDIA CC

Sentence Transformers: all-MiniLM-L6-v2

GPU TEE

The all-MiniLM-L6-v2 embedding model maps sentences and short paragraphs into a 384-dimensional dense vector space, enabling high-quality semantic representations that are ideal for downstream tasks such as information retrieval, clustering, similarity scoring, and text ranking.

by phala|512 context|$0.005/M input|$0.00/M output

Intel TDXNVIDIA CC

View All Models

Start Building.

API Documentation

Comprehensive guides, API references, and tutorials to help you integrate RedPill into your applications. Try the interactive playground or get a free API key.

View Documentation Get API Key

Developer Community

Join our Discord community to connect with other developers, get help with integration questions, and share what you're building with RedPill.

Join Discord GitHub

Ready to experience private AI?

Try RedPill in our Private AI Playground - no signup needed. Your conversations stay encrypted and completely private.

Try RedPill Free

Private Chat

E2E Encrypted

Hi! I'm your private AI assistant. Ask me anything - your conversations are fully encrypted.

Zero data retentionTEE secured

Privacy-first AI solutions that keep your data secure and confidential.

Build with Private AI.

Key Features for Developers

Unified API for 60+ Models

Privacy & Security Built-In

Simple SDKs & Docs

Example Use Cases

Flexible Deployment

Performance & Cost Controls

YOUR CODE.
OUR PRIVACY.

Explore AI Models

Start Building.

API Documentation

Developer Community

Ready to experience private AI?

Products

Developers

Resources

Build with Private AI.

Key Features for Developers

Unified API for 60+ Models

Privacy & Security Built-In

Simple SDKs & Docs

Example Use Cases

Flexible Deployment

Performance & Cost Controls

YOUR CODE.OUR PRIVACY.

Explore AI Models

Start Building.

API Documentation

Developer Community

Ready to experience private AI?

Products

Developers

Resources

YOUR CODE.
OUR PRIVACY.