Build with Private AI.

Integrate RedPill's Private AI into your app with a simple API. Access dozens of AI models through one secure endpoint. No more juggling multiple AI APIs or worrying about data compliance.

Key Features for Developers

Unified API for 60+ Models

One API key unlocks GPT-4, Claude, Llama, Mistral and more. No vendor lock-in - switch models or use Smart Router to auto-select the best model per request.

Privacy & Security Built-In

All API calls are processed in confidential enclaves. Feed sensitive data to the API and even we can't read it. Ideal for healthcare, legal, or enterprise apps.

Simple SDKs & Docs

SDKs available in Python, JavaScript, and more. Robust REST API with clear documentation. Get started in minutes with our quickstart guides.

Example Use Cases

Add a confidential AI assistant to your app. Process user data with AI without storing it. Use RedPill as a secure backend for chatbots or automation.

Flexible Deployment

Enterprise options for dedicated private instances. Deploy on-prem or in your VPC for maximum control and compliance with your organization's policies.

Performance & Cost Controls

Smart Router ensures efficient model usage. Save costs by routing to appropriate models per request. Rate limits and flexible pricing tiers available.

Just a few lines of code

YOUR CODE.
OUR PRIVACY.

Integrate private AI into your app with simple SDKs. OpenAI-compatible API means minimal code changes to switch from other providers.

View Full Docs
redpill-chat.js
// Node.js / JavaScript SDKimport RedPill from 'redpill-sdk';const client = new RedPill({  apiKey: process.env.REDPILL_API_KEY});// Simple chat completionconst response = await client.chat.completions.create({  model: 'gpt-4',  messages: [    { role: 'user', content: 'Summarize this contract' }  ]});console.log(response.choices[0].message.content);// With streamingconst stream = await client.chat.completions.create({  model: 'claude-3-opus',  messages: [{ role: 'user', content: 'Write a haiku' }],  stream: true});for await (const chunk of stream) {  process.stdout.write(chunk.choices[0]?.delta?.content || '');}

Explore AI Models

From private models in GPU TEE to all your favorites.

z-ai logo
Z.AI: GLM 5
NewGPU TEE
GLM-5 is an open-source foundation model built for complex systems engineering and long-horizon agent workflows. It delivers production-grade productivity for large-scale programming tasks, with performance aligned to top closed-source models, and is designed for expert developers building at the system level.
by phala|203K context|$1.20/M input|$3.50/M output
Intel TDXNVIDIA CC
z-ai logo
Z.AI: GLM 4.7 Flash
GPU TEE
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.
by phala|203K context|$0.10/M input|$0.43/M output
Intel TDXNVIDIA CCBETA
qwen logo
Qwen: Qwen3 Embedding 8B
GPU TEE
The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
by phala|33K context|$0.01/M input|$0.00/M output
Intel TDXNVIDIA CC
phala logo
Phala: Venice Uncensored 24B
GPU TEE
Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.
by phala|33K context|$0.20/M input|$0.90/M output
Intel TDXNVIDIA CC
qwen logo
Qwen: Qwen3 VL 30B A3B Instruct
GPU TEE
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.
by phala|128K context|$0.20/M input|$0.70/M output
Intel TDXNVIDIA CC
sentence-transformers logo
Sentence Transformers: all-MiniLM-L6-v2
GPU TEE
The all-MiniLM-L6-v2 embedding model maps sentences and short paragraphs into a 384-dimensional dense vector space, enabling high-quality semantic representations that are ideal for downstream tasks such as information retrieval, clustering, similarity scoring, and text ranking.
by phala|512 context|$0.005/M input|$0.00/M output
Intel TDXNVIDIA CC

Start Building.

API Documentation

Comprehensive guides, API references, and tutorials to help you integrate RedPill into your applications. Try the interactive playground or get a free API key.

Developer Community

Join our Discord community to connect with other developers, get help with integration questions, and share what you're building with RedPill.

Ready to experience private AI?

Try RedPill in our Private AI Playground - no signup needed. Your conversations stay encrypted and completely private.

Try RedPill Free
Private Chat
E2E Encrypted
AI
Hi! I'm your private AI assistant. Ask me anything - your conversations are fully encrypted.
Zero data retentionTEE secured