Fluent AI: Offline & Cloud LLM Android

Fluent AI: Offline & Cloud LLM

AI chat assistant with offline models - private, customizable, multimodal

Features & Capabilities

🤖 Fluent AI — Private Offline LLM + Claude, GPT-4 & Gemini

Run AI entirely on your device — no cloud, no account, no data sent anywhere. Then switch to Claude, GPT-4 or Gemini when you need more power. One app. Every AI. Always private.

✨ WHAT'S NEW IN v1.3

🏥 MEDICAL AI (MedGemma)
• Google's MedGemma 4B — clinical Q&A and biomedical text, 100% on-device
• Requires accepting Google's Health AI Developer Foundation Terms
• Not a substitute for professional medical advice

🤖 AGENTIC MODE
• On-device AI agent with 12 built-in skills
• Runs tasks autonomously: calendar events, web research, document digest, trip planning
• Agent Task Inspector — see every reasoning step in real time
• 3 free agent runs/day — no subscription needed to start
• Scheduled tasks available with Premium

⚡ LITERT MTP — UP TO 2× FASTER
• Gemma 4n E2B/E4B with Multi-Token Prediction on Android GPU
• Speculative decoding — more tokens per step, same quality
• Tok/s display measures decode-phase speed only for accurate results

👁️ ON-DEVICE VISION (Android)
• Attach photos using Gemma 4n — processed entirely on-device
• No image uploaded to any server, ever

🔒 PRIVACY FIRST
• Conversations stay on your device
• Optional local models = zero cloud data
• API keys encrypted with AES — never stored in plain text
• No mandatory account required

🧠 LOCAL AI MODELS
• GGUF / llama.cpp: Gemma 3/4, Qwen 3.5, Phi-4, Llama, DeepSeek R1, Nemotron, MedGemma
• LiteRT (Android GPU/NPU): Gemma 4n E2B/E4B — vision + MTP speculative decoding
• Apple MLX: Native Metal on Apple Silicon and iOS 18+ (A17 Pro+)
• Q5_K GPU acceleration on Qualcomm Adreno (alongside Q4_0)
• Device-aware model recommendations based on your RAM and chipset
• Browse, download, and manage models in-app — no sideloading needed
• Import custom GGUF from HuggingFace URL or device storage

☁️ CLOUD AI
• Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google)
• OpenRouter — 200+ models via a single API key
• Streaming, vision, and tool calling across all providers

🌐 ONLINE SERVERS
• Ollama Cloud and self-hosted Ollama
• LM Studio, vLLM, LocalAI, and any OpenAI-compatible /v1 API
• Multiple server profiles with per-profile encrypted auth headers

🎤 VOICE MODE
• 5 conversation modes: Normal, Interview, Learning, Storytelling, Translation
• Animated waveform, voice commands (speed, repeat, stop)
• Quick-capture mic button directly in the chat input bar

📚 KNOWLEDGE BASES (RAG)
• Import PDFs, TXT, and Markdown — AI references your docs when answering
• Semantic search for relevant context, topic and project organisation

🔧 POWER FEATURES
• Tool calling: Calculator, DateTime, Weather, Web Search, mem0 Memory
• MCP servers: GitHub, Slack, Notion, Supabase, and 20+ presets
• Code execution: Python, Bash, Node.js from code blocks (desktop + mobile JS)
• Model benchmarking: tok/s, TTFT, MMLU-50 quality score, shareable PNG cards
• Slash commands: /agent, /clear, /export, /voice, /template and more
• Per-chat thinking toggle for Qwen3, DeepSeek R1, Nemotron reasoning models
• URL context injection — paste a link, AI reads the page for context
• Polish Before Send — AI rewrites your draft before you hit send
• Continue button — resumes responses cut off at the token limit

📁 CHAT ORGANISATION
• Folders, tags, and cross-chat full-text search across every message
• HuggingFace model browser with bookmarks and memory fitness badges
• Conversation branching and message reactions

🌟 PREMIUM (OPTIONAL)
• Ad-free experience
• Scheduled agent tasks (recurring or one-time)
• Priority feature access and advanced analytics

📱 PERFECT FOR
✓ Privacy-focused users — local models, zero cloud data
✓ Android power users — LiteRT GPU/NPU with MTP acceleration
✓ Developers — benchmark GGUF, LiteRT, and MLX side-by-side
✓ Healthcare researchers — MedGemma on-device, no upload needed
✓ Students — knowledge bases for study documents and materials
✓ Professionals — agentic tasks, document Q&A, and tool calling

User Growth & Download Statistics

App
By:
ReadHeights Technologies Private Limited
Downloads:
9,023 838
Version:
1.3.2 Last updated: 2026-05-19
Version code:
73
Creation date:
2025-10-29
Publisher country:
IN IN
Permissions:
  • android.permission.RECORD_AUDIO Very high risk
  • android.permission.ACCESS_NOTIFICATION_POLICY Moderate risk
  • android.permission.BLUETOOTH Moderate risk
  • android.permission.BLUETOOTH_ADMIN Moderate risk
  • android.permission.BLUETOOTH_CONNECT Moderate risk
  • android.permission.REQUEST_IGNORE_BATTERY_OPTIMIZATIONS Moderate risk
  • com.google.android.gms.permission.AD_ID Moderate risk
  • android.permission.ACCESS_ADSERVICES_AD_ID Low risk
  • android.permission.ACCESS_ADSERVICES_ATTRIBUTION Low risk
  • android.permission.ACCESS_ADSERVICES_TOPICS Low risk
  • See more
Size:
107.89MB
Email:
re*****@gmail.com
URLs:
Website ,Privacy policy
Full description:
See detailed description
Source:
Google Play Store
Data ingested on:
2026-06-11
Compare stats and ranking:

Contact the developer

Chrome-Stats does not own this Android app. Please use these information below to contact the Android app developer.
Developed by:
ReadHeights Technologies Private Limited
Google Play Store
https://play.google.com/store/apps/details?id=com.readheights.fluentai
Email:
re*****@gmail.com
Website:
https://readheights.com/

Best Fluent AI: Offline & Cloud LLM Alternatives

Here are some Android apps that are similar to Fluent AI: Offline & Cloud LLM: