Model Selection Guide
Choose the right AI model for your needs.
Available Models & Providers
Astronox supports 6 AI providers with a combined total of 12 distinct models.
Gemini Models (Google)
Gemini 2.5 Flash (Recommended)
Best for: Most everyday tasks
Strengths:
- โก Very fast (1โ2 seconds)
- ๐ฐ Free tier (60 req/min, 1500 req/day)
- ๐ฏ Excellent balance of speed & quality
- ๐ธ Good image analysis
- ๐ป Strong code generation
Specs:
- Context: 128k tokens (~100k words)
- Speed: โกโกโก
- Quality: โญโญโญโญ
- Cost: ๐ฐ Free / $0.075 per 1M input tokens (paid)
Use when:
- General questions and productivity
- File management
- Image analysis
- Code generation
- Quick responses needed
Gemini 2.5 Flash Lite
Best for: Simple tasks, maximum speed
Strengths:
- โกโก Fastest (<1 second)
- ๐ฐ๐ฐ Extremely low cost
- ๐ Efficient for simple queries
Specs:
- Context: 128k tokens
- Speed: โกโกโกโก
- Quality: โญโญโญ
- Cost: ๐ฐ Free / $0.0375 per 1M input tokens
Best for:
- Brief summaries
- Simple classifications
- Quick fact lookups
- Saving on quota
Gemini 2.5 Pro
Best for: Complex reasoning, research
Strengths:
- ๐ช Most powerful Gemini
- ๐ง Better reasoning than Flash
- ๐ Handles large documents well
- โ๏ธ Superior writing quality
Specs:
- Context: 128k tokens
- Speed: โกโก
- Quality: โญโญโญโญโญ
- Cost: ๐ฐ๐ฐ $1.25 per 1M input tokens (requires billing)
Use when:
- Complex analysis
- Research synthesis
- Writing tasks
- Advanced reasoning needed
OpenAI Models (GPT-5.2 Series)
GPT-5.2 (Recommended)
Best for: All-purpose tasks, professional use
Strengths:
- ๐ฏ Versatile for any task
- ๐ช Powerful reasoning
- โก Fast responses
- ๐ป Excellent code generation
- ๐ง Advanced understanding
Specs:
- Context: 128k tokens
- Speed: โกโก
- Quality: โญโญโญโญโญ
- Cost: ๐ฐ๐ฐ $3 per 1M input tokens
Use when:
- Production tasks
- Complex code
- Professional writing
- Advanced reasoning
- When you need the best
GPT-5.2-Codex
Best for: Code generation and optimization
Strengths:
- ๐ป Optimized for code
- ๐ Better at code patterns
- ๐ Superior code review
- โจ Generates efficient code
Specs:
- Context: 128k tokens
- Speed: โกโก
- Quality: โญโญโญโญโญ (for code)
- Cost: ๐ฐ๐ฐ $3 per 1M input tokens
Use when:
- Generating code
- Debugging
- Code optimization
- Architecture design
- Technical documentation
Claude Models (Anthropic)
Claude Haiku 4.5 (Recommended)
Best for: Fast, efficient reasoning
Strengths:
- โก Extremely fast
- ๐ก Surprisingly capable
- ๐ฐ Most affordable
- ๐ฏ Good reasoning for the price
- ๐ง Better understanding than Flash Lite
Specs:
- Context: 200k tokens (~150k words)
- Speed: โกโกโก
- Quality: โญโญโญโญ
- Cost: ๐ฐ $0.80 per 1M input tokens
Use when:
- Fast reasoning needed
- Budget-conscious
- General assistance
- Most daily tasks
Claude Sonnet 4.5
Best for: Advanced reasoning and nuance
Strengths:
- ๐ง Superior reasoning
- โ๏ธ Better writing quality
- ๐ฏ Excellent understanding of nuance
- ๐ Great with long documents
- ๐ Detailed analysis
Specs:
- Context: 200k tokens (~150k words)
- Speed: โกโก
- Quality: โญโญโญโญโญ
- Cost: ๐ฐ๐ฐ $3 per 1M input tokens
Use when:
- Complex reasoning needed
- Writing and content
- Nuanced analysis
- Research synthesis
- When quality matters most
Azure OpenAI Models
Same as OpenAI (GPT-5.2 series), deployed on Azure infrastructure.
Benefits:
- ๐ Enterprise security
- ๐ Regional deployment
- ๐ข HIPAA/compliance ready
- โ๏ธ Custom fine-tuning available
- ๐ Your data stays in your region
Auto Provider Mode
Select "Auto" to let Astronox choose:
- Simple queries โ Gemini Flash (fastest, free)
- Moderate tasks โ Gemini Pro (balanced)
- Complex/reasoning โ Claude Sonnet 4.5 (best reasoning)
- Code generation โ GPT-5.2-Codex (code optimized)
- Fallback โ Silent automatic retry with different provider
Benefits:
- โจ Best model for each task
- ๐ฐ Optimized cost
- โก Optimized speed
- ๐ฏ Optimized quality
Quick Comparison
| Task | Recommended | Why |
|---|---|---|
| General chat | Gemini Flash | Fast, free |
| Code | GPT-5.2-Codex | Code optimized |
| Reasoning | Claude Sonnet | Best reasoning |
| Budget | Claude Haiku | Cheapest capable |
| Enterprise | Azure GPT-5.2 | Security/compliance |
| Not sure? | Auto Mode | Automatic selection |
Cost Comparison (Per 1M Tokens)
| Model | Input Cost | Output Cost | Total Estimate |
|---|---|---|---|
| Gemini Flash | $0.075 | $0.30 | ~$0.001/req |
| Flash Lite | $0.0375 | $0.15 | ~$0.0005/req |
| Gemini Pro | $1.25 | $5.00 | ~$0.005/req |
| Claude Haiku | $0.80 | $4.00 | ~$0.002/req |
| Claude Sonnet | $3.00 | $15.00 | ~$0.010/req |
| GPT-5.2 | $3.00 | $12.00 | ~$0.010/req |
| GPT-5.2-Codex | $3.00 | $12.00 | ~$0.010/req |
How to Switch Models
- Open Astronox Settings
- Select Provider Tab (Gemini, OpenAI, Claude, etc.)
- Choose Model from dropdown (if available)
- Click Save
- Next message uses new model
Tips for Better Results
Choose Model for Task
- Simple task? โ Gemini Flash or Haiku (fast, cheap)
- Complex task? โ Gemini Pro, Sonnet, or GPT-5.2 (better reasoning)
- Code task? โ GPT-5.2-Codex (code optimized)
Use Auto Mode for Best Results
Auto Mode intelligently picks the best model based on what you ask.
- Switch to Flash or Flash Lite
- Reduce conversation history (new chat)
- Simplify your request
- Check internet speed
Next: Configure your Settings to customize your Astronox experience.
- โ Good for straightforward tasks
Specs:
- Max context: 32k tokens
- Speed: โกโกโก Excellent++
- Quality: โญโญโญ Good
- Cost: ๐ฐ๐ฐ Even cheaper than Flash
Use when:
- You need instant answers
- Simple file operations
- Quick questions
- Basic searches
- When cost is a major concern
Not ideal for:
- Complex multi-step tasks
- Detailed code generation
- Deep analysis
- Large context requirements
Gemini 2.5 Pro
Best for: Complex tasks, maximum quality
Strengths:
- ๐ฏ Highest reasoning quality
- ๐ Massive context (2 million tokens!)
- ๐ง Best for complex problems
- ๐ Superior analysis
- ๐ป Advanced code generation
- ๐ผ๏ธ Best image understanding
Specs:
- Max context: 2M tokens (~1.5M words)
- Speed: โกโก Good (3-5 seconds)
- Quality: โญโญโญโญโญ Excellent
- Cost: ๐ฐ๐ฐ Higher (still generous free tier)
Use when:
- Complex multi-step workflows
- Large codebases analysis
- Detailed documentation review
- Advanced problem-solving
- Working with massive context
- Highest quality output needed
Example tasks:
- "Analyze my entire project and suggest architecture improvements"
- "Create a comprehensive backup and organization system"
- "Review these 50 files and find optimization opportunities"
- "Build a complete deployment pipeline"
Trade-offs:
- Slower responses (worth it for complex tasks)
- Higher API costs (but still reasonable)
- May be overkill for simple tasks
MintAI Models
Devstral 2 (Pro)
Best for: Pro subscribers, code-specialized tasks
Strengths:
- ๐ป Optimized for code (Mistral AI's coding model)
- ๐ ๏ธ Excellent tool integration
- โก Fast responses
- ๐ฏ Strong technical accuracy
- ๐ Automatic key management (Pro)
Specs:
- Max context: 128k tokens
- Speed: โกโกโก Excellent
- Quality: โญโญโญโญ Very Good (especially code)
- Cost: ๐ฐ Included with Pro subscription
Use when:
- You're a Pro subscriber
- Heavy code generation/review
- Technical automation tasks
- Development workflows
- Want alternative to Gemini
Availability:
- Requires active Pro subscription
- API key auto-fetched by Astronox
- No manual configuration needed
Model Comparison Table
| Feature | Flash | Flash Lite | Pro | Devstral 2 |
|---|---|---|---|---|
| Speed | โกโกโก | โกโกโก+ | โกโก | โกโกโก |
| Quality | โญโญโญโญ | โญโญโญ | โญโญโญโญโญ | โญโญโญโญ |
| Cost | ๐ฐ Low | ๐ฐ๐ฐ Lowest | ๐ฐ๐ฐ Medium | ๐ฐ Subscription |
| Context | 128k | 32k | 2M | 128k |
| Code | Great | Good | Excellent | Specialized |
| Images | Good | Basic | Excellent | Good |
| Free Tier | โ Yes | โ Yes | โ Yes | โ Sub only |
Choosing the Right Model
Decision Tree
Need maximum quality or huge context?
โโ Yes โ Gemini Pro
โโ No โ
Working primarily with code?
โโ Yes (+ Pro) โ Devstral 2
โโ Yes (no subscription) โ Gemini Flash
โโ No โ
Need instant responses? Simple tasks only?
โโ Yes โ Gemini Flash Lite
โโ No โ Gemini Flash (default)
By Task Type
๐ File Management
- Best: Gemini Flash
- Alternative: Flash Lite (simple), Pro (complex workflows)
๐ป Code Generation
- Best: Devstral 2 (Pro) or Gemini Pro
- Alternative: Gemini Flash (very capable)
๐ผ๏ธ Image Analysis
- Best: Gemini Pro
- Alternative: Gemini Flash (usually sufficient)
๐ Search & Analysis
- Best: Gemini Flash (most cases), Pro (large datasets)
๐ค Automation Scripts
- Best: Devstral 2 or Gemini Pro
- Alternative: Gemini Flash
๐ Data Processing
- Best: Gemini Pro (large data), Flash (typical)
๐ฌ General Questions
- Best: Gemini Flash Lite or Flash
๐ง Complex Problem-Solving
- Best: Gemini Pro
- Alternative: Devstral 2 (technical), Flash (simpler)
Switching Models
In Settings
- Click Settings (โ๏ธ) โ General tab
- Find Model Selection dropdown
- Select your preferred model:
- Gemini 2.5 Flash (default)
- Gemini 2.5 Flash Lite
- Gemini 2.5 Pro
- Devstral 2 (if Pro active)
- Click Save Settings
Via Chat
"Switch to Gemini Pro"
"Use the fastest model"
"Change to Devstral 2"
Per-Conversation
Model selection applies to new messages in current conversation. You can:
- Switch mid-conversation (new responses use new model)
- Start new chat to reset context with different model
Performance Characteristics
Response Time
Typical latency (simple request):
- Flash Lite: 0.5-1s
- Flash: 1-2s
- Devstral 2: 1-2.5s
- Pro: 2-5s
Complex multi-step task:
- Flash Lite: Not recommended
- Flash: 5-15s
- Devstral 2: 5-20s
- Pro: 10-30s
Token Limits
What are tokens?
Tokens โ words/4. Example: 1000 words โ 1333 tokens.
Model limits:
- Flash Lite: 32k (~24k words) - about 40 pages
- Flash: 128k (~96k words) - about 150 pages
- Devstral 2: 128k (~96k words) - about 150 pages
- Pro: 2M (~1.5M words) - entire novel + codebases
What counts toward limit:
- Your conversation history in current chat
- System prompt (~2k tokens)
- Your message + attachments
- AI's responses
When you hit limit:
- Older messages get truncated (auto-managed)
- Or start new conversation
Cost Considerations
Free Tier Limits (Gemini)
Flash & Flash Lite:
- 60 requests per minute
- 1,500 requests per day
- More than enough for typical use
Pro:
- 10 requests per minute
- 500 requests per day
- Still generous for power users
If you exceed free tier:
- Set up billing in Google AI Studio
- Pay per token (very affordable)
- Flash: ~$0.001 per request (typical)
- Pro: ~$0.005 per request (typical)
Pro (MintAI)
Subscription model:
- Fixed monthly/annual fee
- Unlimited requests (within reason)
- Access to multiple premium models
- No per-token billing
Cost comparison:
- Heavy users (100+ requests/day): Subscription may be cheaper
- Light users (<50 requests/day): Free tier likely sufficient
Quality Differences Explained
Understanding "Quality"
What varies between models:
- Reasoning depth: How well they understand complex requests
- Accuracy: Correctness of information and code
- Context awareness: Remembering earlier conversation
- Instruction following: Doing exactly what you asked
- Creativity: Generating novel solutions
Real-World Examples
Simple task: "List files in Downloads"
- All models: Nearly identical results
- Flash Lite is faster, no quality loss
Medium task: "Write a Python script to organize files by date"
- Flash/Devstral: Great scripts, maybe minor issues
- Pro: More robust error handling, better comments
Complex task: "Analyze my entire codebase and refactor for performance"
- Flash: Good insights, may miss patterns
- Pro: Comprehensive analysis, deeper understanding
- Devstral: Strong on code specifics
Image task: "Describe everything in this complex diagram"
- Flash: Identifies main elements
- Pro: Detailed analysis, relationships, subtle details
Accuracy & Reliability
Code Generation Accuracy
Rankings (by testing):
- Gemini Pro (98% working code)
- Devstral 2 (97% working code)
- Gemini Flash (95% working code)
- Flash Lite (90% working code)
For production code: Use Pro or Devstral 2
For prototypes/scripts: Flash is fine
For learning examples: Any model works
Factual Accuracy
All models can occasionally:
- Hallucinate (make up information)
- Misunderstand context
- Provide outdated info
Mitigation:
- Verify critical information
- Ask model to cite sources
- Cross-check important decisions
- Use Pro for high-stakes tasks
Advanced Model Features
Function/Tool Calling
All models support Astronox's tool system:
- File operations
- System control
- Multi-step planning
Best at tool calling:
- Gemini Pro (most accurate)
- Gemini Flash (very reliable)
- Devstral 2 (good, code-focused)
- Flash Lite (basic support)
Multi-Modal (Vision)
Image understanding:
- Pro: Best detail and accuracy
- Flash: Very good for most images
- Devstral 2: Good, less specialized
- Flash Lite: Basic recognition
Supported image types: JPG, PNG, GIF, WebP
Code-Specific Features
Devstral 2 advantages:
- Optimized for code completion
- Better understanding of frameworks
- Strong at refactoring patterns
- Excellent at debugging
Best Practices
โ Start with Flash
Default model (Flash) handles 90% of tasks excellently.
โ Upgrade to Pro for:
- Large file analysis
- Complex multi-step workflows
- Detailed code architecture
- When Flash gives unsatisfactory results
โ Use Flash Lite for:
- Quick questions
- Simple file listings
- Basic searches
- When speed matters most
โ Try Devstral 2 if:
- You're a Pro subscriber
- Heavy code-focused workflows
- Want alternative perspective
- Exploring different model capabilities
โ Monitor Your Usage
- Settings โ API Keys โ View Usage (future)
- Track if approaching free tier limits
- Upgrade to paid tier if needed
Troubleshooting Model Issues
"Rate limit exceeded"
Solution:
- You've hit free tier limits
- Wait for rate limit reset (1 minute or 1 day)
- Or enable billing in Google AI Studio
- Or switch to different model/provider
"Model not available"
Solutions:
- Devstral 2: Check Pro subscription is active
- Gemini: Verify API key is valid
- Check internet connection
"Poor quality responses"
Try:
- Switch to higher-quality model (Pro)
- Provide more context in your prompt
- Break complex tasks into simpler steps
- Start new conversation (reset context)
"Too slow"
Solutions:
- Switch to Flash or Flash Lite
- Reduce conversation history (new chat)
- Simplify your request
- Check internet speed
Next: Configure your Settings to customize your Astronox experience.