LaunchLLM / demo_app.py
Bmccloud22's picture
Deploy simplified investor demo - Gradio 4.0.0 compatible
c27bcc3 verified
"""
LaunchLLM - Minimal Demo for Investor Presentations
Compatible with Gradio 4.0.0 on HuggingFace Spaces
"""
import gradio as gr
import json
from pathlib import Path
# Load model registry info
def get_available_models():
"""Get list of supported models"""
return [
"Qwen 2.5 7B (Best for 8GB GPU)",
"Llama 3.1 8B (General Purpose)",
"Phi-3 Mini (Fastest Training)",
"Mistral 7B (Strong Reasoning)",
"Qwen 2.5 32B (Production Quality)"
]
def get_model_info(model_name):
"""Get information about a model"""
info = {
"Qwen 2.5 7B (Best for 8GB GPU)": "**VRAM Required:** 6-8GB\n**Training Time:** 30-60 min\n**Use Case:** Development & testing",
"Llama 3.1 8B (General Purpose)": "**VRAM Required:** 8-10GB\n**Training Time:** 45-90 min\n**Use Case:** Production ready",
"Phi-3 Mini (Fastest Training)": "**VRAM Required:** 4-6GB\n**Training Time:** 15-30 min\n**Use Case:** Quick iterations",
"Mistral 7B (Strong Reasoning)": "**VRAM Required:** 8-10GB\n**Training Time:** 45-90 min\n**Use Case:** Complex tasks",
"Qwen 2.5 32B (Production Quality)": "**VRAM Required:** 24GB+\n**Training Time:** 2-4 hours\n**Use Case:** Best quality"
}
return info.get(model_name, "Select a model to see details")
def generate_sample_data(topic, num_examples):
"""Generate sample training data (mock for demo)"""
examples = []
topics = topic.split(',') if topic else ["Financial Planning"]
for i in range(int(num_examples)):
topic_name = topics[i % len(topics)].strip()
examples.append({
"instruction": f"Example question about {topic_name} #{i+1}",
"input": "",
"output": f"Detailed answer about {topic_name} would go here..."
})
return json.dumps(examples, indent=2)
def train_model(model, data, epochs, learning_rate):
"""Simulate training (for demo purposes)"""
if not data or data == "{}":
return "❌ Please generate or add training data first!"
return f"""βœ… Training Started Successfully!
**Model:** {model}
**Epochs:** {epochs}
**Learning Rate:** {learning_rate}
πŸ“Š **Training Progress:**
━━━━━━━━━━━━━━━━━━━━ 100%
**Note:** This is a demo environment. In production:
- Training runs on GPU (local or cloud)
- Takes 30-120 minutes depending on model size
- Automatically saves checkpoints
- Runs evaluation on completion
**Next Steps:**
1. Test your trained model in the Testing tab
2. Run certification benchmarks
3. Deploy to production
"""
def test_model(question):
"""Simulate model inference (for demo)"""
if not question:
return "Please enter a question to test the model."
return f"""**Your Question:** {question}
**AI Response:**
Based on your question about financial planning, here's a comprehensive answer:
In a production deployment, this would be a real response from your fine-tuned model. The model would have been trained on your specific domain data (financial advisory, medical, legal, etc.) and would provide accurate, relevant answers.
**Training Details:**
- Fine-tuned using LoRA (parameter-efficient)
- Trained on your custom dataset
- Optimized for your specific use case
**Production Features:**
- Real-time inference
- Cloud GPU deployment
- API endpoints
- Monitoring & logging
"""
# Create the demo interface
with gr.Blocks(
title="LaunchLLM - AI Training Platform",
theme=gr.themes.Soft()
) as demo:
gr.Markdown("""
# πŸš€ LaunchLLM - AI Model Training Platform
**Train custom AI models for your domain - no coding required**
Perfect for: Financial Advisors β€’ Medical Practices β€’ Law Firms β€’ Educational Institutions
""")
with gr.Tabs():
# Tab 1: Overview
with gr.Tab("πŸ“– Overview"):
gr.Markdown("""
## What is LaunchLLM?
LaunchLLM is a **no-code platform** for training custom AI models using state-of-the-art techniques:
### ✨ Key Features
**1. No-Code Training**
- Select a pre-configured model
- Upload or generate training data
- Click "Train" - that's it!
**2. Efficient Training (LoRA/PEFT)**
- Train only 1-3% of model parameters
- 10x faster than full fine-tuning
- Works on consumer GPUs (8GB+)
**3. Professional Domains**
- **Financial Advisory:** CFP, CFA exam-ready models
- **Medical:** HIPAA-compliant medical assistants
- **Legal:** Contract law, compliance
- **Education:** Subject-specific tutors
**4. Production Ready**
- Cloud GPU integration (RunPod)
- Automatic evaluation & benchmarking
- Knowledge gap analysis
- API deployment
### πŸ’° Cost Efficiency
- **Training:** $2-10 per custom model
- **Inference:** Free (local) or $0.60/hr (cloud GPU)
- **ROI:** Automate 60%+ of routine questions
### 🎯 Use Cases
| Industry | Use Case | ROI |
|----------|----------|-----|
| **Financial Services** | CFP-certified advisor chatbot | 40% cost reduction |
| **Medical Practices** | Patient intake & triage | 10x faster processing |
| **Law Firms** | Contract review & research | 60% time savings |
| **Education** | Personalized tutoring | 5x student engagement |
### πŸ† Competitive Advantages
vs. **OpenAI Fine-tuning:**
- βœ… Own your model (not dependent on API)
- βœ… 10x cheaper per model
- βœ… No ongoing per-token costs
vs. **Building from scratch:**
- βœ… Ready in hours, not months
- βœ… No ML expertise required
- βœ… Pre-configured for best practices
---
**Ready to try it?** Click the tabs above to:
1. **Training Data** β†’ Generate sample data
2. **Model Training** β†’ Start training a model
3. **Testing** β†’ Chat with your AI
""")
# Tab 2: Training Data
with gr.Tab("πŸ“Š Training Data"):
gr.Markdown("### Generate Sample Training Data")
gr.Markdown("In production, this uses GPT-4 or Claude to generate high-quality training examples.")
with gr.Row():
with gr.Column():
data_topic = gr.Textbox(
label="Topics (comma-separated)",
value="Retirement Planning, Tax Strategy, Estate Planning"
)
data_num = gr.Slider(
label="Number of Examples",
minimum=5,
maximum=100,
value=20,
step=5
)
generate_btn = gr.Button("✨ Generate Sample Data", variant="primary")
with gr.Column():
data_output = gr.Code(
label="Generated Training Data (JSON)",
language="json",
lines=15
)
generate_btn.click(
fn=generate_sample_data,
inputs=[data_topic, data_num],
outputs=data_output
)
gr.Markdown("""
**Production Features:**
- AI-generated Q&A pairs using GPT-4 or Claude
- Automatic quality validation and scoring
- Import from HuggingFace datasets
- Upload custom JSON/CSV data
- Duplicate detection and removal
""")
# Tab 3: Model Training
with gr.Tab("πŸŽ“ Model Training"):
gr.Markdown("### Train Your Custom AI Model")
with gr.Row():
with gr.Column():
model_selector = gr.Dropdown(
choices=get_available_models(),
value=get_available_models()[0],
label="Select Model"
)
model_info_display = gr.Markdown()
gr.Markdown("### Training Parameters")
train_epochs = gr.Slider(
label="Training Epochs",
minimum=1,
maximum=10,
value=3,
step=1
)
train_lr = gr.Dropdown(
choices=["1e-4", "2e-4", "5e-4"],
value="2e-4",
label="Learning Rate"
)
train_btn = gr.Button("πŸš€ Start Training", variant="primary", size="lg")
with gr.Column():
training_output = gr.Textbox(
label="Training Status",
lines=20
)
# Wire up model info display
model_selector.change(
fn=get_model_info,
inputs=model_selector,
outputs=model_info_display
)
# Set initial model info
demo.load(
fn=get_model_info,
inputs=model_selector,
outputs=model_info_display
)
# Wire up training
train_btn.click(
fn=train_model,
inputs=[model_selector, data_output, train_epochs, train_lr],
outputs=training_output
)
gr.Markdown("""
**Production Training Features:**
- Real GPU training (local or cloud)
- Live progress monitoring
- Automatic checkpointing
- TensorBoard integration
- WandB experiment tracking
- Automatic evaluation on completion
""")
# Tab 4: Testing
with gr.Tab("πŸ§ͺ Testing"):
gr.Markdown("### Test Your Trained Model")
gr.Markdown("Ask questions to see how your trained model responds.")
with gr.Row():
with gr.Column():
test_question = gr.Textbox(
label="Ask a Question",
lines=3,
value="Should I prioritize paying off my student loans or investing in my 401k?"
)
test_btn = gr.Button("πŸ’¬ Get Answer", variant="primary")
with gr.Column():
test_response = gr.Textbox(
label="Model Response",
lines=15
)
test_btn.click(
fn=test_model,
inputs=test_question,
outputs=test_response
)
gr.Markdown("""
**Production Testing Features:**
- Real-time inference from trained model
- Certification exam benchmarks (CFP, CFA, CPA)
- Custom benchmark creation
- A/B testing between model versions
- Performance metrics & analytics
""")
# Tab 5: About
with gr.Tab("ℹ️ About"):
gr.Markdown("""
## About LaunchLLM
### 🎯 Mission
Make custom AI model training accessible to domain experts without requiring ML expertise.
### πŸ› οΈ Technology Stack
- **Framework:** PyTorch + Hugging Face Transformers
- **Training:** LoRA/PEFT (parameter-efficient fine-tuning)
- **Models:** Qwen, Llama, Mistral, Phi, Gemma
- **Interface:** Gradio (this demo!)
- **Cloud:** RunPod GPU integration
### πŸ“ˆ Business Model
**Target Market:**
- 10,000+ financial advisory firms in US
- 5,000+ medical practices
- 3,000+ law firms
- Educational institutions
**Pricing:**
- **Self-Service:** $49/month (unlimited training)
- **Professional:** $199/month (priority support)
- **Enterprise:** Custom (dedicated infrastructure)
**Unit Economics:**
- Training cost: $2-10 per model (cloud GPU)
- Average customer value: $2,400/year
- Gross margin: 85%+
### πŸš€ Traction
- Beta testing with 3 financial advisory firms
- 15+ models trained successfully
- 85%+ pass rate on CFP practice exams
- <60 min average training time
### πŸ‘₯ Team
- Built for domain experts by ML engineers
- Open source core (Apache 2.0)
- Active community on GitHub
### πŸ“ž Contact
- **GitHub:** https://github.com/brennanmccloud/LaunchLLM
- **Demo:** This Space!
- **Docs:** See GitHub repo
### πŸŽ“ Learn More
**What is LoRA?**
Low-Rank Adaptation trains only a small subset of model parameters (1-3%), making it:
- 10x faster than full fine-tuning
- 10x cheaper (less GPU time)
- Works on consumer hardware
- Same quality as full fine-tuning
**What models are supported?**
- Qwen 2.5 (7B, 14B, 32B)
- Llama 3.1 (8B, 70B)
- Mistral 7B
- Phi-3 Mini
- Gemma 2B/7B
- Mixtral 8x7B
**Can I use my own data?**
Yes! Upload JSON/CSV or connect to HuggingFace datasets.
**How long does training take?**
- Small models (7B): 30-60 minutes
- Medium models (30B): 2-4 hours
- Large models (70B): 6-8 hours
**Do I need a GPU?**
Not required - you can use RunPod cloud GPUs ($0.44-1.39/hour).
For best experience: 8GB+ GPU (RTX 3060 or better).
---
**Ready to deploy?** Visit our [GitHub](https://github.com/brennanmccloud/LaunchLLM) for full installation instructions.
""")
gr.Markdown("""
---
**πŸ’‘ Note:** This is a demo environment showcasing the platform's capabilities.
**For production deployment:** Visit [GitHub](https://github.com/brennanmccloud/LaunchLLM) to deploy on your infrastructure.
**Questions?** Open an issue on GitHub or contact us.
""")
# Launch the demo
if __name__ == "__main__":
demo.launch()