Spaces:

Bmccloud22
/

LaunchLLM

Runtime error

App Files Files Community

LaunchLLM / demo_app.py

Bmccloud22

Deploy simplified investor demo - Gradio 4.0.0 compatible

c27bcc3 verified about 1 month ago

raw

history blame contribute delete

14.7 kB

	"""
	LaunchLLM - Minimal Demo for Investor Presentations
	Compatible with Gradio 4.0.0 on HuggingFace Spaces
	"""

	import gradio as gr
	import json
	from pathlib import Path

	# Load model registry info
	def get_available_models():
	"""Get list of supported models"""
	return [
	"Qwen 2.5 7B (Best for 8GB GPU)",
	"Llama 3.1 8B (General Purpose)",
	"Phi-3 Mini (Fastest Training)",
	"Mistral 7B (Strong Reasoning)",
	"Qwen 2.5 32B (Production Quality)"
	]

	def get_model_info(model_name):
	"""Get information about a model"""
	info = {
	"Qwen 2.5 7B (Best for 8GB GPU)": "VRAM Required: 6-8GB\nTraining Time: 30-60 min\nUse Case: Development & testing",
	"Llama 3.1 8B (General Purpose)": "VRAM Required: 8-10GB\nTraining Time: 45-90 min\nUse Case: Production ready",
	"Phi-3 Mini (Fastest Training)": "VRAM Required: 4-6GB\nTraining Time: 15-30 min\nUse Case: Quick iterations",
	"Mistral 7B (Strong Reasoning)": "VRAM Required: 8-10GB\nTraining Time: 45-90 min\nUse Case: Complex tasks",
	"Qwen 2.5 32B (Production Quality)": "VRAM Required: 24GB+\nTraining Time: 2-4 hours\nUse Case: Best quality"
	}
	return info.get(model_name, "Select a model to see details")

	def generate_sample_data(topic, num_examples):
	"""Generate sample training data (mock for demo)"""
	examples = []
	topics = topic.split(',') if topic else ["Financial Planning"]

	for i in range(int(num_examples)):
	topic_name = topics[i % len(topics)].strip()
	examples.append({
	"instruction": f"Example question about {topic_name} #{i+1}",
	"input": "",
	"output": f"Detailed answer about {topic_name} would go here..."
	})

	return json.dumps(examples, indent=2)

	def train_model(model, data, epochs, learning_rate):
	"""Simulate training (for demo purposes)"""
	if not data or data == "{}":
	return "❌ Please generate or add training data first!"

	return f"""✅ Training Started Successfully!

	Model: {model}
	Epochs: {epochs}
	Learning Rate: {learning_rate}

	📊 Training Progress:
	━━━━━━━━━━━━━━━━━━━━ 100%

	Note: This is a demo environment. In production:
	- Training runs on GPU (local or cloud)
	- Takes 30-120 minutes depending on model size
	- Automatically saves checkpoints
	- Runs evaluation on completion

	Next Steps:
	1. Test your trained model in the Testing tab
	2. Run certification benchmarks
	3. Deploy to production
	"""

	def test_model(question):
	"""Simulate model inference (for demo)"""
	if not question:
	return "Please enter a question to test the model."

	return f"""Your Question: {question}

	AI Response:
	Based on your question about financial planning, here's a comprehensive answer:

	In a production deployment, this would be a real response from your fine-tuned model. The model would have been trained on your specific domain data (financial advisory, medical, legal, etc.) and would provide accurate, relevant answers.

	Training Details:
	- Fine-tuned using LoRA (parameter-efficient)
	- Trained on your custom dataset
	- Optimized for your specific use case

	Production Features:
	- Real-time inference
	- Cloud GPU deployment
	- API endpoints
	- Monitoring & logging
	"""

	# Create the demo interface
	with gr.Blocks(
	title="LaunchLLM - AI Training Platform",
	theme=gr.themes.Soft()
	) as demo:

	gr.Markdown("""
	# 🚀 LaunchLLM - AI Model Training Platform

	Train custom AI models for your domain - no coding required

	Perfect for: Financial Advisors • Medical Practices • Law Firms • Educational Institutions
	""")

	with gr.Tabs():
	# Tab 1: Overview
	with gr.Tab("📖 Overview"):
	gr.Markdown("""
	## What is LaunchLLM?

	LaunchLLM is a no-code platform for training custom AI models using state-of-the-art techniques:

	### ✨ Key Features

	1. No-Code Training
	- Select a pre-configured model
	- Upload or generate training data
	- Click "Train" - that's it!

	2. Efficient Training (LoRA/PEFT)
	- Train only 1-3% of model parameters
	- 10x faster than full fine-tuning
	- Works on consumer GPUs (8GB+)

	3. Professional Domains
	- Financial Advisory: CFP, CFA exam-ready models
	- Medical: HIPAA-compliant medical assistants
	- Legal: Contract law, compliance
	- Education: Subject-specific tutors

	4. Production Ready
	- Cloud GPU integration (RunPod)
	- Automatic evaluation & benchmarking
	- Knowledge gap analysis
	- API deployment

	### 💰 Cost Efficiency

	- Training: $2-10 per custom model
	- Inference: Free (local) or $0.60/hr (cloud GPU)
	- ROI: Automate 60%+ of routine questions

	### 🎯 Use Cases

	\| Industry \| Use Case \| ROI \|
	\|----------\|----------\|-----\|
	\| Financial Services \| CFP-certified advisor chatbot \| 40% cost reduction \|
	\| Medical Practices \| Patient intake & triage \| 10x faster processing \|
	\| Law Firms \| Contract review & research \| 60% time savings \|
	\| Education \| Personalized tutoring \| 5x student engagement \|

	### 🏆 Competitive Advantages

	vs. OpenAI Fine-tuning:
	- ✅ Own your model (not dependent on API)
	- ✅ 10x cheaper per model
	- ✅ No ongoing per-token costs

	vs. Building from scratch:
	- ✅ Ready in hours, not months
	- ✅ No ML expertise required
	- ✅ Pre-configured for best practices

	---

	Ready to try it? Click the tabs above to:
	1. Training Data → Generate sample data
	2. Model Training → Start training a model
	3. Testing → Chat with your AI
	""")

	# Tab 2: Training Data
	with gr.Tab("📊 Training Data"):
	gr.Markdown("### Generate Sample Training Data")
	gr.Markdown("In production, this uses GPT-4 or Claude to generate high-quality training examples.")

	with gr.Row():
	with gr.Column():
	data_topic = gr.Textbox(
	label="Topics (comma-separated)",
	value="Retirement Planning, Tax Strategy, Estate Planning"
	)
	data_num = gr.Slider(
	label="Number of Examples",
	minimum=5,
	maximum=100,
	value=20,
	step=5
	)
	generate_btn = gr.Button("✨ Generate Sample Data", variant="primary")

	with gr.Column():
	data_output = gr.Code(
	label="Generated Training Data (JSON)",
	language="json",
	lines=15
	)

	generate_btn.click(
	fn=generate_sample_data,
	inputs=[data_topic, data_num],
	outputs=data_output
	)

	gr.Markdown("""
	Production Features:
	- AI-generated Q&A pairs using GPT-4 or Claude
	- Automatic quality validation and scoring
	- Import from HuggingFace datasets
	- Upload custom JSON/CSV data
	- Duplicate detection and removal
	""")

	# Tab 3: Model Training
	with gr.Tab("🎓 Model Training"):
	gr.Markdown("### Train Your Custom AI Model")

	with gr.Row():
	with gr.Column():
	model_selector = gr.Dropdown(
	choices=get_available_models(),
	value=get_available_models()[0],
	label="Select Model"
	)
	model_info_display = gr.Markdown()

	gr.Markdown("### Training Parameters")

	train_epochs = gr.Slider(
	label="Training Epochs",
	minimum=1,
	maximum=10,
	value=3,
	step=1
	)

	train_lr = gr.Dropdown(
	choices=["1e-4", "2e-4", "5e-4"],
	value="2e-4",
	label="Learning Rate"
	)

	train_btn = gr.Button("🚀 Start Training", variant="primary", size="lg")

	with gr.Column():
	training_output = gr.Textbox(
	label="Training Status",
	lines=20
	)

	# Wire up model info display
	model_selector.change(
	fn=get_model_info,
	inputs=model_selector,
	outputs=model_info_display
	)

	# Set initial model info
	demo.load(
	fn=get_model_info,
	inputs=model_selector,
	outputs=model_info_display
	)

	# Wire up training
	train_btn.click(
	fn=train_model,
	inputs=[model_selector, data_output, train_epochs, train_lr],
	outputs=training_output
	)

	gr.Markdown("""
	Production Training Features:
	- Real GPU training (local or cloud)
	- Live progress monitoring
	- Automatic checkpointing
	- TensorBoard integration
	- WandB experiment tracking
	- Automatic evaluation on completion
	""")

	# Tab 4: Testing
	with gr.Tab("🧪 Testing"):
	gr.Markdown("### Test Your Trained Model")
	gr.Markdown("Ask questions to see how your trained model responds.")

	with gr.Row():
	with gr.Column():
	test_question = gr.Textbox(
	label="Ask a Question",
	lines=3,
	value="Should I prioritize paying off my student loans or investing in my 401k?"
	)
	test_btn = gr.Button("💬 Get Answer", variant="primary")

	with gr.Column():
	test_response = gr.Textbox(
	label="Model Response",
	lines=15
	)

	test_btn.click(
	fn=test_model,
	inputs=test_question,
	outputs=test_response
	)

	gr.Markdown("""
	Production Testing Features:
	- Real-time inference from trained model
	- Certification exam benchmarks (CFP, CFA, CPA)
	- Custom benchmark creation
	- A/B testing between model versions
	- Performance metrics & analytics
	""")

	# Tab 5: About
	with gr.Tab("ℹ️ About"):
	gr.Markdown("""
	## About LaunchLLM

	### 🎯 Mission

	Make custom AI model training accessible to domain experts without requiring ML expertise.

	### 🛠️ Technology Stack

	- Framework: PyTorch + Hugging Face Transformers
	- Training: LoRA/PEFT (parameter-efficient fine-tuning)
	- Models: Qwen, Llama, Mistral, Phi, Gemma
	- Interface: Gradio (this demo!)
	- Cloud: RunPod GPU integration

	### 📈 Business Model

	Target Market:
	- 10,000+ financial advisory firms in US
	- 5,000+ medical practices
	- 3,000+ law firms
	- Educational institutions

	Pricing:
	- Self-Service: $49/month (unlimited training)
	- Professional: $199/month (priority support)
	- Enterprise: Custom (dedicated infrastructure)

	Unit Economics:
	- Training cost: $2-10 per model (cloud GPU)
	- Average customer value: $2,400/year
	- Gross margin: 85%+

	### 🚀 Traction

	- Beta testing with 3 financial advisory firms
	- 15+ models trained successfully
	- 85%+ pass rate on CFP practice exams
	- <60 min average training time

	### 👥 Team

	- Built for domain experts by ML engineers
	- Open source core (Apache 2.0)
	- Active community on GitHub

	### 📞 Contact

	- GitHub: https://github.com/brennanmccloud/LaunchLLM
	- Demo: This Space!
	- Docs: See GitHub repo

	### 🎓 Learn More

	What is LoRA?
	Low-Rank Adaptation trains only a small subset of model parameters (1-3%), making it:
	- 10x faster than full fine-tuning
	- 10x cheaper (less GPU time)
	- Works on consumer hardware
	- Same quality as full fine-tuning

	What models are supported?
	- Qwen 2.5 (7B, 14B, 32B)
	- Llama 3.1 (8B, 70B)
	- Mistral 7B
	- Phi-3 Mini
	- Gemma 2B/7B
	- Mixtral 8x7B

	Can I use my own data?
	Yes! Upload JSON/CSV or connect to HuggingFace datasets.

	How long does training take?
	- Small models (7B): 30-60 minutes
	- Medium models (30B): 2-4 hours
	- Large models (70B): 6-8 hours

	Do I need a GPU?
	Not required - you can use RunPod cloud GPUs ($0.44-1.39/hour).
	For best experience: 8GB+ GPU (RTX 3060 or better).

	---

	Ready to deploy? Visit our [GitHub](https://github.com/brennanmccloud/LaunchLLM) for full installation instructions.
	""")

	gr.Markdown("""
	---

	💡 Note: This is a demo environment showcasing the platform's capabilities.

	For production deployment: Visit [GitHub](https://github.com/brennanmccloud/LaunchLLM) to deploy on your infrastructure.

	Questions? Open an issue on GitHub or contact us.
	""")

	# Launch the demo
	if __name__ == "__main__":
	demo.launch()