| # βοΈ CompI Phase 3.E: Performance, Model Management & Reliability - Complete Guide | |
| ## π― **What Phase 3.E Delivers** | |
| **Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.** | |
| ### **π€ Model Manager** | |
| - **Dynamic Model Switching**: Switch between SD 1.5 and SDXL based on requirements | |
| - **Auto-Availability Checking**: Intelligent detection of model compatibility and VRAM requirements | |
| - **Universal LoRA Support**: Load and scale LoRA weights across all models and generation modes | |
| - **Smart Recommendations**: Hardware-based model suggestions and optimization advice | |
| ### **β‘ Performance Controls** | |
| - **xFormers Integration**: Memory-efficient attention with automatic fallback | |
| - **Advanced Memory Optimization**: Attention slicing, VAE slicing/tiling, CPU offloading | |
| - **Precision Control**: Automatic dtype selection (fp16/bf16/fp32) based on hardware | |
| - **Batch Optimization**: Memory-aware batch processing with intelligent sizing | |
| ### **π VRAM Monitoring** | |
| - **Real-time Tracking**: Live GPU memory usage monitoring and alerts | |
| - **Usage Analytics**: Memory usage patterns and optimization suggestions | |
| - **Threshold Warnings**: Automatic alerts when approaching memory limits | |
| - **Cache Management**: Intelligent GPU cache clearing and memory cleanup | |
| ### **π‘οΈ Reliability Engine** | |
| - **OOM-Safe Generation**: Automatic retry with progressive fallback strategies | |
| - **Intelligent Fallbacks**: Reduce size β reduce steps β CPU fallback progression | |
| - **Error Classification**: Smart error detection and appropriate response strategies | |
| - **Graceful Degradation**: Maintain functionality even under resource constraints | |
| ### **π¦ Batch Processing** | |
| - **Seed-Controlled Batches**: Deterministic seed sequences for reproducible results | |
| - **Memory-Aware Batching**: Automatic batch size optimization based on available VRAM | |
| - **Progress Tracking**: Detailed progress monitoring with per-image status | |
| - **Failure Recovery**: Continue batch processing even if individual images fail | |
| ### **π Upscaler Integration** | |
| - **Latent Upscaler**: Optional 2x upscaling using Stable Diffusion Latent Upscaler | |
| - **Graceful Degradation**: Clean fallback when upscaler unavailable | |
| - **Memory Management**: Intelligent memory allocation for upscaling operations | |
| - **Quality Enhancement**: Professional-grade image enhancement capabilities | |
| --- | |
| ## π **Quick Start Guide** | |
| ### **1. Launch Phase 3.E** | |
| ```bash | |
| # Method 1: Using launcher script (recommended) | |
| python run_phase3e_performance_manager.py | |
| # Method 2: Direct Streamlit launch | |
| streamlit run src/ui/compi_phase3e_performance_manager.py --server.port 8505 | |
| ``` | |
| ### **2. System Requirements Check** | |
| The launcher automatically checks: | |
| - **GPU Setup**: CUDA availability and VRAM capacity | |
| - **Dependencies**: Required and optional packages | |
| - **Model Support**: SD 1.5 and SDXL availability | |
| - **Performance Features**: xFormers and upscaler support | |
| ### **3. Access the Interface** | |
| - **URL:** `http://localhost:8505` | |
| - **Interface:** Professional Streamlit dashboard with real-time monitoring | |
| - **Sidebar:** Live VRAM monitoring and system status | |
| --- | |
| ## π¨ **Professional Workflow** | |
| ### **Step 1: Model Selection** | |
| 1. **Choose Base Model**: SD 1.5 (fast, compatible) or SDXL (high quality, more VRAM) | |
| 2. **Select Generation Mode**: txt2img or img2img | |
| 3. **Check Compatibility**: System automatically validates model/mode combinations | |
| 4. **Review VRAM Requirements**: See memory requirements and availability status | |
| ### **Step 2: LoRA Integration (Optional)** | |
| 1. **Enable LoRA**: Toggle LoRA support | |
| 2. **Specify Path**: Enter path to LoRA weights (diffusers format) | |
| 3. **Set Scale**: Adjust LoRA influence (0.1-2.0) | |
| 4. **Verify Status**: Check LoRA loading status and compatibility | |
| ### **Step 3: Performance Optimization** | |
| 1. **Choose Optimization Level**: Conservative, Balanced, Aggressive, or Extreme | |
| 2. **Monitor VRAM**: Watch real-time memory usage in sidebar | |
| 3. **Adjust Settings**: Fine-tune individual optimization features | |
| 4. **Enable Reliability**: Configure OOM retry and CPU fallback options | |
| ### **Step 4: Generation** | |
| 1. **Single Images**: Generate individual images with full control | |
| 2. **Batch Processing**: Create multiple images with seed sequences | |
| 3. **Monitor Progress**: Track generation progress and memory usage | |
| 4. **Review Results**: Analyze generation statistics and performance metrics | |
| --- | |
| ## π§ **Advanced Features** | |
| ### **π€ Model Manager Deep Dive** | |
| #### **Model Compatibility Matrix** | |
| ```python | |
| SD 1.5: | |
| β txt2img (512x512 optimal) | |
| β img2img (all strengths) | |
| β ControlNet (full support) | |
| β LoRA (universal compatibility) | |
| πΎ VRAM: 4+ GB recommended | |
| SDXL: | |
| β txt2img (1024x1024 optimal) | |
| β img2img (limited support) | |
| β οΈ ControlNet (requires special handling) | |
| β LoRA (SDXL-compatible weights only) | |
| πΎ VRAM: 8+ GB recommended | |
| ``` | |
| #### **Automatic Model Selection Logic** | |
| - **VRAM < 6GB**: Recommends SD 1.5 only | |
| - **VRAM 6-8GB**: SD 1.5 preferred, SDXL with warnings | |
| - **VRAM 8GB+**: Full SDXL support with optimizations | |
| - **CPU Mode**: SD 1.5 only with aggressive optimizations | |
| ### **β‘ Performance Optimization Levels** | |
| #### **Conservative Mode** | |
| - Basic attention slicing | |
| - Standard precision (fp16/fp32) | |
| - Minimal memory optimizations | |
| - **Best for**: Stable systems, first-time users | |
| #### **Balanced Mode (Default)** | |
| - xFormers attention (if available) | |
| - Attention + VAE slicing | |
| - Automatic precision selection | |
| - **Best for**: Most users, good performance/stability balance | |
| #### **Aggressive Mode** | |
| - All memory optimizations enabled | |
| - VAE tiling for large images | |
| - Maximum memory efficiency | |
| - **Best for**: Limited VRAM, large batch processing | |
| #### **Extreme Mode** | |
| - CPU offloading enabled | |
| - Maximum memory savings | |
| - Slower but uses minimal VRAM | |
| - **Best for**: Very limited VRAM (<4GB) | |
| ### **π‘οΈ Reliability Engine Strategies** | |
| #### **Fallback Progression** | |
| ```python | |
| Strategy 1: Original settings (100% size, 100% steps) | |
| Strategy 2: Reduced size (75% size, 90% steps) | |
| Strategy 3: Half size (50% size, 80% steps) | |
| Strategy 4: Minimal (50% size, 60% steps) | |
| Final: CPU fallback if all GPU attempts fail | |
| ``` | |
| #### **Error Classification** | |
| - **CUDA OOM**: Triggers progressive fallback | |
| - **Model Loading**: Suggests alternative models | |
| - **LoRA Errors**: Disables LoRA and retries | |
| - **General Errors**: Logs and reports with context | |
| ### **π VRAM Monitoring System** | |
| #### **Real-time Metrics** | |
| - **Total VRAM**: Hardware capacity | |
| - **Used VRAM**: Currently allocated memory | |
| - **Free VRAM**: Available for new operations | |
| - **Usage Percentage**: Current utilization level | |
| #### **Smart Alerts** | |
| - **Green (0-60%)**: Optimal usage | |
| - **Yellow (60-80%)**: Moderate usage, monitor closely | |
| - **Red (80%+)**: High usage, optimization recommended | |
| #### **Memory Management** | |
| - **Automatic Cache Clearing**: Between batch generations | |
| - **Memory Leak Detection**: Identifies and resolves memory issues | |
| - **Optimization Suggestions**: Hardware-specific recommendations | |
| --- | |
| ## π **Performance Benchmarks** | |
| ### **Generation Speed Comparison** | |
| ``` | |
| SD 1.5 (512x512, 20 steps): | |
| RTX 4090: ~15-25 seconds | |
| RTX 3080: ~25-35 seconds | |
| RTX 2080: ~45-60 seconds | |
| CPU: ~5-10 minutes | |
| SDXL (1024x1024, 20 steps): | |
| RTX 4090: ~30-45 seconds | |
| RTX 3080: ~60-90 seconds | |
| RTX 2080: ~2-3 minutes (with optimizations) | |
| CPU: ~15-30 minutes | |
| ``` | |
| ### **Memory Usage Patterns** | |
| ``` | |
| SD 1.5: | |
| Base: ~3.5GB VRAM | |
| + LoRA: ~3.7GB VRAM | |
| + Upscaler: ~5.5GB VRAM | |
| SDXL: | |
| Base: ~6.5GB VRAM | |
| + LoRA: ~7.0GB VRAM | |
| + Upscaler: ~9.0GB VRAM | |
| ``` | |
| --- | |
| ## π **Troubleshooting Guide** | |
| ### **Common Issues & Solutions** | |
| #### **"CUDA Out of Memory" Errors** | |
| 1. **Enable OOM Auto-Retry**: Automatic fallback handling | |
| 2. **Reduce Image Size**: Use 512x512 instead of 1024x1024 | |
| 3. **Lower Batch Size**: Generate fewer images simultaneously | |
| 4. **Enable Aggressive Optimizations**: Use VAE slicing/tiling | |
| 5. **Clear GPU Cache**: Use sidebar "Clear GPU Cache" button | |
| #### **Slow Generation Speed** | |
| 1. **Enable xFormers**: Significant speed improvement if available | |
| 2. **Use Balanced Optimization**: Good speed/quality trade-off | |
| 3. **Reduce Inference Steps**: 15-20 steps often sufficient | |
| 4. **Check VRAM Usage**: Ensure not hitting memory limits | |
| #### **Model Loading Failures** | |
| 1. **Check Internet Connection**: Models download on first use | |
| 2. **Verify Disk Space**: Models require 2-7GB storage each | |
| 3. **Try Alternative Model**: Switch between SD 1.5 and SDXL | |
| 4. **Clear Model Cache**: Remove cached models and re-download | |
| #### **LoRA Loading Issues** | |
| 1. **Verify Path**: Ensure LoRA files exist at specified path | |
| 2. **Check Format**: Use diffusers-compatible LoRA weights | |
| 3. **Model Compatibility**: Ensure LoRA matches base model type | |
| 4. **Scale Adjustment**: Try different LoRA scale values | |
| --- | |
| ## π― **Best Practices** | |
| ### **π Performance Optimization** | |
| 1. **Start Conservative**: Begin with balanced settings, adjust as needed | |
| 2. **Monitor VRAM**: Keep usage below 80% for stability | |
| 3. **Batch Wisely**: Use smaller batches on limited hardware | |
| 4. **Clear Cache Regularly**: Prevent memory accumulation | |
| ### **π€ Model Selection** | |
| 1. **SD 1.5 for Speed**: Faster generation, lower VRAM requirements | |
| 2. **SDXL for Quality**: Higher resolution, better detail | |
| 3. **Match Hardware**: Choose model based on available VRAM | |
| 4. **Test Compatibility**: Verify model works with your use case | |
| ### **π‘οΈ Reliability** | |
| 1. **Enable Auto-Retry**: Let system handle OOM errors automatically | |
| 2. **Use Fallbacks**: Allow progressive degradation for reliability | |
| 3. **Monitor Logs**: Check run logs for patterns and issues | |
| 4. **Plan for Failures**: Design workflows that handle generation failures | |
| --- | |
| ## π **Integration with CompI Ecosystem** | |
| ### **Universal Enhancement** | |
| Phase 3.E enhances ALL existing CompI components: | |
| - **Ultimate Dashboard**: Model switching and performance controls | |
| - **Phase 2.A-2.E**: Reliability and optimization for all multimodal phases | |
| - **Phase 1.A-1.E**: Enhanced foundation with professional features | |
| - **Phase 3.D**: Performance metrics in workflow management | |
| ### **Backward Compatibility** | |
| - **Graceful Degradation**: Works on all hardware configurations | |
| - **Default Settings**: Optimal defaults for most users | |
| - **Progressive Enhancement**: Advanced features when available | |
| - **Legacy Support**: Maintains compatibility with existing workflows | |
| --- | |
| ## π **Phase 3.E: Production-Grade CompI Complete** | |
| **Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.** | |
| **Key Benefits:** | |
| - β **Professional Performance**: Industry-standard optimization and monitoring | |
| - β **Intelligent Reliability**: Automatic error handling and recovery | |
| - β **Advanced Model Management**: Dynamic switching and LoRA integration | |
| - β **Production Ready**: Suitable for commercial and professional use | |
| - β **Universal Enhancement**: Improves all existing CompI features | |
| **CompI is now a complete, production-grade multimodal AI art generation platform!** π¨β¨ | |