Spaces:

tommytracx
/

ollama-api

Sleeping

App Files Files Community

tommytracx commited on Aug 24

Commit

f361dc7

verified ·

1 Parent(s): 56b568d

Upload 4 files

Browse files

Files changed (4) hide show

Dockerfile +25 -0
README.md +183 -8
app.py +225 -0
requirements.txt +3 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,25 @@

+FROM python:3.11-slim
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements and install Python dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY . .
+# Expose port
+EXPOSE 7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run the application
+CMD ["gunicorn", "--bind", "0.0.0.0:7860", "--workers", "1", "--timeout", "120", "app:app"]

README.md CHANGED Viewed

@@ -1,12 +1,187 @@
 ---
-title: Ollama Api
-emoji: 🌖
-colorFrom: gray
-colorTo: indigo
 sdk: docker
-pinned: false
-license: apache-2.0
-short_description: ollama-api
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Ollama API Space
+emoji: 🚀
+colorFrom: blue
+colorTo: purple
 sdk: docker
+app_port: 7860
 ---
+# 🚀 Ollama API Space
+A Hugging Face Space that provides a REST API interface for Ollama models, allowing you to run local LLMs through a web API.
+## 🌟 Features
+- **Model Management**: List and pull Ollama models
+- **Text Generation**: Generate text using any available Ollama model
+- **REST API**: Simple HTTP endpoints for easy integration
+- **Health Monitoring**: Built-in health checks and status monitoring
+- **OpenWebUI Integration**: Compatible with OpenWebUI for a full chat interface
+## 🚀 Quick Start
+### 1. Deploy to Hugging Face Spaces
+1. Fork this repository or create a new Space
+2. Upload these files to your Space
+3. Set the following environment variables in your Space settings:
+   - `OLLAMA_BASE_URL`: URL to your Ollama instance (e.g., `http://localhost:11434`)
+   - `ALLOWED_MODELS`: Comma-separated list of allowed models (optional)
+### 2. Local Development
+```bash
+# Clone the repository
+git clone <your-repo-url>
+cd ollama-space
+# Install dependencies
+pip install -r requirements.txt
+# Set environment variables
+export OLLAMA_BASE_URL=http://localhost:11434
+# Run the application
+python app.py
+```
+## 📡 API Endpoints
+### GET `/api/models`
+List all available Ollama models.
+**Response:**
+```json
+{
+  "status": "success",
+  "models": ["llama2", "codellama", "neural-chat"],
+  "count": 3
+}
+```
+### POST `/api/models/pull`
+Pull a model from Ollama.
+**Request Body:**
+```json
+{
+  "name": "llama2"
+}
+```
+**Response:**
+```json
+{
+  "status": "success",
+  "model": "llama2"
+}
+```
+### POST `/api/generate`
+Generate text using a model.
+**Request Body:**
+```json
+{
+  "model": "llama2",
+  "prompt": "Hello, how are you?",
+  "temperature": 0.7,
+  "max_tokens": 100
+}
+```
+**Response:**
+```json
+{
+  "status": "success",
+  "response": "Hello! I'm doing well, thank you for asking...",
+  "model": "llama2",
+  "usage": {
+    "prompt_tokens": 7,
+    "completion_tokens": 15,
+    "total_tokens": 22
+  }
+}
+```
+### GET `/health`
+Health check endpoint.
+**Response:**
+```json
+{
+  "status": "healthy",
+  "ollama_connection": "connected",
+  "available_models": 3
+}
+```
+## 🔧 Configuration
+### Environment Variables
+- `OLLAMA_BASE_URL`: URL to your Ollama instance (default: `http://localhost:11434`)
+- `MODELS_DIR`: Directory for storing models (default: `/models`)
+- `ALLOWED_MODELS`: Comma-separated list of allowed models (default: all models)
+### Supported Models
+By default, the following models are allowed:
+- `llama2`
+- `llama2:13b`
+- `llama2:70b`
+- `codellama`
+- `neural-chat`
+You can customize this list by setting the `ALLOWED_MODELS` environment variable.
+## 🌐 Integration with OpenWebUI
+This Space is designed to work seamlessly with OpenWebUI. You can:
+1. Use this Space as a backend API for OpenWebUI
+2. Configure OpenWebUI to connect to this Space's endpoints
+3. Enjoy a full chat interface with your local Ollama models
+## 🐳 Docker Support
+The Space includes a Dockerfile for containerized deployment:
+```bash
+# Build the image
+docker build -t ollama-space .
+# Run the container
+docker run -p 7860:7860 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 ollama-space
+```
+## 🔒 Security Considerations
+- The Space only allows access to models specified in `ALLOWED_MODELS`
+- All API endpoints are publicly accessible (consider adding authentication for production use)
+- The Space connects to your Ollama instance - ensure proper network security
+## 🚨 Troubleshooting
+### Common Issues
+1. **Connection to Ollama failed**: Check if Ollama is running and accessible
+2. **Model not found**: Ensure the model is available in your Ollama instance
+3. **Timeout errors**: Large models may take time to load - increase timeout values
+### Health Check
+Use the `/health` endpoint to monitor the Space's status and Ollama connection.
+## 📝 License
+This project is open source and available under the MIT License.
+## 🤝 Contributing
+Contributions are welcome! Please feel free to submit a Pull Request.
+## 📞 Support
+If you encounter any issues or have questions, please open an issue on the repository.

app.py ADDED Viewed

	@@ -0,0 +1,225 @@

+from flask import Flask, request, jsonify
+import os
+import subprocess
+import json
+import logging
+from typing import Dict, Any, List
+import requests
+app = Flask(__name__)
+logging.basicConfig(level=logging.INFO)
+# Configuration
+OLLAMA_BASE_URL = os.getenv('OLLAMA_BASE_URL', 'http://localhost:11434')
+MODELS_DIR = os.getenv('MODELS_DIR', '/models')
+ALLOWED_MODELS = os.getenv('ALLOWED_MODELS', 'llama2,llama2:13b,llama2:70b,codellama,neural-chat').split(',')
+class OllamaManager:
+    def __init__(self, base_url: str):
+        self.base_url = base_url
+        self.available_models = []
+        self.refresh_models()
+    def refresh_models(self):
+        """Refresh the list of available models"""
+        try:
+            response = requests.get(f"{self.base_url}/api/tags", timeout=10)
+            if response.status_code == 200:
+                data = response.json()
+                self.available_models = [model['name'] for model in data.get('models', [])]
+            else:
+                self.available_models = []
+        except Exception as e:
+            logging.error(f"Error refreshing models: {e}")
+            self.available_models = []
+    def list_models(self) -> List[str]:
+        """List all available models"""
+        self.refresh_models()
+        return self.available_models
+    def pull_model(self, model_name: str) -> Dict[str, Any]:
+        """Pull a model from Ollama"""
+        try:
+            response = requests.post(f"{self.base_url}/api/pull",
+                                  json={"name": model_name},
+                                  timeout=300)
+            if response.status_code == 200:
+                return {"status": "success", "model": model_name}
+            else:
+                return {"status": "error", "message": f"Failed to pull model: {response.text}"}
+        except Exception as e:
+            return {"status": "error", "message": str(e)}
+    def generate(self, model_name: str, prompt: str, **kwargs) -> Dict[str, Any]:
+        """Generate text using a model"""
+        try:
+            payload = {
+                "model": model_name,
+                "prompt": prompt,
+                "stream": False
+            }
+            payload.update(kwargs)
+            response = requests.post(f"{self.base_url}/api/generate",
+                                  json=payload,
+                                  timeout=120)
+            if response.status_code == 200:
+                data = response.json()
+                return {
+                    "status": "success",
+                    "response": data.get('response', ''),
+                    "model": model_name,
+                    "usage": data.get('usage', {})
+                }
+            else:
+                return {"status": "error", "message": f"Generation failed: {response.text}"}
+        except Exception as e:
+            return {"status": "error", "message": str(e)}
+# Initialize Ollama manager
+ollama_manager = OllamaManager(OLLAMA_BASE_URL)
+@app.route('/')
+def home():
+    """Home page with API documentation"""
+    return '''
+    <!DOCTYPE html>
+    <html>
+    <head>
+        <title>Ollama API Space</title>
+        <style>
+            body { font-family: Arial, sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; }
+            .endpoint { background: #f5f5f5; padding: 15px; margin: 10px 0; border-radius: 5px; }
+            .method { background: #007bff; color: white; padding: 2px 8px; border-radius: 3px; font-size: 12px; }
+            .url { font-family: monospace; background: #e9ecef; padding: 2px 6px; border-radius: 3px; }
+        </style>
+    </head>
+    <body>
+        <h1>🚀 Ollama API Space</h1>
+        <p>This Space provides API endpoints for Ollama model management and inference.</p>
+        <h2>Available Endpoints</h2>
+        <div class="endpoint">
+            <span class="method">GET</span> <span class="url">/api/models</span>
+            <p>List all available models</p>
+        </div>
+        <div class="endpoint">
+            <span class="method">POST</span> <span class="url">/api/models/pull</span>
+            <p>Pull a model from Ollama</p>
+            <p>Body: {"name": "model_name"}</p>
+        </div>
+        <div class="endpoint">
+            <span class="method">POST</span> <span class="url">/api/generate</span>
+            <p>Generate text using a model</p>
+            <p>Body: {"model": "model_name", "prompt": "your prompt"}</p>
+        </div>
+        <div class="endpoint">
+            <span class="method">GET</span> <span class="url">/health</span>
+            <p>Health check endpoint</p>
+        </div>
+        <h2>Usage Examples</h2>
+        <p>You can use this API with OpenWebUI or any other client that supports REST APIs.</p>
+        <h3>cURL Examples</h3>
+        <pre>
+# List models
+curl https://your-space-url.hf.space/api/models
+# Generate text
+curl -X POST https://your-space-url.hf.space/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{"model": "llama2", "prompt": "Hello, how are you?"}'
+        </pre>
+    </body>
+    </html>
+    '''
+@app.route('/api/models', methods=['GET'])
+def list_models():
+    """List all available models"""
+    try:
+        models = ollama_manager.list_models()
+        return jsonify({
+            "status": "success",
+            "models": models,
+            "count": len(models)
+        })
+    except Exception as e:
+        return jsonify({"status": "error", "message": str(e)}), 500
+@app.route('/api/models/pull', methods=['POST'])
+def pull_model():
+    """Pull a model from Ollama"""
+    try:
+        data = request.get_json()
+        if not data or 'name' not in data:
+            return jsonify({"status": "error", "message": "Model name is required"}), 400
+        model_name = data['name']
+        if model_name not in ALLOWED_MODELS:
+            return jsonify({"status": "error", "message": f"Model {model_name} not in allowed list"}), 400
+        result = ollama_manager.pull_model(model_name)
+        if result["status"] == "success":
+            return jsonify(result), 200
+        else:
+            return jsonify(result), 500
+    except Exception as e:
+        return jsonify({"status": "error", "message": str(e)}), 500
+@app.route('/api/generate', methods=['POST'])
+def generate_text():
+    """Generate text using a model"""
+    try:
+        data = request.get_json()
+        if not data or 'model' not in data or 'prompt' not in data:
+            return jsonify({"status": "error", "message": "Model name and prompt are required"}), 400
+        model_name = data['model']
+        prompt = data['prompt']
+        # Remove additional parameters that might be passed
+        kwargs = {k: v for k, v in data.items() if k not in ['model', 'prompt']}
+        result = ollama_manager.generate(model_name, prompt, **kwargs)
+        if result["status"] == "success":
+            return jsonify(result), 200
+        else:
+            return jsonify(result), 500
+    except Exception as e:
+        return jsonify({"status": "error", "message": str(e)}), 500
+@app.route('/health', methods=['GET'])
+def health_check():
+    """Health check endpoint"""
+    try:
+        # Try to connect to Ollama
+        response = requests.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=5)
+        if response.status_code == 200:
+            return jsonify({
+                "status": "healthy",
+                "ollama_connection": "connected",
+                "available_models": len(ollama_manager.available_models)
+            })
+        else:
+            return jsonify({
+                "status": "unhealthy",
+                "ollama_connection": "failed",
+                "error": f"Ollama returned status {response.status_code}"
+            }), 503
+    except Exception as e:
+        return jsonify({
+            "status": "unhealthy",
+            "ollama_connection": "failed",
+            "error": str(e)
+        }), 503
+if __name__ == '__main__':
+    app.run(host='0.0.0.0', port=7860, debug=False)

requirements.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+flask==2.3.3
+requests==2.31.0
+gunicorn==21.2.0