2.7 KiB
2.7 KiB
Monitoring & Observability
Der Middlelayer ist mit umfassendem Monitoring ausgestattet:
1. Structured Logging (Winston)
Konfiguration:
- Log-Level:
LOG_LEVEL(default:info) - Format: JSON in Production, farbig in Development
- Output: Console + Files (
logs/error.log,logs/combined.log)
Verwendung:
import { logger, logQuery, logError } from './monitoring/logger.js';
logger.info('Info message', { context: 'data' });
logQuery('GetProducts', { limit: 10 }, 45);
logError(error, { operation: 'getProducts' });
2. Prometheus Metrics
Endpoints:
GET http://localhost:9090/metrics- Prometheus MetricsGET http://localhost:9090/health- Health Check
Verfügbare Metriken:
Query Metrics
graphql_queries_total- Anzahl der Queries (Labels:operation,status)graphql_query_duration_seconds- Query-Dauer (Histogram)graphql_query_complexity- Query-Komplexität (Gauge)
Cache Metrics
cache_hits_total- Cache Hits (Label:cache_type)cache_misses_total- Cache Misses (Label:cache_type)
DataService Metrics
dataservice_calls_total- DataService Aufrufe (Labels:method,status)dataservice_duration_seconds- DataService Dauer (Histogram)
Error Metrics
errors_total- Anzahl der Fehler (Labels:type,operation)
Beispiel Prometheus Query:
# Query Rate
rate(graphql_queries_total[5m])
# Error Rate
rate(errors_total[5m])
# Cache Hit Ratio
rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))
3. Distributed Tracing
Features:
- Automatische Trace-ID-Generierung pro Request
- Span-Tracking für verschachtelte Operationen
- Dauer-Messung für Performance-Analyse
Trace-IDs werden automatisch in Logs und Metrics eingebunden.
Environment Variables
# Logging
LOG_LEVEL=info # debug, info, warn, error
# Metrics
METRICS_PORT=9090 # Port für Metrics-Endpoint
# Query Complexity
MAX_QUERY_COMPLEXITY=1000 # Max. Query-Komplexität
Integration mit Grafana
Prometheus Scrape Config:
scrape_configs:
- job_name: 'graphql-middlelayer'
static_configs:
- targets: ['localhost:9090']
Grafana Dashboard:
- Importiere die Metriken in Grafana
- Erstelle Dashboards für:
- Query Performance
- Cache Hit Rates
- Error Rates
- Request Throughput
Beispiel-Dashboard Queries
# Requests pro Sekunde
sum(rate(graphql_queries_total[1m])) by (operation)
# Durchschnittliche Query-Dauer
avg(graphql_query_duration_seconds) by (operation)
# Cache Hit Rate
sum(rate(cache_hits_total[5m])) / (sum(rate(cache_hits_total[5m])) + sum(rate(cache_misses_total[5m])))
# Error Rate
sum(rate(errors_total[5m])) by (type, operation)