OpenTelemetry Troubleshooting¶
This guide helps you diagnose and resolve common issues with OpenTelemetry integration in kibana-py, including trace instrumentation and log forwarding.
See also
For general observability setup and configuration, see OpenTelemetry Observability.
Quick Diagnosis¶
Check Telemetry Status¶
All examples display telemetry status at startup. Look for this output:
📊 OpenTelemetry Status:
✅ Traces: Enabled (endpoint: http://localhost:8200)
✅ Logs: Enabled (level: WARNING, loggers: kibana,my-app)
📡 Protocol: grpc
🔑 Authentication: Bearer token
Common Status Messages¶
Status |
Meaning |
Action Required |
|---|---|---|
|
Log forwarding is working |
None |
|
Log forwarding is disabled |
Enable with |
|
Some loggers configured |
Check logger names |
|
Cannot connect to APM server |
Check APM server and network |
Configuration Issues¶
Log Forwarding Not Enabled¶
Symptom: ❌ Logs: Disabled in status output
Causes:
KIBANA_OTEL_LOGS_ENABLEDnot set totrueMissing OpenTelemetry logs dependencies
Configuration precedence issues
Solutions:
Enable log forwarding:
export KIBANA_OTEL_LOGS_ENABLED=true
Check
.envfile (if using elastic-start-local):# In elastic-start-local/.env KIBANA_OTEL_LOGS_ENABLED=true
Verify dependencies:
pip install kibana-py[observability]
Check configuration precedence:
Environment variables override
.envfileselastic-start-local/.envoverrides local.envEnsure no conflicting environment variables
Wrong Log Level Configuration¶
Symptom: Some logs missing in APM, traces work fine
Causes:
Log level set too high (e.g.,
ERRORwhen you wantINFOlogs)Application using lower log levels than configured
Solutions:
Lower the log level temporarily:
export KIBANA_OTEL_LOGS_LEVEL=DEBUG # Forward all logs
Check your application’s log levels:
import logging logger = logging.getLogger("my-app") logger.info("This won't be forwarded if level is WARNING or higher") logger.warning("This will be forwarded if level is WARNING or lower")
Use appropriate levels by environment:
# Development export KIBANA_OTEL_LOGS_LEVEL=DEBUG # Staging export KIBANA_OTEL_LOGS_LEVEL=WARNING # Production export KIBANA_OTEL_LOGS_LEVEL=ERROR
Incorrect Logger Names¶
Symptom: No logs forwarded despite correct level configuration
Causes:
Logger names don’t match
KIBANA_OTEL_LOGS_LOGGERSconfigurationTypos in logger names
Using
__name__instead of explicit logger names
Solutions:
Check your logger names:
import logging # Make sure this matches KIBANA_OTEL_LOGS_LOGGERS logger = logging.getLogger("my-app") # Should be in KIBANA_OTEL_LOGS_LOGGERS
Update logger configuration:
# Include your application's logger names export KIBANA_OTEL_LOGS_LOGGERS=kibana,my-app,my-module
Use wildcard for testing (not recommended for production):
# Forward all loggers (expensive) export KIBANA_OTEL_LOGS_LOGGERS=""
Debug logger names:
import logging # Print logger name to verify logger = logging.getLogger("my-app") print(f"Logger name: {logger.name}") logger.error("Test message")
APM Server Connectivity Issues¶
APM Server Not Available¶
Symptom: ❌ APM server not available, telemetry disabled
Causes:
APM server not running
Wrong endpoint URL
Network connectivity issues
Firewall blocking connections
Solutions:
Check if APM server is running:
curl -X GET "http://localhost:8200/"
Expected response:
{ "build_date": "2024-01-01T00:00:00Z", "build_sha": "abc123", "version": "8.x.x" }
Verify endpoint configuration:
# Check current configuration echo $OTEL_EXPORTER_OTLP_ENDPOINT # Should be APM server URL, not Kibana URL export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200 # APM server # NOT: http://localhost:5601 # This is Kibana
Test network connectivity:
# Test basic connectivity telnet localhost 8200 # Test with curl curl -v http://localhost:8200/
Check Docker/container setup:
# If using elastic-start-local cd elastic-start-local docker-compose ps # APM server should be running docker-compose logs apm-server
Authentication Failures¶
Symptom: ❌ APM authentication failed
Causes:
Invalid or expired APM secret token
Wrong authentication header format
Token permissions issues
Solutions:
Verify APM token:
# Check token is set echo $ELASTIC_APM_SECRET_TOKEN # Test authentication curl -H "Authorization: Bearer $ELASTIC_APM_SECRET_TOKEN" \ "http://localhost:8200/config/v1/agents"
Check header format:
# Correct format export OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer your_token" # NOT: "Authorization: Bearer your_token" # Wrong format
Regenerate token (if using elastic-start-local):
cd elastic-start-local ./local-stack.sh -o stop ./local-stack.sh -o start # This generates a new token
Test logs endpoint specifically:
curl -X POST "http://localhost:8200/v1/logs" \ -H "Authorization: Bearer $ELASTIC_APM_SECRET_TOKEN" \ -H "Content-Type: application/x-protobuf"
Protocol Mismatch¶
Symptom: Connection timeouts or protocol errors
Causes:
APM server doesn’t support configured protocol
Network proxy issues
Port conflicts
Solutions:
Try different protocols:
# Try gRPC (default) export OTEL_EXPORTER_OTLP_PROTOCOL=grpc # Try HTTP if gRPC fails export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
Check APM server protocol support:
# Check APM server configuration curl http://localhost:8200/config/v1/agents
Verify ports:
# Standard APM server ports # 8200: HTTP/HTTPS # 8200: gRPC (same port, different protocol) netstat -an | grep 8200
Performance Issues¶
High Latency with Log Forwarding¶
Symptom: Application becomes slow when log forwarding is enabled
Causes:
Too many logs being forwarded
Synchronous log processing
Network latency to APM server
Solutions:
Reduce log volume:
# Only forward errors export KIBANA_OTEL_LOGS_LEVEL=ERROR # Limit to critical loggers export KIBANA_OTEL_LOGS_LOGGERS=critical-module
Check log frequency:
import logging logger = logging.getLogger("my-app") # Avoid high-frequency logging in loops for i in range(1000): # DON'T do this - creates 1000 log entries logger.info(f"Processing item {i}") # DO this instead - log summary logger.info(f"Processing {len(items)} items") # ... process items ... logger.info(f"Completed processing {len(items)} items")
Monitor APM server performance:
# Check APM server response times curl -w "@curl-format.txt" -o /dev/null -s "http://localhost:8200/" # curl-format.txt: # time_total: %{time_total}
Memory Usage Issues¶
Symptom: High memory usage when log forwarding is enabled
Causes:
Large log messages
Log queue buildup
Memory leaks in log processing
Solutions:
Limit log message size:
import logging logger = logging.getLogger("my-app") # Avoid logging large objects large_data = {"key": "x" * 10000} logger.info("Processing data", extra={"data_size": len(str(large_data))}) # Good # logger.info("Processing data", extra={"data": large_data}) # Bad - too large
Monitor log queue:
# Check if logs are being processed import logging # Enable debug logging to see queue status logging.getLogger("opentelemetry").setLevel(logging.DEBUG)
Implement log sampling:
import random import logging logger = logging.getLogger("my-app") # Sample high-frequency logs if random.random() < 0.1: # 10% sampling logger.debug("High frequency debug message")
Application Integration Issues¶
Custom Attributes Not Appearing¶
Symptom: Basic logs work but custom attributes missing
Causes:
Incorrect
extraparameter usageAttribute name conflicts
Serialization issues with complex objects
Solutions:
Use correct
extraparameter:import logging logger = logging.getLogger("my-app") # Correct way logger.info( "Operation completed", extra={ "operation_type": "create_connector", "duration_ms": 150, "success": True } ) # Wrong way - attributes won't be forwarded logger.info("Operation completed", {"operation_type": "create_connector"})
Avoid reserved attribute names:
# Avoid these reserved names reserved_names = ["name", "msg", "args", "levelname", "levelno", "pathname", "filename", "module", "lineno", "funcName", "created", "msecs", "relativeCreated", "thread", "threadName", "processName", "process", "getMessage", "exc_info", "exc_text", "stack_info"] # Use prefixed names instead logger.info( "User action", extra={ "user_name": "john", # Good "action_type": "login", # Good # "name": "john", # Bad - conflicts with log record name } )
Serialize complex objects:
import json import logging logger = logging.getLogger("my-app") complex_object = {"nested": {"data": [1, 2, 3]}} # Serialize complex objects logger.info( "Processing complex data", extra={ "data_summary": json.dumps(complex_object), "data_size": len(str(complex_object)) } )
Debugging Tools and Techniques¶
Enable Debug Logging¶
import logging
# Enable debug logging for OpenTelemetry
logging.getLogger("opentelemetry").setLevel(logging.DEBUG)
logging.getLogger("kibana.observability").setLevel(logging.DEBUG)
# Enable debug logging for your application
logging.getLogger("my-app").setLevel(logging.DEBUG)
# Configure console output
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
Test Log Forwarding Manually¶
#!/usr/bin/env python3
"""Test log forwarding functionality."""
import logging
import time
from examples.utils import configure_example_telemetry, print_telemetry_info
def test_log_forwarding():
print("🧪 Testing Log Forwarding")
print("=" * 40)
# Configure telemetry
traces_enabled, logs_enabled = configure_example_telemetry(logs_enabled=True)
print_telemetry_info()
if not logs_enabled:
print("❌ Log forwarding not enabled")
return False
# Create test logger
logger = logging.getLogger("test-app")
# Test different log levels
print("\n📝 Sending test logs...")
logger.debug("Debug message - should only appear if level is DEBUG")
logger.info("Info message - should only appear if level is INFO or lower")
logger.warning("Warning message - should appear if level is WARNING or lower")
logger.error("Error message - should appear if level is ERROR or lower")
logger.critical("Critical message - should always appear")
# Test structured logging
logger.error(
"Structured error message",
extra={
"error_code": "TEST_ERROR",
"component": "log_forwarding_test",
"test_id": "test-123",
"timestamp": time.time()
}
)
print("✅ Test logs sent")
print("🔍 Check APM at: http://localhost:5601/app/apm")
print("📋 Look for service: kibana-py-example")
print("📝 Navigate to Logs tab to see forwarded messages")
return True
if __name__ == "__main__":
test_log_forwarding()
Validate APM Server Logs Endpoint¶
#!/bin/bash
# validate_apm_logs.sh
echo "🔍 Validating APM Server Logs Endpoint"
echo "======================================"
# Check APM server is running
echo "1. Checking APM server availability..."
if curl -s -f "http://localhost:8200/" > /dev/null; then
echo "✅ APM server is running"
else
echo "❌ APM server is not available"
exit 1
fi
# Check authentication
echo "2. Testing authentication..."
if [ -z "$ELASTIC_APM_SECRET_TOKEN" ]; then
echo "❌ ELASTIC_APM_SECRET_TOKEN not set"
exit 1
fi
if curl -s -f -H "Authorization: Bearer $ELASTIC_APM_SECRET_TOKEN" \
"http://localhost:8200/config/v1/agents" > /dev/null; then
echo "✅ Authentication successful"
else
echo "❌ Authentication failed"
exit 1
fi
# Test logs endpoint
echo "3. Testing logs endpoint..."
if curl -s -f -X POST \
-H "Authorization: Bearer $ELASTIC_APM_SECRET_TOKEN" \
-H "Content-Type: application/x-protobuf" \
"http://localhost:8200/v1/logs" > /dev/null; then
echo "✅ Logs endpoint accessible"
else
echo "❌ Logs endpoint not accessible"
exit 1
fi
echo "✅ All APM server checks passed"
Monitor Log Forwarding Performance¶
#!/usr/bin/env python3
"""Monitor log forwarding performance."""
import logging
import time
from examples.utils import configure_example_telemetry
def performance_test():
# Configure telemetry
traces_enabled, logs_enabled = configure_example_telemetry(logs_enabled=True)
if not logs_enabled:
print("❌ Log forwarding not enabled")
return
logger = logging.getLogger("perf-test")
# Test log forwarding performance
num_logs = 100
start_time = time.time()
for i in range(num_logs):
logger.warning(
f"Performance test log {i}",
extra={
"test_id": "perf-test",
"log_number": i,
"batch_size": num_logs
}
)
end_time = time.time()
duration = end_time - start_time
print(f"📊 Performance Results:")
print(f" Logs sent: {num_logs}")
print(f" Duration: {duration:.2f}s")
print(f" Rate: {num_logs/duration:.1f} logs/sec")
print(f" Avg time per log: {(duration/num_logs)*1000:.2f}ms")
if __name__ == "__main__":
performance_test()
Environment-Specific Configurations¶
Development Environment¶
# .env for development
KIBANA_OTEL_ENABLED=true
KIBANA_OTEL_LOGS_ENABLED=true
KIBANA_OTEL_LOGS_LEVEL=DEBUG # See all logs
KIBANA_OTEL_LOGS_LOGGERS=kibana,my-app,debug
OTEL_SERVICE_NAME=my-app-dev
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
Staging Environment¶
# .env for staging
KIBANA_OTEL_ENABLED=true
KIBANA_OTEL_LOGS_ENABLED=true
KIBANA_OTEL_LOGS_LEVEL=WARNING # Important events only
KIBANA_OTEL_LOGS_LOGGERS=kibana,my-app
OTEL_SERVICE_NAME=my-app-staging
OTEL_EXPORTER_OTLP_ENDPOINT=https://staging-apm.company.com:8200
Production Environment¶
# .env for production
KIBANA_OTEL_ENABLED=true
KIBANA_OTEL_LOGS_ENABLED=true
KIBANA_OTEL_LOGS_LEVEL=ERROR # Errors only
KIBANA_OTEL_LOGS_LOGGERS=kibana,my-app,critical
OTEL_SERVICE_NAME=my-app-prod
OTEL_EXPORTER_OTLP_ENDPOINT=https://prod-apm.company.com:8200
Getting Help¶
Check Example Output¶
Run any example to see current configuration:
python examples/simple_index_connector.py
Look for the telemetry status output at the beginning.
Enable Verbose Logging¶
import logging
logging.basicConfig(level=logging.DEBUG)
Test with Console Export¶
from kibana import configure_opentelemetry
configure_opentelemetry(
enabled=True,
logs_enabled=True,
console_export=True # See logs in console too
)
Validate Configuration¶
from examples.utils import get_otel_config
config = get_otel_config()
print(f"Configuration: {config}")
Common Resolution Steps¶
Restart APM server (if using elastic-start-local):
./local-stack.sh -o stop cd .. ./local-stack.sh -o start
Clear environment variables and test with defaults:
unset KIBANA_OTEL_LOGS_ENABLED unset KIBANA_OTEL_LOGS_LEVEL unset KIBANA_OTEL_LOGS_LOGGERS python examples/simple_index_connector.py
Test with minimal configuration:
export KIBANA_OTEL_ENABLED=true export KIBANA_OTEL_LOGS_ENABLED=true export KIBANA_OTEL_LOGS_LEVEL=DEBUG export KIBANA_OTEL_LOGS_LOGGERS="" # All loggers python examples/simple_index_connector.py
Check APM server logs:
# If using elastic-start-local cd elastic-start-local docker-compose logs apm-server
See also
OpenTelemetry Observability - Observability setup and configuration
Log Forwarding Migration Guide - Log forwarding migration guide
Common Issues - General troubleshooting