Troubleshooting Guide
Copy page
Learn how to diagnose and resolve issues when something breaks in your Inkeep agent system.
Overview
This guide provides a structured methodology for debugging problems across different components of your agent system.
Step 1: Check the Timeline
The timeline is your first stop for understanding what happened during a conversation or agent execution. Navigate to the Traces sections to view in depth details per conversation. Within each conversation, you'll find an error card that is clickable whenever something goes wrong during agent execution.
What to Look For
- Execution flow: Review the sequence of agent actions and tool calls
- Timing: Check for delays or bottlenecks in the execution
- Agent transitions: Verify that transfers and delegations happened as expected
- Tool usage: Confirm that tools were called correctly and returned expected results
- Error cards: Look for red error indicators in the timeline and click to view detailed error information
Error Cards in the Timeline
Clicking on this error card reveals:
- Error type: The specific category of error (e.g., "Agent Generation Error")
- Exception stacktrace: The complete stack trace showing exactly where the error occurred in the code
This detailed error information helps you pinpoint exactly what went wrong and where in your agent's execution chain.
Copy Trace for Debugging
The Copy Trace button in the timeline view allows you to export the entire conversation trace as JSON. This is particularly useful for offline analysis and debugging complex flows.

What's Included in the Trace Export
When you click Copy Trace, the system exports a JSON object containing:
How to Use Copy Trace
- Navigate to the Traces section in the management UI
- Open the conversation you want to debug
- Click the Copy Trace button at the top of the timeline
- The complete trace JSON is copied to your clipboard
- Paste it into your preferred tool for analysis
This exported trace contains all the activities shown in the timeline, making it easy to share complete execution context with team members or support.
Step 2: Check SigNoz
SigNoz provides distributed tracing and observability for your agent system, offering deeper insights when the built-in timeline isn't sufficient.
Accessing SigNoz from the Timeline
You can easily access SigNoz directly from the timeline view. In the Traces section, click on any activity in the conversation timeline to view its details. Within the activity details, you'll find a "View in SigNoz" button that takes you directly to the corresponding span in SigNoz for deeper analysis.
What SigNoz Shows
- Distributed traces: End-to-end request flows across services
- Performance metrics: Response times, throughput, and error rates
Key Metrics to Monitor
- Agent response times: How long each agent takes to process requests
- Tool execution times: Performance of MCP servers and external APIs
- Error rates: Frequency and types of failures
Agent Stopped Unexpectedly
StopWhen Limits Reached
If your agent stops mid-conversation, it may have hit a configured stopWhen limit:
- Transfer limit reached: Check
transferCountIson your Agent or Project - agent stops after this many transfers between Sub Agents - Step limit reached: Check
stepCountIson your Sub Agent or Project - execution stops after this many tool calls + LLM responses
How to diagnose:
- Check the timeline for the last activity before stopping
- Look for messages indicating limits were reached
- Review your stopWhen configuration in Agent/Project settings
How to fix:
- Increase the limits if legitimate use case requires more steps/transfers
- Optimize your agent flow to use fewer transfers
- Investigate if agent is stuck in a loop (limits working as intended)
See Configuring StopWhen for more details.
Common Configuration Issues
General Configuration Issues
- Missing environment variables: Ensure all required env vars are set
- Incorrect API endpoints: Verify you're using the right URLs
- Network connectivity: Check firewall and proxy settings
- Version mismatches: Ensure all packages are compatible
MCP Server Connection Issues
- MCP not able to connect:
- Check that the MCP server is running and accessible
- 401 Unauthorized errors:
- Verify that credentials are properly configured and valid
- Connection timeouts:
- Ensure network connectivity and firewall settings allow connections
AI Provider Configuration Problems
-
AI Provider key not defined or invalid:
- Ensure you have one of these environment variables set:
ANTHROPIC_API_KEY,OPENAI_API_KEY, orGOOGLE_GENERATIVE_AI_API_KEY - Verify the API key is valid and has sufficient credits
- Check that the key hasn't expired or been revoked
- Ensure you have one of these environment variables set:
-
GPT-5 access issues:
- Individual users cannot access GPT-5 as it requires organization verification
- Use GPT-4 or other available models instead
- Contact OpenAI support if you need GPT-5 access for your organization
Credit and Rate Limiting Issues
-
Running out of credits:
- Monitor your OpenAI usage and billing
- Set up usage alerts to prevent unexpected charges
-
Rate limiting by AI providers:
- Especially common with high-frequency operations like summarizers
- Monitor your API usage patterns and adjust accordingly
Context Fetcher Issues
- Context fetcher timeouts:
- Check that external services are responding within expected timeframes