CLI Observability with Langfuse
Copy page
Track LLM operations during pull command with Langfuse
The Inkeep CLI includes built-in observability for tracking LLM operations during the pull command. This allows you to monitor costs, latency, and quality of AI-generated code across different LLM providers.
Overview
When you run inkeep pull, the CLI uses LLMs to generate TypeScript files for your agents, tools, and components. With Langfuse integration enabled, you can:
- Track token usage and costs across Anthropic, OpenAI, and Google models
- Monitor generation latency to identify slow operations
- View complete traces of multi-file code generation
- Analyze placeholder optimization impact on token savings
- Debug failed generations with full context
Setup
1. Create a Langfuse Account
Sign up for a free account at cloud.langfuse.com (EU region) or us.cloud.langfuse.com (US region).
2. Get API Keys
From your Langfuse dashboard:
- Navigate to Settings → API Keys
- Create a new API key pair
- Copy both the Secret Key (
sk-lf-...) and Public Key (pk-lf-...)
3. Configure Environment Variables
Add these variables to your .env file in your project root:
4. Run Pull Command
Now when you run inkeep pull, all LLM operations will be traced to Langfuse:
Viewing Traces
In Langfuse Dashboard
- Go to your Langfuse dashboard
- Navigate to Traces
- You'll see traces for each file generation operation
Trace Metadata
Each trace includes rich metadata:
| Field | Description | Example |
|---|---|---|
fileType | Type of file being generated | agent, tool, data_component |
placeholderCount | Number of placeholders used | 5 |
promptSize | Size of prompt in characters | 15234 |
model | LLM model used | claude-sonnet-4-5 |
Example Trace Structure
Monitoring Strategies
Track Costs by Provider
Compare costs across different LLM providers:
- Filter traces by model in Langfuse
- View cumulative costs in the Usage dashboard
- Identify cost-saving opportunities
Optimize Generation Time
Find slow generation steps:
- Sort traces by duration
- Check if complex agents need longer timeouts
- Consider using faster models for simpler files
Analyze Token Savings
Monitor placeholder optimization impact:
- Look at
placeholderCountmetadata - Higher counts = more token savings
- Useful for understanding efficiency gains
Troubleshooting
Traces Not Appearing
Check if Langfuse is enabled:
Verify API keys are set:
Check for errors:
Missing Metadata
If traces appear but lack metadata:
- Ensure you're using the latest CLI version
- Check that file type context is being passed correctly
- Report issues on GitHub
Privacy Considerations
What Data is Sent to Langfuse
- Prompt content: The full prompts sent to LLMs (includes your project data)
- Generated code: The TypeScript code generated by LLMs
- Model metadata: Model names, token counts, timings
- File metadata: File types, sizes, placeholder counts
What is NOT Sent
- Your API keys: LLM provider keys are never sent to Langfuse
- Other environment variables: Only Langfuse-specific vars are used
Self-Hosted Option
For complete control over your data, you can self-host Langfuse:
See Langfuse self-hosting docs for details.
Best Practices
- Enable for development: Keep tracing on during development to catch issues early
- Disable in CI/CD: Turn off for automated builds to avoid unnecessary traces
- Review weekly: Check Langfuse dashboard weekly to monitor costs and performance
- Set budgets: Configure spending alerts in your LLM provider dashboards
Related Documentation
- SignOz Usage - OpenTelemetry tracing for runtime operations
- Langfuse Usage - Langfuse integration for agent runtime
- CLI Reference - Complete CLI command reference