Overview

Why Inkeep?

TL;DR

In short: quality.

Our #1 goal is to help our customers ship AI support experiences to their users with confidence.

For us, quality means developing our AI solution to:

  • Admit when it doesn't know and intelligently guide users to support channels
  • Consistently find the right content and not hallucinate
  • Provide rich citations that help users inspect answers
  • Leverage many sources while prioritizing authorative content

It also means providing the end-to-end tooling needed for the entire support lifecycle. Beyond customer-facing AI assistants, Inkeep provides a copilot for support team members and actionable reporting for content and product teams.

We built our product and team to deliver best-in-class experiences for each of these steps.

That said - don't take our word for it! Use our free trial to test Inkeep with your toughest questions.

Hands-on Support

Quality for us also means working closely with any team we work to accomplish their goals. Consider us a partner who will work with you every step of the way.

Technical Deep Dive ✨

In our journey, we've talked to hundreds of companies who are eager to use generative AI to provide better self-serve experiences for their users. Many of them had experimented with creating their own LLM Q&A apps or tried other services, but often didn't ship because they felt the quality and reliability weren't there.

Ingesting content from many sources

Knowledge about technical products often lives in many places: documentation, GitHub, forums, Slack and Discord channels, blogs, StackOverflow, support systems, and elsewhere. Smartly ingesting all of this content, and keeping it up to date over time, quickly becomes the full-time task of a team of data engineers.

Inkeep addresses this by:

  • Automatically ingesting content from common public and private content sources with out-of-the-box integrations (while prioritizing them appropriately)

  • Frequently re-crawling your sources to find differences and keep your knowledge base up to date.

Finding the most relevant content

Retrieval augmented generation (RAG) is the best way to use LLMs to answer questions regarding domain-specific content. At a high level, it involves taking a user question, finding the most relevant content, and feeding it to an LLM model.

RAG relies on finding the relevant documents and "chunks" within those documents needed to answer user questions. The problem is, popular ways of doing retrieval - like slicing up all content into n-character chunks - are often arbitrary and ineffective.

Retrieval becomes even more challenging as the number of documents and sources increase. More content can mean higher coverage of user questions, but often also means more noise and the need for a precise retrieval system.

Our retrieval and neural search engines address this by using:

  • Accounting for time, author, source type, and other metadata that's important for prioritizing trustworthy content.

  • Custom embedding and chunking strategies for each content source. The most effective embedding and chunking strategy for a Slack conversation is very different from one for a "How-to" article.

  • Neural search that combines semantic and keyword search to balance vector similarity and keyword matching.

  • Tailoring of the embedding space to your specific organization and content. Out-of-the-box embedding models don't account for what we call the "semantic space" of your company and your products. For example, "Retrieval system" is much closer in semantic meaning to "Feature" for Inkeep than for other companies.

  • ...and more.

If you're curious about our technical approach, join our newsletter where we share product updates and engineering deep-dives.

Minimizing hallucinations

Conversational large language models are trained to provide satisfying answers to users. Unfortunately, this makes them prone to providing answers that are unsubstantiated, i.e. "hallucinating". Dealing with hallucinations is notoriously difficult and a common blocker for many companies.

Here are some of the key ways in which we minimize hallucinations with our grounded-answer system:

  1. Retrieving the right content - When models are not given content that helps them answer a question, they are significantly more likely to hallucinate. That's in part why we focus so heavily on our search and retrieval engines.

  2. Providing citations - Citations give end-users easy ways to learn more and introspect answers. We include rich citations in our UI to make it easy to compare reference content and use citations to automatically evaluate, alert, and fix drift from source material.

  3. Staying on topic - We've implemented a variety of protections to keep model answers on topic. For example, the bot won't answer questions unrelated to your company and will guard against giving answers that create a poor perception of your product.

  4. Rapidly experimenting at scale - We continuously test and evaluate our entire retrieval and LLM-stack against both historical and new user questions. This allows us to identify and adopt new techniques while monitoring for regressions.

Incorporating feedback

Even with a best-in-class retrieval and grounded-answer system, feedback loops are the key to continuous improvement of model performance over time.

Our platform has built-in mechanisms for this, including:

  • Thumbs up/down feedback from end-users

  • "Edit answer" feature for administrators

  • The ability to batch test a set of test questions

  • Custom FAQs

We also provide usage, topical, and sentiment analysis on all user questions. Product and content teams often use these insights to prioritize content creation and product improvements that address root causes of user questions.

Production-ready service

To launch something confidently to end-users, it's essential to have:

  • High availability, geo-distributed, low latency search and chat services

  • API and UX monitoring

  • Continuous evaluation of search and chat results

Our platform already handles this at scale and answers hundreds of thousands of questions per month.

The Team

Our team is made up of engineers passionate about machine learning, data engineering, and user experiences. We're excited to solve the challenges in this space and help companies provide the best self-serve support and search experiences possible to their users.

We're fortunate to have the backing of reputable investors, including Y Combinator and Khosla Ventures.

On this page