Large Language Models (LLM) are a new technology that has taken over the world in no time. AI chatbots like ChatGPT have made AI controllable by using natural language and thus accessible and extremely useful to the general public.
Most people have probably already experimented with this or another chatbot and had similar experiences: They know the answer to almost every question, write detailed texts and even respond to counter-questions. There is one very big problem: The chatbots hallucinate, i.e. they invent content that sounds correct but is not correct. In addition, due to the vast amount of training data, it is no longer possible to reconstruct where a piece of information came from.
We have dealt with this problem and tried to find a solution by which a chatbot builds its answers within a given domain and gives sources for the origin of the information used from the domain.
Such knowledge-based chatbots can be used in practically any domain, for example, to chat with documentation, contents of a database or website, using natural language, without the risk of hallucination. We also looked at ideas such as the use of open source LLMs (e.g. LLama2), scalability and cost management.