A Custom Chatbot Leveraging GPT Capabilities

By Diego Olaya – article on Medium

The most recent ChatGPT chatbot, developed by OpenAI, has demonstrated the ability of AI to empower and support a diverse spectrum of users across various applications. Use cases vary from enriching the learning experience at all educational levels, supporting programmers with debugging and code explanations, to enabling content creators to craft immersive and captivating narratives. This article aims to bridge the gap between the often technical discussions about ChatGPT and the practical needs of businesses. To accomplish this, I first present the fundamentals of ChatGPT and show a custom application that leverages its capabilities for a specific case. By the end of this article, you will have a comprehensive understanding of ChatGPT and how it can be adapted to suit your unique requirements.

Understanding ChatGPT

Large language models (LLMs) have become an essential component of modern chatbots and virtual assistant technologies due to their ability to analyse and understand natural language. In essence, a language model is a probabilistic technique trained on large volumes of text data with the purpose of learning patterns within language, such as syntax and semantics. Among the modelling strategies that have been proposed, Neural Network architectures such as Recurrent Neural Networks and transformers have led to outstanding results. In particular, the transformer architecture achieves superior performance while maintaining lower computational costs thanks to its encoder-decoder structure and self-attention mechanisms.

Generative Pre-trained Transformer (GPT) models, such as the ChatGPT, use the transformer architecture and a reinforcement learning mechanism in their training. Although these models leverage the same architecture, they differ in their intended usage. GPT models are general purpose LLMs. They are trained for a broad range of natural language processing tasks including text classification, question-answering, and text summarisation. Whereas, the ChatGPT model was specifically trained to hold conversations, making it well-suited for chatbot applications. It is worth noting that the GPT-3 model is one of the largest LLMs to date, with 175 billion parameters, compared to the 117 million parameters of the ChatGPT.

All put together, the ChatGPT model is a LLM built on the basis of a transformer architecture and reinforcement learning from human feedback, specially trained for text-based conversional applications such as chatbots and dialogue systems. Hence, it does not capture the wider range of linguistic phenomena and cannot support a larger set of language tasks unlike general purpose GPT models. Moreover, the model’s training is sourced from a vast amount of written content from the internet, up until 2021. As a result, any information requested beyond this period cannot be provided by the model.

Creating a custom chatbot with GPT using your own data

As aforementioned, GPT models rely on pre-defined datasets during training. Since the training data does not necessarily contain information that can be directly applied to a particular context, customisation is required. This allows the models to learn and adapt to the unique context of an organisation. As a result, the models will generate more accurate and context-specific responses.

At present, the GPT-3.5 model cannot be fine-tuned, which restricts the extent to which it can be customised. However, other models within the GPT family can be tailored in any of the following strategies:

Using the GPT model as-is and providing context-specific information

This approach also known as Prompt Engineering consists of designing and optimising prompts with information that the model uses to formulate a response. A prompt refers to the set of instructions received by the model, which can take many forms, such as a question, statement, or command. For example, in the case of a company, specific offers or products can be used as examples to feed base GPT models with information to answer customer queries.

However, this method is limited to processing a maximum number of text chunks known as tokens, which can range in length from a character to a word (a maximum of 4096 tokens for the GPT-3.5 model). Therefore it may be impractical when processing large amounts of information spread across multiple text files, if not combined with other strategies.

Fine-tuning base GPT-3 models

Base GPT-3 models can be fine-tuned with custom data, which involves continuing its training on a smaller, context-specific dataset. This can lead to better performance on custom tasks, without needing training from scratch. Additionally, the model can learn the right information from the data, thereby eliminating the need of custom prompts. This results as well in token savings, as the prompt is reduced to the user’s query.

The process of fine-tuning involves creating a custom training dataset and launching a training job. Once the model is fine-tuned, it can be used by specifying it as a parameter within an API call. For instance, a company can use past support call centre dialogues to fine-tune a GPT-3 model to generate responses that align with its services and philosophy.

The choice of implementation strategy depends on the use case and the associated costs to each method, as requirements such as the size of the search space and utilisation vary depending on the application.

The custom chatbot case

The user is a large network of over 600 companies. This can make it challenging for sales personnel to identify the company within the group that matches a specific customer request. To address this issue and facilitate the interaction between the sales personnel and their customers, a custom chatbot was built to support sales in finding the most suitable company for each case.

Since base GPT models lack knowledge on the user’s companies, a custom chatbot application was developed with the right information on each company’s offering and core competencies. The chatbot engages in a human-like fashion with the user and provides accurate information about the companies.

Figure 1 illustrates the limitation of the base GPT-3 model (text-davinci-003) to obtain the right information. The text highlighted in green is the answer of the model to our query: “Please tell me about MbarQ”. For readers unfamiliar with MbarQ, it is a Belgian start-up that provides AI services and is not involved in the hospitality industry.

Figure 1. Base GPT-3 model response to: “Please tell me about MbarQ”.

Next, the base GPT-3 model was fed with information about MbarQ, so that it can accurately respond to our query. As shown in Figure 2, incorporating this knowledge base returns a response that closely aligns with the vision and services of MbarQ.

Figure 2. Custom knowledge base GPT-3 model response to: “Please tell me about MbarQ”.

A custom chatbot was built leveraging the capabilities of the GPT-3 model from OpenAI and Power Apps, as illustrated in Figure 3. First, it was created a dataset with descriptions related to the services and capabilities of each company. This provides the necessary context-specific information to the base model and ensures accurate responses to queries. Finally, a custom flow and app within Power Apps were implemented, in order to integrate the GPT-3 model and the custom data via a user-friendly interface. An example of the application is displayed in Figure 4.

Figure 3. Architecture of the custom chatbot.

Final reflection

ChatGPT has transformed the way people interact with AI. It has brought AI closer to the general public and has inspired many people to leverage its capabilities to a wide range of applications. The base capabilities of language models like GPT are powerful and versatile. A whole new range of opportunities for companies can be unlocked with the appropriate customisation.

One should not ignore the limitations, potential risks, and biases that make AI technologies vulnerable to produce harmful content and misleading information. However, I strongly believe that stand-alone AI models, fine-tuned to meet specific business requirements, have the potential to build responsible AI solutions.

I would like to thank my colleagues at MbarQ for their valuable contributions to this article, specially to Stefan Schoonbrood, Steven Van Goidsenhoven, and Pieter van der Deen.

News

Meest recente blogs.

Generative AI and the labor market: the impact on jobs

Together with the rise of generative AI software, there’s an entire shift in the labor market, transforming the HR landscape as we know it. Not only will existing jobs undergo changes in their way of working; but new professions will also emerge. There will be jobs to develop, integrate, and optimize generative AI models within […]

Lees meer

Puratos’ recipe for success: Revolutionizing culinary conversations with ChatGPT

We created an AI chatbot that simplifies Puratos' product searches. Just ask the chatbot a question or describe the product, and voilà!

Lees meer

Data science vs content science

Tool such as Bard, Midjourney and - ofcourse - ChatGPT have lowered the threshold to using Generative AI. But, what's in it for you?

Lees meer

Meet Julio Alberto Lopez

began my professional life as an industrial engineer, but approximately five years ago, I developed a passion for data and digital platforms

Lees meer

Meet Pieter van der Deen

I’m the newest AI Engineer of MbarQ, freshly graduated from Thomas More University of Applied Sciences.

Lees meer

Meet Christophe Cop

I'm Christophe, and in hindsight, my main focus has always been data, although that's a rather abstract term.

Lees meer

Interview Diego Olaya

I’m a 32 year old Colombian who ended up in Belgium because of love.

Lees meer

Meet Silke Brusselmans

I am Silke, 32 years old and I would describe myself as creative and enthusiastic.

Lees meer

Ontmoet Steven Van Goidsenhoven

Ik geloof dat MbarQ een succesverhaal gaat worden door juist die combinatie van openheid en visie.

Lees meer

Meet Simon Uytterhoeven

"It is critical not to get lost in all of the technology in order to offer enough value for your end user."

Lees meer

Het business model van AI verandert en mijn boekhouder begrijpt het.

Mijn boekhouder vertelt me dat ik mijn bonnetjes via een foto kan uploaden en het systeem automatisch de juiste boekingen voorbereidt.

Lees meer

Interview met de oprichters: data & AI schaalbaar en haalbaar gemaakt

Wie zijn jullie en wat gaan jullie doen met MbarQ? Ik ben Erik Klewais en ik heb een sterke expertise in data en AI. Bij VDAB heb ik een lange carrière achter de rug in data warehousing en de laatste 5 jaar in het uitbouwen van een AI team. Daarna heb ik 3 jaar bij […]

Lees meer

Het glazen plafond van AI

We lopen met z’n allen tegen het plafond van Artificiële Intelligentie aan. In ons oprichtersinterview verwezen we al naar de recente studie van de Vlaamse Overheid die een helder licht op deze problematiek werpt. De studie vormt een brug naar de reden waarom MbarQ is opgericht. Als trusted advisor begeleiden we bedrijven om artificiële intelligentie en data toepassingen haalbaar, […]

Lees meer

Cookie	Duur	Beschrijving
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.