The most recent ChatGPT chatbot, developed by OpenAI, has demonstrated the ability of AI to empower and support a diverse spectrum of users across various applications. Use cases vary from enriching the learning experience at all educational levels, supporting programmers with debugging and code explanations, to enabling content creators to craft immersive and captivating narratives. This article aims to bridge the gap between the often technical discussions about ChatGPT and the practical needs of businesses. To accomplish this, I first present the fundamentals of ChatGPT and show a custom application that leverages its capabilities for a specific case. By the end of this article, you will have a comprehensive understanding of ChatGPT and how it can be adapted to suit your unique requirements.
Large language models (LLMs) have become an essential component of modern chatbots and virtual assistant technologies due to their ability to analyse and understand natural language. In essence, a language model is a probabilistic technique trained on large volumes of text data with the purpose of learning patterns within language, such as syntax and semantics. Among the modelling strategies that have been proposed, Neural Network architectures such as Recurrent Neural Networks and transformers have led to outstanding results. In particular, the transformer architecture achieves superior performance while maintaining lower computational costs thanks to its encoder-decoder structure and self-attention mechanisms.
Generative Pre-trained Transformer (GPT) models, such as the ChatGPT, use the transformer architecture and a reinforcement learning mechanism in their training. Although these models leverage the same architecture, they differ in their intended usage. GPT models are general purpose LLMs. They are trained for a broad range of natural language processing tasks including text classification, question-answering, and text summarisation. Whereas, the ChatGPT model was specifically trained to hold conversations, making it well-suited for chatbot applications. It is worth noting that the GPT-3 model is one of the largest LLMs to date, with 175 billion parameters, compared to the 117 million parameters of the ChatGPT.
All put together, the ChatGPT model is a LLM built on the basis of a transformer architecture and reinforcement learning from human feedback, specially trained for text-based conversional applications such as chatbots and dialogue systems. Hence, it does not capture the wider range of linguistic phenomena and cannot support a larger set of language tasks unlike general purpose GPT models. Moreover, the model’s training is sourced from a vast amount of written content from the internet, up until 2021. As a result, any information requested beyond this period cannot be provided by the model.
Creating a custom chatbot with GPT using your own data
As aforementioned, GPT models rely on pre-defined datasets during training. Since the training data does not necessarily contain information that can be directly applied to a particular context, customisation is required. This allows the models to learn and adapt to the unique context of an organisation. As a result, the models will generate more accurate and context-specific responses.
At present, the GPT-3.5 model cannot be fine-tuned, which restricts the extent to which it can be customised. However, other models within the GPT family can be tailored in any of the following strategies:
Using the GPT model as-is and providing context-specific information
This approach also known as Prompt Engineering consists of designing and optimising prompts with information that the model uses to formulate a response. A prompt refers to the set of instructions received by the model, which can take many forms, such as a question, statement, or command. For example, in the case of a company, specific offers or products can be used as examples to feed base GPT models with information to answer customer queries.
However, this method is limited to processing a maximum number of text chunks known as tokens, which can range in length from a character to a word (a maximum of 4096 tokens for the GPT-3.5 model). Therefore it may be impractical when processing large amounts of information spread across multiple text files, if not combined with other strategies.
Fine-tuning base GPT-3 models
Base GPT-3 models can be fine-tuned with custom data, which involves continuing its training on a smaller, context-specific dataset. This can lead to better performance on custom tasks, without needing training from scratch. Additionally, the model can learn the right information from the data, thereby eliminating the need of custom prompts. This results as well in token savings, as the prompt is reduced to the user’s query.
The process of fine-tuning involves creating a custom training dataset and launching a training job. Once the model is fine-tuned, it can be used by specifying it as a parameter within an API call. For instance, a company can use past support call centre dialogues to fine-tune a GPT-3 model to generate responses that align with its services and philosophy.
The choice of implementation strategy depends on the use case and the associated costs to each method, as requirements such as the size of the search space and utilisation vary depending on the application.
The custom chatbot case
The user is a large network of over 600 companies. This can make it challenging for sales personnel to identify the company within the group that matches a specific customer request. To address this issue and facilitate the interaction between the sales personnel and their customers, a custom chatbot was built to support sales in finding the most suitable company for each case.
Since base GPT models lack knowledge on the user’s companies, a custom chatbot application was developed with the right information on each company’s offering and core competencies. The chatbot engages in a human-like fashion with the user and provides accurate information about the companies.
Figure 1 illustrates the limitation of the base GPT-3 model (text-davinci-003) to obtain the right information. The text highlighted in green is the answer of the model to our query: “Please tell me about MbarQ”. For readers unfamiliar with MbarQ, it is a Belgian start-up that provides AI services and is not involved in the hospitality industry.
Next, the base GPT-3 model was fed with information about MbarQ, so that it can accurately respond to our query. As shown in Figure 2, incorporating this knowledge base returns a response that closely aligns with the vision and services of MbarQ.
A custom chatbot was built leveraging the capabilities of the GPT-3 model from OpenAI and Power Apps, as illustrated in Figure 3. First, it was created a dataset with descriptions related to the services and capabilities of each company. This provides the necessary context-specific information to the base model and ensures accurate responses to queries. Finally, a custom flow and app within Power Apps were implemented, in order to integrate the GPT-3 model and the custom data via a user-friendly interface. An example of the application is displayed in Figure 4.
ChatGPT has transformed the way people interact with AI. It has brought AI closer to the general public and has inspired many people to leverage its capabilities to a wide range of applications. The base capabilities of language models like GPT are powerful and versatile. A whole new range of opportunities for companies can be unlocked with the appropriate customisation.
One should not ignore the limitations, potential risks, and biases that make AI technologies vulnerable to produce harmful content and misleading information. However, I strongly believe that stand-alone AI models, fine-tuned to meet specific business requirements, have the potential to build responsible AI solutions.
I would like to thank my colleagues at MbarQ for their valuable contributions to this article, specially to Stefan Schoonbrood, Steven Van Goidsenhoven, and Pieter van der Deen.
Meest recente blogs.
Generative AI and the labor market: the impact on jobs
Together with the rise of generative AI software, there’s an entire shift in the labor market, transforming the HR landscape as we know it. Not only will existing jobs undergo changes in their way of working; but new professions will also emerge. There will be jobs to develop, integrate, and optimize generative AI models within […]
Puratos’ recipe for success: Revolutionizing culinary conversations with ChatGPT
We created an AI chatbot that simplifies Puratos' product searches. Just ask the chatbot a question or describe the product, and voilà!
Data science vs content science
Tool such as Bard, Midjourney and - ofcourse - ChatGPT have lowered the threshold to using Generative AI. But, what's in it for you?
Meet Julio Alberto Lopez
began my professional life as an industrial engineer, but approximately five years ago, I developed a passion for data and digital platforms
Meet Pieter van der Deen
I’m the newest AI Engineer of MbarQ, freshly graduated from Thomas More University of Applied Sciences.
Meet Christophe Cop
I'm Christophe, and in hindsight, my main focus has always been data, although that's a rather abstract term.
Interview Diego Olaya
I’m a 32 year old Colombian who ended up in Belgium because of love.
Meet Silke Brusselmans
I am Silke, 32 years old and I would describe myself as creative and enthusiastic.
Ontmoet Steven Van Goidsenhoven
Ik geloof dat MbarQ een succesverhaal gaat worden door juist die combinatie van openheid en visie.
Meet Simon Uytterhoeven
"It is critical not to get lost in all of the technology in order to offer enough value for your end user."
Het business model van AI verandert en mijn boekhouder begrijpt het.
Mijn boekhouder vertelt me dat ik mijn bonnetjes via een foto kan uploaden en het systeem automatisch de juiste boekingen voorbereidt.
Interview met de oprichters: data & AI schaalbaar en haalbaar gemaakt
Wie zijn jullie en wat gaan jullie doen met MbarQ? Ik ben Erik Klewais en ik heb een sterke expertise in data en AI. Bij VDAB heb ik een lange carrière achter de rug in data warehousing en de laatste 5 jaar in het uitbouwen van een AI team. Daarna heb ik 3 jaar bij […]
Het glazen plafond van AI
We lopen met z’n allen tegen het plafond van Artificiële Intelligentie aan. In ons oprichtersinterview verwezen we al naar de recente studie van de Vlaamse Overheid die een helder licht op deze problematiek werpt. De studie vormt een brug naar de reden waarom MbarQ is opgericht. Als trusted advisor begeleiden we bedrijven om artificiële intelligentie en data toepassingen haalbaar, […]