Create chat bot - JO PARIS 4 - Programming

Front page > Programming > Create chat bot - JO PARIS 4

Create chat bot - JO PARIS 4

Published on 2024-08-31

Browse:261

Create chat bot - JO PARIS 4

In this article, I show how to create a simple chat bot with tensorflow.

For the data, I use a kaggle dataset from PARIS JO JO 2024 to get sentences at the training stage.

You can obtain the finish code in my github : https://github.com/victordalet/Kaggle_analysis/tree/feat/paris_2024_olympics

I - Default chat bot dataset

A tensorflow dataset on chat bots looks like this.
We can find a tag, a pattern and the various responses.
Our goal will be to add the different sequences from the JO betting dataset and add them to a file like this.

{
  "intents": [
    {
      "tag": "google",
      "patterns": [
        "google",
        "search",
        "internet"
      ],
      "responses": [
        "Redirecting to Google..."
      ]
    },

II - Data processing

I read a chat bot dataset in default json and JO's csv and split and processed it to add the sentence in the json

import json


class CreateDataset:
    def __init__(self):
        self.json_path = 'data.json'
        self.csv_path = '../paris-2024-faq.csv'
        with open(self.json_path) as file:
            self.dataset = json.load(file)
        f = open(self.csv_path, 'r')
        dataset_split = f.read().split(";")
        question = False
        for data in dataset_split:
            if question:
                question = False
                self.dataset["intents"][-1]["responses"].append(data)

            if "?" in data:
                question = True
                self.dataset["intents"].append({
                    "tag": "",
                    "patterns": [
                        data
                    ],
                    "responses": [
                    ]
                })
        with open(self.json_path, 'w') as f:
            json.dump(self.dataset, f)

III - Training

For training purposes, I've edited a tensorflow example.
If you take my code to run it, add in the first argument the number of epochs you want.
Create a save directory where your model will go, and add inside the classes.pkl and words.pkl files that are in the github like at the beginning of this article.

import random
import json
import pickle
import numpy as np
import sys

import nltk
from nltk.stem import WordNetLemmatizer

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD


class Train:
    words: list
    classes: list
    documents: list
    ignore_letters: list
    training: list
    output_empty: list
    train_x: list
    train_y: list
    model: Sequential
    epochs: int

    def __init__(self):
        self.lemmatizer = WordNetLemmatizer()
        self.intents = json.loads(open('data.json').read())
        self.words = []
        self.classes = []
        self.documents = []
        self.training = []
        self.ignore_letters = ['?', '!']
        self.epochs = int(sys.argv[1])

    def run(self):
        self.download_nltk_data()
        self.load_training_data()
        self.prepare_training_data()
        self.build_neural_network()
        self.train()

    @staticmethod
    def download_nltk_data():
        nltk.download('punkt')
        nltk.download('wordnet')

    def load_training_data(self):
        for intent in self.intents['intents']:
            for pattern in intent['patterns']:
                word_list = nltk.word_tokenize(pattern)
                self.words.extend(word_list)
                self.documents.append((word_list, intent['tag']))
                if intent['tag'] not in self.classes:
                    self.classes.append(intent['tag'])

    def prepare_training_data(self):
        self.words = [self.lemmatizer.lemmatize(word)
                      for word in self.words
                      if word not in self.ignore_letters]

        self.words = sorted(set(self.words))
        self.classes = sorted(set(self.classes))
        pickle.dump(self.words, open('saves/words.pkl', 'wb'))
        pickle.dump(self.classes, open('saves/classes.pkl', 'wb'))

        self.output_empty = [0] * len(self.classes)
        for document in self.documents:
            bag = []
            word_patterns = document[0]
            word_patterns = [self.lemmatizer.lemmatize(word.lower())
                             for word in word_patterns]
            for word in self.words:
                bag.append(1) if word in word_patterns else bag.append(0)

            output_row = list(self.output_empty)
            output_row[self.classes.index(document[1])] = 1
            self.training.append([bag, output_row])

        random.shuffle(self.training)
        self.training = np.array(self.training)

        self.train_x = list(self.training[:, 0])
        self.train_y = list(self.training[:, 1])

    def build_neural_network(self):
        self.model = Sequential()
        self.model.add(Dense(128, input_shape=(len(self.train_x[0]),),
                             activation='relu'))
        self.model.add(Dropout(0.5))
        self.model.add(Dense(64, activation='relu'))
        self.model.add(Dropout(0.5))
        self.model.add(Dense(len(self.train_y[0]), activation='softmax'))

        sgd = SGD(lr=0.01, momentum=0.9, nesterov=True)
        self.model.compile(loss='categorical_crossentropy',
                           optimizer=sgd,
                           metrics=['accuracy'])

    def train(self):
        self.model.fit(np.array(self.train_x),
                       np.array(self.train_y),
                       epochs=self.epochs,
                       batch_size=5,
                       verbose=1)
        self.model.save('saves/chatbot_model.model')


if __name__ == "__main__":
    Train().run()

IV - Test

I create a ChatBot class, with a test method that takes a random message.
You can use the get_response method to add this chatbot to your application, for example I call it in one of my projects in a flask api to have my chatbot in a website.

import random
import json
import pickle
import numpy as np

import nltk
from nltk.stem import WordNetLemmatizer
from tensorflow.keras.models import load_model


class ChatBot:
    lemmatizer: WordNetLemmatizer
    intents: dict
    words: list
    classes: list
    model: load_model
    ERROR_THRESHOLD = 0.25

    def __init__(self):
        self.download_nltk_data()
        self.lemmatizer = WordNetLemmatizer()
        self.intents = json.loads(open('data.json').read())
        self.words = pickle.load(open('saves/words.pkl', 'rb'))
        self.classes = pickle.load(open('saves/classes.pkl', 'rb'))
        self.model = load_model('saves/chatbot_model.model')

    @staticmethod
    def download_nltk_data():
        nltk.download('punkt')
        nltk.download('wordnet')

    def clean_up_sentence(self, sentence):
        sentence_words = nltk.word_tokenize(sentence)
        sentence_words = [self.lemmatizer.lemmatize(word)
                          for word in sentence_words]
        return sentence_words

    def bag_of_words(self, sentence):
        sentence_words = self.clean_up_sentence(sentence)
        bag = [0] * len(self.words)
        for w in sentence_words:
            for i, word in enumerate(self.words):
                if word == w:
                    bag[i] = 1
        return np.array(bag)

    def predict_class(self, sentence):
        bow = self.bag_of_words(sentence)
        res = self.model.predict(np.array([bow]))[0]
        results = [[i, r]
                   for i, r in enumerate(res)
                   if r > self.ERROR_THRESHOLD]
        results.sort(key=lambda x: x[1], reverse=True)
        return_list = []
        for r in results:
            return_list.append({'intent': self.classes[r[0]],
                                'probability': str(r[1])})
        return return_list

    def get_response(self, intents_list):
        intents_json = self.intents
        tag = intents_list[0]['intent']
        list_of_intents = intents_json['intents']
        for i in list_of_intents:
            if i['tag'] == tag:
                result = random.choice(i['responses'])
                break
        return result

    def test(self):
        while True:
            message = input("")
            ints = self.predict_class(message)
            res = self.get_response(ints)
            print(res)

Release Statement This article is reproduced at: https://dev.to/victordalet/create-chat-bot-jo-paris-2024-4dnf?1 If there is any infringement, please contact [email protected] to delete it

Latest tutorial More>

Easily build query string methods for System.Net.HttpClient GET requests
System.Net.HttpClient query string construction method for GET request] question: System.Net.HttpClient lacks an API to directly add GET request para...

Programming Posted on 2025-03-13
Can IFRAME Content Overflow Its Parent Frame in Modern Browsers?
Can an IFRAME Overflow Its Parent Frame in Modern Browsers?You may have UI elements within an IFRAME that require tool-tips to overlay the parent fram...

Programming Posted on 2025-03-13
How Can I UNION Database Tables with Different Numbers of Columns?
Combined tables with different columns] Can encounter challenges when trying to merge database tables with different columns. A straightforward way i...

Programming Posted on 2025-03-13
Euclid distance to create a color matching library: My development journey
Color is paramount in design, branding, and UX. Choosing the right color is crucial for any product or website, but navigating countless shades and ...

Programming Posted on 2025-03-13
How Can I Efficiently Compare a Variable Against Multiple Values in C++?
Comparing a Variable to Multiple Values EfficientlyOften in programming, it is necessary to check if a variable matches one of several options. This c...

Programming Posted on 2025-03-13
The title of creating a unique constraint that allows null values in MySQL can be: How to create a unique constraint that allows null values in MySQL?
Unique Constraint Handling Empty Values in MySQLAs a common database requirement, it's crucial to guarantee data integrity by enforcing unique con...

Programming Posted on 2025-03-13
How does Android send POST data to PHP server?
Sending POST Data in AndroidIntroductionThis article addresses the need to send POST data to a PHP script and display the result in an Android applica...

Programming Posted on 2025-03-13
How to dynamically parse YAML fields to a finite set of structures in Go?
Dynamically Parsing YAML Fields to a Finite Set of Structs in GoParsing YAML files is a common task in many applications. However, sometimes a YAML fi...

Programming Posted on 2025-03-13
$Why Doesn\'t Firefox Display Images Using the CSS `content` Property?$
Why Doesn\'t Firefox Display Images Using the CSS `content` Property?
Displaying Images with Content URL in FirefoxAn issue has been encountered where certain browsers, specifically Firefox, fail to display images when r...

Programming Posted on 2025-03-13
How Can I Drop a Column from a SQLite Table?
Modify SQLite table: Delete column ] question: Try to delete a column from the SQLite database table using the following query: ALTER TABLE table_na...

Programming Posted on 2025-03-13
Python Read CSV File UnicodeDecodeError Ultimate Solution
Unicode Decode Error in CSV File ReadingWhen attempting to read a CSV file into Python using the built-in csv module, you may encounter an error stati...

Programming Posted on 2025-03-13
Does an INNER JOIN Override an OUTER JOIN's Inclusion of NULL Values?
Understanding the Interaction of INNER and OUTER JOINs in SQL Complex SQL queries frequently employ multiple JOIN operations, combining different JOI...

Programming Posted on 2025-03-13
Go Web Server: Securely release privileges after binding of privileged ports
Dropping Privileges in Go (v1.7)The task of creating a custom web server in Golang often encounters a need to bind to privileged ports like port 80. T...

Programming Posted on 2025-03-13
How Can I Prevent File Collisions When Creating and Writing to Files in C#?
Resolving File Conflicts with File.Create() in C# Creating and writing to files in C# can lead to errors if another process is already accessing the ...

Programming Posted on 2025-03-13
How to Query Zabbix to Show Host-Template Relationships?
Query to Display Host-Template Relationships in ZabbixThis query helps retrieve data from Zabbix tables to show which hosts utilize specific templates...

Programming Posted on 2025-03-13