Hello, I found a dataset on kaggle in the time of use of a website, so I want to find a ratio between the number of pages visited and the total time in the website.
You can find the dataset and the code in my github : https://github.com/victordalet/Kaggle_analysis/tree/feat/website_traffic
To do this, I use sqlalchemy in python to convert my csv into a database and plotly to display my results.
pip install plotly pip install sqlalchemy
I create a Main class, in which I retrieve my csv and put it in a database, using the get_data method.
The result is a list of tuples, so I create the transform_data method to obtain a double list.
Finally, I can display a simple graph between the number of pages viewed and the total time.
import pandas as pd from sqlalchemy import create_engine, text import plotly.express as px class Main: def __init__(self): self.result = None self.connection = None self.engine = create_engine("sqlite:///my_database.db", echo=False) self.df = pd.read_csv("website_wata.csv") self.df.to_sql("website_data", self.engine, index=False, if_exists="append") self.get_data() self.transform_data() self.display_graph() def get_data(self): self.connection = self.engine.connect() query = text("SELECT Page_Views, Time_on_Page FROM website_data") self.result = self.connection.execute(query).fetchall() def transform_data(self): for i in range(len(self.result)): self.result[i] = list(self.result[i]) def display_graph(self): fig = px.scatter( self.result, x=0, y=1, title="" ) fig.show() Main()
The x-axis indicates the number of pages visited by the user, while the y-axis shows the time spent on the website in minutes.
We can see that the users who stay the longest visit between 4 and 6 pages, and that between 11 and 15 pages all users stay at least a few minutes.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3