In this article, I’ll walk you through analyzing weather patterns using Python. From identifying temperature trends to visualizing rainfall, this step-by-step guide is perfect for anyone interested in using data science techniques for weather analysis. I’ll explore code, data manipulation, and visualizations for practical insights.
In Kenya, Weather plays a critical role in many sectors, particularly agriculture, tourism, and outdoor activities. Farmers, businesses, and event planners need accurate weather information in order to make decisions. However, weather patterns can vary significantly across different regions, and current forecasting systems may not always provide localised insights.
The objective of this project is to collect real-time weather data from from OpenWeatherMap API and Weather API for different regions across Kenya. This data will be stored in a database and analysed using Python to uncover insights into:-
In this project, I analyze a dataset containing weather information for various cities in Kenya. The dataset includes over 3,000 rows of weather observations, including temperature, humidity, pressure, wind speed, visibility, and rainfall, among other factors. Using these insights, we aim to provide accurate, region specific weather forecast that can aid decision-making in weather sensitive sectors like agriculture, tourism, and even management.
The dataset was structured using several columns:
This is how the data is structured in the database.
The first step in the analysis involved basic exploration of the data.
_ Data dimensions - The dataset contains 3,000 rows and 14 columns.
_ Null Values - Minimal missing data, ensuring that the dataset was reliable for further analysis.
print(df1[['temperature_celsius', 'humidity_pct', 'pressure_hpa', 'wind_speed_ms', 'rain', 'clouds']].describe())
Using the code above, we computed summary statistics for the numerical columns, that provided insights into the range, mean, and spread of temperature, humidity, pressure, rainfall and clouds.
To gain a clearer understanding of the weather features, we plotted various distributions:
Temperature Distribution
sns.displot(df1['temperature_celsius'], bins=50, kde=True) plt.title('Temperature Distribution') plt.xlabel('Temperature (Celsius)')
This distibution reveals the general spread of temperatures across the cities. The KDE line plot gives a smooth estimate of the probability distribution of temperature.
Rainfall Distribution
sns.displot(df1['rain'], bins=50, kde=True) plt.title('Rainfall Distribution') plt.xlabel('Rainfall (mm/h)')
This code analyzes rainfall distribution across kenyan cities.
Humidity, Pressure and Wind Speed
Similar distribution plots for Humidity (%), Pressure (hPa), and Wind Speed (m/s), each providing useful insights into the variations of these parameters across the dataset.
Weather conditions (e.g., 'Clouds', 'Rain') were counted and visualized using a pie chart to show their proportional distribution:
condition_counts = df1['weather_condition'].value_counts() plt.figure(figsize=(8,8)) plt.pie(condition_counts, labels=condition_counts.index, autopct='%1.1f%%', pctdistance=1.1, labeldistance=0.6, startangle=140) plt.title('Distribution of Weather Conditions') plt.axis('equal') plt.show()
One of the key analysis was the total rainfall by city:
rainfall_by_city = df1.groupby('city')['rain'].sum().sort_values() plt.figure(figsize=(12,12)) rainfall_by_city.plot(kind='barh', color='skyblue') plt.title('Total Rainfall by City') plt.xlabel('Total Rainfall (mm)') plt.ylabel('City') plt.tight_layout() plt.show()
This bar plot highlighted which cities received the most rain over the observed period, with a few outliers showing significant rainfall compared to others.
avg_temp_by_month.plot(kind='line') plt.title('Average Monthly Temperature')
The line chart revealed temperature fluctuations across different months, showing seasonal changes.
monthly_rain.plot(kind='line') plt.title('Average Monthly Rainfall')
Similarly, rainfall was analyzed to observe how it varied month-to-month.
We also visualized the data using heatmaps for a more intuitive understanding of monthly temperature and rainfall.
Here are the heatmaps for the average monthly temperature and rainfall
Next, I calculated the correlation matrix between key weather variables:
correlation_matrix = df1[['temperature_celsius', 'humidity_pct', 'pressure_hpa', 'wind_speed_ms', 'rain', 'clouds']].corr() correlation_matrix sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm') plt.title('Correlation Between Weather Variables')
This heatmap allowed us to identify relationships between variables. For example, we observed a negative correlation between temperature and humidity, as expected.
I have focused on individual cities such as Mombasa and Nyeri, to explore their unique weather patterns:
Mombasa Temperature Trends
plt.plot(monthly_avg_temp_msa) plt.title('Temperature Trends in Mombasa Over Time')
This city showed significant variation in temperature across the year.
Nyeri Rainfall Trends
plt.plot(monthly_avg_rain_nyr) plt.title('Rainfall Trends in Nyeri Over Time')
The rainfall data for Nyeri displayed a clear seasonal pattern, with rainfall peaking during certain months.
This analysis provides a comprehensive overview of the weather conditions in major cities, highlighting the temperature, rainfall, and other key weather variables. By using visualizations like histograms, line charts, pie charts, and heatmaps, we were able to extract meaningful insights into the data. Further analysis could involve comparing these trends with historical weather patterns or exploring predictive modeling to forecast future weather trends.
You can find the Jupyter Notebook with the full code for this analysis in my GitHub repository).
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3