Edit Content

About Us

We must explain to you how all seds this mistakens idea off denouncing pleasures and praising pain was born and I will give you a completed accounts off the system and expound.

Contact Info

 Use of chatGPT and YouTube to become a Data Analyst

Use of chatGPT and YouTube to become a Data Analyst – Χρήση του chatGPT και του YouTube για να κάνει κάποιος ανάλυση δεδομένων

Code file: app.py

Folder: C:PythonProgramsdata_analysis_with_chatGPT

1) Merge multiple excel files – Συγχώνευση πολλών αρχείων excel

data1.xlsx

data2.xlsx

data3.xlsx

data4.xlsx

2) Clean data – Καθαρισμός δεδομένων

3) Create two excel reports (Total Revenue, Expenses and Profit by Category) and (Total Revenue, Expenses and Profit by Country) – Δημιουργία δύο αρχείων excel (category_report.xlsx and country_report.xlsx)

4) Add a chart to category_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel category_report.xlsx

Category report

5) Add a chart to country_report.xlsx – Προσθήκη γραφήματος στο αρχείο excel country_report.xlsx

6) Build two interactive plots (category_chart.html and country_chart.html)

Category report htmlCountry report html

7) Create a streamlit dashboard with 2 graphs – Δημιουργία ενός πίνακα προβολής και διαχείρισης streamlit με 2 γραφήματα

Streamlit presentation

με τη βοήθεια της γλώσσας Python – with Python.

Changed code by Trifonas Papadopoulos:

#Code starts here—————————

import xlsxwriter

import pandas as pd

import os

import plotly.express as px

import streamlit as st

# Define the path to the data folder

data_folder = ‘data’

# Get a list of all the Excel files in the data folder

excel_files = [f for f in os.listdir(data_folder) if f.endswith(‘.xlsx’)]

# Create an empty dataframe to hold the combined data

combined_data = pd.DataFrame()

# Loop through each Excel file and append its data to the combined_data dataframe

for file in excel_files:

# Load the Excel file into a dataframe

df = pd.read_excel(os.path.join(data_folder, file))

# Add a column to the dataframe to indicate the source file

df[‘Source’] = file

# Append the data to the combined_data dataframe

combined_data = pd.concat([combined_data, df], ignore_index=True)

# Drop duplicate rows

combined_data.drop_duplicates(inplace=True)

# Handle missing values

combined_data.dropna(inplace=True)

# Print the combined data

print(combined_data)

# Category Statistics

# Calculate total revenue, expenses, and profit for each category

category_data = combined_data.groupby(‘Category’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

category_data[‘Profit’] = category_data[‘Revenue’] – category_data[‘Expenses’]

# Print the category data

print(category_data)

# Create an interactive chart using Plotly

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Category’)

# Save the chart as an HTML file

chart_name = ‘category_chart.html’

fig.write_html(chart_name)

print(f’Chart saved to {chart_name}.’)

# Create an Excel report

report_name = ‘category_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

# Write the category data to the report

worksheet.write(0, 0, ‘Category’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

row = 1

for index, row_data in category_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Category’})

chart.set_x_axis({‘name’: ‘Category’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

worksheet.insert_chart(‘F1′, chart)

# Close the workbook

workbook.close()

print(f’Report saved to {report_name}.’)

# Country Statistics

# Calculate total revenue, expenses, and profit for each category

country_data = combined_data.groupby(‘Country’).agg({‘Revenue’: ‘sum’, ‘Expenses’: ‘sum’})

country_data[‘Profit’] = country_data[‘Revenue’] – country_data[‘Expenses’]

# Print the category data

print(country_data)

# Create an interactive chart using Plotly

fig = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’, title=’Total Revenue, Expenses and Profit by Country’)

# Save the chart as an HTML file

chart_name = ‘country_chart.html’

fig.write_html(chart_name)

print(f’Chart saved to {chart_name}.’)

# Create an Excel report

report_name = ‘country_report.xlsx’

workbook = xlsxwriter.Workbook(report_name)

worksheet = workbook.add_worksheet()

# Write the category data to the report

worksheet.write(0, 0, ‘Country’)

worksheet.write(0, 1, ‘Revenue’)

worksheet.write(0, 2, ‘Expenses’)

worksheet.write(0, 3, ‘Profit’)

row = 1

for index, row_data in country_data.iterrows():

worksheet.write(row, 0, index)

worksheet.write(row, 1, row_data[‘Revenue’])

worksheet.write(row, 2, row_data[‘Expenses’])

worksheet.write(row, 3, row_data[‘Profit’])

row += 1

# Create a chart representing the data

chart = workbook.add_chart({‘type’: ‘column’})

chart.add_series({

‘name’: ‘Revenue’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 1, row – 1, 1],

})

chart.add_series({

‘name’: ‘Expenses’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 2, row – 1, 2],

})

chart.add_series({

‘name’: ‘Profit’,

‘categories’: [‘Sheet1’, 1, 0, row – 1, 0],

‘values’: [‘Sheet1’, 1, 3, row – 1, 3],

})

chart.set_title({‘name’: ‘Total Revenue, Expenses and Profit by Country’})

chart.set_x_axis({‘name’: ‘Country’})

chart.set_y_axis({‘name’: ‘Amount’})

chart.set_legend({‘position’: ‘bottom’})

worksheet.insert_chart(‘F1′, chart)

# Close the workbook

workbook.close()

print(f’Report saved to {report_name}.’)

# Create a Streamlit dashboard

st.set_page_config(page_title=’Total Revenue, Expenses and Profit’, page_icon=’:chart_with_upwards_trend:’)

st.title(‘Total Revenue, Expenses and Profit’)

st.subheader(‘Grouped Data by Category’)

# Create an interactive chart using Plotly and display it in the dashboard

fig = px.bar(category_data, x=category_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig)

# Add a second graph

st.subheader(‘Grouped Data by Country’)

# Create an interactive chart using Plotly and display it in the dashboard for second graph

fig2 = px.bar(country_data, x=country_data.index, y=[‘Revenue’, ‘Expenses’, ‘Profit’], barmode=’group’)

st.plotly_chart(fig2)

#Code ends here—————————

This code has been changed. But it has been based heavily on code and instructions that you can find in this video on YouTube : I Quit Coding – How I use ChatGPT instead as Data Analyst https://www.youtube.com/watch?v=IAWMMi_VFOI

YouTube Video credit : Coding is Fun https://www.youtube.com/@CodingIsFun

Music track: A Positive Direction by Aylex

Source: https://lnkd.in/d4gBFZEU

Royalty Free Music for Video (Safe)

#mindstormGR #chatGPT #coding #python #data_analysis / www.mindstorm.gr