Goal
Build a personal Expense Tracker Analytics mini-system. Import expenses from CSV (and optionally JSON), clean them, calculate monthly & category summaries, create a bar chart, and export results to CSV + JSON + PNG.
Difficulty: Easy • Perfect for your first Pandas project.
Data entry sources
Start with CSV and JSON. (MySQL/Mongo can be added as an upgrade later.)
📥 Read CSV + JSON
import pandas as pd
df_csv = pd.read_csv("../data/expenses.csv", parse_dates=["date"])
df_json = pd.read_json("../data/expenses_sample.json")
df_json["date"] = pd.to_datetime(df_json["date"], errors="coerce")
df = pd.concat([df_csv, df_json], ignore_index=True)
print(df.head())
print(df.info())
Pandas processing
Clean numeric values, create a month column, and compute totals.
🧠 Clean + groupby
import pandas as pd
df = pd.read_csv("../data/expenses.csv", parse_dates=["date"])
df.columns = [c.strip().lower().replace(" ", "_") for c in df.columns]
df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0)
df["month"] = df["date"].dt.to_period("M").astype(str)
by_category = df.groupby("category")["amount"].sum().sort_values(ascending=False)
by_month = df.groupby("month")["amount"].sum().reset_index()
print(by_category)
print(by_month.head())
Charts (Matplotlib) → PNG
Create a bar chart and save to PNG using plt.savefig().
📊 Bar chart + save PNG
import pandas as pd
import matplotlib.pyplot as plt
import os
os.makedirs("../output", exist_ok=True)
df = pd.read_csv("../data/expenses.csv", parse_dates=["date"])
df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0)
by_category = df.groupby("category")["amount"].sum().sort_values(ascending=False)
plt.figure()
plt.bar(by_category.index, by_category.values)
plt.title("Total Spend by Category")
plt.xlabel("Category")
plt.ylabel("Amount")
plt.xticks(rotation=25, ha="right")
plt.tight_layout()
plt.savefig("../output/easy_spend_by_category.png", dpi=200)
plt.show()
Exports
Export summaries to CSV and JSON.
💾 Export CSV + JSON
import pandas as pd
import os
os.makedirs("../output", exist_ok=True)
df = pd.read_csv("../data/expenses.csv", parse_dates=["date"])
df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0)
df["month"] = df["date"].dt.to_period("M").astype(str)
by_category = df.groupby("category")["amount"].sum().sort_values(ascending=False)
by_month = df.groupby("month")["amount"].sum().reset_index().rename(columns={"amount":"total_amount"})
by_category.to_csv("../output/easy_category_summary.csv")
by_month.to_json("../output/easy_monthly_summary.json", orient="records", indent=2)
Run steps
- Install deps: pip install -r requirements.txt
- Open terminal in scripts/
- Run: python easy_expense_analytics.py
- Check outputs in output/ (CSV/JSON/PNG)