🍊 Programmer’s Picnic • Data Lesson

Pandas + CSV + Matplotlib — Comprehensive Lesson

Learn how to read CSV files using Pandas, clean and analyze data, then visualize insights using Matplotlib. Includes copy-ready code, downloadable scripts, and an embedded Python editor.

Level: Beginner → Confident Works in VS Code Offline HTML Embedded Python Editor

⭐ Embedded Python Editor (Programmer’s Picnic)

Use this editor to test small Python snippets. For Pandas + Matplotlib, your local VS Code setup is best, but you can still practice syntax and logic here.

🧪 Live Python Editor Tip: Run small snippets here
If you embed this page on Blogger: you may need to allow iframes in your template and keep sandbox permissive enough for the editor to run.

0) Setup (install + verify)

📦 Install
pip install pandas matplotlib
✅ Verify
import pandas as pd
import matplotlib.pyplot as plt

print("pandas:", pd.__version__)
print("matplotlib OK")

2) Read CSV properly (real-world CSV handling)

📄 Basic read
import pandas as pd

df = pd.read_csv("sales.csv")
print(df.head())
print(df.shape)
print(df.columns)
🛠 Robust read (dates + missing)
import pandas as pd

df = pd.read_csv(
    "sales.csv",
    parse_dates=["date"],
    na_values=["NA", "N/A", "-", "null", ""],
    on_bad_lines="skip"
)

print(df.info())

9) Chart examples

sales.csv (sample)
date,city,amount
2026-01-01,Varanasi,1200
2026-01-01,Delhi,800
2026-01-02,Varanasi,600
2026-01-02,Lucknow,500
2026-01-03,Delhi,1200
2026-01-03,Varanasi,900
📊 city_bar.py
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("sales.csv")
by_city = df.groupby("city")["amount"].sum().sort_values(ascending=False)

plt.figure()
plt.bar(by_city.index, by_city.values)
plt.title("Total Sales by City")
plt.xlabel("City")
plt.ylabel("Total Amount")
plt.tight_layout()
plt.show()
📈 daily_line.py
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("sales.csv", parse_dates=["date"])
daily = df.groupby(df["date"].dt.date)["amount"].sum()

plt.figure()
plt.plot(daily.index, daily.values, marker="o")
plt.title("Daily Sales")
plt.xlabel("Date")
plt.ylabel("Amount")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

10) Full workflow script (end-to-end)

⭐ pandas_csv_matplotlib_workflow.py
import pandas as pd
import matplotlib.pyplot as plt

CSV_FILE = "sales.csv"

def main():
    df = pd.read_csv(CSV_FILE, parse_dates=["date"])
    df.columns = [c.strip().lower().replace(" ", "_") for c in df.columns]
    df["amount"] = pd.to_numeric(df["amount"], errors="coerce").fillna(0)
    df = df.drop_duplicates()

    by_city = df.groupby("city")["amount"].sum().sort_values(ascending=False)
    daily = df.groupby(df["date"].dt.date)["amount"].sum()

    by_city.to_csv("report_city_totals.csv")
    daily.to_csv("report_daily_totals.csv")

    plt.figure()
    plt.bar(by_city.index, by_city.values)
    plt.title("Total Sales by City")
    plt.xlabel("City")
    plt.ylabel("Total Amount")
    plt.tight_layout()
    plt.show()

    plt.figure()
    plt.plot(daily.index, daily.values, marker="o")
    plt.title("Daily Sales Trend")
    plt.xlabel("Date")
    plt.ylabel("Amount")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

if __name__ == "__main__":
    main()