Mini Project ~2–3 Weeks Beginner Friendly

Personal Expense Tracker with AI

Build a Streamlit expense tracking app that categorises your expenses automatically using an LLM and generates monthly insights and charts.

PythonStreamlitOpenAI APIpandasplotlySQLite
Source code + README Milestone breakdown Deployment guide 10 viva Q&A
Start Building → Viva Q&A Learn GenAI First

An intelligent expense tracker where you enter expenses in natural language and the AI automatically categorises them and generates financial insights.

  • Natural language expense entry: "coffee at Starbucks ₹250" → auto-categorised
  • LLM-powered categorisation (Food, Transport, Entertainment, etc.)
  • SQLite database for persistent storage across sessions
  • Monthly summary with interactive plotly charts (pie, bar, trend)
  • AI-generated insights: top spending category, comparison to last month, savings tips
  • CSV export of all transactions

Before You Start

  • Python basics (functions, file I/O)
  • Basic SQL understanding (SELECT, INSERT)
  • Familiarity with OpenAI API or willingness to learn

How It Works

📝 Natural Language
Expense Input
🤖 LLM Categorisation
(GPT-4o-mini)
🗄 SQLite Storage
(persistent)
📊 Plotly Charts
+ AI Insights

Milestone Breakdown

1
Week 1
Database + LLM Categorisation
  • Set up SQLite database (expenses table: date, description, amount, category)
  • Build LLM categorisation function with 10 standard categories
  • Test with 50 diverse expense descriptions
2
Week 2
Streamlit UI + Charts
  • Build expense entry form with date, description, amount
  • Display expense history table with edit/delete
  • Monthly pie chart and weekly trend bar chart with plotly
3
Week 3
AI Insights + Deploy
  • Generate monthly summary with LLM (top categories, unusual spends)
  • Add budget alerts (warn when category exceeds threshold)
  • Deploy to Streamlit Cloud

Core Implementation

categorise.py
import sqlite3
from openai import OpenAI

client = OpenAI()
CATEGORIES = ["Food", "Transport", "Shopping", "Entertainment",
              "Utilities", "Health", "Education", "Other"]

def categorise_expense(description: str) -> str:
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role":"user",
                   "content":f"Categorise: '{description}'. Choose from: {CATEGORIES}. Reply with category only."}]
    )
    category = resp.choices[0].message.content.strip()
    return category if category in CATEGORIES else "Other"

Deploy to Streamlit Cloud (Free)

deploy.sh
# requirements.txt: openai streamlit pandas plotly
# Set OPENAI_API_KEY in Streamlit secrets:
# share.streamlit.io → App → Settings → Secrets → OPENAI_API_KEY = "sk-..."

Create a free account at share.streamlit.io, connect your GitHub repo, and deploy in one click. Your app gets a public URL instantly.

10 Viva Questions with Answers

Q1. Why use SQLite for storage instead of a CSV file?
SQLite supports concurrent reads, ACID transactions, SQL queries for aggregation, and handles large datasets efficiently. CSV requires loading everything into memory for every query.
Q2. How do you prevent the LLM from returning an invalid category?
Validate: if response not in CATEGORIES, default to "Other". Add few-shot examples in the prompt showing correct category selection. Use response_format=json_object for structured output.
Q3. What is the cost of categorising 100 expenses with gpt-4o-mini?
Each categorisation: ~50 input tokens + 5 output tokens. 100 expenses: ~5500 tokens total. At $0.15/1M input: approximately $0.001. Very cheap for personal use.
Q4. How would you make this work without an OpenAI API key?
Use a free local model via Ollama (llama3.2:1b). Or use a rule-based fallback: keyword matching (coffee→Food, uber→Transport). Or HuggingFace zero-shot classification.
Q5. How does your app handle multi-user scenarios?
SQLite is single-writer — fine for personal use. For multi-user: PostgreSQL (Supabase), add user_id column, Supabase auth for per-user data isolation.
Q6. What data visualisations did you implement and why?
Pie chart for category breakdown (proportion), bar chart for monthly comparison (trend), line chart for daily spending (pattern). Each tells a different story about spending behavior.
Q7. How would you add budget tracking?
Budget table in SQLite: category, monthly_limit. Compare actual spending per category to limit. Show traffic light indicator (green/amber/red) based on % of budget used.
Q8. What are the security considerations for storing financial data?
Encrypt the SQLite database (sqlcipher). Never log transaction descriptions to console. Use Supabase RLS if multi-user. Don't include API keys in source code (use environment variables/secrets).
Q9. How would you test this application?
Unit tests for categorise_expense (mock OpenAI API), integration tests for SQLite operations, UI tests with Streamlit's testing framework. Test edge cases: duplicate entries, very large amounts, special characters.
Q10. What ML techniques could replace the LLM for categorisation?
Train a text classifier (TF-IDF + LogReg) on a labelled expense dataset. Rule-based with keyword matching + regex. Embedding similarity to category descriptions. LLM is easiest but costliest.
🏆

Mark Project Complete

Record your completion and earn your project certificate.