Step 3: Transform & Model
Clean, transform, and model raw data into analysis-ready business metrics.
Business Logic Design
Goal: Transform raw licensing data into meaningful business health indicators that stakeholders can act on.
Raw Data Challenges
- • 47 different license types
- • Inconsistent business categories
- • Address variations (same business)
- • Status changes over time
Business Questions
- • Which neighborhoods are growing?
- • What business types are trending?
- • Are businesses staying or leaving?
- • How does permit activity correlate?
Data Modeling Strategy
Implemented a dimensional model optimized for time-series business analysis:
-- Core fact table design
CREATE TABLE fact_business_activity (
date_key DATE,
ward_key INTEGER,
business_type_key INTEGER,
license_status_key INTEGER,
-- Metrics
new_licenses INTEGER DEFAULT 0,
renewed_licenses INTEGER DEFAULT 0,
closed_licenses INTEGER DEFAULT 0,
total_active INTEGER DEFAULT 0,
-- Calculated fields
net_change INTEGER GENERATED ALWAYS AS
(new_licenses - closed_licenses) STORED,
growth_rate DECIMAL(5,2),
PRIMARY KEY (date_key, ward_key, business_type_key)
);Framework Progress
✓
Scope & Strategy✓
Data Ingestion✓
Transform & Model4
Load & Validate5
Visualize & Report6
Automate & ScaleTechnical Stack
PythonPandasPostgreSQLGreat ExpectationsDimensional Modeling