{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreigaz2buoaqqo3hjf37v2tyrvckct27e7dh2ptjlyakm7bby25zwca",
"uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mpix4vggm5q2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreidmnpgeznw25iiklnggu4u3j3jgnnguupxdaptnupwvoavzpfr73q"
},
"mimeType": "image/webp",
"size": 74078
},
"path": "/magichu_njoroge_9123627a6/understanding-data-modeling-schemas-relationships-and-joins-1d63",
"publishedAt": "2026-06-30T11:22:24.000Z",
"site": "https://dev.to",
"tags": [
"beginners",
"database",
"microsoft",
"tutorial"
],
"textContent": "## Data Modeling in Power BI: Schemas, Relationships, and Joins Explained\n\nLearning Power BI was a very exciting part of this journey, and my first time on model view almost changed that. I genuinely had no idea what I was looking at. But with good tutoring, practice, and patience, I finally got the hang of it. Almost... With every wall or obstacle hit, I write about it for the next person and for my deeper understanding. This is data modeling as how I understand it.\n\n## What Is Data Modeling?\n\nData modeling is the process of creating visual representations of the connections between data structures, with information about the individual attributes contained within those data structures.\n\nIn simple terms, a data model answers three questions:\n\n * What tables do I have?\n * How do they relate to each other?\n * Which direction do filters travel between them?\n\n\n\nGet this right and your visuals update instantly; your numbers are accurate, and adding new data is easy. Get it wrong and you end up with reports that show inflated totals, filters that don't work, and dashboards that are a nightmare to maintain.\n\nA well-designed data model helps\n\n * Understand data requirements.\n * Ensure proper structure for reporting.\n * Align with organizational goals.\n * Maintain data integrity.\n\n\n\n## The Two Types of Tables: Fact and Dimension\n\nEvery data model is built from two kinds of tables. Understanding\nthe difference between them is the single most important foundation\nin data modeling.\n\n### Fact Tables\n\nThe fact table is at the center of the _star schema_ and stores the core transactional data you want to analyze, such as sales records, orders, or financial transactions. Each row in the fact table is unique and contains keys that link it to related _dimension tables_.\n\nFact tables are usually:\n\n * Very long (thousands or millions of rows)\n * Full of numbers (quantities, amounts, counts)\n * Connected to multiple other tables via ID columns\n\n\n\n### Dimension Tables\n\nA _dimension table_ is connected to the fact table and stores the context around the data, the who, what, when, and where. Customers. Products. Dates. Stores.\n\nDimension tables are usually\n\n * Short but wide (fewer rows, more descriptive columns)\n * Full of text and categories\n * The tables you filter and slice by in your reports\n\n\n\n\n ┌─────────────────────────────────────────────────────────────┐\n │ FACT vs DIMENSION │\n ├──────────────────────────┬──────────────────────────────────┤\n │ FACT TABLE │ DIMENSION TABLE │\n │ (Sales_Fact) │ (Dim_Product) │\n ├──────────────────────────┼──────────────────────────────────┤\n │ SaleID (PK) │ ProductID (PK) │\n │ CustomerID (FK) ──────┼──▶ CustomerID │\n │ ProductID (FK) ──────┼──▶ ProductID │\n │ DateID (FK) │ ProductName │\n │ StoreID (FK) │ Category │\n │ Quantity │ Price │\n │ SalesAmount │ Supplier │\n ├──────────────────────────┼──────────────────────────────────┤\n │ Rows: Millions │ Rows: Hundreds or Thousands │\n │ Updates: Daily │ Updates: Rarely │\n │ Contains: Numbers │ Contains: Descriptions │\n └──────────────────────────┴──────────────────────────────────┘\n\n\nA good way to remember it: the fact table is _what happened_. The dimension table is _everything around what happened_.\n\n## Schemas: The Blueprint of Your Model\n\nThis is where most beginners, myself included, get lost first.\nA schema refers to the structure and organization of data within a data model. Schemas define how data is connected and related within the model. There are two main types of schemas one interacts with in Power BI.\n\n### The Star Schema\n\nThe _star schema_ is the recommended approach for Power BI. One fact table sits at the center. All dimension tables connect directly to it. When you step back and look at it, it resembles a star; the fact table is the center, and the dimensions are the points radiating outward.\n\n\n\n ┌─────────────────┐\n │ Dim_Date │\n │─────────────────│\n │ DateID (PK) │\n │ Day │\n │ Month │\n │ Quarter │\n │ Year │\n └────────┬────────┘\n │ 1\n │\n ┌─────────────────────┼─────────────────────┐\n │ 1 │ N │ 1\n ┌──────┴──────────┐ ┌───────▼──────────┐ ┌──────┴──────────┐\n │ Dim_Customer │ │ Fact_Sales │ │ Dim_Product │\n │─────────────────│ │──────────────────│ │─────────────────│\n │ CustomerID (PK) ├─▶│ SaleID (PK) │◀─┤ ProductID (PK) │\n │ FullName │ │ CustomerID (FK) │ │ ProductName │\n │ Email │ │ ProductID (FK) │ │ Category │\n │ County │ │ DateID (FK) │ │ UnitPrice │\n │ Phone │ │ StoreID (FK) │ │ Supplier │\n └─────────────────┘ │ Quantity │ └─────────────────┘\n │ SalesAmount │\n ┌─────────────│ Discount │\n │ 1 └──────────────────┘\n ┌──────┴──────────┐\n │ Dim_Store │\n │─────────────────│\n │ StoreID (PK) │\n │ StoreName │\n │ Town │\n │ Region │\n └─────────────────┘\n\n\nWhy is the star schema the gold standard?\n\n * Simplifies queries by clearly defining relationships between facts and dimensions.\n * Reduces data redundancy through organized dimension tables.\n * Improves performance for large datasets and complex analytics.\n * It is easy for anyone to look at the model and understand it.\n * Adding a new dimension table later is straightforward.\n\n\n\n### The Snowflake Schema\n\nThe _snowflake schema_ is when the dimension table is split up into multiple related sub-tables. It is an extension of the star schema. A product dimension, for example,\nmight split into a separate category table and a separate supplier table.\n\n\n\n ┌──────────────────┐ ┌──────────────────┐\n │ Dim_Category │ │ Dim_Supplier │\n │──────────────────│ │──────────────────│\n │ CategoryID (PK) │ │ SupplierID (PK) │\n │ CategoryName │ │ SupplierName │\n └────────┬─────────┘ └────────┬─────────┘\n │ 1 │ 1\n │ │\n ▼ N ▼ N\n ┌──────────────────────────────────────────────┐\n │ Dim_Product │\n │──────────────────────────────────────────────│\n │ ProductID (PK) │\n │ ProductName │\n │ CategoryID (FK) ──▶ Dim_Category │\n │ SupplierID (FK) ──▶ Dim_Supplier │\n │ UnitPrice │\n └───────────────────────┬──────────────────────┘\n │ 1\n │\n ▼ N\n ┌────────────────────┐\n │ Fact_Sales │\n │────────────────────│\n │ SaleID (PK) │\n │ ProductID (FK) │\n │ CustomerID (FK) │\n │ SalesAmount │\n └────────────────────┘\n\n\n\n ┌──────────────────────┬───────────────────────────┐\n │ │ STAR │ SNOWFLAKE │\n ├──────────────────────┼────────────┼──────────────┤\n │ Query Speed │ Faster │ Slower │\n │ Ease of Use │ Simpler │ More complex │\n │ Power BI Performance │ Ideal │ Not ideal |\n │ Storage │ More space │ Less space │\n └──────────────────────┴────────────┴──────────────┘\n\n\n> Stick with the star schema in Power BI. It is what Microsoft recommends and what most professional models use.\n\n## Relationships: Connecting Your Tables\n\nA relationship is used to define how tables are linked to each other, which helps to analyze and visualize data across multiple tables seamlessly. There are several types of relationships, as it will be discussed later. They appear as literal lines with a number on each end.\n\n### Primary Keys and Foreign Keys\n\nEvery relationship is built on two column types:\n\n * **Primary Key (PK)** — uniquely identifies every row in a table. No duplicates. Example: `ProductID` in the Products table.\n\n * **Foreign Key (FK)** — a column in another table that references that primary key. Example: `ProductID` in the Sales table, pointing back to which product was sold.\n\n\n\n\n\n\n Dim_Product Fact_Sales\n ┌──────────────────────┐ ┌──────────────────────┐\n │ ProductID ← PK │─────1─────│ ProductID ← FK │\n │ ProductName │ │ SaleID │\n │ Category │ N │ CustomerID │\n │ UnitPrice │◀──────────│ Quantity │\n └──────────────────────┘ │ SalesAmount │\n └──────────────────────┘\n One product can appear in many sales rows — this is 1:N\n\n\n## Types of relationships\n\n**One-to-Many (1:N) — the most common**\n\nOne row in the dimension table connects to many rows in the fact table. This is the backbone of every star schema.\n\n\n\n Dim_Customer Fact_Sales\n ┌─────────────────┐ ┌─────────────────┐\n │ CustomerID: C01 │──────────────│ SaleID: 1001 │\n │ Name: Wanjiru │ 1 : N │ CustomerID: C01 │\n └─────────────────┘ │ ├─────────────────┤\n │ │ SaleID: 1002 │\n └─────▶│ CustomerID: C01 │\n ├─────────────────┤\n │ SaleID: 1003 │\n │ CustomerID: C01 │\n └─────────────────┘\n Wanjiru appears once as a customer but has made three purchases.\n\n\n**One-to-One (1:1) — rare**\n\nEach row in one table matches exactly one row in another. Used\nmostly when splitting a very wide table for performance reasons.\n\n\n\n Dim_Employee Dim_EmployeePrivate\n ┌──────────────────┐ ┌────────────────────────┐\n │ EmployeeID: E01 │────────│ EmployeeID: E01 │\n │ Name: Kamau │ 1:1 │ NationalID: 12345678 │\n │ Department: IT │ │ EmergencyContact: ... │\n └──────────────────┘ └────────────────────────┘\n\n\n**Many-to-Many (N:N) — handle carefully**\n\nMany rows in Table A match many rows in Table B. Power BI can\nhandle this, but it often leads to ambiguous results. The clean solution is a bridge table.\n\n\n\n Dim_Student Dim_Course\n ┌───────────────┐ ┌──────────────────┐\n │ StudentID │ │ CourseID │\n │ StudentName │ │ CourseName │\n └───────┬───────┘ └────────┬─────────┘\n │ 1 │ 1\n │ │\n ▼ N Bridge_Enrollment ▼ N\n └─────▶┌─────────────────┐◀┘\n │ StudentID (FK) │\n │ CourseID (FK) │\n │ EnrolledDate │\n └─────────────────┘\n The bridge table resolves the many-to-many into two 1:N links.\n\n\n## Joins: What Happens Behind the Scenes\n\nWhen Power BI evaluates a visual that pulls data from multiple tables, it performs a join; it combines rows from two tables based on a shared column. You never write the join yourself in Power BI, but knowing what type of join is happening helps you understand why some rows appear and others don't.\n\n### Inner Join — Only Matching Rows\n\nReturns only the rows that have a match in both tables. If a row in one table has no match in the other, it does not appear from the results.\n\n\n\n Fact_Sales Dim_Product\n ┌───────────────────┐ ┌───────────────────┐\n │ SaleID │ProductID │ │ ProductID │ Name │\n │ S001 │ P10 │ │ P10 │ Rice │\n │ S002 │ P20 │ │ P20 │ Sugar │\n │ S003 │ P99 │ └───────────────────┘\n └───────────────────┘\n\n INNER JOIN on ProductID\n ▼\n ┌──────────────────────────────────┐\n │ SaleID │ ProductID │ Name │\n │ S001 │ P10 │ Rice │\n │ S002 │ P20 │ Sugar │\n └──────────────────────────────────┘\n Sale S003 disappears — ProductID P99 has no match in Dim_Product\n\n\n### Left Join — Keep Everything from the Left\n\nReturns all rows from the left table, with matching data from the right table. Non-matching rows get a blank value rather than being dropped.\n\n\n\n LEFT JOIN on ProductID\n ▼\n ┌──────────────────────────────────┐\n │ SaleID │ ProductID │ Name │\n │ S001 │ P10 │ Rice │\n │ S002 │ P20 │ Sugar │\n │ S003 │ P99 │ (blank) │ ← kept, but no product name\n └──────────────────────────────────┘\n\n\n> In Power BI, when you define a relationship between two tables, it uses a Left Join by default, all rows from the dimension side are preserved, and fact rows without a matching dimension value show as blank rather than disappearing entirely.\n\n## Cross-Filter Direction: Which Way Do Filters Travel?\n\nWhen you click on a value in one visual, say, you click \"Nairobi\" on a map, Power BI filters every other visual on the page. The direction that filter travels between tables is controlled by the cross-filter direction setting on each relationship.\n\n**Single Direction (Default)**\n\nFilters flow from the dimension table toward the fact table only.\nThis is the safe, recommended default.\n\n\n\n Dim_Product Fact_Sales\n ┌─────────────────────┐ ┌──────────────────────┐\n │ Category = \"Flour\" │────filter──▶│ Shows only Flour │\n │ │ one way │ sales rows │\n └─────────────────────┘ └──────────────────────┘\n Filter does NOT travel back\n\n\n**Bidirectional**\n\nFilters flow both ways. This sounds useful but can create circular filter paths and slow your report down significantly.\n\n\n\n Dim_Product ◀────────────▶ Fact_Sales\n ┌─────────────────────┐ both ways ┌──────────────────────┐\n │ ProductName │◀───────────▶│ SalesAmount │\n └─────────────────────┘ └──────────────────────┘\n Use only when you have a specific, tested reason to do so\n\n\n> One thing that caught me off guard while learning this was assuming that because Power BI drew the relationship lines automatically, the model was correct. Sometimes it connected the wrong columns. You should always double-check the relationships before going any further to prevent the waste of time and resources.\n\n## Quick Reference Cheat Sheet\n\n\n ┌─────────────────────────────────────────────────────────────┐\n │ POWER BI DATA MODELING CHEAT SHEET │\n ├─────────────────────┬───────────────────────────────────────┤\n │ CONCEPT │ WHAT IT MEANS │\n ├─────────────────────┼───────────────────────────────────────┤\n │ Fact Table │ Stores transactions — the numbers │\n │ Dimension Table │ Stores context — the descriptions │\n │ Star Schema │ Fact at center, dimensions around it │\n │ Snowflake Schema │ Dimensions split into sub-tables │\n │ Primary Key (PK) │ Uniquely identifies every row │\n │ Foreign Key (FK) │ References a PK in another table │\n │ 1:N Relationship │ One dimension row → many fact rows │\n │ N:N Relationship │ Needs a bridge table to resolve │\n │ Inner Join │ Only rows that match in both tables │\n │ Left Join │ All rows from left + matches on right │\n │ Single Filter │ Filters flow dimension → fact │\n │ Bidirectional │ Filters flow both ways (use carefully)│\n ├─────────────────────┴───────────────────────────────────────┤\n │\n └─────────────────────────────────────────────────────────────┘\n\n\nAs someone still early in this journey, I'd appreciate comments, suggestions, and corrections, as they are my real learning curve and the most valuable part of my journey.",
"title": "UNDERSTANDING DATA MODELING, SCHEMAS, RELATIONSHIPS, AND JOINS."
}