{
"$type": "site.standard.document",
"content": "---\ntitle: \"One git heatmap to rule them all\"\ndescription: \"A Python script that merges contribution data from GitHub and multiple GitLab\n instances into a single interactive SVG heatmap.\"\ntags: [\"tools\", \"python\", \"visualisation\"]\n---\n\nGitHub's contribution heatmap gives me mixed feelings. A year of work, reduced\nto a grid of green squares. But if your work is spread across multiple git\nforges---as mine is, between GitHub and two self-hosted GitLabs at ANU (one for\nteaching, one for research, though the boundary is a bit blurry)---then no\nsingle profile page tells the whole story. My GitHub heatmap has gaps that\naren't actually gaps; they're just weeks where the commits landed somewhere\nelse.\n\nMostly I was just curious what my developer history since joining GitHub in\n2010 (towards the end of my PhD) looked like. So (Claude and) I wrote a script\nto find out. It pulls contribution data from all three forges and renders a\nsingle self-contained SVG covering my entire git history. Here's what it looks\nlike:\n\n<iframe src=\"/assets/contributions.svg\" style=\"display:block;width:100%;height:auto;aspect-ratio:1082/388;border:none;border-radius:6px;background:#0d1117\"></iframe>\n\n:::info[Accurate as of 2026-04-09]\n\nThis isn't a live visualisation---the SVG is a static snapshot generated by\nrunning the script below. It includes data up to the date it was last run.\n\n:::\n\nThe whole thing is a single Python file---about 900 lines, with\n[httpx](https://www.python-httpx.org/) as the only dependency. It uses\n[uv's inline script metadata](https://docs.astral.sh/uv/guides/scripts/#declaring-script-dependencies)\nso you can run it with `uv run contribution_heatmap.py` without setting up a\nvirtual environment. GitHub data comes via the\n[GraphQL contributions API](https://docs.github.com/en/graphql/reference/objects#contributionscollection),\nwhich gives you a nice per-day breakdown by year. GitLab is more work---the\nevents API only retains a year or two of history on most self-hosted instances,\nso the script scans all your member projects and queries their commit history\ndirectly via the\n[repository commits endpoint](https://docs.gitlab.com/api/commits/).\n\nThe tiles are week-aggregated rather than daily, because 16 years of daily tiles\nwould produce an unreadably wide image. Each year gets its own row with 53\ncolumns, which keeps things compact while still letting you spot seasonal\npatterns.\n\nThe colour scaling uses global quantile normalisation---non-zero weeks are split\ninto quartiles, so a few massive weeks don't wash out everything else. The\npalette matches GitHub's dark theme, because this site is dark-mode only and I\ndidn't fancy debugging a light-mode variant.\n\nThe SVG is fully self-contained: inline CSS, inline JavaScript for the hover\npopovers, no external dependencies at view time. Hover over any tile and you get\na breakdown by source, a day-of-week mini bar chart for that week, and the top\nevent types. It's all embedded in `foreignObject` elements, which is one of\nthose SVG features that feels slightly transgressive but works well in\npractice.[^foreignobject]\n\n[^foreignobject]:\n `foreignObject` lets you embed arbitrary HTML inside an SVG. It's been\n supported in all major browsers for years, but it still feels like you're\n getting away with something.\n\nCaching is simple but effective. For GitHub, past years get saved as JSON files\nand only the current (incomplete) year gets refetched. For GitLab, commits are\ncached per project and only rescanned when a project's `last_activity_at`\nchanges. The initial run takes a few minutes (scanning 1,000+ projects), but\nlater runs are near-instant.\n\nConfiguration is all via environment variables---`GITHUB_USER`, `GITHUB_TOKEN`,\n`GITLAB1_URL`, `GITLAB1_USER`, `GITLAB1_TOKEN`, and so on for a second GitLab\ninstance. There's a `--dry-run` flag that shows what would be fetched without\nactually hitting any APIs, which was helpful while getting the request counts\nright.\n\n## The full script\n\n```python\n#!/usr/bin/env python3\n# /// script\n# requires-python = \">=3.12\"\n# dependencies = [\"httpx\"]\n# ///\n\"\"\"Generate a multi-source contribution heatmap as a self-contained interactive SVG.\"\"\"\n\nfrom __future__ import annotations\n\nimport argparse\nimport json\nimport logging\nimport sys\nimport time\nfrom dataclasses import dataclass, field\nfrom datetime import date, datetime, timedelta\nfrom pathlib import Path\n\nimport httpx # ty: ignore[unresolved-import]\n\nlog = logging.getLogger(__name__)\n\n# ── Constants ──────────────────────────────────────────────────────────────────\n\nTILE = 14\nGAP = 2\nCELL = TILE + GAP\nLEFT_MARGIN = 54\nTOP_MARGIN = 72\nRIGHT_MARGIN = 180\nBOTTOM_MARGIN = 44\nMAX_WEEKS = 53\n\nPALETTE = [\"#161b22\", \"#0e4429\", \"#006d32\", \"#26a641\", \"#39d353\"]\nSOURCE_COLORS = {\"github\": \"#6e5494\", \"gitlab1\": \"#fc6d26\", \"gitlab2\": \"#1f9e8e\"}\nSOURCE_LABELS = {\"github\": \"GitHub\", \"gitlab1\": \"Teaching GitLab\", \"gitlab2\": \"Research GitLab\"}\nPOPOVER_LABELS = {\"github\": \"GitHub\", \"gitlab1\": \"Teaching\", \"gitlab2\": \"Research\"}\nTEXT_COLOR = \"#c9d1d9\"\nMUTED_COLOR = \"#8b949e\"\nBG_COLOR = \"#0d1117\"\nPOPOVER_BG = \"#1b1f23\"\nPOPOVER_BORDER = \"#30363d\"\n\nMONTHS = [\"Jan\", \"Feb\", \"Mar\", \"Apr\", \"May\", \"Jun\",\n \"Jul\", \"Aug\", \"Sep\", \"Oct\", \"Nov\", \"Dec\"]\nDAYS = [\"Mon\", \"Tue\", \"Wed\", \"Thu\", \"Fri\", \"Sat\", \"Sun\"]\n\nMAX_RETRIES = 3\nRETRY_DELAYS = [1, 2, 4]\nGITLAB_REQUEST_DELAY = 0.1\n\n\n# ── Data types ─────────────────────────────────────────────────────────────────\n\n@dataclass\nclass DayData:\n count: int = 0\n event_types: dict[str, int] = field(default_factory=dict)\n\n\n@dataclass\nclass WeekData:\n week_start: date\n total: int = 0\n by_source: dict[str, int] = field(default_factory=dict)\n by_day: list[int] = field(default_factory=lambda: [0] * 7)\n top_event_types: list[tuple[str, int]] = field(default_factory=list)\n\n\n@dataclass\nclass Stats:\n total: int = 0\n longest_streak: int = 0\n current_streak: int = 0\n most_active_week: tuple[date, int] = field(default_factory=lambda: (date.today(), 0))\n most_active_dow: str = \"Monday\"\n per_year: dict[int, dict[str, int]] = field(default_factory=dict)\n sources_available: list[str] = field(default_factory=list)\n sources_failed: list[str] = field(default_factory=list)\n\n\n# ── HTTP helpers ───────────────────────────────────────────────────────────────\n\ndef _request(client: httpx.Client, method: str, url: str, **kwargs) -> httpx.Response:\n for attempt in range(MAX_RETRIES):\n try:\n resp = client.request(method, url, **kwargs)\n if resp.status_code == 429:\n delay = RETRY_DELAYS[min(attempt, len(RETRY_DELAYS) - 1)]\n log.warning(\"Rate limited on %s, sleeping %ds\", url, delay)\n time.sleep(delay)\n continue\n resp.raise_for_status()\n return resp\n except httpx.TransportError as e:\n if attempt == MAX_RETRIES - 1:\n raise\n delay = RETRY_DELAYS[min(attempt, len(RETRY_DELAYS) - 1)]\n log.warning(\"Request failed (%s), retrying in %ds\", e, delay)\n time.sleep(delay)\n raise RuntimeError(f\"Failed after {MAX_RETRIES} retries: {method} {url}\")\n\n\ndef _graphql(client: httpx.Client, query: str, variables: dict) -> dict:\n resp = _request(client, \"POST\", \"https://api.github.com/graphql\",\n json={\"query\": query, \"variables\": variables})\n data = resp.json()\n if \"errors\" in data:\n raise RuntimeError(f\"GraphQL errors: {data['errors']}\")\n rate = data.get(\"data\", {}).get(\"rateLimit\", {})\n if rate and rate.get(\"remaining\", 100) < 10:\n reset_at = datetime.fromisoformat(rate[\"resetAt\"].replace(\"Z\", \"+00:00\"))\n wait = max(0, (reset_at - datetime.now(reset_at.tzinfo)).total_seconds())\n if wait > 0:\n log.info(\"GitHub rate limit low (%d remaining), sleeping %.0fs\",\n rate[\"remaining\"], wait)\n time.sleep(wait)\n return data\n\n\n# ── GitHub fetcher ─────────────────────────────────────────────────────────────\n\nGITHUB_CREATED_QUERY = \"\"\"\nquery($login: String!) {\n rateLimit { remaining resetAt }\n user(login: $login) { createdAt }\n}\"\"\"\n\nGITHUB_CONTRIB_QUERY = \"\"\"\nquery($login: String!, $from: DateTime!, $to: DateTime!) {\n rateLimit { remaining resetAt }\n user(login: $login) {\n contributionsCollection(from: $from, to: $to) {\n contributionCalendar {\n weeks {\n contributionDays {\n date\n contributionCount\n }\n }\n }\n }\n }\n}\"\"\"\n\n\ndef fetch_github(user: str, token: str, start_year: int | None,\n end_date: date, cache_dir: Path, no_cache: bool) -> tuple[dict[date, DayData], int]:\n client = httpx.Client(\n headers={\"Authorization\": f\"bearer {token}\", \"Content-Type\": \"application/json\"},\n timeout=30,\n )\n try:\n resp = _graphql(client, GITHUB_CREATED_QUERY, {\"login\": user})\n created = datetime.fromisoformat(\n resp[\"data\"][\"user\"][\"createdAt\"].replace(\"Z\", \"+00:00\"))\n join_year = created.year\n if start_year is None:\n start_year = join_year\n\n result: dict[date, DayData] = {}\n\n for year in range(start_year, end_date.year + 1):\n cache_file = cache_dir / f\"github_{year}.json\"\n if not no_cache and year < end_date.year and cache_file.exists():\n log.info(\"GitHub %d: using cache\", year)\n cal = json.loads(cache_file.read_text())\n else:\n log.info(\"GitHub %d: fetching\", year)\n from_dt = f\"{year}-01-01T00:00:00Z\"\n to_year = year + 1 if year < end_date.year else end_date.year\n to_month = 1 if year < end_date.year else end_date.month\n to_day = 1 if year < end_date.year else end_date.day\n to_dt = f\"{to_year}-{to_month:02d}-{to_day:02d}T00:00:00Z\"\n if year == end_date.year:\n to_dt = f\"{(end_date + timedelta(days=1)).isoformat()}T00:00:00Z\"\n\n data = _graphql(client, GITHUB_CONTRIB_QUERY,\n {\"login\": user, \"from\": from_dt, \"to\": to_dt})\n cal = data[\"data\"][\"user\"][\"contributionsCollection\"][\"contributionCalendar\"]\n cache_dir.mkdir(parents=True, exist_ok=True)\n cache_file.write_text(json.dumps(cal))\n\n for week in cal[\"weeks\"]:\n for day in week[\"contributionDays\"]:\n d = date.fromisoformat(day[\"date\"])\n if d <= end_date:\n result[d] = DayData(count=day[\"contributionCount\"])\n\n return result, start_year\n finally:\n client.close()\n\n\n# ── GitLab fetcher ─────────────────────────────────────────────────────────────\n\ndef _list_gitlab_projects(client: httpx.Client,\n base_url: str) -> list[dict]:\n projects: list[dict] = []\n page = 1\n while True:\n resp = _request(client, \"GET\", f\"{base_url}/api/v4/projects\",\n params={\"membership\": \"true\", \"per_page\": 100,\n \"page\": page, \"simple\": \"true\"})\n batch = resp.json()\n if not batch:\n break\n projects.extend(batch)\n if len(batch) < 100:\n break\n page += 1\n time.sleep(GITLAB_REQUEST_DELAY)\n return projects\n\n\ndef _fetch_project_user_commits(client: httpx.Client, base_url: str,\n proj_id: int,\n author_emails: list[str]) -> list[str]:\n seen: set[str] = set()\n dates: list[str] = []\n for email in author_emails:\n page = 1\n while True:\n try:\n resp = _request(client, \"GET\",\n f\"{base_url}/api/v4/projects/{proj_id}/repository/commits\",\n params={\"author\": email, \"per_page\": 100, \"page\": page})\n except Exception:\n break\n batch = resp.json()\n if not isinstance(batch, list) or not batch:\n break\n for c in batch:\n sha = c.get(\"id\", \"\")\n if sha and sha not in seen:\n seen.add(sha)\n dates.append(c[\"created_at\"][:10])\n if len(batch) < 100:\n break\n page += 1\n time.sleep(GITLAB_REQUEST_DELAY)\n return dates\n\n\ndef fetch_gitlab(base_url: str, username: str, token: str,\n author_emails: list[str], start_year: int | None,\n end_date: date, cache_dir: Path, no_cache: bool,\n source_name: str) -> tuple[dict[date, DayData], int]:\n base_url = base_url.rstrip(\"/\")\n client = httpx.Client(headers={\"PRIVATE-TOKEN\": token}, timeout=30)\n try:\n projects = _list_gitlab_projects(client, base_url)\n log.info(\"%s: %d member projects to scan\", source_name, len(projects))\n\n cache_file = cache_dir / f\"{source_name}_project_commits.json\"\n cached: dict[str, dict] = {}\n if not no_cache and cache_file.exists():\n cached = json.loads(cache_file.read_text())\n\n result: dict[date, DayData] = {}\n scanned = 0\n\n for i, proj in enumerate(projects):\n proj_id = str(proj[\"id\"])\n proj_path = proj.get(\"path_with_namespace\", proj_id)\n last_activity = proj.get(\"last_activity_at\", \"\")\n\n if proj_id in cached and not no_cache:\n if cached[proj_id].get(\"last_activity\") == last_activity:\n for d_str in cached[proj_id].get(\"dates\", []):\n _add_commit_date(result, d_str, start_year, end_date)\n continue\n\n commit_dates = _fetch_project_user_commits(\n client, base_url, proj[\"id\"], author_emails)\n scanned += 1\n\n cached[proj_id] = {\n \"last_activity\": last_activity,\n \"path\": proj_path,\n \"dates\": commit_dates,\n }\n\n for d_str in commit_dates:\n _add_commit_date(result, d_str, start_year, end_date)\n\n if commit_dates:\n log.info(\"%s: %s — %d commits\", source_name, proj_path,\n len(commit_dates))\n if scanned % 100 == 0:\n log.info(\"%s: scanned %d/%d projects…\",\n source_name, i + 1, len(projects))\n\n cache_dir.mkdir(parents=True, exist_ok=True)\n cache_file.write_text(json.dumps(cached))\n log.info(\"%s: done — scanned %d projects (%d from cache)\",\n source_name, scanned, len(projects) - scanned)\n\n if result and start_year is None:\n start_year = min(d.year for d in result)\n if start_year is None:\n start_year = end_date.year\n return result, start_year\n finally:\n client.close()\n\n\ndef _add_commit_date(result: dict[date, DayData], d_str: str,\n start_year: int | None, end_date: date) -> None:\n try:\n d = date.fromisoformat(d_str)\n except ValueError:\n return\n if d > end_date:\n return\n if start_year and d.year < start_year:\n return\n if d not in result:\n result[d] = DayData()\n result[d].count += 1\n result[d].event_types[\"commits\"] = result[d].event_types.get(\"commits\", 0) + 1\n\n\n# ── Data processing ────────────────────────────────────────────────────────────\n\ndef _week_start(d: date) -> date:\n return d - timedelta(days=d.weekday())\n\n\ndef _year_week_mondays(year: int) -> list[date]:\n \"\"\"All week-start Mondays whose Thursday falls in the given calendar year.\"\"\"\n jan1 = date(year, 1, 1)\n monday = jan1 - timedelta(days=jan1.weekday())\n if (monday + timedelta(days=3)).year < year:\n monday += timedelta(weeks=1)\n\n weeks = []\n while (monday + timedelta(days=3)).year == year:\n weeks.append(monday)\n monday += timedelta(weeks=1)\n return weeks\n\n\ndef merge_sources(sources: dict[str, dict[date, DayData]]) -> dict[date, dict[str, DayData]]:\n daily: dict[date, dict[str, DayData]] = {}\n for src, days in sources.items():\n for d, dd in days.items():\n if d not in daily:\n daily[d] = {}\n daily[d][src] = dd\n return daily\n\n\ndef aggregate_weeks(daily: dict[date, dict[str, DayData]],\n start_year: int, end_date: date\n ) -> dict[int, list[tuple[date, WeekData]]]:\n flat: dict[date, WeekData] = {}\n for d, sources in daily.items():\n ws = _week_start(d)\n if ws not in flat:\n flat[ws] = WeekData(week_start=ws)\n wd = flat[ws]\n dow = d.weekday()\n for src, dd in sources.items():\n wd.total += dd.count\n wd.by_source[src] = wd.by_source.get(src, 0) + dd.count\n wd.by_day[dow] += dd.count\n\n all_event_types: dict[date, dict[str, int]] = {}\n for d, sources in daily.items():\n ws = _week_start(d)\n if ws not in all_event_types:\n all_event_types[ws] = {}\n for dd in sources.values():\n for et, cnt in dd.event_types.items():\n all_event_types[ws][et] = all_event_types[ws].get(et, 0) + cnt\n\n for ws, ets in all_event_types.items():\n if ws in flat:\n flat[ws].top_event_types = sorted(ets.items(), key=lambda x: -x[1])[:3]\n\n by_year: dict[int, list[tuple[date, WeekData]]] = {}\n for year in range(start_year, end_date.year + 1):\n mondays = _year_week_mondays(year)\n by_year[year] = [(m, flat.get(m, WeekData(week_start=m))) for m in mondays]\n return by_year\n\n\n# ── Statistics ─────────────────────────────────────────────────────────────────\n\ndef compute_stats(daily: dict[date, dict[str, DayData]],\n weeks_by_year: dict[int, list[tuple[date, WeekData]]],\n end_date: date,\n sources_ok: list[str],\n sources_fail: list[str]) -> Stats:\n stats = Stats(sources_available=sources_ok, sources_failed=sources_fail)\n\n if not daily:\n return stats\n\n all_dates = sorted(daily.keys())\n start = all_dates[0]\n\n stats.total = sum(dd.count for sources in daily.values() for dd in sources.values())\n\n longest = current = 0\n d = start\n while d <= end_date:\n day_total = sum(dd.count for dd in daily.get(d, {}).values())\n if day_total > 0:\n current += 1\n longest = max(longest, current)\n else:\n current = 0\n d += timedelta(days=1)\n stats.longest_streak = longest\n stats.current_streak = current\n\n dow_totals = [0] * 7\n for d, sources in daily.items():\n day_total = sum(dd.count for dd in sources.values())\n dow_totals[d.weekday()] += day_total\n stats.most_active_dow = DAYS[dow_totals.index(max(dow_totals))]\n\n best_week = date.today()\n best_count = 0\n for year_weeks in weeks_by_year.values():\n for monday, wd in year_weeks:\n if wd.total > best_count:\n best_count = wd.total\n best_week = monday\n stats.most_active_week = (best_week, best_count)\n\n for year, year_weeks in weeks_by_year.items():\n year_by_source: dict[str, int] = {}\n for _, wd in year_weeks:\n for src, cnt in wd.by_source.items():\n year_by_source[src] = year_by_source.get(src, 0) + cnt\n year_by_source[\"_total\"] = sum(year_by_source.values())\n stats.per_year[year] = year_by_source\n\n return stats\n\n\n# ── Color levels ───────────────────────────────────────────────────────────────\n\ndef _compute_thresholds(values: list[int]) -> list[int]:\n non_zero = sorted(v for v in values if v > 0)\n if not non_zero:\n return [1, 1, 1]\n n = len(non_zero)\n return [\n non_zero[max(0, n * 1 // 4 - 1)],\n non_zero[max(0, n * 2 // 4 - 1)],\n non_zero[max(0, n * 3 // 4 - 1)],\n ]\n\n\ndef _level(value: int, thresholds: list[int]) -> int:\n if value == 0:\n return 0\n if value <= thresholds[0]:\n return 1\n if value <= thresholds[1]:\n return 2\n if value <= thresholds[2]:\n return 3\n return 4\n\n\n# ── SVG builder ────────────────────────────────────────────────────────────────\n\ndef _fmt_date(d: date) -> str:\n return f\"{d.day} {MONTHS[d.month - 1]} {d.year}\"\n\n\ndef _fmt_date_short(d: date) -> str:\n return f\"{d.day} {MONTHS[d.month - 1]}\"\n\n\ndef build_svg(weeks_by_year: dict[int, list[tuple[date, WeekData]]],\n stats: Stats) -> str:\n years = sorted(weeks_by_year.keys())\n num_years = len(years)\n\n grid_w = MAX_WEEKS * CELL\n grid_h = num_years * CELL\n svg_w = LEFT_MARGIN + grid_w + RIGHT_MARGIN\n svg_h = TOP_MARGIN + grid_h + BOTTOM_MARGIN\n\n global_totals = [wd.total for yws in weeks_by_year.values() for _, wd in yws]\n global_thresh = _compute_thresholds(global_totals)\n\n active_sources = stats.sources_available\n\n parts: list[str] = []\n\n parts.append(f'<svg xmlns=\"http://www.w3.org/2000/svg\" '\n f'viewBox=\"0 0 {svg_w} {svg_h}\" '\n f'width=\"100%\">')\n\n # Background\n parts.append(f'<rect width=\"{svg_w}\" height=\"{svg_h}\" fill=\"{BG_COLOR}\" rx=\"6\"/>')\n\n # CSS\n parts.append('<style><![CDATA[')\n parts.append(f'text {{ fill: {TEXT_COLOR}; '\n f'font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", sans-serif; }}')\n parts.append(f'.year-label {{ font-size: 11px; fill: {MUTED_COLOR}; }}')\n parts.append(f'.month-label {{ font-size: 10px; fill: {MUTED_COLOR}; }}')\n parts.append(f'.stat-text {{ font-size: 11px; fill: {MUTED_COLOR}; }}')\n parts.append(f'.title {{ font-size: 14px; font-weight: 600; fill: {TEXT_COLOR}; }}')\n parts.append(f'.year-total {{ font-size: 10px; fill: {MUTED_COLOR}; text-anchor: end; }}')\n parts.append(f'.tile {{ rx: 2; ry: 2; }}')\n parts.append(']]></style>')\n\n # Title\n source_names = \" + \".join(SOURCE_LABELS[s] for s in active_sources)\n if stats.sources_failed:\n source_names += \" (\" + \", \".join(\n f\"{SOURCE_LABELS.get(s, s)} unavailable\" for s in stats.sources_failed) + \")\"\n parts.append(f'<text x=\"{LEFT_MARGIN}\" y=\"20\" class=\"title\">'\n f'Contributions across {source_names} \\u00b7 {years[0]}\\u2013{years[-1]}</text>')\n\n # Stats strip\n maw_date = _fmt_date_short(stats.most_active_week[0])\n stat_line = (f\"Total: {stats.total:,} \\u00b7 \"\n f\"Longest streak: {stats.longest_streak:,} days \\u00b7 \"\n f\"Most active week: {maw_date} ({stats.most_active_week[1]:,}) \\u00b7 \"\n f\"Most active day: {stats.most_active_dow}\")\n parts.append(f'<text x=\"{LEFT_MARGIN}\" y=\"38\" class=\"stat-text\">{stat_line}</text>')\n\n # Month labels (based on last year)\n last_year = years[-1]\n last_mondays = [m for m, _ in weeks_by_year[last_year]]\n for month_idx in range(12):\n first_of_month = date(last_year, month_idx + 1, 1)\n ws = _week_start(first_of_month)\n if ws in last_mondays:\n col = last_mondays.index(ws)\n else:\n closest = min(last_mondays, key=lambda m: abs((m - ws).days))\n col = last_mondays.index(closest)\n x = LEFT_MARGIN + col * CELL\n parts.append(f'<text x=\"{x}\" y=\"{TOP_MARGIN - 6}\" class=\"month-label\">'\n f'{MONTHS[month_idx]}</text>')\n\n # Tile grid + year labels + sidebar\n js_data: dict[str, dict] = {}\n\n for row, year in enumerate(years):\n y_base = TOP_MARGIN + row * CELL\n\n # Year label\n parts.append(f'<text x=\"{LEFT_MARGIN - 6}\" y=\"{y_base + TILE - 2}\" '\n f'class=\"year-label\" text-anchor=\"end\">{year}</text>')\n\n year_weeks = weeks_by_year[year]\n for col, (monday, wd) in enumerate(year_weeks):\n lv = _level(wd.total, global_thresh)\n x = LEFT_MARGIN + col * CELL\n y = y_base\n\n tile_key = f\"{year}-{col}\"\n\n # Fallback title\n source_parts = []\n for s in active_sources:\n v = wd.by_source.get(s, 0)\n short = s.upper().replace(\"GITLAB\", \"GL\").replace(\"GITHUB\", \"GH\")\n source_parts.append(f\"{short}:{v}\")\n title_text = (f\"Week of {_fmt_date(monday)} \\u00b7 \"\n f\"{wd.total} contributions ({' '.join(source_parts)})\")\n\n parts.append(\n f'<rect class=\"tile\" id=\"t-{tile_key}\" data-k=\"{tile_key}\" '\n f'x=\"{x}\" y=\"{y}\" width=\"{TILE}\" height=\"{TILE}\" '\n f'fill=\"{PALETTE[lv]}\">'\n f'<title>{title_text}</title></rect>')\n\n # JS data for popover\n js_entry: dict = {\n \"t\": wd.total,\n \"s\": {s: wd.by_source.get(s, 0) for s in active_sources},\n \"d\": wd.by_day,\n \"l\": f\"Week of {_fmt_date(monday)}\",\n }\n js_data[tile_key] = js_entry\n\n # Year total + source bar (right sidebar)\n year_total = stats.per_year.get(year, {}).get(\"_total\", 0)\n sidebar_x = LEFT_MARGIN + MAX_WEEKS * CELL + 10\n parts.append(f'<text x=\"{sidebar_x + 40}\" y=\"{y_base + TILE - 2}\" '\n f'class=\"year-total\">{year_total:,}</text>')\n\n # Stacked source bar\n bar_x = sidebar_x + 48\n bar_w = 100\n bar_h = 8\n bar_y = y_base + (TILE - bar_h) // 2\n parts.append(f'<rect x=\"{bar_x}\" y=\"{bar_y}\" width=\"{bar_w}\" '\n f'height=\"{bar_h}\" fill=\"#21262d\" rx=\"2\"/>')\n if year_total > 0:\n offset = 0\n for s in active_sources:\n s_count = stats.per_year.get(year, {}).get(s, 0)\n if s_count == 0:\n continue\n w = max(1, round(s_count / year_total * bar_w))\n w = min(w, bar_w - offset)\n parts.append(f'<rect x=\"{bar_x + offset}\" y=\"{bar_y}\" width=\"{w}\" '\n f'height=\"{bar_h}\" fill=\"{SOURCE_COLORS[s]}\" rx=\"2\"/>')\n offset += w\n\n # Legend\n legend_y = TOP_MARGIN + num_years * CELL + 16\n legend_x = LEFT_MARGIN\n parts.append(f'<text x=\"{legend_x}\" y=\"{legend_y + 10}\" '\n f'style=\"font-size:10px;fill:{MUTED_COLOR}\">Less</text>')\n for i, color in enumerate(PALETTE):\n bx = legend_x + 30 + i * (TILE + 2)\n parts.append(f'<rect x=\"{bx}\" y=\"{legend_y}\" width=\"{TILE}\" '\n f'height=\"{TILE}\" fill=\"{color}\" rx=\"2\"/>')\n parts.append(f'<text x=\"{legend_x + 30 + 5 * (TILE + 2) + 4}\" y=\"{legend_y + 10}\" '\n f'style=\"font-size:10px;fill:{MUTED_COLOR}\">More</text>')\n\n # Source legend\n src_legend_x = legend_x + 160\n for i, s in enumerate(active_sources):\n sx = src_legend_x + i * 90\n parts.append(f'<rect x=\"{sx}\" y=\"{legend_y + 2}\" width=\"10\" height=\"10\" '\n f'fill=\"{SOURCE_COLORS[s]}\" rx=\"2\"/>')\n parts.append(f'<text x=\"{sx + 14}\" y=\"{legend_y + 10}\" '\n f'style=\"font-size:10px;fill:{MUTED_COLOR}\">{SOURCE_LABELS[s]}</text>')\n\n # Popover foreignObject\n parts.append(\n f'<foreignObject id=\"popover\" x=\"0\" y=\"0\" width=\"280\" height=\"320\" display=\"none\">'\n f'<div xmlns=\"http://www.w3.org/1999/xhtml\" id=\"popover-content\" '\n f'style=\"background:{POPOVER_BG};color:{TEXT_COLOR};padding:10px 12px;'\n f'border-radius:6px;font-family:-apple-system,BlinkMacSystemFont,\\'Segoe UI\\',sans-serif;'\n f'font-size:11px;line-height:1.5;border:1px solid {POPOVER_BORDER};'\n f'box-shadow:0 4px 12px rgba(0,0,0,0.5);\">'\n f'</div>'\n f'</foreignObject>')\n\n # JS\n sources_js = json.dumps([[s, POPOVER_LABELS[s], SOURCE_COLORS[s]]\n for s in active_sources])\n data_js = json.dumps(js_data, separators=(\",\", \":\"))\n\n js = _build_js(data_js, sources_js, svg_w, svg_h, CELL)\n parts.append(f'<script type=\"text/ecmascript\"><![CDATA[\\n{js}\\n]]></script>')\n\n parts.append('</svg>')\n return \"\\n\".join(parts)\n\n\ndef _build_js(data_js: str, sources_js: str,\n svg_w: int, svg_h: int, cell: int) -> str:\n return f\"\"\"\nvar W={data_js};\nvar S={sources_js};\nvar SVG_W={svg_w},SVG_H={svg_h},CELL={cell};\nvar fo=document.getElementById('popover');\nvar pc=document.getElementById('popover-content');\n\ndocument.querySelectorAll('.tile').forEach(function(el){{\n el.addEventListener('mouseenter',show);\n el.addEventListener('mouseleave',hide);\n}});\n\nfunction show(evt){{\n var k=evt.target.dataset.k;\n if(!k||!W[k])return;\n var d=W[k];\n var h='<div style=\"font-weight:600;margin-bottom:4px\">'+d.l+'</div>';\n h+='<div style=\"margin-bottom:8px\">'+d.t+' contribution'+(d.t!==1?'s':'')+'</div>';\n\n var maxD=Math.max.apply(null,d.d.concat([1]));\n h+='<div style=\"display:flex;gap:1px;height:28px;align-items:flex-end;margin-bottom:8px\">';\n var days=['M','T','W','T','F','S','S'];\n for(var i=0;i<7;i++){{\n var pct=d.d[i]?Math.max(d.d[i]/maxD*100,8):0;\n var bg=d.d[i]?'#39d353':'#21262d';\n h+='<div style=\"flex:1;display:flex;flex-direction:column;align-items:center;gap:2px\">';\n h+='<div style=\"flex:1;width:100%;display:flex;align-items:flex-end\">';\n h+='<div style=\"width:100%;background:'+bg+';height:'+pct+'%;border-radius:1px\"></div>';\n h+='</div>';\n h+='<div style=\"font-size:8px;color:{MUTED_COLOR}\">'+days[i]+'</div>';\n h+='</div>';\n }}\n h+='</div>';\n\n var maxS=1;\n for(var i=0;i<S.length;i++){{var v=d.s[S[i][0]]||0;if(v>maxS)maxS=v;}}\n for(var i=0;i<S.length;i++){{\n var src=S[i],v=d.s[src[0]]||0;\n var pct=v/maxS*100;\n h+='<div style=\"display:flex;align-items:center;gap:6px;margin:2px 0\">';\n h+='<span style=\"width:56px;color:{MUTED_COLOR};font-size:10px\">'+src[1]+'</span>';\n h+='<div style=\"flex:1;height:6px;background:#21262d;border-radius:2px\">';\n h+='<div style=\"width:'+pct+'%;height:100%;background:'+src[2]+';border-radius:2px;min-width:'+(v?'2px':'0')+'\"></div>';\n h+='</div>';\n h+='<span style=\"width:24px;text-align:right;font-size:10px\">'+v+'</span>';\n h+='</div>';\n }}\n\n pc.innerHTML=h;\n\n var tx=parseFloat(evt.target.getAttribute('x'));\n var ty=parseFloat(evt.target.getAttribute('y'));\n var pw=280,ph=320;\n var px=tx+CELL+4;\n var py=ty-40;\n if(px+pw>SVG_W-20)px=tx-pw-4;\n if(py<10)py=10;\n if(py+ph>SVG_H-10)py=Math.max(10,SVG_H-ph-10);\n fo.setAttribute('x',px);\n fo.setAttribute('y',py);\n fo.setAttribute('display','inline');\n}}\n\nfunction hide(){{\n fo.setAttribute('display','none');\n}}\n\"\"\"\n\n\n# ── CLI ────────────────────────────────────────────────────────────────────────\n\ndef parse_args() -> argparse.Namespace:\n p = argparse.ArgumentParser(description=\"Multi-source contribution heatmap generator\")\n p.add_argument(\"--output\", default=None, help=\"Output SVG path\")\n p.add_argument(\"--start-year\", type=int, default=None, help=\"Override earliest year\")\n p.add_argument(\"--end-date\", type=str, default=None, help=\"Override end date (YYYY-MM-DD)\")\n p.add_argument(\"--cache-dir\", type=str, default=\".cache\", help=\"Cache directory\")\n p.add_argument(\"--no-cache\", action=\"store_true\", help=\"Skip cache, always refetch\")\n p.add_argument(\"--dry-run\", action=\"store_true\", help=\"Show what would be fetched\")\n p.add_argument(\"-v\", \"--verbose\", action=\"store_true\", help=\"Verbose output\")\n return p.parse_args()\n\n\ndef main():\n args = parse_args()\n logging.basicConfig(\n level=logging.DEBUG if args.verbose else logging.INFO,\n format=\"%(levelname)s: %(message)s\",\n )\n\n import os\n end_date = date.fromisoformat(args.end_date) if args.end_date else date.today()\n output = args.output or os.environ.get(\"OUTPUT_PATH\", \"contributions.svg\")\n cache_dir = Path(args.cache_dir)\n start_year = args.start_year\n\n github_user = os.environ.get(\"GITHUB_USER\")\n github_token = os.environ.get(\"GITHUB_TOKEN\")\n gl1_url = os.environ.get(\"GITLAB1_URL\")\n gl1_user = os.environ.get(\"GITLAB1_USER\")\n gl1_token = os.environ.get(\"GITLAB1_TOKEN\")\n gl2_url = os.environ.get(\"GITLAB2_URL\")\n gl2_user = os.environ.get(\"GITLAB2_USER\")\n gl2_token = os.environ.get(\"GITLAB2_TOKEN\")\n author_emails = [e.strip() for e in\n os.environ.get(\"AUTHOR_EMAILS\", \"\").split(\",\") if e.strip()]\n\n fetch_plan = []\n if github_user and github_token:\n fetch_plan.append((\"github\", f\"GitHub ({github_user})\"))\n if gl1_url and gl1_user and gl1_token:\n fetch_plan.append((\"gitlab1\", f\"GitLab #1 ({gl1_user}@{gl1_url})\"))\n if gl2_url and gl2_user and gl2_token:\n fetch_plan.append((\"gitlab2\", f\"GitLab #2 ({gl2_user}@{gl2_url})\"))\n\n if not fetch_plan:\n print(\"No sources configured. Set GITHUB_USER/GITHUB_TOKEN and/or \"\n \"GITLAB1_URL/GITLAB1_USER/GITLAB1_TOKEN environment variables.\",\n file=sys.stderr)\n sys.exit(1)\n\n has_gitlab = any(n.startswith(\"gitlab\") for n, _ in fetch_plan)\n if has_gitlab and not author_emails:\n print(\"AUTHOR_EMAILS is required for GitLab sources (comma-separated \"\n \"git author emails).\", file=sys.stderr)\n sys.exit(1)\n\n if args.dry_run:\n print(\"Would fetch from:\")\n year_range = f\"{start_year or '(auto-detect)'} to {end_date.year}\"\n for name, desc in fetch_plan:\n est = (end_date.year - (start_year or 2010) + 1)\n if name == \"github\":\n print(f\" {desc}: ~{est} GraphQL queries (years {year_range})\")\n else:\n print(f\" {desc}: ~{est * 10} REST requests (years {year_range}, est.)\")\n print(f\"Output: {output}\")\n return\n\n sources: dict[str, dict[date, DayData]] = {}\n sources_ok: list[str] = []\n sources_fail: list[str] = []\n earliest_year = start_year\n\n # Fetch GitHub first so its join year can serve as the floor for GitLab\n if github_user and github_token:\n log.info(\"Fetching GitHub (%s)\", github_user)\n try:\n data, detected_start = fetch_github(\n github_user, github_token, start_year, end_date, cache_dir, args.no_cache)\n sources[\"github\"] = data\n sources_ok.append(\"github\")\n if earliest_year is None or detected_start < earliest_year:\n earliest_year = detected_start\n log.info(\"GitHub (%s): %d days with activity\", github_user, len(data))\n except Exception as e:\n log.warning(\"Failed to fetch GitHub: %s\", e)\n sources_fail.append(\"github\")\n\n for name, desc in fetch_plan:\n if name == \"github\":\n continue\n log.info(\"Fetching %s\", desc)\n try:\n if name == \"gitlab1\":\n assert gl1_url and gl1_user and gl1_token\n data, detected_start = fetch_gitlab(\n gl1_url, gl1_user, gl1_token, author_emails,\n earliest_year, end_date, cache_dir, args.no_cache,\n \"gitlab1\")\n elif name == \"gitlab2\":\n assert gl2_url and gl2_user and gl2_token\n data, detected_start = fetch_gitlab(\n gl2_url, gl2_user, gl2_token, author_emails,\n earliest_year, end_date, cache_dir, args.no_cache,\n \"gitlab2\")\n else:\n continue\n\n sources[name] = data\n sources_ok.append(name)\n if earliest_year is None or detected_start < earliest_year:\n earliest_year = detected_start\n log.info(\"%s: %d days with activity\", desc, len(data))\n except Exception as e:\n log.warning(\"Failed to fetch %s: %s\", desc, e)\n sources_fail.append(name)\n\n if not sources:\n print(\"All sources failed. Cannot generate heatmap.\", file=sys.stderr)\n sys.exit(1)\n\n if earliest_year is None:\n earliest_year = end_date.year\n\n daily = merge_sources(sources)\n weeks_by_year = aggregate_weeks(daily, earliest_year, end_date)\n stats = compute_stats(daily, weeks_by_year, end_date, sources_ok, sources_fail)\n svg = build_svg(weeks_by_year, stats)\n\n Path(output).write_text(svg)\n print(f\"Written {len(svg):,} bytes to {output}\")\n print(f\" {stats.total:,} total contributions across {len(sources_ok)} sources, \"\n f\"{earliest_year}-{end_date.year}\")\n\n\nif __name__ == \"__main__\":\n main()\n```\n",
"createdAt": "2026-05-13T23:14:36.584Z",
"description": "A Python script that merges contribution data from GitHub and multiple GitLab instances into a single interactive SVG heatmap.",
"path": "/blog/2026/04/09/one-git-heatmap-to-rule-them-all",
"publishedAt": "2026-04-09T00:00:00.000Z",
"site": "at://did:plc:tevykrhi4kibtsipzci76d76/site.standard.publication/self",
"tags": [
"tools",
"python",
"visualisation"
],
"textContent": "A Python script that merges contribution data from GitHub and multiple GitLab instances into a single interactive SVG heatmap.",
"title": "One git heatmap to rule them all"
}