Raw Record Source

{
  "$type": "site.standard.document",
  "canonicalUrl": "https://johnnyreilly.com/posts/schemar-github-action-to-validate-structured-data",
  "description": "This post demonstrates how to use Schemar to validate structured data using a GitHub Action.",
  "path": "/posts/schemar-github-action-to-validate-structured-data",
  "publishedAt": "2024-01-02T00:00:00.000Z",
  "site": "at://did:plc:yy3apqjlms24kso7ahn7lbmb/site.standard.publication/3mova7c4nho2b",
  "tags": [
    "seo"
  ],
  "textContent": "Of late, I've found myself getting more and more into structured data. Structured data is a way of adding machine-readable information to web pages. To entertain myself, I liken it to static typing for websites. I've written about structured data before, but in this post I want to focus on how to validate structured data.\n\nSpecifically, how can we validate structured data in the context of a GitHub workflow? I've created a GitHub Action called Schemar that facilitates just that. In this post we'll see how to use it.\n\n\n\nIf you'd like to read more about structured data, you might like to read these posts:\n\n- Structured data, SEO and React\n- How we fixed my SEO\n- Docusaurus blogs: adding breadcrumb structured data\n\nWhat is Schemar?\n\nSchemar is a GitHub Action that validates structured data. It's a wrapper around the Schema Markup Validator tool.\n\nIf you haven't heard of Schema.orgs validator; it originally started at Google as the Structured Data Testing Tool but was repurposed and gifted to the community.\n\nThat tool is a website; Schemar is a wrapper around the tool that makes it easy to validate structured data in the context of a GitHub workflow. Let's imagine it's very important to you that your structured data is both present and valid. You could use Schemar to validate your structured data as part of your CI/CD pipeline.\n\nImagine Schemar to be the structured data equivalent of the lighthouse-ci-action GitHub Action.\n\nUsing Schemar\n\nI'm going to take my blog (that's what you're reading right now BTW) and use Schemar to validate the structured data on it. I already have a GitHub Action that builds and deploys my blog to a staging environment in Azure Static Web Apps and validates it with Lighthouse. So I'm going to add Schemar to that.\n\nBut before we do that, let's look at simple usage of Schemar. If you were to add a .github/workflows/schemar.yml file to your repo with the following contents:\n\nThen you'd have a GitHub workflow that would validate the structured data on https://johnnyreilly.com and fail if it wasn't valid.\n\nThe urls input of Schemar is a list of URLs to validate. In this case, we're just validating only one. The results look like this:\n\n> Validating https://johnnyreilly.com for structured data...\n>\n> https://johnnyreilly.com has structured data of these types:\n>\n> - Organization / Brand\n> - WebSite\n> - Blog\n>\n> For more details see https://validator.schema.org/#url=https%3A%2F%2Fjohnnyreilly.com\n\nWe can see that the home page of my blog has structured data of the types Organization / Brand, WebSite and Blog. And we can even click into the Schema Markup Validator to see the details.\n\nIf at some point I were to omit or break the structured data on my blog, then Schemar would fail the build. This is a great way to ensure that your structured data is always present and valid.\n\nWe're going to see what usage looks like in a minute, as we dive into a more sophisticated example.\n\nSurfacing Schemar results in your pull requests\n\nNow that we've seen a basic example, let's see what it looks like to use Schemar in a more sophisticated way. We're going to add Schemar to run against my blogs pull request previews, in the same way we're already running Lighthouse against them.\n\nAdding Schemar to the GitHub Action\n\nI won't reiterate the whole GitHub workflow that spins up a preview environment here, but I'll show the key parts. You can see the whole thing in the build-and-deploy-static-web-app.yml of the blog repo. You'll note I'm using Azure Static Web Apps to host my blog - but any web platform will do.\n\nHere is the key part of the GitHub workflow:\n\nAlong with the following structuredDataCommentMaker.mjs script:\n\nLet's break this down:\n\n- We're using the nev7n/wait_for_response GitHub Action to wait for the preview to be available. This is because the preview URL is not available immediately after the preview is created.\n- We're running Schemar against four URLs in our pull request preview. These pages should have structured data; and if any fail then it's likely a sign that something has gone wrong with my sites structured data story.\n- We then take the output of the Schemar run and format it into a comment that we can add to the pull request - to do that we use the structuredDataCommentMaker.mjs script.\n- Finally, we add the comment to the pull request using the marocchino/sticky-pull-request-comment GitHub Action.\n\nTesting it out\n\nLet's see what this looks like in action. I've created a pull request that breaks the structured data from my blog. This is what the pull request looks like:\n\nThe question is, what does the pull request look like after the GitHub Action has run? Here's the answer:\n\nIt failed! And it put a comment on the PR that looks like this:\n\nLet's unbreak the structured data and see what happens:\n\nIt succeeded! And it put a comment on the PR that looks like this:\n\nThis is great! It means that I can be confident that my structured data is always present and valid. And if it isn't, then I'll know about it. I can even click through to the Schema Markup Validator to see the details.\n\nConclusion\n\nMy hope is that Schemar can be used to increase the quality of structured data on the web. I'm using it to increase the quality of structured data on my blog. I hope you'll find it useful too.\n\nI've also shared this with the good folk of Schema.org in the hope they'll find it useful too. The source code of Schemar can be found here.",
  "title": "Schemar: a GitHub Action to validate structured data"
}