Raw Record Source

{
  "path": "/posts/2019/2019-05-29-unix",
  "site": "at://did:plc:mracrip6qu3vw46nbewg44sm/site.standard.publication/self",
  "tags": [
    "code",
    "unix"
  ],
  "$type": "site.standard.document",
  "title": "Pipelines and your Unix toolbox",
  "updatedAt": "2019-05-29T00:00:00.000Z",
  "publishedAt": "2019-05-29T00:00:00.000Z",
  "textContent": "Unix commands are great for manipulating data and files. They get even better when used in shell pipelines. The following are a few of my go-tos -- I'll list the commands with an example or two. While many of the commands can be used standalone, I'll provide examples that assume the input is piped in because that's how you'd used these commands in a pipeline. Lastly, most of these commands are pretty simple and that is by design -- the Unix philosophy focuses of simple, modular code, which can be composed to perform more complex operations.\n\nNote: if you're using a Mac, the builtin tools shipped with macOS might behave a little differently than the most recent versions. You can get more recently compiled version of these tools by running brew install coreutils. The typically usage is then ghead instead of head, gtail instead of tail, gpaste instead of paste, etc.\n\nHere are the commands and a one sentence description:\n\nPlease forgive the \"useless use of cat\". I'm using cat to show the pipe-able versions of the piped-to commands.\n\nPrint the first 5 lines\n\nPrint the last 5 lines:\n\nPrint all but the first line\n\nSend all lines of output as arguments to echo\n\nSend each line individually as an argument to echo\n\nSend two arguments at a time to echo\n\nPrint only column two of each line (assumes whitespace between columns)\n\nPrint only column two of each line with , separators\n\nor\n\nString replace each line matching 1, 2, or 3 using regex with the letter 'x'\n\nCount the number of times each line appears in the file\n\nWrite out the intermediate result to a file in the middle of a pipeline using tee\n\nThe above are a bunch of \"tools\" to file away for when you need them. I like to store them along with a few keywords or a sentence describing what each does for easy searching and recall.\n\nNow, let's consider the following example file myfile.txt:\n\nWe want to create files in the current folder using the names in column two, where the uuids in column one start with a \"2\".\n\nBreak down the transformation into parts:\n\n- remove the first line (the header)\n- filter for lines starting with 2\n- grab the second column using a , separator\n- use the result to create the files with touch\n\nConfirm it worked:\n\nAs you add new tools to your toolbox, you can plug them in to your shell one-liners to manipulate data streams.\n\nSome other useful commands for text processing worth looking into:\n\npaste\ntr\nwc\njq",
  "canonicalUrl": "https://www.danielcorin.com/posts/2019/2019-05-29-unix"
}