Pipelines and your Unix toolbox

Dan Corin May 29, 2019
Source
Unix commands are great for manipulating data and files. They get even better when used in shell pipelines. The following are a few of my go-tos -- I'll list the commands with an example or two. While many of the commands can be used standalone, I'll provide examples that assume the input is piped in because that's how you'd used these commands in a pipeline. Lastly, most of these commands are pretty simple and that is by design -- the Unix philosophy focuses of simple, modular code, which can be composed to perform more complex operations. Note: if you're using a Mac, the builtin tools shipped with macOS might behave a little differently than the most recent versions. You can get more recently compiled version of these tools by running brew install coreutils. The typically usage is then ghead instead of head, gtail instead of tail, gpaste instead of paste, etc. Here are the commands and a one sentence description: Please forgive the "useless use of cat". I'm using cat to show the pipe-able versions of the piped-to commands. Print the first 5 lines Print the last 5 lines: Print all but the first line Send all lines of output as arguments to echo Send each line individually as an argument to echo Send two arguments at a time to echo Print only column two of each line (assumes whitespace between columns) Print only column two of each line with , separators or String replace each line matching 1, 2, or 3 using regex with the letter 'x' Count the number of times each line appears in the file Write out the intermediate result to a file in the middle of a pipeline using tee The above are a bunch of "tools" to file away for when you need them. I like to store them along with a few keywords or a sentence describing what each does for easy searching and recall. Now, let's consider the following example file myfile.txt: We want to create files in the current folder using the names in column two, where the uuids in column one start with a "2". Break down the transformation into parts: - remove the first line (the header) - filter for lines starting with 2 - grab the second column using a , separator - use the result to create the files with touch Confirm it worked: As you add new tools to your toolbox, you can plug them in to your shell one-liners to manipulate data streams. Some other useful commands for text processing worth looking into: paste tr wc jq

Discussion in the ATmosphere

Loading comments...