In addition to command aliases (see an earlier post), you can add your own functions to the bash shell. Here is a simple but useful command line sequence:
function freq() {
sort $* | uniq -c | sort -rn;
}
Put it in
~/.bashrc
and you will have a freq
command for creating frequency lists:freq <FILES>will sort and count all identical lines of the input file(s), and present them in descending frequency. Useful in many situations, not the least for checking that files that are supposed to only contain unique lines actually do so.
(I'm not too sure about bash function syntax, but the function above seems to do its work.)
If you're not familiar with the different commands of the pipeline above, there is plenty to read (e.g., egrep for linguists).
3 comments:
Thanks for that, very handy snippet.
Cool stuff! What's the "sort $*" for? Thanks!
Gabe,
"uniq" wants its input sorted. "sort $*" sorts the lines of all input files. "$*" holds the command line arguments to the script (the input files in this case). Hope this helped.
Post a Comment