Difference between revisions of "BASH Scripting"

From PHYSpedia
Jump to: navigation, search
(Text Manipulation Commands)
Line 59: Line 59:
 
Linux has a rich command line history. Many (most) of the command used at the command line are just re-implementations of commands that were written 30 or 40 years ago. There are many useful text manipulation command available that can be combined using pipes to create "new" commands on the fly. Here is a list of some commonly used text manipulation commands
 
Linux has a rich command line history. Many (most) of the command used at the command line are just re-implementations of commands that were written 30 or 40 years ago. There are many useful text manipulation command available that can be combined using pipes to create "new" commands on the fly. Here is a list of some commonly used text manipulation commands
  
:sort
+
;sort
;sorts text read from stdin and outputs result to stdout
+
:sorts text read from stdin and outputs result to stdout
;usefull options
+
:usefull options
; -r    : reverse sort (sort in descending order)
+
: -r    : reverse sort (sort in descending order)
; -k i  : sort input on the i'th column (by default input is sorted on the first column)
+
: -k i  : sort input on the i'th column (by default input is sorted on the first column)
; -n    : interpret input as numbers and sort based on numeric value, not ascii value of text
+
: -n    : interpret input as numbers and sort based on numeric value, not ascii value of text
; -g    : interpret input as general numbers and sort based on numeric value (this will sort numbers in scientific notation)
+
: -g    : interpret input as general numbers and sort based on numeric value (this will sort numbers in scientific notation)

Revision as of 12:44, 17 September 2011

Scripting refers to the practice of writing "scripts" to perform tasks on a machine. A shell script is a file that contains a set of command that well be executed as if they were typed in at the command line. Shell scripts make use of the wide range of command line tools that already exist for processing text and data. Scripts have been used by system administrators for years to automate repetitive task, but scripting has many applications in computational physics and high performance computing.

Pipes (the case for stdin)

One of the most powerful features of Unix, and therefore Linux, is pipes. Pipes direct the standard output of one program to the standard input of another program. So, for exmaple, the cat command prints the contents of a file to standard output

$ cat file.txt
  1 4
  2 3
  3 2

In this example, the file named file.txt contains three lines. The cat command prints these lines to standard output. Now, the grep command reads lines from standard input, and prints lines that match a pattern to standard output. So, if we wanted to see only the lines in file.txt that contain the number 3, we can pipe the standard output of the cat command to grep

$ cat file.txt | grep "3"
  2 3
  3 2

Here, we have two separate commands to build a single output. There is no limit to the number of pipes we can chain together (or, if there is, this limit is much larger than you will need). We could take the output of the last command and pipe it through sort to show the lines in descending order on column two.

$ cat file.txt | grep "3" | sort -g -k 2
  3 2
  2 3

The -k 2 option tells sort to sort the data based on column 2. -g tells it to sort based on numerical value, rather than text (ascii) value.

Pipes simply connect the standard output of one command to the standard input of another command. In fact, pipes are completely separate from the commands, i.e. you can use pipes with programs you write. Since pipes work with stdin and stdout, you can write a program that reads from stdin and writes to stdout and use your programs with pipes. For example, consider a simple C++ program that reads numbers from standard input and computes their average,

/* avg.cpp - compute average of a list of numbers */ 
#include <iostream>

int main()
{
  double x, sum;
  int N;

  sum = 0;
  N   = 0;
  while( std::cin>>x )
  {
    sum += x;
    N++;
  }

  std::cout<<sum / N;
}

If you just run this program, it will wait for input from the keyboard (since this is where standard input comes from by default)

$ ./avg

You can enter as many numbers as you like. Pressing Ctrl-D will break the read loop and cause the program to output the average value of the numbers you enter. This would be a fairly tedious program to use, having to type in every number you want to average, however, you can send input to your program through a pipe.

$ echo -e "1\n3\n5\n7\n5\n3"
1
3
5
7
5
3

$ echo -e "1\n3\n5\n7\n5\n3" | ./avg
4

BAM! The pipe magically sends the standard output of the echo command to your program. We have used echo here as a simple example, but you could use ./avg to compute the average of any program that output a string of numbers, one per line.

Text Manipulation Commands

Linux has a rich command line history. Many (most) of the command used at the command line are just re-implementations of commands that were written 30 or 40 years ago. There are many useful text manipulation command available that can be combined using pipes to create "new" commands on the fly. Here is a list of some commonly used text manipulation commands

sort
sorts text read from stdin and outputs result to stdout
usefull options
-r  : reverse sort (sort in descending order)
-k i  : sort input on the i'th column (by default input is sorted on the first column)
-n  : interpret input as numbers and sort based on numeric value, not ascii value of text
-g  : interpret input as general numbers and sort based on numeric value (this will sort numbers in scientific notation)