Extreme shell programming

A year ago I've wrote article for russian magasine System Adminsitrator about using /bin/sh in tiny installations -- to emulate some shell commands, do text matching and etc. Shell fans should like.

Table of Contents
Advanced Shell Programming
Pattern matching in shells (emulating regular expressions)
cut_atomic
cut
Another way to extract positional parameters
Pattern cheching
Arrays in shell
Breaking up input and line-by-line processing

Advanced Shell Programming

This article describes advanced and sometimes not-so-obvious programming tricks
which can be used in shell programming. I suggest you to use all power, that gives standard FreeBSD's /bin/sh shell and use every bit of functionality which have this binary.

Pattern matching in shells (emulating regular expressions)

Sometimes it's necessary to perform some patterm matching in shell script, and,
for example, extract some variable from string, but you have no opportunity to use standard unix utilities like sed, awk or something heavier (like perl :). In this case shell's pattern removal can help.

Please refer to sh(1), Parameter Expansion section.

Here are some examples for simulating some unix utilities by means of shell
functions. All functions given return result in `result` variable, and return 0 on success and non-zero value on error. In case if there are no result to return they reset `result` varible to empty string.

cut_atomic

cut_atomic function cuts string from left of string till the firs occurence of
delimiter, first argument is a delimiter itself - it can by one symbol or some
string, and all other parameters are string where to search. Shell patterns are
always greedy, so please keep this in mind when passing wildcards as delimiter

argument.

cut_atomic () {
    local DELIM STRING

    DELIM="$1" # delimiter may be any string , not only one symbol
    shift
    STRING=$*  # remaining string 

    result=${STRING%%${DELIM}*}
    return 0
        
}

in this case ${variable%%pattern} construct removes larges possible suffix from
string - i.e. only the symbols from beginning of string till first occurence of

delimiter will remain.

cut

cut () {
    local DELIM POS1 POS2 STRING

    DELIM="$1"
    POS1=$2
    POS2=$3
    shift 3
    STRING=$*  # remaining string 

    if [ $POS2 -lt $POS1 ]; then
        return 1                    # error 
    fi

    I=0
    while [ $I -lt $POS1 ]; do
        STRING=${STRING#*${DELIM}}
    done
    echo "$STRING"

}

cut function emulated behavior of cut(1) utility, called with -d option, like
/usr/bin/cut -d$DELIM -f$POS1-$POS2. You have to pass delimiter as first
argument in function (if you are using wildcards in delimiter results may be
very unobvious), first and last field numbers and string itself. If you
want only one filed, you must pass same number as first and last field numbers.

cut () {
    local DELIM POS1 POS2 STRING STR1 POSTFIX

    DELIM="$1"
    POS1=$2
    POS2=$3
    shift 3
    STRING=$*  # remaining parameters are string 

    if [ $POS2 -lt $POS1 ]; then
        result=""

        return 1                    # error 
    fi

    I=1
    while [ $I -lt $POS1 ]; do          # strip first elements
        STRING=${STRING#*${DELIM}}      # from string
        I=$(($I+1))
    done

    STR1="$STRING"                      # save the result 

    while [ $I -le $POS2 ]; do          # strip argument
        STRING=${STRING#*${DELIM}}      # which we need, and 
        I=$(($I+1))                     # get suffix
    done

# we have unneded suffix in STRING varible
# strip it from saved result

    result=${STR1%${DELIM}${STRING}}
    return 0
}

Another way to extract positional parameters

You can use 'set' function to replace arguments of current shell block and you
can extract positional parameter with $NNN varibales. Only thing that you can
not work around is that wildcard extraction will take place and you will get
unpredictable results if you have wildcards in your string.

Here are small example how to do the trick. We are going to extract first 3
bytes rom MAC address of ethernet card.

extract_manufacturer () {
    local STR IFS       # define IFS as local to keep 
                        # changes inside this func

    STR=$*

    IFS=':'             # break string using : delimiter
    set -- $STR         # replace positional args with STR variable content
    result="$1:$2:$3"
    return 0
}

S=`ifconfig fxp0` # get outout of ifconfig 
S=${S##*ether}    # strip all data before ether keyword

extract_manufacturer "$S"
echo "manufacturer code: $result"

Pattern cheching

Sometimes it's necessary to check string against some pattern. Input
validation is a good example. The only function in shell, which allows you
to know, does really string contain such pattern or not is case statement.
So we are constructing simple wrapper around it to do checking.

First argument for the match_pattern function is a pattern to match, and all
others - single string which is matched against pattern.
match_pattern_strict requires that patterm must match match whole string,

and match_pattern is looser - it requires only match of part of string.

Be warned - you should enclose pattern in single or double quotes most of
time, to prevent expansion before it is passed to function.

match_pattern_strict () {
    local PATTERN STRING

    PATTERN="$1"
    shift
    STRING=$*

    result=""

    case "$STRING" in
    $PATTERN) 
        return 0
    esac

    return 1

}
match_pattern() {
    local PATTERN STRING

    PATTERN="$1"

    shift
    STRING=$*

    result=""

    case "$STRING" in
    *${PATTERN}*) 
        return 0
    esac

    return 1

}

Here are some examples of input validation.

match_pattern_strict '[0-9]*.[0-9]*.[0-9]*.*[0-9]' 192.168.0.1

Arrays in shell

Another thing which makes shell programming ugly and nasty thing is absense
of arrays of any kind - associative or indexed ones.

In PicoBSD? boot scripts I've found interesting way to organize data in such
way, that it's simple to emulate array. So i would like to present it.

Let's assume that array name is foo and we want there array element named A

and B, so we will have elements - foo[0]['A'], foo[0]['B'], foo[10]['A']
and so on. Our convention also assumes that A never can take empty value.

So we will create for each array element corresponding variable in script -
foo_0_A, foo_0_B, foo_10_A, foo_10_B and so on.

Here are several functions to work with such kind of arrays

arr_count () {
    local ARRNAME KEYNAME VAL I
    ARRNAME=$1
    KEYNAME=$2

    I=0
    result=""
    eval VAL=\${${ARRNAME}_${I}_${KEYNAME}}
    if [ $VAL = "" ];then
        return 1        # if first key element is zero
                        # we assume that array does not exist
    fi
    while [ "$VAL" != "" ] ; do
        I=$(($I+1))
        eval VAL=\${${ARRNAME}_${I}_${KEYNAME}}
    done

    result=$I
    return 0
}

arr_count returns number of elements in array. First argument of arr_count is
the array prefix variable - i.e. array name, second argument is the key name.

# array_name key_field_name value_field_name key_value                          
arr_lookup_by_key () {
    local i array kfield vfield kvalue key value
    array=$1
    kfield=$2
    vfield=$3
    kvalue=$4

    i=0
    result=""

    key="x" # force it jump inside loop
    while [ "$key" != "" ]; do
            eval key=\${${array}_${i}_${kfield}}
            if [ "$key" = "$kvalue" ]; then
                    eval result=\${${array}_${i}_${vfield}}
                    return 0
            fi
            i=$(($i+1))
    done
    return 1
}

arr_lookup_by_key function provides convient way to search element value,
knowing the name(corresponding key) of the element. It takes four
arguments - array name, name of key field, name of value field, and value
of key field which is looked up.

# array_name key_field_name value_field_name index
arr_lookup_by_index () {
    local i array index vfield value kfield
    array=$1
    kfield=$2
    vfield=$3
    kvalue=$4

    eval key=\${${array}_${index}_${kfield}}

    if [ "$key" = "" ]; then 
        return 1        # report error - because we have no key
    fi

    eval result=\${${array}_${index}_${vfield}}
    return 0
}

arr_lookup_by_index function provides convient way to get element value,
knowing the index of the element. It takes four arguments - array name,
name of key field, name of value field, and index , which value we want to
get.

Breaking up input and line-by-line processing

Usually we are running awk or perl if we need to scan file line-by-line and
split it by some delimiter and process arguments. But not always we have awk or
perl on hands, so let's try to emulate this behavior with shell only. We will

use ability of read command to split it's stdin and pass it to variables. read
also returns false if there is EOF, so we will use it as loop condition for
while loop

IFS=:
i=0
while read name pass uid gid gcos homedir shell junk; do
    echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
    i=$(($i+1))
done < /etc/passwd
echo "total lines: $i"

Be very carefull when redirecting some file as input to while loop. If you will
do it in while loop condition itself - something like this --

while read name pass uid gid gcos homedir shell junk < /etc/passd; do
    ...
done

It will loop forever because read will begin parsing stdin on each iteration.
So we need to redirect data on while operators stdin, and we must do it after
while operator end - i.e. after done statement.

Another thing we need is to parse output of another command in pipe. So we will
try:

cat /etc/passwd | while read name pass uid gid gcos homedir shell junk; do
    echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
    i=$(($i+1))
done 
echo "total lines: $i"

We will get correct results for each line of passwd file - it will count them

correctly, but last statement will show 0 lines. That is strange for a first
glance, but we need to remember, that each command in pipe is executed in
separate process, so whole while; do ...; done loop is executed in subshell and
we have no way to pass arguments back to parent shell. Workaround for this is
usage of in-place documents. So lets rewrite example as follows:

IFS=:
i=0
while read name pass uid gid gcos homedir shell junk; do
    echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
    i=$(($i+1))
done < /etc/passwd

In this case we have moved all previos to 'while' pipeline to separate shell(s)
and it's output will be feed on stdin of while statement. Just like in previous
case. So it works well and returns correct results - many many lines of passwd
file. In simular way you can join output of several commands and pass it as
input for while loop.

~ программирование на языке erlang ~

31 August 2006

Shell programming: Extreme shell programming

Extreme shell programming

Advanced Shell Programming

Pattern matching in shells (emulating regular expressions)

cut_atomic

cut

Another way to extract positional parameters

Pattern cheching

Arrays in shell

Breaking up input and line-by-line processing

Erlang программирование: Запуск отладчика

Запуск отладчика в Erlang

Erlang programming: Starting visual debugger

Starting visual debugger

25 August 2006

Язык erlang: Трассирование вызовов функции

Трассирование процессов в Erlang

Erlang programming: Trace function calls

To trace function calls

Blog Archive

About Me

~~~ программирование на языке erlang ~~~

31 August 2006

Shell programming: Extreme shell programming

Extreme shell programming

Advanced Shell Programming

Pattern matching in shells (emulating regular expressions)

cut_atomic

cut

Another way to extract positional parameters

Pattern cheching

Arrays in shell

Breaking up input and line-by-line processing

Erlang программирование: Запуск отладчика

Запуск отладчика в Erlang

Erlang programming: Starting visual debugger

Starting visual debugger

25 August 2006

Язык erlang: Трассирование вызовов функции

Трассирование процессов в Erlang

Erlang programming: Trace function calls

To trace function calls

Blog Archive

About Me

~ программирование на языке erlang ~