Extreme shell programming
A year ago I've wrote article for russian magasine System Adminsitrator about using /bin/sh in tiny installations -- to emulate some shell commands, do text matching and etc. Shell fans should like.
Table of Contents
Advanced Shell Programming
Pattern matching in shells (emulating regular expressions)
cut_atomic
cut
Another way to extract positional parameters
Pattern cheching
Arrays in shell
Breaking up input and line-by-line processing
Advanced Shell Programming
This article describes advanced and sometimes not-so-obvious programming tricks
which can be used in shell programming. I suggest you to use all power, that gives standard FreeBSD's /bin/sh shell and use every bit of functionality which have this binary.
Pattern matching in shells (emulating regular expressions)
Sometimes it's necessary to perform some patterm matching in shell script, and,
for example, extract some variable from string, but you have no opportunity to use standard unix utilities like sed, awk or something heavier (like perl :). In this case shell's pattern removal can help.
Please refer to sh(1), Parameter Expansion section.
Here are some examples for simulating some unix utilities by means of shell
functions. All functions given return result in `result` variable, and return 0 on success and non-zero value on error. In case if there are no result to return they reset `result` varible to empty string.
cut_atomic
cut_atomic function cuts string from left of string till the firs occurence of
delimiter, first argument is a delimiter itself - it can by one symbol or some
string, and all other parameters are string where to search. Shell patterns are
always greedy, so please keep this in mind when passing wildcards as delimiter
argument.
cut_atomic () {
local DELIM STRING
DELIM="$1" # delimiter may be any string , not only one symbol
shift
STRING=$* # remaining string
result=${STRING%%${DELIM}*}
return 0
}
in this case ${variable%%pattern} construct removes larges possible suffix from
string - i.e. only the symbols from beginning of string till first occurence of
delimiter will remain.
cut
cut () {
local DELIM POS1 POS2 STRING
DELIM="$1"
POS1=$2
POS2=$3
shift 3
STRING=$* # remaining string
if [ $POS2 -lt $POS1 ]; then
return 1 # error
fi
I=0
while [ $I -lt $POS1 ]; do
STRING=${STRING#*${DELIM}}
done
echo "$STRING"
}
cut function emulated behavior of cut(1) utility, called with -d option, like
/usr/bin/cut -d$DELIM -f$POS1-$POS2. You have to pass delimiter as first
argument in function (if you are using wildcards in delimiter results may be
very unobvious), first and last field numbers and string itself. If you
want only one filed, you must pass same number as first and last field numbers.
cut () {
local DELIM POS1 POS2 STRING STR1 POSTFIX
DELIM="$1"
POS1=$2
POS2=$3
shift 3
STRING=$* # remaining parameters are string
if [ $POS2 -lt $POS1 ]; then
result=""
return 1 # error
fi
I=1
while [ $I -lt $POS1 ]; do # strip first elements
STRING=${STRING#*${DELIM}} # from string
I=$(($I+1))
done
STR1="$STRING" # save the result
while [ $I -le $POS2 ]; do # strip argument
STRING=${STRING#*${DELIM}} # which we need, and
I=$(($I+1)) # get suffix
done
# we have unneded suffix in STRING varible
# strip it from saved result
result=${STR1%${DELIM}${STRING}}
return 0
}
Another way to extract positional parameters
You can use 'set' function to replace arguments of current shell block and you
can extract positional parameter with $NNN varibales. Only thing that you can
not work around is that wildcard extraction will take place and you will get
unpredictable results if you have wildcards in your string.
Here are small example how to do the trick. We are going to extract first 3
bytes rom MAC address of ethernet card.
extract_manufacturer () {
local STR IFS # define IFS as local to keep
# changes inside this func
STR=$*
IFS=':' # break string using : delimiter
set -- $STR # replace positional args with STR variable content
result="$1:$2:$3"
return 0
}
S=`ifconfig fxp0` # get outout of ifconfig
S=${S##*ether} # strip all data before ether keyword
extract_manufacturer "$S"
echo "manufacturer code: $result"
Pattern cheching
Sometimes it's necessary to check string against some pattern. Input
validation is a good example. The only function in shell, which allows you
to know, does really string contain such pattern or not is case statement.
So we are constructing simple wrapper around it to do checking.
First argument for the match_pattern function is a pattern to match, and all
others - single string which is matched against pattern.
match_pattern_strict requires that patterm must match match whole string,
and match_pattern is looser - it requires only match of part of string.
Be warned - you should enclose pattern in single or double quotes most of
time, to prevent expansion before it is passed to function.
match_pattern_strict () {
local PATTERN STRING
PATTERN="$1"
shift
STRING=$*
result=""
case "$STRING" in
$PATTERN)
return 0
esac
return 1
}
match_pattern() {
local PATTERN STRING
PATTERN="$1"
shift
STRING=$*
result=""
case "$STRING" in
*${PATTERN}*)
return 0
esac
return 1
}
Here are some examples of input validation.
match_pattern_strict '[0-9]*.[0-9]*.[0-9]*.*[0-9]' 192.168.0.1
Arrays in shell
Another thing which makes shell programming ugly and nasty thing is absense
of arrays of any kind - associative or indexed ones.
In PicoBSD? boot scripts I've found interesting way to organize data in such
way, that it's simple to emulate array. So i would like to present it.
Let's assume that array name is foo and we want there array element named A
and B, so we will have elements - foo[0]['A'], foo[0]['B'], foo[10]['A']
and so on. Our convention also assumes that A never can take empty value.
So we will create for each array element corresponding variable in script -
foo_0_A, foo_0_B, foo_10_A, foo_10_B and so on.
Here are several functions to work with such kind of arrays
arr_count () {
local ARRNAME KEYNAME VAL I
ARRNAME=$1
KEYNAME=$2
I=0
result=""
eval VAL=\${${ARRNAME}_${I}_${KEYNAME}}
if [ $VAL = "" ];then
return 1 # if first key element is zero
# we assume that array does not exist
fi
while [ "$VAL" != "" ] ; do
I=$(($I+1))
eval VAL=\${${ARRNAME}_${I}_${KEYNAME}}
done
result=$I
return 0
}
arr_count returns number of elements in array. First argument of arr_count is
the array prefix variable - i.e. array name, second argument is the key name.
# array_name key_field_name value_field_name key_value
arr_lookup_by_key () {
local i array kfield vfield kvalue key value
array=$1
kfield=$2
vfield=$3
kvalue=$4
i=0
result=""
key="x" # force it jump inside loop
while [ "$key" != "" ]; do
eval key=\${${array}_${i}_${kfield}}
if [ "$key" = "$kvalue" ]; then
eval result=\${${array}_${i}_${vfield}}
return 0
fi
i=$(($i+1))
done
return 1
}
arr_lookup_by_key function provides convient way to search element value,
knowing the name(corresponding key) of the element. It takes four
arguments - array name, name of key field, name of value field, and value
of key field which is looked up.
# array_name key_field_name value_field_name index
arr_lookup_by_index () {
local i array index vfield value kfield
array=$1
kfield=$2
vfield=$3
kvalue=$4
eval key=\${${array}_${index}_${kfield}}
if [ "$key" = "" ]; then
return 1 # report error - because we have no key
fi
eval result=\${${array}_${index}_${vfield}}
return 0
}
arr_lookup_by_index function provides convient way to get element value,
knowing the index of the element. It takes four arguments - array name,
name of key field, name of value field, and index , which value we want to
get.
Breaking up input and line-by-line processing
Usually we are running awk or perl if we need to scan file line-by-line and
split it by some delimiter and process arguments. But not always we have awk or
perl on hands, so let's try to emulate this behavior with shell only. We will
use ability of read command to split it's stdin and pass it to variables. read
also returns false if there is EOF, so we will use it as loop condition for
while loop
IFS=:
i=0
while read name pass uid gid gcos homedir shell junk; do
echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
i=$(($i+1))
done < /etc/passwd
echo "total lines: $i"
Be very carefull when redirecting some file as input to while loop. If you will
do it in while loop condition itself - something like this --
while read name pass uid gid gcos homedir shell junk < /etc/passd; do
...
done
It will loop forever because read will begin parsing stdin on each iteration.
So we need to redirect data on while operators stdin, and we must do it after
while operator end - i.e. after done statement.
Another thing we need is to parse output of another command in pipe. So we will
try:
cat /etc/passwd | while read name pass uid gid gcos homedir shell junk; do
echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
i=$(($i+1))
done
echo "total lines: $i"
We will get correct results for each line of passwd file - it will count them
correctly, but last statement will show 0 lines. That is strange for a first
glance, but we need to remember, that each command in pipe is executed in
separate process, so whole while; do ...; done loop is executed in subshell and
we have no way to pass arguments back to parent shell. Workaround for this is
usage of in-place documents. So lets rewrite example as follows:
IFS=:
i=0
while read name pass uid gid gcos homedir shell junk; do
echo "$i|$name|$uid|$gid|$gcos|$homedir|$shell|$junk|"
i=$(($i+1))
done < /etc/passwd
In this case we have moved all previos to 'while' pipeline to separate shell(s)
and it's output will be feed on stdin of while statement. Just like in previous
case. So it works well and returns correct results - many many lines of passwd
file. In simular way you can join output of several commands and pass it as
input for while loop.
No comments:
Post a Comment