Bash Shell: Arrays, Pro and Con
Table of Contents
See the References below
1 Introduction
This article on bash arrays provides an opportunity to compare use of bash arrays with one alternative, factoring the interface into functions.
While many will be encouraged to add the array to their practice, my recommendation is to take the time to appreciate, if not adopt the pfunctional alternative.
2 Arrays
Array syntax is used:
- to assign values to the arrays
- to count the size of an array
- to fetch members of the parallel arrays
And the programming is quite straight-foward.
# List of logs and who should be notified of issues logPaths=("api.log" "auth.log" "jenkins.log" "data.log") logEmails=("jay@email" "emma@email" "jon@email" "sophia@email") # Look for signs of trouble in each log for i in ${!logPaths[@]}; do log=${logPaths[$i]} stakeholder=${logEmails[$i]} numErrors=$( tail -n 100 "$log" | grep "ERROR" | wc -l ) # Warn stakeholders if recently saw > 5 errors if [[ "$numErrors" -gt 5 ]]; then emailRecipient="$stakeholder" emailSubject="WARNING: ${log} showing unusual levels of errors" emailBody="${numErrors} errors found in log ${log}" echo "$emailBody" | mailx -s "$emailSubject" "$emailRecipient" fi done
3 Function Alternative
The functional approach burries the syntactic noise. The email recepient is in a variable name, keyed on the distinct part of the logfile name, e.g.:
$ auth_ema=emma@email
and retreived replacing the .log
suffix with the remainder
of the label _email
. The emaRecepient eval echo
idiom
is necessary to defer the leading (escaped) dollar sign to
fetch the value in the name.
With a bit more work the coupling of the log file to an email address is therefore made explicit.
This approach also takes advantage of the bash shell naming
convention of alternate names in a general pattern, in this
case the common .log
suffix. The collections may be nested
and appear anywhere in the pattern. For example:
echo {a,b}.{x,y} # produces a.x a.y b.x b.y
emaRecepient () { eval echo \$${$1%.log}_email; } pair_log-ema () { eval ${1}_email=$2; } list_of () { eval "$1 () { \${*:-echo} ${*:2}; }"; } list_of logPaths {api,auth,jenkins,data}.log pair_log-ema api jay@email # make these explicit pair_log_ema auth emma@email pair_log-ema jenkins jon@email pair_log-ema data sophia@email errorThreshold () { echo 5; } numErrors () { tail -n 100 $1 | grep ERROR | wc -l; } stakeHolderWarning () { : args: Error Threshhold, a logPath member : local erTh=$1; shift local nErr=$(numErrors $1) : [[ $nErr -gt $erTh ]] && { : : compose and send the error email to : . . . . . . the appropriate mailbox : echo "$nErr errors found in log: $1" | mailx -s "WARNING: Unusual Error Level, $1" $(emaRecepient $1) } } foreachi stakeHolderWarning $(errorThreshold) $(logPaths)
4 Pros and Cons
I prefer the Functional approach over the Array. While the array approach favors conventional programming wisdom, I defy convention by claiming less syntax is better.
4.1 Array
While the array approach is quite straight-forward, here are some liabilities:
- using parallel arrays is a dangerous technique, especially when lists get long.
- while it's nice to have the array size available in the syntax, if it's only use is to sequence through the array, the shell provides a ready alternative.
- the functional approach, which should requrire an economizing of
arguments focuses on the primary iterator, in this case
logPaths
and the email addresses are recognized as a function of the log name.
4.2 Functional
In the functional approach there are a few instances of what I call "more syntax". i.e. that beyond conventional wisdom:
- the flavors of eval
- the bash "repeated name" convention, which could (should) be used in the array script.
- the "foreachi" function belongs to a "foreach" family:
- foreach – takes a function a list of arguments,
- foreachi – same with function, repeating arg, arg list..
- foreachij – function, two repeating, arg list
Each of these bits of enhanced syntax use are meant to make the code cleaner, reducing syntactic noise.
Also, the list_of function is so general, and powerful, it's what causes me to wonder if I'll ever need to use a bash array.
- list_of
The first use of list_of is to return it's names. Here's a demonstration of its power:
$ declare -f list_of list_of () { eval "$1 () { \${*:-echo} ${*:2}; }" } $ list_of logPaths {api,auth,jenkins,data}.log $ logPaths api.log auth.log jenkins.log data.log $ logPaths ls -l ls: api.log: No such file or directory -rw-r--r--@ 1 applemcg staff 96 Jun 10 14:37 auth.log -rw-r--r--@ 1 applemcg staff 140 Jun 10 14:38 data.log -rw-r--r--@ 1 applemcg staff 0 Jun 10 14:47 jenkins.log
The file
jenkins.log
contains the text of commands and the resulting standard output. And shows up as empty in the last command.- first the body of the
list_of
function - next, creating logPaths
- then, the default, routine use: echo the names,
- and as an alternative, with arguments, e.g.
ls -l
, instead of echoing the names, they are used as the arguments to the long-list request.
This latter feature is what causes me to doubt the need for arrays.
- first the body of the
- foreachi
foreachi () { : date: 2017-05-11; report_notargcount 3 $# && return 1; for a in ${*:-3}; do $1 $2 $a; done }
Notice, the shell parameter substitution:
${*:3}
says, in effect return the remainder of the arguments from the third thru the endThe report_notargcount is left as an exercise, here's a hint
- Maintenance
Notice with an appropriate functions:
toStake () { foreachi stakeHolderWarning $(errorThreshold) ${*:-$(logPaths)} } setget () { : ~ name value -- defines NAME function returning VALUE; : ~ name -- defines NAME function with no value, but now settable; set $1 $(UC $1) $2; eval "$1 () { [[ \$# -ge 1 ]] && { setenv $2 \"\$1\"; }; echo \$$2; }"; [[ $# -gt 2 ]] && { $1 $3 } } setget errorThreshold 5
it now becomes possible to consider separate error thresholds for
$ toStake data # uses the default 5, while $ ... $ setget errorThreshold 12 $ toStake auth jenkins # uses another
Always build so the "constants" are easily converted to variables.
4.3 Conclusion
Functions lift "scripting" to a discipline of programming. And make the application malleable by design.
5 References
- http://mcgowans.org/pubs/marty3/commonplace/software/arraysProCon.html
- from the online, my commonplace book
- it's Red Chapter