Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Road to POSIX #26

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Road to POSIX #26

wants to merge 1 commit into from

Conversation

rgcv
Copy link

@rgcv rgcv commented Jan 14, 2022

Howdy 👋

Been a hot minute. I've returned with a scratch at the surface towards converting the notify-send.sh script into a POSIX shell-compliant one. This first batch of changes covers conditional constructs.

BASH is quite comfortable and offers a series of things POSIX shell does not, such as convenient extended regular expression matching. POSIX shell is fairly poorer in this regard, but functionally equivalent alternatives exist. Here is my first proposal. It is a relatively substantial one, but I thought it would be best to break things into what I consider feature-set disparities between BASH and POSIX shell.

I'm creating a draft PR so this can be further discussed before it can be promoted to a proper PR and be more thoroughly tested as well.

@@ -78,9 +78,9 @@ make_action() {

make_hint() {
type=$(convert_type "$1")
[[ ! $? = 0 ]] && return 1
[ $? -ne 0 ] && return 1
Copy link
Author

@rgcv rgcv Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good place to start: conditionals in POSIX shell don't use double brackets, and the subset of things one is allowed to do is more limited as well. Instead, single brackets are used. Each pair of brackets is equivalent to the test(1p) command.

Here, specifically, we replace [[ ! ... = ... ]] with a numeric comparison [ ... -ne ... ] since we're testing the value of $?.

Comment on lines -217 to +222
if [[ "$1" = -* ]] && ! [[ "$positional" = yes ]] ; then
echo "Unknown option $1"
exit 1
case "$1" in
-*) if [ "$positional" != yes ] ; then
echo "Unknown option $1"
exit 1
fi ;;
esac
Copy link
Author

@rgcv rgcv Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we can't really do regular expression matching with the regular test or [ ], we have to resort to other means to replace similar constructs. Fortunately, case provides some fairly basic matching capabilities that we can benefit from.

We first check if $1 is prefixed with - much in the same way as in the old if statement. If that is the case, we check if $positional is not yes, using a familiar != operator, meaning much the same as it does in other languages.

The difference between using = or != and using -eq, -ne, -gt, and co, is that the former apply only to strings while the latter are for numeric comparison.

Comment on lines -230 to +232
while (( $# > 0 )) ; do
while [ $# -gt 0 ] ; do
Copy link
Author

@rgcv rgcv Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of bracket expansion does not exist in POSIX shell. The replacement is test's numeric comparison operators. In this case, -gt, for greater-than. An quasi-symmetric -lt exists, as well as x-than-or-equal variants too.

Comment on lines -240 to 250
-u|--urgency|--urgency=*)
[[ "$1" = --urgency=* ]] && urgency="${1#*=}" || { shift; urgency="$1"; }
process_urgency "$urgency"

-u|--urgency)
shift
process_urgency "$1"
;;
--urgency=*)
process_urgency "${1#*=}"
;;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a bit trickier to handle and, for some situations, the solution is clean. However, in cases where the handling is a bit more convoluted than just checking the --option=* variant, one would probably need at least one extra repeated case statement and it starts polluting the option parsing section.

My recommendation would be replacing that kind of handling and validation with functions, much like this process_urgency, and sweep away all of that mess away from here.

Copy link
Author

@rgcv rgcv Jan 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a more convoluted approach, but it works for any kind of --option=*-like option

while [ $# -gt 0 ]; do
  case "$1" in
    --[[:lower:]]*=*)
      __opt="${1%=*}"
      __val="${1#*=}"
      shift
      set -- ignored "$__opt" "$__val" "$@"
      unset __opt __val
      ;;
  # other options
  esac
shift
done

Let's break it up:

  1. The pattern --[[:lower:]]*=* is equivalent to the --[a-z].*=.* extended regular expression, though some character classes take LC_CTYPE into consideration. [:lower:] could be replaced with a-z:
    • Not the strongest of checks, but a decent one to start with. Further validation could incur after an initial match.
  2. We break the option and the value into temporary variables (__opt and __val resp.) for later usage;
  3. shift the now parsed option
  4. Use set -- to set positional argument values:
    • We set the first one ($1) to some mock value;
    • $2 and $3 become the option and the value, respectively;
    • The rest is set as whatever still needs to be processed;
  5. We clean up (unset) the temporary variables

What happens next is we break out of the case, the mock value we set gets shifted, and we are going to hit the case (if one exists) that covers option we parsed. The rest is business as usual.

Say the following arguments are passed in

notify-send.sh --option=value summary body

Here, $1 is --option=value, and $@ would be summary body. Once we parse $1, we get __opt=--option and __val=value. We shift, resulting in the following situation:

notify-send.sh summary body

However, we then reset the positional arguments, putting them in the following state:

notify-send.sh ignored --option value summary body

Then, after we clean up and break out, the ignored mock argument is shifted and we end up with:

notify-send.sh --option value summary body

🎉

notify-send.sh Outdated
Comment on lines 258 to 263
case "${EXPIRE_TIME#-*}" in
*[![:digit:]]*|"")
echo "Invalid expire time: ${EXPIRE_TIME}"
exit 1
;;
esac
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once more, since we don't have extended (or even basic, it's just a subset) regular expression matching in POSIX shell, we must make due with what we have.

We first strip a potential leading - minus sign from the candidate numeric value. Then, we test if there it is either empty or if there is a character present that isn't ! in the [:digit:] (could be replaced with [0-9], honestly) character class (not as convenient as \d, but gets the job done). In case we find an intruder, we report the error. So instead of checking against a numeric pattern inclusively, we invert the check. Works beautifuly 👍

Comment on lines +253 to +256
case "$1" in
--expire-time=*) EXPIRE_TIME="${1#*=}" ;;
*) shift; EXPIRE_TIME="$1" ;;
esac
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a prime example of what I just mentioned, where we need the extra case to check the different variant so we don't inadvertently shift arguments or repeat option parsing/validation logic in a different branch just for the --option=* variant.

@rgcv
Copy link
Author

rgcv commented Jan 15, 2022

Oh, as an added note, instead of using cases for some of the pattern matching, one could resort to expr(1p).

edit: in fact, just a bit of exploration lead me to an interesting and clearer alternative for matching numeric values using expr.

expr "$variable" : '-\{,1\}[[:digit:]]\{1,\}$'

One can already notice the anchor $ at the end, but a missing ^ at the beginning. As per expr(1p):

EXTENDED DESCRIPTION
    ...
    Matching Expression
    ...... <initial excerpt of paragraph omitted for brevity's sake> ......  Regular expression syn‐
    tax shall be that defined in the Base Definitions volume of  POSIX.1‐2017,  Section  9.3,  Basic
    Regular  Expressions, except that all patterns are anchored to the beginning of the string (that
    is, only sequences starting at the first character of a string are matched by  the  regular  ex‐
    pression)  and, therefore, it is unspecified whether '^' is a special character in that context.
    ...

The pattern, broken down:

-           # the '-' literal
\{,1\}      # at most once (= <obj>?)
[[:digit:]] # decimal digit character class
\{1,\}      # at least once (= <obj>+ or <obj><obj>*)
$           # end anchor

which is equivalent to the expression already used in the script (/^-?[0-9]+$/).

Here are a quick couple of (paranoia satiation) tests;

$ pat='-\{,1\}[[:digit:]]\{1,\}$'
$ expr '1234' : "$pat"
4 # number of characters matched
$ expr '-1234' : "$pat"
5
$ expr ' 1234' : "$pat"
0
$ expr 'aaa1234' : "$pat"
0
$ expr '1234aaa' : "$pat"
0
$ expr '1234 ' : "$pat"
0
$ expr '-1234 ' : "$pat"
0
$ expr '-12aa34' : "$pat"
0

@rgcv
Copy link
Author

rgcv commented Jan 15, 2022

Here's an example pitting case vs. expr:

case ${variable#-} in
  *[![:digit:]]*) echo "not ok"; exit 1 ;;
esac
# versus
if expr "$variable" : '-\{,1\}[[:digit:]]\{1,\}' = 0; then
  echo "not ok"
  exit 1
fi

It's a matter of wanting to do it using pure shell or resorting to expr.

@rgcv rgcv force-pushed the posix-compliance-1 branch from bd4721e to bef907e Compare January 15, 2022 01:40
@rgcv rgcv force-pushed the posix-compliance-1 branch from bef907e to 183bd4d Compare January 15, 2022 02:05
@rgcv rgcv changed the title Road to POSIX: Conditions Road to POSIX Jan 19, 2022
@eylles
Copy link

eylles commented Jan 29, 2023

hmmm, at a glance it looks good, tho for some of the still hard regex stuff what do you think about using some awk magic?

iirc the one true awk that should be standar for POSIX supports a rather nice subset of regex so it can be abused on functions to pass it some input and then pour awk's output into other functions and vars.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants