Site Map - skip to main content

Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes every weekday Monday through Friday.
This page was generated by The HPR Robot at


hpr2669 :: Additional ancillary Bash tips - 12

Making decisions in Bash (part 4)

<< First, < Previous, , Latest >>

Thumbnail of Dave Morriss
Hosted by Dave Morriss on 2018-10-25 is flagged as Explicit and is released under a CC-BY-SA license.
Bash, test, regular expression. 9.
The show is available on the Internet Archive at: https://archive.org/details/hpr2669

Listen in ogg, spx, or mp3 format. Play now:

Duration: 00:28:22

Bash Scripting.

This is an open series in which Hacker Public Radio Listeners can share their Bash scripting knowledge and experience with the community. General programming topics and Bash commands are explored along with some tutorials for the complete novice.

Additional ancillary Bash tips - 12

Making decisions in Bash

This is the twelfth episode in the Bash Tips sub-series. It is the fourth of a group of shows about making decisions in Bash.

In the last three episodes we saw the types of test Bash provides, and we looked briefly at some of the commands that use these tests. We looked at conditional expressions and all of the operators Bash provides to do this. We concentrated particularly on string comparisons which use glob and extended glob patterns.

Now we want to look at the other form of string comparison, using regular expressions.

Long notes

I have provided detailed notes as usual for this episode, and these can be viewed here.


Comments

Subscribe to the comments RSS feed.

Comment #1 posted on 2018-10-26 10:17:21 by Mad Sweeney

Quoted Literals in Regex

Hi,

It seems the rule of quoted literals doesn't apply if the RHS is a variable. So a variable with a quoted "." would try to match a quote followed by . followed by another quote. If you wanted to match a quote in a literal RE you would have to write "." The following Bash snippet illustrates:

#!/bin/bash v=0 for r in '^a.b$' '^a"."b$' "^a'.'b$"; do ((v++)) # matches var 1 only [[ a.b =~ $r ]] && echo match var $v # matches var 2 only [[ 'a"."b' =~ $r ]] && echo match double quote $v # matches var 3 only [[ "a'.'b" =~ $r ]] && echo match single quote $v # all 3 match eval "[[ a.b =~ $r ]] && echo match eval $v" done

I find the numerous ways of testing in Bash confusing. I have to look up the manual every time I come back to Bash scripting. I hope posting about it will help keep it in the brain.

--Mad

Comment #2 posted on 2018-10-26 14:12:00 by Mad Sweeney

Re: Quoted Literals in Regex

It also seems like HPR comments eats backslashes! Here's my comment showing where backslashes should be. Would be good if there was a preview comment option:

It seems the rule of quoted literals doesn't apply if the RHS is a variable. So a variable with a quoted "." would try to match a quote followed by . followed by another quote. If you wanted to match a quote in a literal RE you would have to write {backslash}"{backslash}.{backslash}" A literal RE "." would be like unquoted {backslash}. The following Bash snippet illustrates:

#!/bin/bash

v=0 for r in '^a{backslash}.b$' '^a"."b$' "^a'.'b$"; do ((v++)) # matches var 1 only [[ a.b =~ $r ]] && echo match var $v # matches var 2 only [[ 'a"."b' =~ $r ]] && echo match double quote $v # matches var 3 only [[ "a'.'b" =~ $r ]] && echo match single quote $v # all 3 match eval "[[ a.b =~ $r ]] && echo match eval $v" done

Comment #3 posted on 2018-10-26 15:23:27 by Stuart Little

quoting portions of regex

Re: the previous comment by Mad Sweeney:

You can quote portions of variables on the RHS just fine, but for the match to work the overall pattern you're trying to match must not be enclosed in *outer* quotes. So for instance, the following modification of your script works fine (matches):

--- server="hackerpublicradio.org"

for re in publicradio"."org do echo "Using regular expression: $re" if [[ $server =~ $re ]]; then echo "This is HPR" else echo "No match" fi done ---

Note that there are no outside quotes on publicradio"."org.

The issue was visible from the echoes given out by bash. When you received the message

Using regular expression: ^(hacker|hobby)publicradio"."org$ No match

you can see bash was searching for actual quotes around the period, which of course are not there in the string $server.

Comment #4 posted on 2018-10-26 22:23:16 by Mad Sweeney

Re: Quoted Literals in Regex

The quirk Dave refers to is that you can remove the meta-status of a character in a literal RHS by quoting it so abc'.'def only matches abc.def but not abcxdef, and that it seems there is no way to do that using a regex in a variable: in a variable you only have the traditional backslash escape which you can also use in a literal regex.

--Mad

Comment #5 posted on 2018-10-27 10:09:51 by Dave Morriss

Thanks for the combined wisdom being directed at my question

Thanks to Mad Sweeney and Stuart Little for commenting on this issue.

In the light of your comments my simple tests were these:

$ [[ 'axb' =~ a.b ]] && echo "Match" Match - The RE on the right uses '.' as a metacharacter

$ [[ 'axb' =~ a'.'b ]] && echo "Match" - The "meta-ness" of the '.' is removed by quoting, so no match

$ [[ 'a.b' =~ a'.'b ]] && echo "Match" Match - Proving that a literal match works

$ re="a'.'b" $ [[ 'a.b' =~ $re ]] && echo "Match" - Now the match fails if the RE is in a variable

$ eval "[[ 'a.b' =~ $re ]] && echo Match" Match - Following Mad Sweeney's lead, the 'eval' substitutes in the contents of '$re' so it looks to the extended test like the literal string we used earlier, and thus it works.

My working hypothesis is that the Bash logic processing this can deal with quoted metacharacters in a "bare string" but isn't used when the RE is in a variable - or maybe in any case where expansion is needed to provide the RHS argument.

You'd have to think this was a bug I guess.

Comment #6 posted on 2018-10-27 10:31:10 by Dave Morriss

Backslashes in comments

Yes, there's a bug in the comment code (or what I call a bug).

I think that, in the spirit of avoiding the "Little Bobby Tables" error the comment text is being sanitised, but the sanitisation includes backslash removal.

You can include a backslash at the moment, but you need to double it: backslash '\'

We'll have a look at this issue.

Dave

Comment #7 posted on 2018-10-27 21:37:10 by Mad Sweeney

Not just backslashes

It's eating ampersands too! Grrrrrrrrrrrrr!

Comment #8 posted on 2018-10-27 22:00:20 by Dave Morriss

Comments eating ampersands?

I don't see evidence of ampersand eating. Could you point to an example?

My earlier comment #5 had ampersands galore and they are all visible, unless I'm missing something. They are being turned into HTML entities of course, but that's what you'd expect.

Comment #9 posted on 2018-10-27 23:59:10 by Mad Sweeney

Re: Comments eating ampersands?

Apologies Dave, It's a bug in the screen reader: reading one ampersand where there are two. [I must dump all this proprietary as soon as possible.]

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Title:
Comment:
Anti Spam Question: What does the letter P in HPR stand for?
Are you a spammer?
What is the HOST_ID for the host of this show?
What does HPR mean to you?