examples of using basic linux regular expressions
Linux regular expressions are very much useful, when working with extracting text from files, sorting output from command results, scripting some task etc. I believe regular expressions are bit difficult to master as, they can be used in myriad number of ways. Some of the expressions are deprecated, while some are still kept for backward compatibility.
we will go through some of the most used and very useful way of working with regular expressions here. Let me make one thing very clear, that regular expressions in linux and any programming language is a very large topic in itself, that its beyond the scope of this blog post to cover it completely.
Comments, suggestions are always welcome.
1. Use of Anchor characters ^ and $.
^ (caret symbol) is used to match a character at the beginning of the line.
$ (dollar symbol) is used to match a character at the end of the line.
lets have a look at one example.
i have a file named "testing" containing the following things inside
so to grep out lines beginning with character "a" i will do the following.
[root@myvm1 ~]# grep "^a" testing
a
abc
abcd
abcdef
abcdefg
now to grep out lines with no characters or say blank lines,we will do the following.
[root@myvm1 ~]# grep -c "^$" testing
3
now simply greping without -c option in grep to count, grep will output the blank lines on the screen(which is useless).
"^a" - will output lines bigininng with a
"a$" - will match the lines contining a at the end of the line
"^^" - will match the lines containing caret symbol at the biginning of the line
2.find lines with only one charecter
[root@myvm1 ~]# grep "^.$" testing
a
d
the above shown example with "^.$" here the "." character will match one character, so in short the above expression will match lines with one single character.
3.Match characters specified in between, or range of characters
matching a range of charecters can be done with the help of the expression [...]
if you want to match any line containing only one number, then we can do that with the following expression.
[root@myvm1 ~]# grep "^[0123456789]$" testing
[root@myvm1 ~]#
in our case it didnt show us any output, as our file "testing" didnt contain any lines starting with numbers, and only one number in the line.
now if you want to match a charecter thats either a letter,number or a symbol, we can do the following for that.
[root@myvm1 ~]# grep "[0-9a-zA-Z]" testing
a
abc
abcd
abcdef
abcdefg
sd
sd
d
note that the output didn't show the empty lines in the file.
there are some points to understand before going further.
[0-9] - matches any number
[^0-9] -- will match any character by not a number, (the meaning of ^ changes when placed inside [] brackets, outside squire brackets it means the beginning of the line)
4.Using Asterisk in regular expressions
the charecteer "*" matches zero or more. so we can say that the regular expression [0-9]* will match zero or more findings of numbers.
lets see one example.
[root@myvm1 ~]# grep "^a*" testing
a
abc
abcd
abcdef
abcdefg
sd
sd
d
as * expression will match zero or more occurrences of a it matched all the lines, the result is like "cat testing"
5.Specify maximum number of occurence in regex
now * will grep out patterns matched zero times,or many times, but there is no upper limit or lower limit that we can specify in that.
to specify upper and lower limit while searching for a pattern we can use the "\{" and "\}".
now lets understand why backslashes are used. like inorder to match a period we need to match it in the following way
"\." because we need to first turn off the meaning of simple ".".
an asterisk is matched by \* otherwise the shell will think that you are finding zero or many occurences(the original meaning of *)
[root@myvm1 ~]# grep "s\{1,2\}" testing
sd
sd
the above shown example will match any line that contains one or maximum two occurrences or "s", the lower and upper limit can be set by using \{,and \}.
6.matching words.
matching words can be done with the charecters "\<" and "\>".
[root@myvm1 ~]# grep "\<abc\>" testing
abc
greping for a word can also be done by using the regular expression "abc" but will even match abc in abcdef also. i will show an example out put of using "abc" instead of "\<abc\>".
[root@myvm1 ~]# grep "abc" testing
abc
abcd
abcdef
abcdefg
7.using +operator with regular expressions.
like the * character matches zero or more findings, the + character matches one of the mentioned charecter.
for example.
[root@myvm1 ~]# grep "[a]\+" testing
a
abc
abcd
abcdef
abcdefg
8.search for files ending with different patterns.
suppose you want to search something in several files , the files are named messages,messages.1,messages.2 etc. we can do that by the following technique.
[root@myvm1 log]# grep usb messa*.[0-9]
by the above method we can search the word only on files starting with messa and ending with numbers on them.
9.using the "|" or operator in searching patterns.
to find out lines that bigins with either b or s we need to use the | operator available in regular expression.
[root@myvm1 ~]# grep "^[b|s]" testing
sd
sd
in the above example only two lines are there that begins with s and no lines begin with b in the file "testing"
This post will be continued.....
Add new comment