ARRAYS, LISTS AND HASHES By SANA MATEEN
ARRAYS  It is collections of scalar data items which have an assigned storage space in memory, and can therefore be accessed using a variable name.  The difference between arrays and hashes is that the constituent elements of an array are identified by a numerical index, which starts at zero for the first element.  array always starts with @, eg: @days_of_week.  An array stores a collection, and list is a collection, so it is natural to assign a list to an array. eg. @rainfall=(1.2, 0.4, 0.3, 0.1, 0, 0 , 0); This creates an array of seven elements. These can be accessed like $rainfall[0], $rainfall[1], .... $rainfall[6]. A list can also occur as elements of other list. @foo=(1,2,3, “string”); @foobar= (4, 5, @foo, 6); This gives foobar the value (4,5,1,2,3, “string”,6).
MANIPULATING ARRAYS  Elements of an array are selected using C like square bracket syntax, eg: $bar=$foo[2].  The $ and [ ] make it clear that this instance foo is an element of the array foo, not the scalar variable foo.  A group of contiguous elements is called a slice, and is accessed using simple syntax.  @foo[1..3]  Is the same as the list ($foo[1],$foo[2],$foo[3])  The slice can be used as the destination of the assignment eg:@foo[1..3]= (“hop”, “skip”, “jump”);  Array variables and lists can be used interchangeably in almost any sensible situation: $front=(“bob”, “carol”, “ted”, “alice”)[0]; @rest=(“bob”, “carol”, “ted”, “alice”) [1..3]; or even @rest=qw/bob carol ted alice/[1..3]; Elements of an array can be selected by using another array selector.  @foo =(7, “fred”, 9);  @bar=(2,1,0); then @foo=@foo[@bar];
LISTS  List is a collection of variables , constants or expressions which is to be treated as a whole . It is written as comma separated sequence of values. Eg: “red”, “green”, “blue”  A list often appears in a script enclosed in round brackets. (“red”, “green”, “blue”)  Short hand used in lists: (1..8) and (“A”.. “H”, “O”.. “Z”).  To save the tedious typing, qw(the quick brown fox)  Is short hand for : (“the”, “quick”, “brown”, “fox”).  qw- quote words operator, is an obvious extension of the q and qq operator.  qw/the quick brown fox/ (or) qw|the quick brown fox|  The list containing variables can appear as the target of an assignment and/or as the value to be assigned. ($a , $b , $c)= (1,2,3);
MANIPULATING LISTS  Perl provides several built-in functions for list manipulation. Three useful ones are a)shift LIST b)unshift LIST c)push LIST  a) returns the first item of the list and moves remaining items down reducing the size of the list by 1.  b) the opposite of shift: puts the items in LIST at the beginning of ARRAY, moving the original contents up by the required amount.  c) push LIST: It is similar to unshift but adds the values in LIST to the end of ARRAY  ITERATING OVER LISTS Perl provides a number of mechanisms to achieve this. I. foreach II. map III. grep  foreach loop: It performs a simple iteration over all the elements of a list. foreach $item (list){ } This blocks takes each value from the list and repeats execution. foreach (@array){ .... #process $_
 map: perl provides an inbuilt function map to create plural forms of words.  @p1=map $_. ‘s’ , @s;  general form of map is: map expression, list;  and map BLOCK list;  we can also use foreach loop to achieve the same. @s=qw/cat, dog, rabbit, hamster, rat/; @p1=(); foreach (@s){ push @p1, $_. ‘s’ }  grep : In unix grep is used to print all lines of the file which contains an instance of pattern. grep pattern file The perl grep function takes a pattern and a list and returns new list containing all the elements of the original list that match the pattern. Eg: @things = (car, bus, cardigan, jumper, carrot); grep /car/ @things returns the list (car,cardigan,carrot)
HASHES  A hash is a set of key/value pairs. Hash variables are preceded by a percent (%) sign. To refer to a single element of a hash, you will use the hash variable name preceded by a "$" sign and followed by the "key" associated with the value in curly brackets.  Here is a simple example of using the hash variables −  #!/usr/bin/perl  %data = ('John Paul', 45, 'Lisa', 30, 'Kumar', 40);  print "$data{'John Paul'} = $data{'John Paul'}n";  print "$data{'Lisa'} = $data{'Lisa'}n";  print "$data{'Kumar'} = $data{'Kumar'}n";  This will produce the following result −  $data{'John Paul'} = 45  $data{'Lisa'} = 30  $data{'Kumar'} = 40  CREATING HASHES we can assign a list to an array, so it is not surprising that we can assign a list of key-value pairs to a hash. for example: %foo= (key1, value1, key2, value2,.....); alternative syntax is provided using the => operator to associate key-value pairs %foo =(banana => ‘yellow’ , apple=>’green’ , ...) Key Value(age) John Paul 45 Lisa 30 Kumar 40
MANIPULATING HASHES  Perl provides a number of built-in functions to facilitate manipulation of hashes. If we have a hash called magic. keys %magic Returns a list of the keys of the elements in the hash. values %magic These functions provide way to iterate over the elements of hash using foreach: foreach $key(keys %magic) { do something with $magic($key) } Explicit loop variable is omitted, in which case the anonymous variable $_ will be assumed. foreach(keys %magic) { process $magic($_); } An alternative is to use “each” operator which delivers successive key-value pairs from a hash. while(($key,$value)=each %magic){ ... }
 Other useful operators for manipulating hashes are delete and exists.  delete $magic($key)  Removes the elements whose key matches $key from the hash %magic, and  exists $magic($key)  Returns true if the hash %magic contains an element whose key matches $key.common idiomis exists($h{‘key’})&&do(statements) To avoid using an if statement.
STRINGS,PATTERNS AND REGULAR EXPRESSIONS BY SANA MATEEN
INTRODUCTION TO REGULAR EXPRESSIONS  It is a way of defining patterns.  A notation for describing the strings produced by regular expression.  The first application of regular expressions in computer system was in the text editors ed and sed in the UNIX system.  Perl provides very powerful and dynamic string manipulation based on the usage of regular expressions.  Pattern Match – searching for a specified pattern within string.  For example:  A sequence motif,  Accession number of a sequence,  Parse HTML,  Validating user input.  Regular Expression (regex) – how to make a pattern match.
HOW REGEX WORK Regex code Perl compiler Input data (e.g. sequence file) outputregex engine
SIMPLE PATTERNS  Place the regex between a pair of forward slashes ( / / ).  try:  #!/usr/bin/perl  while (<STDIN>) {  if (/abc/) {  print “>> found ‘abc’ in $_n”;  }  }  Save then run the program. Type something on the terminal then press return. Ctrl+C to exit script.  If you type anything containing ‘abc’ the print statement is returned.
STAGES 1. The characters | ( ) [ { ^ $ * + ? . are meta characters with special meanings in regular expression. To use metacharacters in regular expression without a special meaning being attached, it must be escaped with a backslash. ] and } are also metacharacters in some circumstances. 2. Apart from meta characters any single character in a regular expression /cat/ matches the string cat. 3. The meta characters ^ and $ act as anchors: ^ -- matches the start of the line $ -- matches the end of the line. so regex /^cat/ matches the string cat only if it appears at the start of the line. /cat$/ matches only at the end of the line. /^cat$/ matches the line which contains the string cat and /^$/ matches an empty line. 4. The meta character dot (.) matches any single character except newline, so/c.t/ matches cat,cot,cut, etc.
STAGES 5. A character class is set of characters enclosed in square brackets. Matches any single character from those listed. So /[aeiou]/- matches any vowel /[0123456789]/-matches any digit Or /[0-9]/ 6. A character class of the form /[^....]/ matches any characters except those listed, so /[^0-9]/ matches any non digit. 7. To remove the special meaning of minus to specify regular expression to match arithmetic operators. /[+-*/]/ 8. Repetition of characters in regular expression can be specified by the quantifiers * -- zero or more occurrences + -- one or more occurrences ? – zero or more occurrences 9. Thus /[0-9]+/ matches an unsigned decimal number and /a.*b/ matches a substring starting with ‘a’ and ending with ‘b’, with an indefinite number of other characters in between.
FACILITIES 1. Alternations | If RE1,RE2,RE3 are regular expressions, RE1|RE2|RE3 will match any one of the components. 2. Grouping- ( ) Round Brackets can be used to group items. /pitt the (elder|younger)/ 3. Repetition counts Explicit repetition counts can be added to a component of regular expression /(wet[]){2}wet/ matches ‘ wet wet wet’ Full list of possible count modifiers are {n} – must occur exactly n times {n,} –must occur at least n times {n,m}- must occur at least n times but no more than m times. 4. Regular expression  Simple regex to check for an IP address:  ^(?:[0-9]{1,3}.){3}[0-9]{1,3}$
FACILITIES 5. Non-greedy matching A pattern including .* matches the longest string it can find. The pattern .*? Can be used when the shortest match is required. ? – shortest match 6.Short hand This notation is given for frequent occurring character classes. d – matches- digit w – matches – word s- matches- whitespace D- matches any non digit character Capitalization of notation reverses the sense 7. Anchors b – word boundary B – not a word boundary /bJohn/ -matches both the target string John and Johnathan. 8. Back References Round brackets define a series of partial matches that are remembered for use in subsequent processing or in the RegEx itself. 9. The Match Operator The match operator, m//, is used to match a string or statement to a regular expression. For example, to match the character sequence "foo" against the scalar $bar, you might use a statement like this: if ($bar =~ /foo/) Note that the entire match expression.that is the expression on the left of =~ or !~ and the match operator,
BINDING OPERATOR  Previous example matched against $_  Want to match against a scalar variable?  Binding Operator “=~” matches pattern on right against string on left.  Usually add the m operator – clarity of code.  $string =~ m/pattern/
MATCHING ONLY ONCE  There is also a simpler version of the match operator - the ?PATTERN? operator.  This is basically identical to the m// operator except that it only matches once within the string you are searching between each call to reset.  For example, you can use this to get the first and last elements within a list:  To remember which portion of string matched we use $1,$2,$3 etc  #!/usr/bin/perl  @list = qw/food foosball subeo footnote terfoot canic footbrdige/;  foreach (@list) {  $first = $1 if ?(foo.*)?; $last = $1 if /(foo.*)/;  }  print "First: $first, Last: $lastn";  This will produce following result First: food, Last: footbrdige
s/PATTERN/REPLACEMENT/; $string =~ s/dog/cat/; #/user/bin/perl $string = 'The cat sat on the mat'; $string =~ s/cat/dog/; print "Final Result is $stringn"; This will produce following result The dog sat on the mat THE SUBSTITUTION OPERATOR The substitution operator, s///, is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is: The PATTERN is the regular expression for the text that we are looking for. The REPLACEMENT is a specification for the text or regular expression that we want to use to replace the found text with. For example, we can replace all occurrences of .dog. with .cat. Using Another example:
PATTERN MATCHING MODIFIERS  m//i – Ignore case when pattern matching.  m//g – Helps to count all occurrence of substring. $count=0; while($target =~ m/$substring/g) { $count++ }  m//m – treat a target string containing newline characters as multiple lines.  m//s –Treat a target string containing new line characters as single string, i.e dot matches any character including newline.  m//x – Ignore whitespace characters in the regular expression unless they occur in character class.  m//o – Compile regular expressions once only
THE TRANSLATION OPERATOR  Translation is similar, but not identical, to the principles of substitution, but unlike substitution, translation (or transliteration) does not use regular expressions for its search on replacement values. The translation operators are −  tr/SEARCHLIST/REPLACEMENTLIST/cds y/SEARCHLIST/REPLACEMENTLIST/cds  The translation replaces all occurrences of the characters in SEARCHLIST with the corresponding characters in REPLACEMENTLIST.  For example, using the "The cat sat on the mat." string  #/user/bin/perl  $string = 'The cat sat on the mat';  $string =~ tr/a/o/;  print "$stringn";  When above program is executed, it produces the following result −  The cot sot on the mot.
TRANSLATION OPERATOR MODIFIERS  Standard Perl ranges can also be used, allowing you to specify ranges of characters either by letter or numerical value.  To change the case of the string, you might use the following syntax in place of the uc function.  $string =~ tr/a-z/A-Z/;  Following is the list of operators related to translation. Modifier Description c Complements SEARCHLIST d Deletes found but unreplaced characters s Squashes duplicate replaced characters.
SPLIT  Syntax of split  split REGEX, STRING will split the STRING at every match of the REGEX.  split REGEX, STRING, LIMIT where LIMIT is a positive number. This will split the STRING at every match of the REGEX, but will stop after it found LIMIT- 1 matches. So the number of elements it returns will be LIMIT or less.  split REGEX - If STRING is not given, splitting the content of $_, the default variable of Perl at every match of the REGEX.  split without any parameter will split the content of $_ using /s+/ as REGEX.  Simple cases  split returns a list of strings:  use Data::Dumper qw(Dumper); # used to dump out the contents of any variable during the running of a program  my $str = "ab cd ef gh ij";  my @words = split / /, $str;  print Dumper @words;  The output is:  $VAR1 = [ 'ab', 'cd', 'ef', 'gh', 'ij' ];
SUBSROUTINES IN PERL BY SANA MATEEN
WHAT IS SUBROUTINE?  A Perl subroutine or function is a group of statements that together performs a task. You can divide up your code into separate subroutines. How you divide up your code among different subroutines is up to you, but logically the division usually is so each function performs a specific task.  Perl uses the terms subroutine, method and function interchangeably.  The simplest way for reusing code is building subroutines.  They allow executing the same code in several places in your application, and they allow it to be executed with different parameters.  Define and Call a Subroutine  The general form of a subroutine definition in Perl programming language is as follows −  sub subroutine_name  { body of the subroutine  }  The typical way of calling that Perl subroutine is as follows −  subroutine_name( list of arguments );  Or  &subroutine_name(earlier way);
Because Perl compiles your program before executing it, it doesn't matter where you declare your subroutine.
# # Main Code # pseudo-code ..set variables . call sub1 . call sub2 . call sub3 . exit program sub 1 # code for sub 1 exit subroutine sub 2 # code for sub 2 exit subroutine sub 3 # code for sub 3 call sub 4 exit subroutine sub 4 # code sub4 exit PROGRAM(CODE) DESIGN USING SUBROUTINES -PSEUDO CODE
PASSING ARGUMENTS TO A SUBROUTINE  You can pass various arguments to a subroutine like you do in any other programming language and they can be accessed inside the function using the special array @_. Thus the first argument to the function is in $_[0], the second is in $_[1], and so on.  You can pass arrays and hashes as arguments like any scalar but passing more than one array or hash normally causes them to lose their separate identities. So we will use references to pass any array or hash.  Let's try the following example, which takes a list of numbers and then prints their average −
PASSING LISTS TO SUBROUTINES  Because the @_ variable is an array, it can be used to supply lists to a subroutine. However, because of the way in which Perl accepts and parses lists and arrays, it can be difficult to extract the individual elements from @_. If you have to pass a list along with other scalar arguments, then make list as the last argument as shown below −
PASSING HASHES TO SUBROUTINES  When you supply a hash to a subroutine or operator that accepts a list, then hash is automatically translated into a list of key/value pairs. For example −
RETURNING VALUE FROM A SUBROUTINE  You can return a value from subroutine like you do in any other programming language. If you are not returning a value from a subroutine then whatever calculation is last performed will automatically returns value.  You can return arrays and hashes from the subroutine like any scalar but returning more than one array or hash normally causes them to lose their separate identities. So we will use references to return any array or hash from a function.  Let's try the following example, which takes a list of numbers and then returns their average −
PRIVATE VARIABLES IN A SUBROUTINE  By default, all variables in Perl are global variables, which means they can be accessed from anywhere in the program. But you can create private variables called lexical variables at any time with the my operator.  The my operator confines a variable to a particular region of code in which it can be used and accessed. Outside that region, this variable cannot be used or accessed. This region is called its scope. A lexical scope is usually a block of code with a set of braces around it, such as those defining the body of the subroutine or those marking the code blocks of if, while, for, foreach, and evalstatements.  Following is an example showing you how to define a single or multiple private variables using my operator −  sub somefunc {  my $variable; # $variable is invisible outside somefunc()  my ($another, @an_array, %a_hash); # declaring many variables at once  }
The following example distinguishes between global variable and private variable.
ADVANTAGES OF SUBROUTINES  Saves typing → fewer lines of code →less likely to make a mistake  re-usable  if subroutine needs to be modified, can be changed in only one place  other programs can use the same subroutine  can be tested separately  makes the overall structure of the program clearer

Unit 1-array,lists and hashes

  • 1.
    ARRAYS, LISTS ANDHASHES By SANA MATEEN
  • 2.
    ARRAYS  It iscollections of scalar data items which have an assigned storage space in memory, and can therefore be accessed using a variable name.  The difference between arrays and hashes is that the constituent elements of an array are identified by a numerical index, which starts at zero for the first element.  array always starts with @, eg: @days_of_week.  An array stores a collection, and list is a collection, so it is natural to assign a list to an array. eg. @rainfall=(1.2, 0.4, 0.3, 0.1, 0, 0 , 0); This creates an array of seven elements. These can be accessed like $rainfall[0], $rainfall[1], .... $rainfall[6]. A list can also occur as elements of other list. @foo=(1,2,3, “string”); @foobar= (4, 5, @foo, 6); This gives foobar the value (4,5,1,2,3, “string”,6).
  • 3.
    MANIPULATING ARRAYS  Elementsof an array are selected using C like square bracket syntax, eg: $bar=$foo[2].  The $ and [ ] make it clear that this instance foo is an element of the array foo, not the scalar variable foo.  A group of contiguous elements is called a slice, and is accessed using simple syntax.  @foo[1..3]  Is the same as the list ($foo[1],$foo[2],$foo[3])  The slice can be used as the destination of the assignment eg:@foo[1..3]= (“hop”, “skip”, “jump”);  Array variables and lists can be used interchangeably in almost any sensible situation: $front=(“bob”, “carol”, “ted”, “alice”)[0]; @rest=(“bob”, “carol”, “ted”, “alice”) [1..3]; or even @rest=qw/bob carol ted alice/[1..3]; Elements of an array can be selected by using another array selector.  @foo =(7, “fred”, 9);  @bar=(2,1,0); then @foo=@foo[@bar];
  • 4.
    LISTS  List isa collection of variables , constants or expressions which is to be treated as a whole . It is written as comma separated sequence of values. Eg: “red”, “green”, “blue”  A list often appears in a script enclosed in round brackets. (“red”, “green”, “blue”)  Short hand used in lists: (1..8) and (“A”.. “H”, “O”.. “Z”).  To save the tedious typing, qw(the quick brown fox)  Is short hand for : (“the”, “quick”, “brown”, “fox”).  qw- quote words operator, is an obvious extension of the q and qq operator.  qw/the quick brown fox/ (or) qw|the quick brown fox|  The list containing variables can appear as the target of an assignment and/or as the value to be assigned. ($a , $b , $c)= (1,2,3);
  • 5.
    MANIPULATING LISTS  Perlprovides several built-in functions for list manipulation. Three useful ones are a)shift LIST b)unshift LIST c)push LIST  a) returns the first item of the list and moves remaining items down reducing the size of the list by 1.  b) the opposite of shift: puts the items in LIST at the beginning of ARRAY, moving the original contents up by the required amount.  c) push LIST: It is similar to unshift but adds the values in LIST to the end of ARRAY  ITERATING OVER LISTS Perl provides a number of mechanisms to achieve this. I. foreach II. map III. grep  foreach loop: It performs a simple iteration over all the elements of a list. foreach $item (list){ } This blocks takes each value from the list and repeats execution. foreach (@array){ .... #process $_
  • 6.
     map: perlprovides an inbuilt function map to create plural forms of words.  @p1=map $_. ‘s’ , @s;  general form of map is: map expression, list;  and map BLOCK list;  we can also use foreach loop to achieve the same. @s=qw/cat, dog, rabbit, hamster, rat/; @p1=(); foreach (@s){ push @p1, $_. ‘s’ }  grep : In unix grep is used to print all lines of the file which contains an instance of pattern. grep pattern file The perl grep function takes a pattern and a list and returns new list containing all the elements of the original list that match the pattern. Eg: @things = (car, bus, cardigan, jumper, carrot); grep /car/ @things returns the list (car,cardigan,carrot)
  • 7.
    HASHES  A hashis a set of key/value pairs. Hash variables are preceded by a percent (%) sign. To refer to a single element of a hash, you will use the hash variable name preceded by a "$" sign and followed by the "key" associated with the value in curly brackets.  Here is a simple example of using the hash variables −  #!/usr/bin/perl  %data = ('John Paul', 45, 'Lisa', 30, 'Kumar', 40);  print "$data{'John Paul'} = $data{'John Paul'}n";  print "$data{'Lisa'} = $data{'Lisa'}n";  print "$data{'Kumar'} = $data{'Kumar'}n";  This will produce the following result −  $data{'John Paul'} = 45  $data{'Lisa'} = 30  $data{'Kumar'} = 40  CREATING HASHES we can assign a list to an array, so it is not surprising that we can assign a list of key-value pairs to a hash. for example: %foo= (key1, value1, key2, value2,.....); alternative syntax is provided using the => operator to associate key-value pairs %foo =(banana => ‘yellow’ , apple=>’green’ , ...) Key Value(age) John Paul 45 Lisa 30 Kumar 40
  • 8.
    MANIPULATING HASHES  Perlprovides a number of built-in functions to facilitate manipulation of hashes. If we have a hash called magic. keys %magic Returns a list of the keys of the elements in the hash. values %magic These functions provide way to iterate over the elements of hash using foreach: foreach $key(keys %magic) { do something with $magic($key) } Explicit loop variable is omitted, in which case the anonymous variable $_ will be assumed. foreach(keys %magic) { process $magic($_); } An alternative is to use “each” operator which delivers successive key-value pairs from a hash. while(($key,$value)=each %magic){ ... }
  • 9.
     Other usefuloperators for manipulating hashes are delete and exists.  delete $magic($key)  Removes the elements whose key matches $key from the hash %magic, and  exists $magic($key)  Returns true if the hash %magic contains an element whose key matches $key.common idiomis exists($h{‘key’})&&do(statements) To avoid using an if statement.
  • 10.
  • 11.
    INTRODUCTION TO REGULAREXPRESSIONS  It is a way of defining patterns.  A notation for describing the strings produced by regular expression.  The first application of regular expressions in computer system was in the text editors ed and sed in the UNIX system.  Perl provides very powerful and dynamic string manipulation based on the usage of regular expressions.  Pattern Match – searching for a specified pattern within string.  For example:  A sequence motif,  Accession number of a sequence,  Parse HTML,  Validating user input.  Regular Expression (regex) – how to make a pattern match.
  • 12.
    HOW REGEX WORK Regex code Perl compiler Inputdata (e.g. sequence file) outputregex engine
  • 13.
    SIMPLE PATTERNS  Placethe regex between a pair of forward slashes ( / / ).  try:  #!/usr/bin/perl  while (<STDIN>) {  if (/abc/) {  print “>> found ‘abc’ in $_n”;  }  }  Save then run the program. Type something on the terminal then press return. Ctrl+C to exit script.  If you type anything containing ‘abc’ the print statement is returned.
  • 14.
    STAGES 1. The characters | ( ) [ { ^ $ * + ? . are meta characters with special meanings in regular expression. To use metacharacters in regular expression without a special meaning being attached, it must be escaped with a backslash. ] and } are also metacharacters in some circumstances. 2. Apart from meta characters any single character in a regular expression /cat/ matches the string cat. 3. The meta characters ^ and $ act as anchors: ^ -- matches the start of the line $ -- matches the end of the line. so regex /^cat/ matches the string cat only if it appears at the start of the line. /cat$/ matches only at the end of the line. /^cat$/ matches the line which contains the string cat and /^$/ matches an empty line. 4. The meta character dot (.) matches any single character except newline, so/c.t/ matches cat,cot,cut, etc.
  • 15.
    STAGES 5. A characterclass is set of characters enclosed in square brackets. Matches any single character from those listed. So /[aeiou]/- matches any vowel /[0123456789]/-matches any digit Or /[0-9]/ 6. A character class of the form /[^....]/ matches any characters except those listed, so /[^0-9]/ matches any non digit. 7. To remove the special meaning of minus to specify regular expression to match arithmetic operators. /[+-*/]/ 8. Repetition of characters in regular expression can be specified by the quantifiers * -- zero or more occurrences + -- one or more occurrences ? – zero or more occurrences 9. Thus /[0-9]+/ matches an unsigned decimal number and /a.*b/ matches a substring starting with ‘a’ and ending with ‘b’, with an indefinite number of other characters in between.
  • 16.
    FACILITIES 1. Alternations | IfRE1,RE2,RE3 are regular expressions, RE1|RE2|RE3 will match any one of the components. 2. Grouping- ( ) Round Brackets can be used to group items. /pitt the (elder|younger)/ 3. Repetition counts Explicit repetition counts can be added to a component of regular expression /(wet[]){2}wet/ matches ‘ wet wet wet’ Full list of possible count modifiers are {n} – must occur exactly n times {n,} –must occur at least n times {n,m}- must occur at least n times but no more than m times. 4. Regular expression  Simple regex to check for an IP address:  ^(?:[0-9]{1,3}.){3}[0-9]{1,3}$
  • 17.
    FACILITIES 5. Non-greedy matching Apattern including .* matches the longest string it can find. The pattern .*? Can be used when the shortest match is required. ? – shortest match 6.Short hand This notation is given for frequent occurring character classes. d – matches- digit w – matches – word s- matches- whitespace D- matches any non digit character Capitalization of notation reverses the sense 7. Anchors b – word boundary B – not a word boundary /bJohn/ -matches both the target string John and Johnathan. 8. Back References Round brackets define a series of partial matches that are remembered for use in subsequent processing or in the RegEx itself. 9. The Match Operator The match operator, m//, is used to match a string or statement to a regular expression. For example, to match the character sequence "foo" against the scalar $bar, you might use a statement like this: if ($bar =~ /foo/) Note that the entire match expression.that is the expression on the left of =~ or !~ and the match operator,
  • 18.
    BINDING OPERATOR  Previousexample matched against $_  Want to match against a scalar variable?  Binding Operator “=~” matches pattern on right against string on left.  Usually add the m operator – clarity of code.  $string =~ m/pattern/
  • 19.
    MATCHING ONLY ONCE There is also a simpler version of the match operator - the ?PATTERN? operator.  This is basically identical to the m// operator except that it only matches once within the string you are searching between each call to reset.  For example, you can use this to get the first and last elements within a list:  To remember which portion of string matched we use $1,$2,$3 etc  #!/usr/bin/perl  @list = qw/food foosball subeo footnote terfoot canic footbrdige/;  foreach (@list) {  $first = $1 if ?(foo.*)?; $last = $1 if /(foo.*)/;  }  print "First: $first, Last: $lastn";  This will produce following result First: food, Last: footbrdige
  • 20.
    s/PATTERN/REPLACEMENT/; $string =~ s/dog/cat/; #/user/bin/perl $string= 'The cat sat on the mat'; $string =~ s/cat/dog/; print "Final Result is $stringn"; This will produce following result The dog sat on the mat THE SUBSTITUTION OPERATOR The substitution operator, s///, is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is: The PATTERN is the regular expression for the text that we are looking for. The REPLACEMENT is a specification for the text or regular expression that we want to use to replace the found text with. For example, we can replace all occurrences of .dog. with .cat. Using Another example:
  • 21.
    PATTERN MATCHING MODIFIERS m//i – Ignore case when pattern matching.  m//g – Helps to count all occurrence of substring. $count=0; while($target =~ m/$substring/g) { $count++ }  m//m – treat a target string containing newline characters as multiple lines.  m//s –Treat a target string containing new line characters as single string, i.e dot matches any character including newline.  m//x – Ignore whitespace characters in the regular expression unless they occur in character class.  m//o – Compile regular expressions once only
  • 22.
    THE TRANSLATION OPERATOR Translation is similar, but not identical, to the principles of substitution, but unlike substitution, translation (or transliteration) does not use regular expressions for its search on replacement values. The translation operators are −  tr/SEARCHLIST/REPLACEMENTLIST/cds y/SEARCHLIST/REPLACEMENTLIST/cds  The translation replaces all occurrences of the characters in SEARCHLIST with the corresponding characters in REPLACEMENTLIST.  For example, using the "The cat sat on the mat." string  #/user/bin/perl  $string = 'The cat sat on the mat';  $string =~ tr/a/o/;  print "$stringn";  When above program is executed, it produces the following result −  The cot sot on the mot.
  • 23.
    TRANSLATION OPERATOR MODIFIERS Standard Perl ranges can also be used, allowing you to specify ranges of characters either by letter or numerical value.  To change the case of the string, you might use the following syntax in place of the uc function.  $string =~ tr/a-z/A-Z/;  Following is the list of operators related to translation. Modifier Description c Complements SEARCHLIST d Deletes found but unreplaced characters s Squashes duplicate replaced characters.
  • 24.
    SPLIT  Syntax ofsplit  split REGEX, STRING will split the STRING at every match of the REGEX.  split REGEX, STRING, LIMIT where LIMIT is a positive number. This will split the STRING at every match of the REGEX, but will stop after it found LIMIT- 1 matches. So the number of elements it returns will be LIMIT or less.  split REGEX - If STRING is not given, splitting the content of $_, the default variable of Perl at every match of the REGEX.  split without any parameter will split the content of $_ using /s+/ as REGEX.  Simple cases  split returns a list of strings:  use Data::Dumper qw(Dumper); # used to dump out the contents of any variable during the running of a program  my $str = "ab cd ef gh ij";  my @words = split / /, $str;  print Dumper @words;  The output is:  $VAR1 = [ 'ab', 'cd', 'ef', 'gh', 'ij' ];
  • 26.
  • 27.
    WHAT IS SUBROUTINE? A Perl subroutine or function is a group of statements that together performs a task. You can divide up your code into separate subroutines. How you divide up your code among different subroutines is up to you, but logically the division usually is so each function performs a specific task.  Perl uses the terms subroutine, method and function interchangeably.  The simplest way for reusing code is building subroutines.  They allow executing the same code in several places in your application, and they allow it to be executed with different parameters.  Define and Call a Subroutine  The general form of a subroutine definition in Perl programming language is as follows −  sub subroutine_name  { body of the subroutine  }  The typical way of calling that Perl subroutine is as follows −  subroutine_name( list of arguments );  Or  &subroutine_name(earlier way);
  • 28.
    Because Perl compilesyour program before executing it, it doesn't matter where you declare your subroutine.
  • 29.
    # # Main Code #pseudo-code ..set variables . call sub1 . call sub2 . call sub3 . exit program sub 1 # code for sub 1 exit subroutine sub 2 # code for sub 2 exit subroutine sub 3 # code for sub 3 call sub 4 exit subroutine sub 4 # code sub4 exit PROGRAM(CODE) DESIGN USING SUBROUTINES -PSEUDO CODE
  • 30.
    PASSING ARGUMENTS TOA SUBROUTINE  You can pass various arguments to a subroutine like you do in any other programming language and they can be accessed inside the function using the special array @_. Thus the first argument to the function is in $_[0], the second is in $_[1], and so on.  You can pass arrays and hashes as arguments like any scalar but passing more than one array or hash normally causes them to lose their separate identities. So we will use references to pass any array or hash.  Let's try the following example, which takes a list of numbers and then prints their average −
  • 31.
    PASSING LISTS TOSUBROUTINES  Because the @_ variable is an array, it can be used to supply lists to a subroutine. However, because of the way in which Perl accepts and parses lists and arrays, it can be difficult to extract the individual elements from @_. If you have to pass a list along with other scalar arguments, then make list as the last argument as shown below −
  • 32.
    PASSING HASHES TOSUBROUTINES  When you supply a hash to a subroutine or operator that accepts a list, then hash is automatically translated into a list of key/value pairs. For example −
  • 33.
    RETURNING VALUE FROMA SUBROUTINE  You can return a value from subroutine like you do in any other programming language. If you are not returning a value from a subroutine then whatever calculation is last performed will automatically returns value.  You can return arrays and hashes from the subroutine like any scalar but returning more than one array or hash normally causes them to lose their separate identities. So we will use references to return any array or hash from a function.  Let's try the following example, which takes a list of numbers and then returns their average −
  • 34.
    PRIVATE VARIABLES INA SUBROUTINE  By default, all variables in Perl are global variables, which means they can be accessed from anywhere in the program. But you can create private variables called lexical variables at any time with the my operator.  The my operator confines a variable to a particular region of code in which it can be used and accessed. Outside that region, this variable cannot be used or accessed. This region is called its scope. A lexical scope is usually a block of code with a set of braces around it, such as those defining the body of the subroutine or those marking the code blocks of if, while, for, foreach, and evalstatements.  Following is an example showing you how to define a single or multiple private variables using my operator −  sub somefunc {  my $variable; # $variable is invisible outside somefunc()  my ($another, @an_array, %a_hash); # declaring many variables at once  }
  • 35.
    The following exampledistinguishes between global variable and private variable.
  • 36.
    ADVANTAGES OF SUBROUTINES Saves typing → fewer lines of code →less likely to make a mistake  re-usable  if subroutine needs to be modified, can be changed in only one place  other programs can use the same subroutine  can be tested separately  makes the overall structure of the program clearer