3

Regular expressions are a weakness of mine.

I am looking for a regex or other technique that will allow me to read an arbitrary string and determine if it is a valid java function.

Good:

public void foo() void foo() static protected List foo() static List foo() 

Bad:

public List myList = new List() 

Code:

For String line : lines. { If(line.matches("(public|protected|private)*(/w)*(") } 

Is there such a regex that will return true if it's a valid java function?

8
  • 1
    Could you look for a ; at the end of the line? You may have to look at rows around it as well Commented Sep 12, 2012 at 16:20
  • Im on a phone right now so i dont have the source in front of me. Ill try to recall from memory in an update Commented Sep 12, 2012 at 16:20
  • Something like the above as i recall Commented Sep 12, 2012 at 16:25
  • The general case of "any valid java function" most probably cannot be done with regexp - it requires proper parser. Do you have any limits on what needs to be recognized? Commented Sep 12, 2012 at 16:29
  • 1
    Do you need to be able to detect methods with generic parameters or return types? If so then you will not be able to use a regex as regexes cannot be used to describe a context free grammar, which is needed to describe generics (that is, the generic parameter of a type may be itself be a generic type and so on). Commented Sep 12, 2012 at 16:38

2 Answers 2

6
/^\s*(public|private|protected)?\s+(static)?\s+\w+\s+\w+\s*\(.*?\)\s*$/m 

Matches:

  • Start of line <^>
  • Arbitrary White space <\s*>
  • Optional scope <(public|private|protected)?>
  • At least one space <\s+>
  • Optional keyword static <(static)?>
  • At least one space <\s+>
  • A java identifier (which you should hope is a class name or literal) <\w+>
  • At least one space <\s+>
  • A java identifier (the function name) <\w+>
  • Open paren <(>
  • arbitrary arguments (no checking done here, because of the massive mess) <.*?>
    • The does lazy matching
  • Close paren <)>
  • arbitrary whitespace <\s*>
  • End of line

This is FAR from complete, but ought to suit your needs.

Sign up to request clarification or add additional context in comments.

1 Comment

The java code conventions call for using one of the access modifiers (public, private or protected) before using the static indicator. Of course, you may code them the other way around.
3

Depends how rigorous you need it to be, because it can get fairly complex as a regex.

The grammar for method declarations in Java is something like the following:

Java method declaration BNF:

method_declaration ::= { modifier } type identifier "(" [ parameter_list ] ")" { "[" "]" } ( statement_block | ";" ) 

and you have to check things like having multiple modifiers but not the same modifier repeated or multiple scope modifiers, also other things like the type and identifier isn't one of the Java keywords. Starts getting hairy... I doubt you'd want to write your own Java parser.

1 Comment

The BNF would lead to something like this: /^\s*?(((public|private|protected|static|final|native|synchronized|abstract|threadsafe|transient)\s+?)*)\s*?(\w+?)\s+?(\w+?)\s*?\(([^)]*)\)[\w\s,]*?(\{)?\s*?$/gm and it doesn't check the semantics, either, it only checks the syntax.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.