Personally, I'd use a perl-compatible modern regexp instead of the horrid POSIX-esque regexps:
regress=> SELECT (regexp_matches('<pika> [SOME_TEXT_WITH|ACTION] And other stuff...', '\[(.*?)\]'))[1]; regexp_matches ----------------------- SOME_TEXT_WITH|ACTION (1 row)
If you want to use the POSIX syntax you have to use the same escape consistently, not \ in some places and # in others. eg:
regress=> SELECT substring( '<pika> [SOME_TEXT_WITH|ACTION] And other stuff...' FROM '%#"#[%#]#"%' FOR '#' ); substring ------------------------- [SOME_TEXT_WITH|ACTION] (1 row)
The docs don't make it very clear that the capture operator is actually <ESCAPECHAR>", not #" specifically. This is equally valid, using a regular backslash escape:
regress=> SELECT substring( '<pika> [SOME_TEXT_WITH|ACTION] And other stuff...' FROM '%\"\[%\]\"%' FOR '\' ); substring ------------------------- [SOME_TEXT_WITH|ACTION] (1 row)
The cause of the odd error is that PostgreSQL translates the POSIX SIMILAR TO style expression into a real regular expression under the covers. Your mixed-escapes regex:
'%> \[#"[A-Z_\|]+#"\] %' FOR '#'
is being turned into something like:
'.*> \\[([A-Z_\\|]+)\\] .*'
resulting in:
regress=> SELECT (regexp_matches('<pika> [SOME_TEXT_WITH|ACTION] And other stuff...', '.*> \\[([A-Z_\\|]+)\\] .*'))[1]; ERROR: invalid regular expression: parentheses () not balanced