40

Today I learned that I can use perl -c filename to find unmatched curly brackets {} in arbitrary files, not necessarily Perl scripts. The problem is, it doesn't work with other types of brackets () [] and maybe <>. I also had experiments with several Vim plugins that claims to help finding unmatched brackets but so far not so good.

I have a text file with quite a few brackets and one of them is missing! Is there any program / script / vim plugin / whatever that can help me identify the unmatched bracket?

3 Answers 3

27

In Vim you can use [ and ] to quickly travel to nearest unmatched bracket of the type entered in the next keystroke.

So [{ will take you back up to the nearest unmatched "{"; ]) would take you ahead to the nearest unmatched ")", and so on.

2
  • 8
    I will also add that in vim you can use % (Shift 5, in the USA) to immediately find the matching bracket for the one you're on. Commented Mar 29, 2011 at 14:31
  • 5
    Onfortunately, this does not work for brackets. [[ and ]] actually go to the next open/closed brace in the first column respectively. Commented Aug 29, 2019 at 12:09
8

Update 2:
The following script now prints out the line number and column of a mismached bracket. It processes one bracket type per scan (ie. '[]' '<>' '{}' '()' ...)
The script identifies the first ,unmatched right bracket, or the first of any un-paired left bracket... On detecting an erroe, it exits with the line and column numbers

Here is some sample output...


File = /tmp/fred/test/test.in Pair = () *INFO: Group 1 contains 1 matching pairs ERROR: *END-OF-FILE* encountered after Bracket 7. A Left "(" is un-paired in Group 2. Group 2 has 1 un-paired Left "(". Group 2 begins at Bracket 3. see: Line, Column (8, 10) ----+----1----+----2----+----3----+----4----+----5----+----6----+----7 000008 ( ) ( ( ( ) ) 

Here is the script...


#!/bin/bash # Itentify the script bname="$(basename "$0")" # Make a work dir wdir="/tmp/$USER/$bname" [[ ! -d "$wdir" ]] && mkdir -p "$wdir" # Arg1: The bracket pair 'string' pair="$1" # pair='[]' # test # pair='<>' # test # pair='{}' # test # pair='()' # test # Arg2: The input file to test ifile="$2" # Build a test source file ifile="$wdir/$bname.in" cp /dev/null "$ifile" while IFS= read -r line ;do echo "$line" >> "$ifile" done <<EOF AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA [ ] [ [ [ < > < < > < > > > ----+----1----+----2----+----3----+----4----+----5----+----6 { } { } } } } ( ) ( ( ( ) ) ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ EOF echo "File = $ifile" # Count how many: Left, Right, and Both left=${pair:0:1} rght=${pair:1:1} echo "Pair = $left$rght" # Make a stripped-down 'skeleton' of the source file - brackets only skel="/tmp/$USER/$bname.skel" cp /dev/null "$skel" # Make a String Of Brackets file ... (It is tricky manipulating bash strings with [].. sed 's/[^'${rght}${left}']//g' "$ifile" > "$skel" < "$skel" tr -d '\n' > "$skel.str" Left=($(<"$skel.str" tr -d "$left" |wc -m -l)); LeftCt=$((${Left[1]}-${Left[0]})) Rght=($(<"$skel.str" tr -d "$rght" |wc -m -l)); RghtCt=$((${Rght[1]}-${Rght[0]})) yBkts=($(sed -e "s/\(.\)/ \1 /g" "$skel.str")) BothCt=$((LeftCt+RghtCt)) eleCtB=${#yBkts[@]} echo if (( eleCtB != BothCt )) ; then echo "ERROR: array Item Count ($eleCtB)" echo " should equal BothCt ($BothCt)" exit 1 else grpIx=0 # Keep track of Groups of nested pairs eleIxFir[$grpIx]=0 # Ix of First Bracket in a specific Group eleCtL=0 # Count of Left brackets in current Group eleCtR=0 # Count of Right brackets in current Group errIx=-1 # Ix of an element in error. for (( eleIx=0; eleIx < eleCtB; eleIx++ )) ; do if [[ "${yBkts[eleIx]}" == "$left" ]] ; then # Left brackets are 'okay' until proven otherwise ((eleCtL++)) # increment Left bracket count else ((eleCtR++)) # increment Right bracket count # Right brackets are 'okay' until their count exceeds that of Left brackets if (( eleCtR > eleCtL )) ; then echo echo "ERROR: MIS-matching Right \"$rght\" in Group $((grpIx+1)) (at Bracket $((eleIx+1)) overall)" errType=$rght errIx=$eleIx break elif (( eleCtL == eleCtR )) ; then echo "*INFO: Group $((grpIx+1)) contains $eleCtL matching pairs" # Reset the element counts, and note the first element Ix for the next group eleCtL=0 eleCtR=0 ((grpIx++)) eleIxFir[$grpIx]=$((eleIx+1)) fi fi done # if (( eleCtL > eleCtR )) ; then # Left brackets are always potentially valid (until EOF)... # so, this 'error' is the last element in array echo echo "ERROR: *END-OF-FILE* encountered after Bracket $eleCtB." echo " A Left \"$left\" is un-paired in Group $((grpIx+1))." errType=$left unpairedCt=$((eleCtL-eleCtR)) errIx=$((${eleIxFir[grpIx]}+unpairedCt-1)) echo " Group $((grpIx+1)) has $unpairedCt un-paired Left \"$left\"." echo " Group $((grpIx+1)) begins at Bracket $((eleIxFir[grpIx]+1))." fi # On error, get Line and Column numbers if (( errIx >= 0 )) ; then errLNum=0 # Source Line number (current). eleCtSoFar=0 # Count of bracket-elements in lines processed so far. errItemNum=$((errIx+1)) # error Ix + 1 (ie. "1 based") # Read the skeketon file to find the error line-number while IFS= read -r skline ; do ((errLNum++)) brackets="${skline//[^"${rght}${left}"]/}" # remove whitespace ((eleCtSoFar+=${#brackets})) if (( eleCtSoFar >= errItemNum )) ; then # We now have the error line-number # ..now get the relevant Source Line excerpt=$(< "$ifile" tail -n +$errLNum |head -n 1) # Homogenize the brackets (to be all "Left"), for easy counting mogX="${excerpt//$rght/$left}"; mogXCt=${#mogX} # How many 'Both' brackets on the error line? if [[ "$errType" == "$left" ]] ; then # R-Trunc from the error element [inclusive] ((eleTruncCt=eleCtSoFar-errItemNum+1)) for (( ele=0; ele<eleTruncCt; ele++ )) ; do mogX="${mogX%"$left"*}" # R-Trunc (Lazy) done errCNum=$((${#mogX}+1)) else # errType=$rght mogX="${mogX%"$left"*}" # R-Trunc (Lazy) errCNum=$((${#mogX}+1)) fi echo " see: Line, Column ($errLNum, $errCNum)" echo " ----+----1----+----2----+----3----+----4----+----5----+----6----+----7" printf "%06d $excerpt\n\n" $errLNum break fi done < "$skel" else echo "*INFO: OK. All brackets are paired." fi fi exit 
1
  • 2
    This is awesome, but it seems to always print Line, Column (8, 10) no matter which file I try it on. Also mogXCt=${#mogX} is set but not used anywhere. Commented Nov 8, 2017 at 3:39
5

The best option is vim/gvim as identified by Shadur, but if you want a script, you can check my answer to a similar question on Stack Overflow. I repeat my whole answer here:

If what you are trying to do applies to a general purpose language, then this is a non-trivial problem.

To start with you will have to worry about comments and strings. If you want to check this on a programming language that uses regular expressions, this will make your quest harder again.

So before I can come in and give you any advice on your question I need to know the limits of your problem area. If you can guarantee that there are no strings, no comments and no regular expressions to worry about - or more generically nowhere in the code that brackets can possibly be used other than for the uses for which you are checking that they are balanced - this will make life a lot simpler.

Knowing the language that you want to check would be helpful.


If I take the hypothesis that there is no noise, i.e. that all brackets are useful brackets, my strategy would be iterative:

I would simply look for and remove all inner bracket pairs: those that contain no brackets inside. This is best done by collapsing all lines to a single long line (and find a mechanism to to add line references, should you need to get that information out). In this case the search and replace is pretty simple:

It requires an array:

B["("]=")"; B["["]="]"; B["{"]="}" 

And a loop through those elements:

for (b in B) {gsub("[" b "][^][(){}]*[" B[b] "]", "", $0)} 

My test file is as follows:

#!/bin/awk ($1 == "PID") { fo (i=1; i<NF; i++) { F[$i] = i } } ($1 + 0) > 0 { count("VIRT") count("RES") count("SHR") count("%MEM") } END { pintf "VIRT=\t%12d\nRES=\t%12d\nSHR=\t%12d\n%%MEM=\t%5.1f%%\n", C["VIRT"], C["RES"], C["SHR"], C["%MEM"] } function count(c[) { f=F[c]; if ($f ~ /m$/) { $f = ($f+0) * 1024 } C[c]+=($f+0) } 

My full script (without line referencing) is as follows:

cat test-file-for-brackets.txt | \ tr -d '\r\n' | \ awk \ ' BEGIN { B["("]=")"; B["["]="]"; B["{"]="}" } { m=1; while(m>0) { m=0; for (b in B) { m+=gsub("[" b "][^][(){}]*[" B[b] "]", "", $0) } }; print } ' 

The output of that script stops on the innermost illegal uses of brackets. But beware: 1/ this script will not work with brackets in comments, regular expressions or strings, 2/ it does not report where in the original file the problem is located, 3/ although it will remove all balanced pairs it stops at the innermost error conditions and keeps all englobbing brackets.

Point 3/ is probably an exploitable result, though I'm not sure of the reporting mechanism you had in mind.

Point 2/ is relatively easy to implement but takes more than a few minutes work to produce, so I'll leave it up to you to figure out.

Point 1/ is the tricky one because you enter a whole new realm of competing sometimes nested beginnings and endings, or special quoting rules for special characters...

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.