Linux ip-172-26-7-228 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64
Your IP : 18.217.122.223
#!/usr/bin/env ruby
# encoding: utf-8
MYNAME = File.basename($0)
Version = '1.43'
<<'DOC'
= match_parens - find mismatches of various brackets and quotes
= Synopsis
match_parens [filename]
Options:
-p,--pairs=S set matching pairs to S (default: |{}[]()""“”''‘’|)
-n,--number=N set number of mismatching characters shown to N (default: 10)
-l,--latex convert |``...''| to |“...”| before testing
-V,--version print version and exit
-h,--help print short help information and exit
--test do an internal test and exit
= Description
Mismatches of parentheses, braces, (angle) brackets, especially in TeX
sources which may be rich in those, may be difficult to trace. This little
script helps you by writing your text to standard output, after adding a
left margin to your text, which will normally be almost empty, but will
clearly show up to 10 mismatches. (Just try me on myself to see that the
parenthesis starting this sentence will not appear to be matched at the end
of the file. If you look at me in the vim editor, then select this
paragraph and try the command: |:!%|.
By default, the following pairs are tested:
() round brackets or parentheses
{} curly brackets or braces
[] square brackets
<> angle brackets (within html text only)
"" ASCII double quotes
“” Unicode double quotation marks
'' ASCII single quotes
‘’ Unicode single quotation marks
The exit value of the script is 0 when there are no mismatches, 1 otherwise.
Angle brackets are only looked for inside HTML text, where HTML is
supposed to start with |<html>| or |=begin rdoc| and to end with
|</html>| or |=end|.
= Options
-p,--pairs=S
Set matching pairs to S (default: |{}[]()""“”''‘’|). For example, if you
want to look for mismatching ASCII single quotes /only/, use |--pairs="''"|.
Or, if you want to match braces and guillemets only, use |-p «»|.
Note that if html is detected in your text, |<>| is automatically added
to the pairs list. So by default, |<...>| is tested only in html, but
you can test that in other text by specifying the |<>| pair in the
|--pairs| option.
-n,--number=N
Set number of mismatching characters shown to N. By default, only the
first 10 are shown.
-l,--latex
convert |``...''| to |“...”|` before testing.
-V,--version
print this script's version and exit.
-h,--help
print short help information and exit.
--test
do an internal test and exit. Note that if, with the |--pairs| option,
you specify an other pairs list than the default, the test will
probably fail, but you can still see the effects of your pairs list on
the test data.
= Examples
Suppose we have two files, |good| and |bad|, containing these texts:
good: bad:
This is a (simple) test This is a (simple test
without mismatches containing mismatches
then here are some usage examples. First a simple test on these files:
$ match_parens good
1 | | This is a (simple) test
2 | | without mismatches
$ echo $?
0
$ match_parens bad
1 | ( | This is a (simple test
2 | ( | containing mismatches
$ echo $?
1
Just report if there are mismatches:
$ match_parens good >/dev/null && echo fine || echo problems
fine
$ match_parens bad >/dev/null && echo fine || echo problems
problems
Report all tex files with mismatches in the current directory:
$ for i in *.tex; do match_parens $i >/dev/null || echo $i; done
Matches must be in correct order:
$ echo -e "This is a ([simple)] test\n" | match_parens
1 ([)] This is a ([simple)] test
2 ([)]
= Changes
Changes with respect to version 1.41:
- test on more quote characters (single, double, ASCII, Unicode)
- option for changing the character pairs to be tested
- conversion of latex' ``...'' is now an option
- built-in test option |--test|
= Author and copyright
Author Wybo Dekker
Email U{Wybo@dekkerdocumenten.nl}{wybo@dekkerdocumenten.nl}
License Released under the U{www.gnu.org/copyleft/gpl.html}{GNU General Public License}
DOC
require 'optparse'
number, input, latex, lineno, s, html, test, parenstext =
10, STDIN, false, 0, '', false, false, %q{{}[]()""“”''‘’}
ARGV.options do |opt|
opt.banner = "#{MYNAME} - find mismatches of parentheses, braces, (angle) brackets, in texts\n"
opt.on('-p','--pairs=S',String,
"set matching pairs to S (default: #{parenstext})"
) do |v|
parenstext = v
end
opt.on('-n','--number=N',Integer,
"set number of mismatching characters shown to N (default: #{number})"
) do |v|
number = v
end
opt.on('-l','--latex',
'convert ``...'' to “...” before testing') do
latex=true
end
opt.on('-V','--version',
'print version and exit') do
puts Version
exit
end
opt.on('-h','--help',
'print this help and exit') do
puts opt.to_s.sub(/^ *-I\n/,'')
exit
end
opt.on('--test',
'do an internal test and exit') do
input = DATA
test = true
end
opt.on('-I') do system("instscript --zip --pdf --markdown #{MYNAME}"); exit; end
opt.parse!
end
parenshtml = parenstext =~ /<>/ ? parenstext : parenstext + '<>'
parens = parenstext
arg = ARGV[0] || ''
unless arg.empty?
test(?e,arg) or raise("file #{arg} does not exist")
test(?r,arg) or raise("file #{arg} is not readable")
input = File.new(arg)
end
while x = input.gets()
# convert LaTeX's ``...'' to “...”
if latex
x = x.gsub(/``/,'“').gsub(/''/,'”')
end
# only inside html text we check <>, too:
if html && (x=~/^([# ]*=end)/ || x=~/<\/html>/i)
html = false
parens = parenstext
elsif x=~/^([# ]*=begin rdoc)/ || x=~/<html>/i
html = true
parens = parenshtml
end
# match any pair like (), {}, [], “”, <> in parens
re = Regexp.new(parens.scan(/../).join('|').gsub(/[{}()\[\]]/,'\\\\\&'))
# move parens' characters into s
s << x.tr('^'+parens,'')
# remove matches from inside
while s.gsub!(re,'') do end
puts "%4d | %-*s | %s" % [lineno+=1,number,s.slice(0..number-1),x]
end
if test
if s != '“{”}'
raise("test failed! (final mismatch should be “{”})")
else
puts "test succeeded"
end
else
exit s.empty? ? 0 : 1
end
__END__
This first line “{()}” has no mismatch
but here ( we have one.
It is even extended “{() with two more.
Here comes an untested angle bracket <, but here }” all) mismatches are erased.
=begin rdoc
Inside html text now. ‘{There is an unmatched} single quote here.
A brace { is added to it here,
and an angle bracket < is added.
But > all mismatches <are>}’ cancelled here.
=end
Four mismatches start ({"'start {here} but
are cancelled '" here}).
We end here with a mismatch caused by “{(improper)”} nesting!
But the test succeeds because we expected that.
|