How do you configure tidy to parse XML instead of HTML?
Explanation:
A while ago, a co-worker showed me a trick to use tidy to clean up XML.
Apparently, you create a tidyrc file like so:
input-xml: yes quiet: yes indent: yes indent-attributes: yes indent-spaces: 4 char-encoding: utf8 wrap: 0 wrap-asp: no wrap-jste: no wrap-php: no wrap-sections: no Even after adding this to ~/.tidyrc , tidy is still attempting to parse as the default HTML, and not XML:
$ cat -v foo.out | tidy > foo.xml line 3 column 1 - Error: <data> is not recognized! line 3 column 1 - Warning: missing <!DOCTYPE> declaration line 3 column 1 - Warning: discarding unexpected <data> I've tried various permissions:
[root@mongo-test3 tmp]# ls -ial ~ 51562 -rw------- 1 root root 11550 Jul 16 02:17 .bash_history 50973 -rw-r--r-- 1 root root 18 May 1 00:40 .bash_logout 51538 -rw-r--r-- 1 root root 176 May 1 00:40 .bash_profile 51537 -rw-r--r-- 1 root root 124 May 1 00:40 .bashrc 51561 -rwxr-xr-x 1 root root 164 Jul 16 22:16 .tidyrc I've tried naming the file .tidyrc and just tidyrc
Versions:
I've tried this on both MacOS and Cent 6.4
Mac OSX 10.8.4
Darwin spuders-macbook-pro 12.4.0 Darwin Kernel Version 12.4.0: Wed May 1 17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64
CentOS 6.4
Linux mongo-test3 2.6.32-279.22.1.el6.x86_64 #1 SMP Wed Feb 6 03:10:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Research:
Normally I would ask the person who taught me this trick, but they are incommunicable.
Workaround:
As a work around, I can use the -xml flag, but I would prefer to get the tidyrc working:
$ cat -v foo.out | tidy -xml foo.xml