3

I want to remove unnecessary whitespace on my css file. I am using grep with the command as follows:

$ grep -rn "[[:space:]]$" 

Surprisingly, it is returning a hit on every line in the file. I search for instances of \t\n, \r\n and \n but could not find anything. How do I go about identifying the invisible whitespace and remove it?

3
  • Q: Can you not identify the lines that have 1 or more spaces using this command: grep -rn "[[:space:]]\+$" <css file>? Commented Dec 14, 2013 at 5:43
  • @slm, yes, I can identify the lines, but I can't identify the mysterious invisible character at the end of each line. I created a new text file and copied the whole content over. Amazingly, those invisible characters are not copied to the new file. Original size is 43kB, new size is 41kB. 2kB worth of rubbish there which can't be selected and I don't have any idea about. Commented Dec 14, 2013 at 6:22
  • You can use the tool hexdump to identify the strange characters. hexdump -C <file>. If you know the first couple of lines have them you can do this: head -10 <file> | hexdump -C. Commented Dec 14, 2013 at 6:39

3 Answers 3

3

Rather than attempt to do this with grep you might want to use a formal mini-fication tool. There are many. One such tool is cssmin. This is a port of Yahoo's YUI-compressor.It's in most major distros' repositories.

Fedora

$ sudo yum install python-cssmin 

Example run

$ cssmin < doc.css > doc_compressed.css $ ls -l | grep css -rw-rw-r--. 1 saml saml 2723 Dec 13 23:35 doc_compressed.css -rw-r--r--. 1 saml saml 4626 Dec 13 23:34 doc.css 

The contents of the file looks like so:

$ head doc_compressed.css a:link{text-decoration:none}a:visited{color:#7F7FFF;text-decoration:none}a:hover{text-decoration:underline}a:active{color:white;background-color:blue;text-decoration:underline}body{background-color:white;color:black;font-size:100.01%}img{display:block;border-width:0}h1{background-color:#900;font-size:x-large;font-weight:bold;color:#ebebeb;padding:.3em 5px .5em 5px;m.... 

Compressors

There are many other choices if this one doesn't suit your needs. Take a look at this AskUbuntu post, titled: Minify tool that can be executed through terminal.

Also searching for "CSS minify" or "CSS JS minify" will turn up many choices.

Identifying strange characters

There are several tools you could use to do this. Octal dump (od) or hexdump for starters. I'd go with hexdump.

Example

$ head -10 doc.css | hexdump -C 00000000 0a 2f 2a 20 47 6c 6f 62 61 6c 20 73 74 79 6c 65 |./* Global style| 00000010 73 2e 20 2a 2f 0a 0a 61 3a 6c 69 6e 6b 20 7b 0a |s. */..a:link {.| 00000020 20 20 74 65 78 74 2d 64 65 63 6f 72 61 74 69 6f | text-decoratio| 00000030 6e 3a 20 6e 6f 6e 65 3b 20 20 20 20 20 20 0a 7d |n: none; .}| 00000040 0a 0a 61 3a 76 69 73 69 74 65 64 20 7b 0a 20 20 |..a:visited {. | 00000050 63 6f 6c 6f 72 3a 20 23 37 46 37 46 46 46 3b 0a |color: #7F7FFF;.| 00000060 20 20 74 65 78 74 2d 64 65 63 6f 72 61 74 69 6f | text-decoratio| 00000070 6e 3a 20 6e 6f 6e 65 3b 20 20 20 20 0a |n: none; .| 0000007d 

In the above output the dots at the end of these lines are spaces:

$ head -10 doc.css /* Global styles. */ a:link { text-decoration: none; } a:visited { color: #7F7FFF; text-decoration: none; 

For example:

00000030 6e 3a 20 6e 6f 6e 65 3b 20 20 20 20 20 20 0a 7d |n: none; .}| 

The spaces are the hex characters "0x20". The "0x0a" is the new line character.

2
  • Thanks for the hexdump method! The mystery is actually caused by 0d 0a which is \r\n, but mistakenly identified as a whitespace by grep. I just don't know how it got into the file. Every newline is \r\n instead of just \n. Commented Dec 15, 2013 at 3:12
  • @QuestionOverflow - glad you got the issue resolved. Yes they can leak in in strange ways. I usually get them from other developers that touch the code base. We setup or Subversion repository to gate these from leaking into our development for this exact reason. Commented Dec 15, 2013 at 3:34
1

Don't be surprised that your regexps with a \n don't match: The \n is the line separator, it's not in the line. Every line in your file ends with \n-- by definition.* You'll never find a \n inside a line.

One possibility is that you're looking at a Windows file on Unix, and your mystery character is \r (NB not \r\n), which your grep is not recognizing as part of the EOL.

To find out what your lines actually look like, use od -c.

*Footnote for the nitpickers: Except possibly for the final line, and on very old Mac OS systems, etc., etc.

1
  • Yes, you hit the bulls-eye. Indeed, grep has mistaken \r\n as a whitespace. I just don't understand how \r\n got into my css file, causing gedit to use that for all the newlines I have created just for this file. Commented Dec 15, 2013 at 3:17
1

You can use tr command for doing this for example cat file | tr -d "\t" > newfile this will remove tabs on your file.

follow this link for more information on tr tool.man tr will not much useful for me

Some interesting part enter image description here

2
  • I am not talking about tab \t which is trivial to remove. It is other invisible whitespace characters that I have problem identifying and removing. Commented Dec 14, 2013 at 5:11
  • i give up.i had translated only spaces and tabs as whitespace had you followed that link may be that will help. Commented Dec 14, 2013 at 5:17

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.