8

Say I have a multi-line strings, but the entries on it are short; if I try to hexdump, then I get something like this:

echo "something is being written here" | hexdump -C #00000000 73 6f 6d 65 74 68 69 6e 67 0a 69 73 0a 62 65 69 |something.is.bei| #00000010 6e 67 0a 77 72 69 74 74 65 6e 0a 68 65 72 65 0a |ng.written.here.| #00000020 

Most hex dump programs, including hexdump simply function as a 2D matrix (you can define how many bytes/column you're going to have per line); and so in this case, the entire output is compacted on two lines of dump.

Is there a program that I can use, which would keep going as usual - except when it encounters a new line (0x0a - but possibly any other character, or seqence thereof), it would also start a new line? In this case, I'd imagine an output like:

00000000 73 6f 6d 65 74 68 69 6e 67 0a |something.| 0000000a 69 73 0a |is.| 0000000d 62 65 69 6e 67 0a |being.| 00000013 77 72 69 74 74 65 6e 0a |written.| 0000001b 68 65 72 65 0a |here.| 00000020 
1
  • Does no answer fit your needs? - What's missing for you? Commented Mar 22, 2015 at 1:30

3 Answers 3

2

Here is one possibility, a compact solution which is making use of read's capability to restrict the amount of read characters:

c=0 while IFS= read -n16 -r line do len=${#line} ((len<16)) && { ((len++)) ; line+=$'\n' ;} printf "%08x " $c for ((i=0; i<len; i++)) do printf " %02x" "'${line:i:1}" done printf " %*s %s\n" $((50-3*len)) "" "'${line//[^[:print:]]/.}'" ((c+=len)) done 
0

Well, there is printf...

hex_split()( unset c dump slice rad pend _get(){ dd bs=1024 count=1; echo .; } 2>/dev/null _buf() case $((${#dump}>0)):$((${#slice}>0)) in (0:*) dump=$(_get); dump=${dump%.} [ -n "$dump" ] || [ -n "$slice" ];; (*:0) [ "${#dump}" -lt 16 ] && slice=${dump:-$slice} dump= && return slice=${dump%"${dump#$q}"} dump=${dump#$q};;esac _out(){ printf "%08x%02.0s" "$rad" "$((rad+=$#/2))" printf "%02x %.0s" "$@" printf "%-$(((16-($#/2))*3))s" printf "%.0s%.1s" '' ' ' '' \| "$@" '' \| '' "$nl" }; q=$(printf %016s|tr \ \?) ; IFS=\ nl=' ' rad=0 c=0 split=${split:-$nl} slice="$*"; set -- while [ -n "$slice" ] || _buf || ! ${1:+"_out"} "$@" && c=${slice%"${slice#?}"} slice=${slice#?} do set "$@" "'$c" "${c#[![:print:]]}." case $#$c in (32*|*$split) _out "$@"; set --;;esac done ) 

You can hand it stdin or arguments or both. So...

echo "something is being written here" | hex_split something else besides 

...the above prints...

00000000 73 6f 6d 65 74 68 69 6e 67 20 65 6c 73 65 20 62 |something else b| 00000010 65 73 69 64 65 73 00 73 6f 6d 65 74 68 69 6e 67 |esides.something| 00000020 0a |.| 00000021 69 73 0a |is.| 00000024 62 65 69 6e 67 0a |being.| 0000002a 77 72 69 74 74 65 6e 0a |written.| 00000032 68 65 72 65 0a |here.| 

Change the default split char like...

split=${somechar} hex_split 
1
  • I would have loved to upvote your answer because it's awesome. But 3 years on it doesn't quite work. Sample file content per hexdump: 01 0c 02 98 00 01 97 be 0a 16 00 00 Output of your function: 00 01 0c 02 7ffe 01 7ffe 7ffe 0a Commented Apr 29, 2018 at 21:39
0

I needed this in order to compare two files with a difftool, but still be able to see what kind of non-printable characters differ.

This function adds a -n option to hexdump. If -n is specified then the output gets split at linebreaks, if not normal hexdump is called. In comparison to @Janis's answer this is not a complete rewrite of hexdump, but instead hexdump is called with the specified other parameters if given. But hexdump is fed the input linewise by using head and the -s skip option in order to preserve offsets. The function works when being piped as well as when the file is specified. Although it does not work for multiple specified files like hexdump would.

I wanted to make this an easier / shorter alternative answer, but guarding against all these edge cases for inputs actually made it longer.

hexdump() { # introduces artifical line breaks in hexdump output at newline characters # might be useful for comparing files linewise, but still be able to # see the differences in non-printable characters utilizing hexdump # first argument must be -n else normal hexdump will be used local isTmpFile=0 if [ "$1" != '-n' ]; then command hexdump "$@"; else if [ -p /dev/stdin ]; then local file="$( mktemp )" args=( "${@:2}" ) isTmpFile=1 cat > "$file" # save pipe to temporary file else local file="${@: -1}" args=( "${@:2:$#-2}" ) fi # sed doesn't seem to work on file descripts for some very weird reason, # the linelength will always be zero, so check for that, too ... local readfile="$( readlink -- "$file" )" if [ -n "$readfile" ]; then # e.g. readlink might return pipe:[123456] if [ "${readfile::1}" != '/' ]; then readfile="$( mktemp )" isTmpFile=1 cat "$file" > "$readfile" file="$readfile" else file="$readfile" fi fi # we can't use read here else \x00 in the file gets ignored. # Plus read will ignore the last line if it does not have a \n! # Unfortunately using sed '<linenumbeer>p' prints an additional \n # on the last line, if it wasn't there, but I guess still better than # ignoring it ... local linelength offset nBytes="$( cat "$file" | wc -c )" line=1 for (( offset = 0; offset < nBytes; )); do linelength=$( sed -n "$line{p;q}" -- "$file" | wc -c ) (( ++line )) head -c $(( offset + $linelength )) -- "$file" | command hexdump -s $offset "${args[@]}" | sed '$d' (( offset += $linelength )) done # Hexdump displays a last empty line by default showing the # file size, bute we delete this line in the loop using sed # Now insert this last empty line by letting hexdump skip all input head -c $offset -- "$file" | command hexdump -s $offset "$args" if [ "$isTmpFile" -eq 1 ]; then rm "$file"; fi fi } 

You can try it out with echo -e "test\nbbb\nomg\n" | hexdump -n -C which prints:

00000000 74 65 73 74 0a |test.| 00000005 62 62 62 0a |bbb.| 00000009 6f 6d 67 0a |omg.| 0000000d 0a |.| 0000000e 

As a bonus here is my hexdiff function:

hexdiff() { # compares two files linewise in their hexadecimal representation # create temporary files, because else the two 'hexdump -n' calls # get executed multiple times alternatingly when using named pipes: # colordiff <( hexdump -n -C "${@: -2:1}" ) <( hexdump -n -C "${@: -1:1}" ) local a="$( mktemp )" b="$( mktemp )" hexdump -n -C "${@: -2:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$a" hexdump -n -C "${@: -1:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$b" colordiff "$a" "$b" rm "$a" "$b" } 

E.g. test with hexdiff <( printf "test\nbbb\x00 \nomg\nbar" ) <( printf "test\nbbb\nomg\nfoo" ), which will print:

2c2 < 62 62 62 11 20 0a |bbb. .| --- > 62 62 62 0a |bbb.| 4,5c4,5 < 62 61 72 |bar| < 00000012 --- > 0c 6f 6f |.oo| > 00000010 

Edit: Ok, this function is not suited for larger files like 8MB and tools like comparehex or dhex are also not good enough, because they ignore newlines and therefore are not able to match the differences very well. Using a combination of od and sed is much faster:

hexlinedump() { local nChars=$1 file=$2 paste -d$'\n' -- <( od -w$( cat -- "$file" | wc -c ) -tx1 -v -An -- "$file" | sed 's| 0a| 0a\n|g' | sed -r 's|(.{'"$(( 3*nChars ))"'})|\1\n|g' | sed '/^ *$/d' ) <( # need to delete empty lines, because 0a might be at the end of a char # boundary, so that not only 0a, but also the character limit introduces # a line break sed -r 's|(.{'"$nChars"'})|\1\n|g' -- "$file" | sed -r 's|(.)| \1 |g' ) } hexdiff() { colordiff <( hexlinedump 16 "${@: -2:1}" ) <( hexlinedump 16 "${@: -1:1}" ) } 
1
  • Trying the first command only prints the first line, then gives the error hexdump: stdin: Illegal seek, like hexdump is trying to seek to the offset instead of discarding to the offset Commented Sep 30 at 6:55

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.