I need to read first byte of file I specified, then second byte,third and so on. How could I do it on BASH? P.S I need to get HEX of this bytes
- do you need HEX fro each byte ?Grijesh Chauhan– Grijesh Chauhan2012-12-15 06:07:17 +00:00Commented Dec 15, 2012 at 6:07
- A a bin and hex for each byte of filemichaeluskov– michaeluskov2012-12-15 07:15:06 +00:00Commented Dec 15, 2012 at 7:15
- superset, random access: stackoverflow.com/questions/1423346/…Ciro Santilli OurBigBook.com– Ciro Santilli OurBigBook.com2017-05-10 08:37:34 +00:00Commented May 10, 2017 at 8:37
7 Answers
Full rewrite: september 2019!
A lot shorter and simplier than previous versions! (Something faster, but not so much)
New post august 2025! On another thread, I show a way to hold binary datas into a bash variable, using mapfile. See set variable from file content, mostly second part where I store zstd's output into a variable.
Yes , bash can read and write binary:
Syntax:
LANG=C IFS= read -r -d '' -n 1 foo will populate $foo with 1 binary byte. Unfortunately, as bash strings cannot hold null bytes ($\0), reading one byte once is required.
If
readcommand success and$foois empty, then read byte is NULL. Else,$foowill hold THE byte read in binary form.
Then for the value of byte read, ( I've missed this in man bash, have a look at 2016 post, at bottom of this ;b) :
printf -v value %d \`$byte printf [-v var] format [arguments] ... Arguments to non-string format specifiers are treated as C constants, except that ..., and if the leading character is a single or double quote, the value is the ASCII value of the following character.
So:
read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS= read -r -d '' -n 1 _r8_car printf -v $_r8_var %d \'$_r8_car } Will populate submitted variable name (default to $OUTBIN) with decimal ascii value of first byte from STDIN
read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb read8 _r16_lb && read8 _r16_hb printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) } Will populate submitted variable name (default to $OUTBIN) with decimal value of first 16 bits word from STDIN...
Of course, for switching Endianness, you have to switch:
read8 _r16_hb && read8 _r16_lb And so on:
# Usage: # read[8|16|32|64] [varname] < binaryStdInput read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS= read -r -d '' -n 1 _r8_car printf -v $_r8_var %d "'"$_r8_car ;} read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb read8 _r16_lb && read8 _r16_hb printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) ;} read32() { local _r32_var=${1:-OUTBIN} _r32_lw _r32_hw read16 _r32_lw && read16 _r32_hw printf -v $_r32_var %d $(( _r32_hw<<16| _r32_lw )) ;} read64() { local _r64_var=${1:-OUTBIN} _r64_ll _r64_hl read32 _r64_ll && read32 _r64_hl printf -v $_r64_var %d $(( _r64_hl<<32| _r64_ll )) ;} Sample playing with GPT patitions tables.
So you could source this, then if your /dev/sda is gpt partitioned,
read totsize < <(blockdev --getsz /dev/sda) read64 gptbackup < <(dd if=/dev/sda bs=8 skip=68 count=1 2>/dev/null) echo $((totsize-gptbackup)) 1 Answer should be 1 (1st GPT is at sector 1, one sector is 512 bytes. GPT Backup location is at byte 32. With bs=8 512 -> 64 + 32 -> 4 = 544 -> 68 blocks to skip, GPT Backup is located a end of disk (disk size - 1 block.)... See GUID Partition Table at Wikipedia).
Then
read64 gptbackup2 < <( dd if=/dev/sda bs=8 skip=$((4+gptbackup*64)) count=1 2>/dev/null) echo $gptbackup2 1 Answer should be 1 (2nd GPT table, located at end of disk, hold location of 1st GPT table, wich is located at sector 1)
Quick small write function...
write () { local i=$((${2:-64}/8)) o= v r r=$((i-1)) for ((;i--;)) { printf -vv '\%03o' $(( ($1>>8*(0${3+-1}?i:r-i))&255 )) o+=$v } printf "$o" } This function default to 64 bits, little endian.
Usage: write <integer> [bits:64|32|16|8] [switchto big endian] - With two parameter, second parameter must be one of
8,16,32or64, to be bit length of generated output. - With any dummy 3th parameter, (even empty string), function will switch to big endian.
.
read64 foo < <(write -12345);echo $foo -12345 ...
First post 2015...
Upgrade for adding specific bash version (with bashisms)
With new version of printf built-in, you could do a lot without having to fork ($(...)) making so your script a lot faster.
First let see (by using seq and sed) how to parse hd output:
echo ;sed <(seq -f %02g 0 $(( COLUMNS-1 )) ) -ne ' /0$/{s/^\(.*\)0$/\o0337\o033[A\1\o03380/;H;}; /[1-9]$/{s/^.*\(.\)/\1/;H}; ${x;s/\n//g;p}';hd < <(echo Hello good world!) 0 1 2 3 4 5 6 7 012345678901234567890123456789012345678901234567890123456789012345678901234567 00000000 48 65 6c 6c 6f 20 67 6f 6f 64 20 77 6f 72 6c 64 |Hello good world| 00000010 21 0a |!.| 00000012 Were hexadecimal part begin at col 10 and end at col 56, spaced by 3 chars and having one extra space at col 34.
So parsing this could by done by:
while read line ;do for x in ${line:10:48};do printf -v x \\%o 0x$x printf $x done done < <( ls -l --color | hd ) Old original post
Edit 2 for Hexadecimal, you could use hd
echo Hello world | hd 00000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a |Hello world.| or od
echo Hello world | od -t x1 -t c 0000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a H e l l o w o r l d \n shortly
while IFS= read -r -n1 car;do [ "$car" ] && echo -n "$car" || echo ; done try them:
while IFS= read -rn1 c;do [ "$c" ]&&echo -n "$c"||echo;done < <(ls -l --color) Explain:
while IFS= read -rn1 car # unset InputFieldSeparator so read every chars do [ "$car" ] && # Test if there is ``something''? echo -n "$car" || # then echo them echo # Else, there is an end-of-line, so print one done Edit; Question was edited: need hex values!?
od -An -t x1 | while read line;do for char in $line;do echo $char;done ;done Demo:
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex while read line;do # Read line of HEX pairs for char in $line;do # For each pair printf "\x$char" # Print translate HEX to binary done done Demo 2: We have both hex and binary
od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex while read line;do # Read line of HEX pairs for char in $line;do # For each pair bin="$(printf "\x$char")" # translate HEX to binary dec=$(printf "%d" 0x$char) # translate to decimal [ $dec -lt 32 ] || # if caracter not printable ( [ $dec -gt 128 ] && # change bin to a single dot. [ $dec -lt 160 ] ) && bin="." str="$str$bin" echo -n $char \ # Print HEX value and a space ((i++)) # count printed values if [ $i -gt 15 ] ;then i=0 echo " - $str" str="" fi done done New post on september 2016:
This could be usefull on very specific cases, ( I've used them to manualy copy GPT partitions between two disk, at low level, without having /usr mounted...)
Yes, bash could read binary!
... but only one byte, by one... (because `char(0)' couldn't be correctly read, the only way of reading them correctly is to consider end-of-file, where if no caracter is read and end of file not reached, then character read is a char(0)).
This is more a proof of concept than a relly usefull tool: there is a pure bash version of hd (hexdump).
This use recent bashisms, under bash v4.3 or higher.
#!/bin/bash printf -v ascii \\%o {32..126} printf -v ascii "$ascii" printf -v cntrl %-20sE abtnvfr values=() todisplay= address=0 printf -v fmt8 %8s fmt8=${fmt8// / %02x} while LANG=C IFS= read -r -d '' -n 1 char ;do if [ "$char" ] ;then printf -v char "%q" "$char" ((${#char}==1)) && todisplay+=$char || todisplay+=. case ${#char} in 1|2 ) char=${ascii%$char*};values+=($((${#char}+32)));; 7 ) char=${char#*\'\\};values+=($((8#${char%\'})));; 5 ) char=${char#*\'\\};char=${cntrl%${char%\'}*}; values+=($((${#char}+7)));; * ) echo >&2 ERROR: $char;; esac else values+=(0) fi if [ ${#values[@]} -gt 15 ] ;then printf "%08x $fmt8 $fmt8 |%s|\n" $address ${values[@]} "$todisplay" ((address+=16)) values=() todisplay= fi done if [ "$values" ] ;then ((${#values[@]}>8))&&fmt="$fmt8 ${fmt8:0:(${#values[@]}%8)*5}"|| fmt="${fmt8:0:${#values[@]}*5}" printf "%08x $fmt%$(( 50-${#values[@]}*3-(${#values[@]}>8?1:0) ))s |%s|\n" $address ${values[@]} ''""'' "$todisplay" fi printf "%08x (%d chars read.)\n" $((address+${#values[@]})){,} You could try/use this, but don't try to compare performances!
time hd < <(seq 1 10000|gzip)|wc 1415 25480 111711 real 0m0.020s user 0m0.008s sys 0m0.000s time ./hex.sh < <(seq 1 10000|gzip)|wc 1415 25452 111669 real 0m2.636s user 0m2.496s sys 0m0.048s same job: 20ms for hd vs 2000ms for my bash script.
... but if you wanna read 4 bytes in a file header or even a sector address in an hard drive, this could do the job...
3 Comments
printf %d \"$char!$[blah] in Bash? When it was introduced? It it the same as $((expr))?The old format $[expression] is deprecated and will be removed in upcoming versions of bash.. Answer edited!Did you try xxd? It gives hex dump directly, as you want..
For your case, the command would be:
xxd -c 1 /path/to/input_file | while read offset hex char; do #Do something with $hex done Note: extract the char from hex, rather than while read line. This is required because read will not capture white space properly.
1 Comment
using read a single char can be read at a time as follows:
read -n 1 c echo $c [ANSWER]
Try this:
#!/bin/bash # data file INPUT=/path/to/input.txt # while loop while IFS= read -r -n1 char do # display one character at a time echo "$char" done < "$INPUT" From this link
Second method, Using awk, loop through char by char
awk '{for(i=1;i<=length;i++) print substr($0, i, 1)}' /home/cscape/Desktop/table2.sql
third way,
$ fold -1 /home/cscape/Desktop/table.sql | awk '{print $0}' EDIT: To print each char as HEX number:
Suppose I have a file name file :
$ cat file 123A3445F I have written a awk script (named x.awk) to that read char by char from file and print into HEX :
$ cat x.awk #!/bin/awk -f BEGIN { _ord_init() } function _ord_init( low, high, i, t) { low = sprintf("%c", 7) # BEL is ascii 7 if (low == "\a") { # regular ascii low = 0 high = 127 } else if (sprintf("%c", 128 + 7) == "\a") { # ascii, mark parity low = 128 high = 255 } else { # ebcdic(!) low = 0 high = 255 } for (i = low; i <= high; i++) { t = sprintf("%c", i) _ord_[t] = i } } function ord(str, c) { # only first character is of interest c = substr(str, 1, 1) return _ord_[c] } function chr(c) { # force c to be numeric by adding 0 return sprintf("%c", c + 0) } { x=$0; printf("%s , %x\n",$0, ord(x) )} To write this script I used awk-documentation
Now, You can use this awk script for your work as follows:
$ fold -1 /home/cscape/Desktop/file | awk -f x.awk 1 , 31 2 , 32 3 , 33 A , 41 3 , 33 4 , 34 4 , 34 5 , 35 F , 46 NOTE: A value is 41 in HEX decimal. To print in decimal change %x to %d in last line of script x.awk.
Give it a Try!!
2 Comments
{ x=$0; printf("%s , %d\n",$0, ord(x) )} echo "$char" | hexdump in my first exampleYet another solution, using head, tail and printf:
for a in $( seq $( cat file.txt | wc -c ) ) ; do cat file.txt | head -c$a | tail -c1 | xargs -0 -I{} printf '%s %0X\n' {} "'{}" ; done More readable:
#!/bin/bash function usage() { echo "Need file with size > 0" exit 1 } test -s "$1" || usage for a in $( seq $( cat $1 | wc -c ) ) do cat $1 | head -c$a | tail -c1 | \ xargs -0 -I{} printf '%c %#02x\n' {} "'{}" done Comments
I have a suggestion to give, but would like a feedback from everybody and manly a personal advice from syntaxerror's user.
I don't know much about bash but I thought maybe it would be better to have "cat $1" stored in a variable.. but the problem is that echo command will also bring a small overhead right?
test -s "$1" || (echo "Need a file with size greater than 0!"; exit 1) a=0 rfile=$(cat $1) max=$(echo $rfile | wc -c) while [[ $((++a)) -lt $max ]]; do echo $rfile | head -c$a | tail -c1 | \ xargs -0 -I{} printf '%c %#02x\n' {} "'{}" done in my opinion it would have a better performance but i haven't perf'tested..
Comments
Although I rather wanted to expand Perleone's own post (as it was his basic concept!), my edit was rejected after all, and I was kindly adviced that this should be posted as a separate answer. Fair enough, so I will do that.
Considerations in short for the improvements on Perleone's original script:
seqwould be totally overkill here. A simplewhileloop withaused as a (likewise simple) counter variable will do the job just fine (and much quicker too)- The max value,
$(cat $1 | wc -c)must be assigned to a variable, otherwise it will be recalculated every time and make this alternate script run even slower than the one it was derived from. - There's no need to waste a function on a simple usage info line. However, it is necessary to know about the (mandatory) curly braces around two commands, for without the
{ }, theexit 1command will be executed in either case, and the script interpreter will never make it to the loop. (Last note:( )will work too, but not in the same way! Parentheses will spawn a subshell, whilst curly braces will execute commands inside them in the current shell.)
#!/bin/bash test -s "$1" || { echo "Need a file with size greater than 0!"; exit 1; } a=0 max=$(cat $1 | wc -c) while [[ $((++a)) -lt $max ]]; do cat $1 | head -c$a | tail -c1 | \ xargs -0 -I{} printf '%c %#02x\n' {} "'{}" done