Read a file by bytes in BASH

Question

I need to read first byte of file I specified, then second byte,third and so on. How could I do it on BASH? P.S I need to get HEX of this bytes

superset, random access: stackoverflow.com/questions/1423346/… — Ciro Santilli OurBigBook.com
– Ciro Santilli OurBigBook.com, Commented May 10, 2017 at 8:37

F. Hauri - Give Up GitHub · Accepted Answer · 2025-08-19 11:29:57Z

Full rewrite: september 2019!

A lot shorter and simplier than previous versions! (Something faster, but not so much)

New post august 2025! On another thread, I show a way to hold binary datas into a bash variable, using mapfile. See set variable from file content, mostly second part where I store zstd's output into a variable.

Yes , bash can read and write binary:

Syntax:

LANG=C IFS= read -r -d '' -n 1 foo

will populate $foo with 1 binary byte. Unfortunately, as bash strings cannot hold null bytes ($\0), reading one byte once is required.

If read command success and $foo is empty, then read byte is NULL. Else, $foo will hold THE byte read in binary form.

Then for the value of byte read, ( I've missed this in man bash, have a look at 2016 post, at bottom of this ;b) :

printf -v value %d \`$byte

 printf [-v var] format [arguments] ... Arguments to non-string format specifiers are treated as C constants, except that ..., and if the leading character is a single or double quote, the value is the ASCII value of the following character.

So:

read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS= read -r -d '' -n 1 _r8_car printf -v $_r8_var %d \'$_r8_car }

Will populate submitted variable name (default to $OUTBIN) with decimal ascii value of first byte from STDIN

read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb read8 _r16_lb && read8 _r16_hb printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) }

Will populate submitted variable name (default to $OUTBIN) with decimal value of first 16 bits word from STDIN...

Of course, for switching Endianness, you have to switch:

 read8 _r16_hb && read8 _r16_lb

And so on:

# Usage: # read[8|16|32|64] [varname] < binaryStdInput read8() { local _r8_var=${1:-OUTBIN} _r8_car LANG=C IFS= read -r -d '' -n 1 _r8_car printf -v $_r8_var %d "'"$_r8_car ;} read16() { local _r16_var=${1:-OUTBIN} _r16_lb _r16_hb read8 _r16_lb && read8 _r16_hb printf -v $_r16_var %d $(( _r16_hb<<8 | _r16_lb )) ;} read32() { local _r32_var=${1:-OUTBIN} _r32_lw _r32_hw read16 _r32_lw && read16 _r32_hw printf -v $_r32_var %d $(( _r32_hw<<16| _r32_lw )) ;} read64() { local _r64_var=${1:-OUTBIN} _r64_ll _r64_hl read32 _r64_ll && read32 _r64_hl printf -v $_r64_var %d $(( _r64_hl<<32| _r64_ll )) ;}

Sample playing with GPT patitions tables.

So you could source this, then if your /dev/sda is gpt partitioned,

read totsize < <(blockdev --getsz /dev/sda) read64 gptbackup < <(dd if=/dev/sda bs=8 skip=68 count=1 2>/dev/null) echo $((totsize-gptbackup)) 1

Answer should be 1 (1st GPT is at sector 1, one sector is 512 bytes. GPT Backup location is at byte 32. With bs=8 512 -> 64 + 32 -> 4 = 544 -> 68 blocks to skip, GPT Backup is located a end of disk (disk size - 1 block.)... See GUID Partition Table at Wikipedia).

Then

read64 gptbackup2 < <( dd if=/dev/sda bs=8 skip=$((4+gptbackup*64)) count=1 2>/dev/null) echo $gptbackup2 1

Answer should be 1 (2nd GPT table, located at end of disk, hold location of 1st GPT table, wich is located at sector 1)

Quick small write function...

write () { local i=$((${2:-64}/8)) o= v r r=$((i-1)) for ((;i--;)) { printf -vv '\%03o' $(( ($1>>8*(0${3+-1}?i:r-i))&255 )) o+=$v } printf "$o" }

This function default to 64 bits, little endian.

Usage: write <integer> [bits:64|32|16|8] [switchto big endian]

With two parameter, second parameter must be one of 8, 16, 32 or 64, to be bit length of generated output.
With any dummy 3th parameter, (even empty string), function will switch to big endian.

.

read64 foo < <(write -12345);echo $foo -12345

...

First post 2015...

Upgrade for adding specific bash version (with bashisms)

With new version of printf built-in, you could do a lot without having to fork ($(...)) making so your script a lot faster.

First let see (by using seq and sed) how to parse hd output:

echo ;sed <(seq -f %02g 0 $(( COLUMNS-1 )) ) -ne ' /0$/{s/^\(.*\)0$/\o0337\o033[A\1\o03380/;H;}; /[1-9]$/{s/^.*\(.\)/\1/;H}; ${x;s/\n//g;p}';hd < <(echo Hello good world!) 0 1 2 3 4 5 6 7 012345678901234567890123456789012345678901234567890123456789012345678901234567 00000000 48 65 6c 6c 6f 20 67 6f 6f 64 20 77 6f 72 6c 64 |Hello good world| 00000010 21 0a |!.| 00000012

Were hexadecimal part begin at col 10 and end at col 56, spaced by 3 chars and having one extra space at col 34.

So parsing this could by done by:

while read line ;do for x in ${line:10:48};do printf -v x \\%o 0x$x printf $x done done < <( ls -l --color | hd )

Old original post

Edit 2 for Hexadecimal, you could use hd

echo Hello world | hd 00000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a |Hello world.|

or od

echo Hello world | od -t x1 -t c 0000000 48 65 6c 6c 6f 20 77 6f 72 6c 64 0a H e l l o w o r l d \n

shortly

while IFS= read -r -n1 car;do [ "$car" ] && echo -n "$car" || echo ; done

try them:

while IFS= read -rn1 c;do [ "$c" ]&&echo -n "$c"||echo;done < <(ls -l --color)

Explain:

while IFS= read -rn1 car # unset InputFieldSeparator so read every chars do [ "$car" ] && # Test if there is ``something''? echo -n "$car" || # then echo them echo # Else, there is an end-of-line, so print one done

Edit; Question was edited: need hex values!?

od -An -t x1 | while read line;do for char in $line;do echo $char;done ;done

Demo:

od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex while read line;do # Read line of HEX pairs for char in $line;do # For each pair printf "\x$char" # Print translate HEX to binary done done

Demo 2: We have both hex and binary

od -An -t x1 < <(ls -l --color ) | # Translate binary to 1 byte hex while read line;do # Read line of HEX pairs for char in $line;do # For each pair bin="$(printf "\x$char")" # translate HEX to binary dec=$(printf "%d" 0x$char) # translate to decimal [ $dec -lt 32 ] || # if caracter not printable ( [ $dec -gt 128 ] && # change bin to a single dot. [ $dec -lt 160 ] ) && bin="." str="$str$bin" echo -n $char \ # Print HEX value and a space ((i++)) # count printed values if [ $i -gt 15 ] ;then i=0 echo " - $str" str="" fi done done

New post on september 2016:

This could be usefull on very specific cases, ( I've used them to manualy copy GPT partitions between two disk, at low level, without having /usr mounted...)

Yes, bash could read binary!

... but only one byte, by one... (because `char(0)' couldn't be correctly read, the only way of reading them correctly is to consider end-of-file, where if no caracter is read and end of file not reached, then character read is a char(0)).

This is more a proof of concept than a relly usefull tool: there is a pure bash version of hd (hexdump).

This use recent bashisms, under bash v4.3 or higher.

#!/bin/bash printf -v ascii \\%o {32..126} printf -v ascii "$ascii" printf -v cntrl %-20sE abtnvfr values=() todisplay= address=0 printf -v fmt8 %8s fmt8=${fmt8// / %02x} while LANG=C IFS= read -r -d '' -n 1 char ;do if [ "$char" ] ;then printf -v char "%q" "$char" ((${#char}==1)) && todisplay+=$char || todisplay+=. case ${#char} in 1|2 ) char=${ascii%$char*};values+=($((${#char}+32)));; 7 ) char=${char#*\'\\};values+=($((8#${char%\'})));; 5 ) char=${char#*\'\\};char=${cntrl%${char%\'}*}; values+=($((${#char}+7)));; * ) echo >&2 ERROR: $char;; esac else values+=(0) fi

 if [ ${#values[@]} -gt 15 ] ;then printf "%08x $fmt8 $fmt8 |%s|\n" $address ${values[@]} "$todisplay" ((address+=16)) values=() todisplay= fi done if [ "$values" ] ;then ((${#values[@]}>8))&&fmt="$fmt8 ${fmt8:0:(${#values[@]}%8)*5}"|| fmt="${fmt8:0:${#values[@]}*5}" printf "%08x $fmt%$(( 50-${#values[@]}*3-(${#values[@]}>8?1:0) ))s |%s|\n" $address ${values[@]} ''""'' "$todisplay" fi printf "%08x (%d chars read.)\n" $((address+${#values[@]})){,}

You could try/use this, but don't try to compare performances!

time hd < <(seq 1 10000|gzip)|wc 1415 25480 111711 real 0m0.020s user 0m0.008s sys 0m0.000s time ./hex.sh < <(seq 1 10000|gzip)|wc 1415 25452 111669 real 0m2.636s user 0m2.496s sys 0m0.048s

same job: 20ms for hd vs 2000ms for my bash script.

... but if you wanna read 4 bytes in a file header or even a sector address in an hard drive, this could do the job...

What does $[blah] in Bash? When it was introduced? It it the same as $((expr))?
@kyb Yes, my bad; from bash's man page: The old format $[expression] is deprecated and will be removed in upcoming versions of bash.. Answer edited!

anishsane · Accepted Answer · 2012-12-15 08:23:27Z

Did you try xxd? It gives hex dump directly, as you want..

For your case, the command would be:

xxd -c 1 /path/to/input_file | while read offset hex char; do #Do something with $hex done

Note: extract the char from hex, rather than while read line. This is required because read will not capture white space properly.

Grijesh Chauhan · Accepted Answer · 2012-12-15 10:19:45Z

using read a single char can be read at a time as follows:

read -n 1 c echo $c

[ANSWER]

Try this:

#!/bin/bash # data file INPUT=/path/to/input.txt # while loop while IFS= read -r -n1 char do # display one character at a time echo "$char" done < "$INPUT"

From this link

Second method, Using awk, loop through char by char

awk '{for(i=1;i<=length;i++) print substr($0, i, 1)}' /home/cscape/Desktop/table2.sql

third way,

$ fold -1 /home/cscape/Desktop/table.sql | awk '{print $0}'

EDIT: To print each char as HEX number:

Suppose I have a file name file :

$ cat file 123A3445F

I have written a awk script (named x.awk) to that read char by char from file and print into HEX :

$ cat x.awk #!/bin/awk -f BEGIN { _ord_init() } function _ord_init( low, high, i, t) { low = sprintf("%c", 7) # BEL is ascii 7 if (low == "\a") { # regular ascii low = 0 high = 127 } else if (sprintf("%c", 128 + 7) == "\a") { # ascii, mark parity low = 128 high = 255 } else { # ebcdic(!) low = 0 high = 255 } for (i = low; i <= high; i++) { t = sprintf("%c", i) _ord_[t] = i } } function ord(str, c) { # only first character is of interest c = substr(str, 1, 1) return _ord_[c] } function chr(c) { # force c to be numeric by adding 0 return sprintf("%c", c + 0) } { x=$0; printf("%s , %x\n",$0, ord(x) )}

To write this script I used awk-documentation
Now, You can use this awk script for your work as follows:

$ fold -1 /home/cscape/Desktop/file | awk -f x.awk 1 , 31 2 , 32 3 , 33 A , 41 3 , 33 4 , 34 4 , 34 5 , 35 F , 46

NOTE: A value is 41 in HEX decimal. To print in decimal change %x to %d in last line of script x.awk.

Give it a Try!!

To print in decimal use last line { x=$0; printf("%s , %d\n",$0, ord(x) )}
you can also use echo "$char" | hexdump in my first example

Perleone · Accepted Answer · 2012-12-16 11:16:37Z

Yet another solution, using head, tail and printf:

for a in $( seq $( cat file.txt | wc -c ) ) ; do cat file.txt | head -c$a | tail -c1 | xargs -0 -I{} printf '%s %0X\n' {} "'{}" ; done

Comments

Willian Mainieri · Accepted Answer · 2015-01-01 16:39:23Z

I have a suggestion to give, but would like a feedback from everybody and manly a personal advice from syntaxerror's user.

I don't know much about bash but I thought maybe it would be better to have "cat $1" stored in a variable.. but the problem is that echo command will also bring a small overhead right?

test -s "$1" || (echo "Need a file with size greater than 0!"; exit 1) a=0 rfile=$(cat $1) max=$(echo $rfile | wc -c) while [[ $((++a)) -lt $max ]]; do echo $rfile | head -c$a | tail -c1 | \ xargs -0 -I{} printf '%c %#02x\n' {} "'{}" done

in my opinion it would have a better performance but i haven't perf'tested..

syntaxerror · Accepted Answer · 2015-06-27 15:27:31Z

Although I rather wanted to expand Perleone's own post (as it was his basic concept!), my edit was rejected after all, and I was kindly adviced that this should be posted as a separate answer. Fair enough, so I will do that.

Considerations in short for the improvements on Perleone's original script:

seq would be totally overkill here. A simple while loop with a used as a (likewise simple) counter variable will do the job just fine (and much quicker too)
The max value, $(cat $1 | wc -c) must be assigned to a variable, otherwise it will be recalculated every time and make this alternate script run even slower than the one it was derived from.
There's no need to waste a function on a simple usage info line. However, it is necessary to know about the (mandatory) curly braces around two commands, for without the { }, the exit 1 command will be executed in either case, and the script interpreter will never make it to the loop. (Last note: ( ) will work too, but not in the same way! Parentheses will spawn a subshell, whilst curly braces will execute commands inside them in the current shell.)

#!/bin/bash test -s "$1" || { echo "Need a file with size greater than 0!"; exit 1; } a=0 max=$(cat $1 | wc -c) while [[ $((++a)) -lt $max ]]; do cat $1 | head -c$a | tail -c1 | \ xargs -0 -I{} printf '%c %#02x\n' {} "'{}" done

Collectives™ on Stack Overflow

Read a file by bytes in BASH

7 Answers 7

Full rewrite: september 2019!

Yes , bash can read and write binary:

Sample playing with GPT patitions tables.

Quick small write function...

First post 2015...

Upgrade for adding specific bash version (with bashisms)

Old original post

New post on september 2016:

Yes, bash could read binary!

3 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Full rewrite: september 2019!

Yes , bash can read and write binary:

Sample playing with GPT patitions tables.

Quick small write function...

First post 2015...

Upgrade for adding specific bash version (with bashisms)

Old original post

New post on september 2016:

Yes, bash could read binary!

3 Comments

1 Comment

2 Comments

Comments

Comments

Comments

Comments

Linked

Related