Can I use `sed` in place of `read` at the start of a while loop that takes input from a pipe?

Question

I would like to process data from a pipe inside a while loop. However, the data is separated by 3 forward slash characters rather than a newline, because I would like the script to be able to handle input that has newlines in it.

# The normal `read`-driven while loop while read -r; do echo "$REPLY" done

Instead of read, I would like to use sed because it allows for more flexability with reading input. The problem I'm running into is that sed eats all the data from the pipe on the first time through the while loop, so the loop only runs once.

Here is how I'm testing this

_{(note 1: uses fd, a file finding utility)}

# Make some files mkdir test-{1,2} touch 'test-1/file' $'test-2/file\nwith a newline' # Find the files we just created # note 2: there are some wide comments here ->>> fd -t f . test-1/ test-2/ -Hx printf '%s///' '{}' | # List all files in test-1 and test-2, putting "///" at the end of each one sed 's/./\n&/g' | # prepend each character with a newline so it can be read by `sed` one character at a time while v="$(sed -n ':loop; N; s|\n\(.\)$|\1|; /\/\/\/$/!{b loop}; s|///$||; p;Q;')"; do # ^^^^^^ sed script that prints everything up to the "///" # that notates the end of the current path, then exits # without processing more than the first one (so the # rest can be processed in future loop iterations) if [ -n "$v" ]; then # print the file name echo "$v" else # if v is empty break fi done

The output of this is

test-1/file

... which indicates sed is only running once, because the output should be this:

test-1/file test-2/file with a newline

Is there a way to get sed to behave like read so it can be used in a while loop? Is there a magical property that allows read to do this because it's a builtin?

I do not understand Why not sed 's|///|\n|g' | while read -r line? Is there a way to get sed to behave like read so it can be used in a while loop? You can while read line && v=$(sed ... <<<"$line") if you want to run for each line. Any command inherits stdin, so sed reads from stdin that you pass, there is only one stdin. I do not understand what the :loop; N; s|\n$.$$|\1|; /\/\/\/$/!{b loop}; s|///$||; p;Q; script should do. What should it do? Why the loop? — KamilCuk
– KamilCuk, Commented Nov 26, 2022 at 13:24
read also reads its input stream only once. The only thing "magical" about it is that it can assign directly into variables, but a big difference between read and sed is that read only reads up to the first newline, and it returns non-zero if it does not read any data. A common technique is sed ... | while read ...; do ...; done, but this brings up other potential issues. — William Pursell
– William Pursell, Commented Nov 26, 2022 at 13:25
owever, the data is seperated by 3 forward slash characters rather than a newline, because I would like the script to be able to handle input that has newlines in it. Is this your choice that you using 3 forward slashes? Or are you forced to use it? Why not use a zero separated stream, a standard where you want to handle newlines in the stream? Are you asking XY question? Do you want to ask "how to handle newline in filenames", instead of how to write a sed for each line? — KamilCuk
– KamilCuk, Commented Nov 26, 2022 at 13:27
Is this your choice that you using 3 forward slashes? - Using three forward slashes is my choice - I'm doing it because I can guarantee that fd won't output three of them in a row because of filesystem naming limitations (that being slashes not being allowed in file names and you can't have a zero-width file name) — PowerUser64
– PowerUser64, Commented Nov 26, 2022 at 22:46
Using three forward slashes is my choice Then the standard way is to use a zero byte to separate the data. Like xargs -0 find -print0 cut -z sed -z grep -z perl -0 sort -z etc. — KamilCuk
– KamilCuk, Commented Nov 27, 2022 at 18:13

KamilCuk · Accepted Answer · 2022-11-26 13:35:05Z

I will guess that you are asking XY question.

I would like the script to be able to handle input that has newlines in it.

Use a zero separated stream.

# I would do standard: find . -type f -print0 | fd -H0t f . | while IFS= read -r file; do echo "$line" done

To handle a stream separated by 3 slashes, you would replaces 3 slashes by a zero byte, and then read zero separated stream. This will potentially be a problem if your stream has actual zero byte in it, but that's in case of filenames impossible.

... | sed 's|///|\x00|g' | while IFS= read -r file; do

The OP doesn't indicate which OS they are on; sed does not portably support \x00 though you can pretty much rely on Perl being installed instead, which will de facto then be reasonably portable. perl -pe 's|///|\x00|g'

Collectives™ on Stack Overflow

Can I use `sed` in place of `read` at the start of a while loop that takes input from a pipe?

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related