Skip to main content
25 events
when toggle format what by license comment
May 6 at 11:46 history edited Ciro Santilli OurBigBook.com CC BY-SA 4.0
added 6 characters in body
Sep 27, 2024 at 10:03 comment added Thomas Guyot-Sionnest @jrw if a writer seeks past the end of the file before writing, or if the file is truncated while a writer is writing to the file, the part between the beginning (or end of written sections from the truncating process) will be filled with null bytes (all zero bits). Full blocks will usually not even be allocated to disk unless written to, that's called a sparse file and can have a size much bigger than the real disk size. On most systems it's also possible to explicitly "poke holes" in the middle of files, i.e. deallocate blocks turning them to nulls.
Dec 3, 2023 at 20:59 history edited Cristian Ciupitu CC BY-SA 4.0
new style formatting; real headers; links to source code
Mar 20, 2021 at 8:43 history edited Ciro Santilli OurBigBook.com CC BY-SA 4.0
added 27 characters in body
Mar 20, 2021 at 8:27 history edited Ciro Santilli OurBigBook.com CC BY-SA 4.0
deleted 89 characters in body
Mar 4, 2021 at 6:33 comment added Ciro Santilli OurBigBook.com @Quasímodo cirosantilli.com/…
Mar 3, 2021 at 20:20 comment added Quasímodo Dammit, the only answer that thoroughly and precisely addresses the questions is sits down here with 10% of the votes of the most voted one.
S Apr 15, 2018 at 7:34 history suggested user273376 CC BY-SA 3.0
Corrected links..
Apr 15, 2018 at 3:20 review Suggested edits
S Apr 15, 2018 at 7:34
Nov 20, 2017 at 10:03 history edited Ciro Santilli OurBigBook.com CC BY-SA 3.0
added 31 characters in body
Apr 13, 2017 at 12:36 history edited CommunityBot
replaced http://unix.stackexchange.com/ with https://unix.stackexchange.com/
Jun 12, 2016 at 14:12 comment added jrw32982 @CiroSantilli巴拿馬文件六四事件法轮功 sparse file
Jun 12, 2016 at 6:59 comment added Ciro Santilli OurBigBook.com @jrw32982 thanks for input! What does a "hole" in the file mean?
Jun 12, 2016 at 4:02 comment added jrw32982 @CiroSantilli巴拿馬文件六四事件法轮功 The grep 2.16 source looks substantially different than the grep 2.24 source. There is no encoding_error_output. The checks are for a NUL in the first buffer or if there are "holes" in the file indicating a NUL character somewhere. If -z is specified, then it checks instead for \x80 (\200).
Jun 9, 2016 at 18:13 comment added Ciro Santilli OurBigBook.com @jrw32982 2.24, same I opened source for. Ubuntu 16.04.
Jun 8, 2016 at 23:33 comment added jrw32982 @CiroSantilli巴拿馬文件六四事件法轮功 what version of GNU grep did you test against?
Jun 8, 2016 at 19:20 comment added Ciro Santilli OurBigBook.com @jrw32982 interesting. Maybe open up 2.16 and see if the encoding_error_output is there. Maybe it was added since.
Jun 8, 2016 at 18:15 comment added jrw32982 @StéphaneChazelas I was not able to reproduce the UTF locale part of this with GNU grep 2.16. printf 'a\x80' | LC_ALL=en_US.UTF-8 grep a did not warn, whereas changing 80 to 00 did warn.
Apr 13, 2016 at 13:09 comment added Stéphane Chazelas I didn't look into great detail either, but did very recently
Apr 13, 2016 at 13:05 comment added Ciro Santilli OurBigBook.com @StéphaneChazelas "Note that the check for valid UTF-8 only happens in UTF-8 locales": do you mean about the export LC_CTYPE='en_US.UTF-8' as in my example, or something else? Buf read: amazing example, added to answer. You have obviously read the source more than me, reminds me of those hacker koans "The student was enlightened" :-)
Apr 13, 2016 at 13:00 history edited Ciro Santilli OurBigBook.com CC BY-SA 3.0
added 934 characters in body
Apr 13, 2016 at 12:18 comment added Stéphane Chazelas Note that the check for valid UTF-8 only happens in UTF-8 locales. Also note that the check is only done on the first buffer read from the file which for a regular file seems to be 32768 bytes on my system, but for a pipe or socket can be as small as one byte. Compare (printf '\n\0y') | grep y with (printf '\n'; sleep 1; printf '\0y') | grep y for instance.
Apr 13, 2016 at 12:10 history edited Ciro Santilli OurBigBook.com CC BY-SA 3.0
added 40 characters in body
Apr 13, 2016 at 2:02 comment added user394 Impressive explication!
Apr 12, 2016 at 20:50 history answered Ciro Santilli OurBigBook.com CC BY-SA 3.0