3

I'm completely new to Perl and I thought that would be the best language to solve my simple task. I need to convert a binary file into something readable and need to find and replace strings like \x00\x39 into \x09 (tab) or something like that.

From bash, I started with the following and it works great:

perl -pi -e 's/abc/123/g' test.txt 

However, when I start to enter ascii codes, I'm lost:

perl -pi -e 's/0x49/*/g' test.txt perl -pi -e 's/{char(49)}/*/g' test.txt 

How would this command would look like as a line in a perl script? I have about a couple hundred of these find/replacement operations and a 500MB text file. Are there any caveats that I would need to know?

Thanks so much for any help!

Gary

3 Answers 3

7

Use the \x## notation:

perl -pi~ -e 's/\x00/*/g' test.txt 

To replace each "special" character with its code in brackets, use the /e option:

perl -pi~ -e 's/([\x0-\x09\x11-\x1f])/"[" . ord($1) . "]"/eg' test.txt 
Sign up to request clarification or add additional context in comments.

1 Comment

Yes, that's what I needed; I was missing the backslash :-)
1

Wow, thank you very much. I learned that it wasn't as easy as I assumed. Wow, Perl is truly very complex ;-)

Here is, what I came up with. I hope this will help someone.

BTW: If you have any chance to know if this will also work on Windows Perl, please let me know.

Thanks again,

Gary

#!/usr/bin/perl use strict; use warnings; my $infile = '/Users/gc/Desktop/a.bin'; my $outfile = '/Users/gc/Desktop/b.txt'; # in and out can be the same file; file will be overwritten when it already exists my $data = read_file($infile); # 1st batch $data =~ s/0\x01J[\x00-\x19]/\x09AnythingYouWant\x09/g; $data =~ s/0\x00[\x00-\x19]/\x09AnythingYouWant\x09/g; # 2nd batch $data =~ s/\r/\x06/g; # CR into \x06 $data =~ s/\n/\x06/g; # LF into \x06 $data =~ s/\r\n/\x06/g; # CR LF into \x06 # … write_file($outfile, $data); exit; sub read_file { my ($infile) = @_; open my $in, '<', $infile or die "Could not open '$infile' for reading $!"; local $/ = undef; my $all = <$in>; close $in; return $all; } sub write_file { my ($outfile, $content) = @_; open my $out, '>', $outfile or die "Could not open '$outfile' for writing $!";; print $out $content; close $out; return; } 

Comments

0

Although it's a bit weird to do string replaces on a binary file, here's how to do it with your txt file:

use strict; use warnings; use Tie::File; my @file; tie @file, 'Tie::File', 'test.txt' or die $!; foreach (@file) { # your regexes go here s/abc/123/g; s/\0x49/*/g; } untie @file; 

The Tie::File module (from the Perl core) allows you to access the lines of the file through an array. Changes will be saved to the file immediately. In the foreach loop, the file is processed line by line. The lines go into $_, which we cannot see. The regex operations are by default also applied to $_, so there's no need to write it down.


However, I believe you are going about this the wrong way. In most cases, you will not be able to just read the file line by line. Refer to perlfaq as a starting point. Dealing with binary is somewhat more tricky than just text processing I'm afraid.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.