Skip to main content
edited body
Source Link

I have a CSV Doc that has three columns. Column 1 has an MD5 checksum. Column 2 has a path to the file. Column 3 is either blank or has a unique identifier.

Example

0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile1,Uniquecode 0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile2,Uniquecode 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath3/Somefile2, 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 005040886c659d73c8596b40a70ff231,/Volumes/Somepath5/Somefile4, 005040886c659d73c8596b40a70ff231,/Volumes/Somepath6/Somefile4, 

What I'm trying to do is print only fileslines that have a matching checksum, and if the files have the unique code field filled in, but not if the file matches to another file that also has the uniquecode. So in the example above, I'd only get a print of the below.

0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 

The first two files match checksum, but both share the uniquecode, which I don't want to print. The last two match checksum but neither have the 3 field filled, but the middle two match and only one has the uniquecode filled it in. There are instances in the list of more than 2 files matching the checksum.

I was trying to use awk to do this, but I'm not very versed in it and couldn't figure out how to put in all of these rules.

Any help would be greatly appreciated.

I have a CSV Doc that has three columns. Column 1 has an MD5 checksum. Column 2 has a path to the file. Column 3 is either blank or has a unique identifier.

Example

0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile1,Uniquecode 0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile2,Uniquecode 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath3/Somefile2, 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 005040886c659d73c8596b40a70ff231,/Volumes/Somepath5/Somefile4, 005040886c659d73c8596b40a70ff231,/Volumes/Somepath6/Somefile4, 

What I'm trying to do is print only files that have a matching checksum, and if the files have the unique code field filled in, but not if the file matches to another file that also has the uniquecode. So in the example above, I'd only get a print of the below.

0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 

The first two files match checksum, but both share the uniquecode, which I don't want to print. The last two match checksum but neither have the 3 field filled, but the middle two match and only one has the uniquecode filled it in. There are instances in the list of more than 2 files matching the checksum.

I was trying to use awk to do this, but I'm not very versed in it and couldn't figure out how to put in all of these rules.

Any help would be greatly appreciated.

I have a CSV Doc that has three columns. Column 1 has an MD5 checksum. Column 2 has a path to the file. Column 3 is either blank or has a unique identifier.

Example

0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile1,Uniquecode 0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile2,Uniquecode 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath3/Somefile2, 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 005040886c659d73c8596b40a70ff231,/Volumes/Somepath5/Somefile4, 005040886c659d73c8596b40a70ff231,/Volumes/Somepath6/Somefile4, 

What I'm trying to do is print only lines that have a matching checksum, and if the files have the unique code field filled in, but not if the file matches to another file that also has the uniquecode. So in the example above, I'd only get a print of the below.

0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 

The first two files match checksum, but both share the uniquecode, which I don't want to print. The last two match checksum but neither have the 3 field filled, but the middle two match and only one has the uniquecode filled it in. There are instances in the list of more than 2 files matching the checksum.

I was trying to use awk to do this, but I'm not very versed in it and couldn't figure out how to put in all of these rules.

Any help would be greatly appreciated.

Source Link

How do I print matching field data from a doc with a qualifier?

I have a CSV Doc that has three columns. Column 1 has an MD5 checksum. Column 2 has a path to the file. Column 3 is either blank or has a unique identifier.

Example

0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile1,Uniquecode 0000801f8b7a5c3b483809ef069d4d82,/Volumes/Somepath2/Somefile2,Uniquecode 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath3/Somefile2, 0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 005040886c659d73c8596b40a70ff231,/Volumes/Somepath5/Somefile4, 005040886c659d73c8596b40a70ff231,/Volumes/Somepath6/Somefile4, 

What I'm trying to do is print only files that have a matching checksum, and if the files have the unique code field filled in, but not if the file matches to another file that also has the uniquecode. So in the example above, I'd only get a print of the below.

0044f99638140c2eec15aa78eeb41d5e,/Volumes/Somepath4/Somefile3,Uniquecode 

The first two files match checksum, but both share the uniquecode, which I don't want to print. The last two match checksum but neither have the 3 field filled, but the middle two match and only one has the uniquecode filled it in. There are instances in the list of more than 2 files matching the checksum.

I was trying to use awk to do this, but I'm not very versed in it and couldn't figure out how to put in all of these rules.

Any help would be greatly appreciated.