1

Thanks in advance for any ideas you present.

My current project has me trying to loop a file containing a list of 1000's of IP addresses through geoiplookup and piping it to sed to delete all lines that do not match criteria.

The list is just a list of ip addresses:

1.2.3.4 5.6.7.8 9.10.11.12 . . . 

I then run geoiplookup ip.ad.dre.ss as root, and get:

[root@system ipset]# geoiplookup 4.2.2.2 GeoIP Country Edition: US, United States 

if the IP is within the US.

My goal is to delete all lines from the file, if their IP is not in the US.

What I tried is not working:

#!/bin/bash for ip in $(cat /tmp/iplist.txt); do geoiplookup $ip | sed '/GeoIP Country Edition: US, United States/!d' done 

Any suggestions? Recommendations of another approach would be greatly appreciated.

@terdon Sorry I haven't updated sooner, I have been busy.

Adding either to this question or comment messes up formatting of text. Sorry newbie to Stack...let me figure this out.

OK, update.

after changing up and running @terdon first code:

#!/bin/bash while read ip; do if ! geoiplookup "$ip" | grep -q ': US, United States$'; then sed -i "/^$ip$/d" /tmp/iplist.txt fi done < /tmp/iplist.txt 

it runs without error but non-US ip address still are in list.

I had been using the last example with the declare -a badLines

scripts#geoiplookup 172.168.155.63 GeoIP Country Edition: GB, United Kingdom scripts#grep 172.168.155.63 /tmp/iplist.txt 172.168.155.63 

Once I figure out how to get copy and paste to work right into the question here I will update.

@Ed Morton Both examples produce same error at the "done" line, line 7 in one and line 9 in the other.

./filter_ips.sh: line 7: syntax error near unexpected token `done' ./filter_ips.sh: line 7: ` done < "${@:--}"' 

@Miri B They run through but there are still non-US ip addresses in the list.

3
  • What should happen if the IP cannot be resolved to a country? For example, try with this site's IP: geoiplookup 172.64.144.30 or geoiplookup unix.stackexchange.com. That gives GeoIP Country Edition: IP Address not found, should such cases be kept or removed? Commented Feb 10 at 13:10
  • Thanks for chiming in, the list does not contain and fqdn's only ip addresses. Having run the for loop and also a while loop with the same basic syntax and output the results to another file the results are a list of the geoiplookup results. Parsing that list it does remove all non-us countries and there aren't any errors regarding invalid results. This says to me that the list will work as expected if I can figure out the syntax. Commented Feb 10 at 13:36
  • Copy/paste your code into shellcheck.net and fix the issues it tells you about. Commented Feb 10 at 15:11

3 Answers 3

0

You want to loop over the lines, check if they contain an IP that maps to the US and, if they don't, delete that line from the file. Your approach failed because you are using sed on the output of the geoiplookup command, not on the file.

What you need to do is check each IP, and then go back to the file. So you could do something like this (note how I am not using for i in $(cat file) here; see Bash pitfall #1 for why that is a bad idea):

## make a backup of your file cp ip.ad.dre.ss ip.ad.dre.ss.bak while read ip; do if ! geoiplookup "$ip" | grep -q ': US, United States$'; then sed -i "/^$ip$/d" ip.ad.dre.ss fi done < ip.ad.dre.ss 

Note that this will also remove entries that simply cannot be mapped to their location. For example, something like this very site:

$ geoiplookup unix.stackexchange.com GeoIP Country Edition: IP Address not found 

If you want to keep those, you could use this:

while read ip; do if ! geoiplookup "$ip" | grep -Eq ': (US, United States|IP Address not found)$'; then sed -i "/^$ip$/d" ip.ad.dre.ss fi done < ip.ad.dre.ss 

However, both of these are a bit inefficient since they need the entire file to be processed multiple times. And alternative approach would be to read the file once to generate the list of lines you want to delete, and then delete them with one, second pass. Something like this:

declare -a badLines while read ip; do ((c++)) if ! geoiplookup "$ip" | grep -Eq ': (US, United States|IP Address not found)$'; then badLines+=($c) fi done < ip.ad.dre.ss sedCommand=$(sed -E 's/ |$/d;/g' <<<"${badLines[@]}") sed -i "$sedCommand" ip.ad.dre.ss 

If you pass sed the command Nd where N is a number, it will delete that line number from its input. So, if I run sed 2d file, that prints out the contents of the file except the 2nd line. Remember, always with sed, if you want the change to actually be saved in the file, you need -i. So sed -i 2d file will actually delete the second line of file.

Here, I am reading the file once and collecting the list of line numbers I want to delete. Then, I convert that space-separated list, to Nd; format with a small sed command, and finally, I am passing the resulting sed command ( it would look like 2d;3d;130d; etc) to sed to actually modify the file.

7
  • Thank you for your response. Getting to work on this with your suggestions. Commented Feb 10 at 13:37
  • After giving this a try it errored after running for a second with- sed: -e expression #1, char 10: unterminated `s' command - After googling the error I found this: stackoverflow.com/questions/10308496/… -and tried double quoting "s/ |$/d;/g" however same error was returned when ran. I'll have a little time later to investigate and see if I can pick it up from here. Thanks so much for your input. Commented Feb 10 at 16:06
  • @MattS. eeek! No, do not double quote the 's/ |$/d;/g' that will break. Instead, add echo "sed command: $sedCommand" to the script to see what it is actually trying to run. Commented Feb 10 at 16:20
  • Also, @MattS. I am assuming you are using a modern shell like bash and you're on a Linux machine. This will break on a mac and just not work on Windows. Commented Feb 10 at 16:22
  • Correct this is on a Debian 12 and using bash. Also verified that the ip list is straight ip addresses and no network addresses with a metric such as "/24" Commented Feb 10 at 16:55
0

If geoiplookup (which I don't have) can only process 1 IP address at a time then consider something like the following (untested):

#!/usr/bin/env bash getUsaIps() { local ip re='United States$' while IFS= read -r ip if [[ "$(geoiplookup "$ip")" =~ $re ]]; then printf '%s\n' "$ip" fi done < "${@:--}" } tmp="$(mktemp)" || exit trap 'rm -f "$tmp"; exit' EXIT infile='/tmp/iplist.txt' getUsaIps "$infile" > "$tmp" && mv -- "$tmp" "$infile" 

or:

#!/usr/bin/env bash getIpMap() { local ip while IFS= read -r ip printf '%s\t%s\n' "$ip" "$(geoiplookup "$ip")" done < "${@:--}" } tmp="$(mktemp)" || exit trap 'rm -f "$tmp"; exit' EXIT infile='/tmp/iplist.txt' getIpMap "$infile" | awk '/United States$/{ print $1 }' > "$tmp" && mv -- "$tmp" "$infile" 
3
  • Thanks for your input, crazy day today but will have time later to investigate. Commented Feb 10 at 16:59
  • 1
    Yes - geoiplookup can only process a single ip at a time so I will take a close look at your sugggestion. Commented Feb 10 at 17:06
  • Whatever the issue was @terdon, it was on me. Created new shell script recopied third example replaced path to file and it worked. Sorry for the confusion. Commented Feb 11 at 22:00
0

Give these two awk codes a go:

awk '{ cmd=("geoiplookup " $0); cmd | getline s ; close(cmd); if( s ~ /Country Edition: US, United States/ ) print $0 }' file_name 

Or another one, probably faster:

awk '{ printf"%s ", $0; system("geoiplookup " $0) }' file_name | awk '/GeoIP Country Edition: US, United States/{print $1}' 

You must log in to answer this question.