Tom's Guide | Tom's Hardware | Tom's Games
![]() |
![]() |
![]() |
First of all, I would like to thank “Fishmonger” helping me out with the following script.
The script works fine but it brings duplicate IP and Device all together. I would like to see duplicate IP only and then next paragraph duplicate host only. I could not figure out how to do that way and per “Fishmonger” I am reposting it. Thank you again for you help.http://www.computing.net/unix/wwwbo...
#!/opt/sa/bin/perl
use strict;
use warnings;
use MIME::Lite;
my (%ip, %host, $duplicates);
my $host_file = '/etc/hosts';
open my $file, '<', $host_file or die "can't open $host_file $!";
while (<$file>) {
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
if ( defined $ip{$ip} or defined $ip{$host} ) {
$duplicates .= $_;
}
else {
$ip{$ip}++;
$ip{$host}++;
}
}
}
close $file;
my $email_msg = <<EMAIL_MSG;
The following entries in the host file are dulpicates
either by IP address or by hostname.
$duplicates
EMAIL_MSG
my $email = MIME::Lite->new(
From => 'ai3478@att.com',
To => 'ai3478@att.com',
#Cc => 'pd1633@att.com,ws7342@att.com',
Subject => 'Host file duplicates',
Data => $email_msg
);
$email->send;

without the e-mail part I am getting good out put but once I add the email portion, I get an e-mail without any value. Please assist:
#!/opt/sa/bin/perl
use strict;
use warnings;
use MIME::Lite;
my (%ip, %host, $duplicates);
my $host_file = '/etc/hosts';
open my $file, '<', $host_file or die "can't open $host_file $!";
while (<$file>) {
if( /^#?([\d.]+)\s+(\S+)/ ) {
my ($ip, $host) = ($1, $2);
push @{$ip{$ip}}, $_;
push @{$host{$host}}, $_;
}
}
print "Duplicate IP's which have duplicate hostnames\n";
foreach my $ip ( keys %ip ) {
if ( @{$ip{$ip}} > 1 ) {
print @{$ip{$ip}};
}
}
print "\nDuplicate hostnames which have duplicate IP's\n";
foreach my $host ( keys %host ) {
if ( @{$host{$host}} > 1 ) {
print @{$host{$host}};
}
}
close $file;
my $email_msg = <<EMAIL_MSG;
The following entries in the host file are dulpicates
either by IP address or by hostname.
$duplicates
EMAIL_MSG
my $email = MIME::Lite->new(
From => '7777@batx.com',
To => '7777@batx.com',
#Cc => 'some@other.com, some@more.com',
Subject => 'Host file duplicates',
Data => $email_msg
);
$email->send;

Please surround your code with the html pre tags so that the indentation will be retained in your post. Also, for readability of the code, it's best to add a little additional vertical whitespace (i.e., add extra blank lines).
I commented out the print statements in the foreach loops but added one just before the email portion that prints out the email message. You could either keep it in or remove it when you put the script into production.
#!/opt/sa/bin/perluse strict;
use warnings;
use MIME::Lite;my (%ip, %host, $duplicate_ip, $duplicate_host);
my $host_file = '/etc/hosts';open my $file, '<', $host_file or die "can't open $host_file $!";
while (<$file>) {
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
push @{$ip{$ip}}, $_;
push @{$host{$host}}, $_;
}
}
close $file;#print "Duplicate IP's which have duplicate hostnames\n";
foreach my $ip ( keys %ip ) {
if ( @{$ip{$ip}} > 1 ) {
#print @{$ip{$ip}};
$duplicate_ip .= join '', @{$ip{$ip}};
}
}
#print "\nDuplicate hostnames which have duplicate IP's\n";
foreach my $host ( keys %host ) {
if ( @{$host{$host}} > 1 ) {
#print @{$host{$host}};
$duplicate_host .= join '', @{$host{$host}}
}
}my $email_msg = <<EMAIL_MSG;
The following entries in the host file are dulpicates
either by IP address or by hostname.Duplicate IP addresses:
$duplicate_ipDuplicate Hostnames:
$duplicate_hostEMAIL_MSG
print $email_msg;
my $email = MIME::Lite->new(
From => '7777@batx.com',
To => '7777@batx.com',
Subject => 'Host file duplicates',
Data => $email_msg
);
$email->send;If you look closely at each of the examples I gave in your other question, you'll see that with a slight modification, you can drop the 2 foreach loops and build the $duplicate_ip and $duplicate_host in the while loop.

Drop the foreach loops and change the while loop to this:
while (<$file>) {
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
$duplicate_ip .= $_ if defined $ip{$ip};
$duplicate_host .= $_ if defined $host{$host};
$ip{$ip}++;
$host{$host}++;
}
}

Thanks. if I delete the foreach loops and change the while loop to this:
I see significant less records: 492 records to 264 records. I see mutiple duplicates are not printed, instead printed only once.

If a line is duplicated, do you want to see each and every one of those duplicates, or just list it once?
If you want to list each of the duplicates, you'd probably also want to know their line numbers, otherwise I don't see the point of listing a line multiple times.

Thanks, yes, I want to see each and every one of those duplicates. So I like the foreach loops. Also how can I put blank line in between each duplicate pair of duplicates, so that it can be read easily. some where i need to put /n.

while (<$file>) {
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
push @{$ip{$ip}}, "Line# $.: $_";
push @{$host{$host}}, "Line# $.: $_";
}
}
close $file;foreach my $ip ( keys %ip ) {
if ( @{$ip{$ip}} > 1 ) {
$duplicate_ip .= join('', @{$ip{$ip}}) . "\n\n";
}
}foreach my $host ( keys %host ) {
if ( @{$host{$host}} > 1 ) {
$duplicate_host .= join('', @{$host{$host}}) . "\n\n";
}
}

Thank you and appreciate your help. I really dont need the line numbers, witout the line line number they look perfect.
Another issue I have discovered that all duplicate IPs are not going under "Duplicate IP": For example:
#10.61.38.45 and 10.61.38.45 should have gone to Duplicate IP, but it went under duplicate hosts.
Duplicate IP:#10.16.15.0 abcdefg0000 abcdefg0000
10.16.15.0 abcdefg0000 abcdefg0000
Duplicate Hostnames:
#10.61.38.45 prnabcd0002 prnabcd0002
10.61.38.45 prnabcd0002 prnabcd0002
So, all duplicates IP should go under "duplicate IP" and all duplicate hosts should go under "duplicate hosts"
Also when output is printed, I would like them print out in tab format:
<IP>TAB<Hostname>TAB<rest of the info>
Thank you very much for this

Another Issue I have found that:
The following two sets of dups goes in both sections,
apprently they are dups and IP and hostname.
Can we put them just in one section:
#172.11.111.222 abc014i00def
#172.11.111.222 abc014i00def

I noticed that Fishmonger has been doing the stuff for you. Have you tried to do it yourself? if you have , post your code, and maybe he can guide you better.

The code helped me tremendously and it is working fine, I also removed line # in between duplicate, but the there are some data validity issues:
1. All duplicates IP are not going under "Duplicate IP" section. Like the folllwing example:
10.16.15.0 abcdefg0000 abcdefg0000
10.16.15.0 abcdefg0000 abcdefg0000Should have gone to Duplicate IP, but it went under duplicate hosts.
2. if there are duplicates of device and IP
like follow:#172.11.111.222 abc014i00def
#172.11.111.222 abc014i00defthe output went in both section "Duplicate IP" and "Deuplicate hosts" instead going one of the section.
3. Output not coming up with TAB in between the IP<>DeviceName<Restoftheinfo>
I believe I should sort them out first:
### Sorting the input file on the hostname; This is required so the input file
### can be checked for duplicate lines:print STDOUT "Sorting $FileName ...\n";
system("/bin/sort -k 2,2 $FileName -o $FileName");open (INPUT, $FileName) || die "Can't open input file $!\n";
### Pushing each line into an array:
while ($LINE = <INPUT>)
{
chomp $LINE ;
if ($LINE eq '')
{
print STDOUT "Remove BLANK LINE from input file before proceeding !\n";
exit;
}
else
{
push @InputList,$LINE ;
}
}close (INPUT);
I am not sure how to do this but I believe:The logic needs to check for dup IP and dup name in two different loops. The first loop would check for one dup, for example IP address, then populate the IP address and host name into a dup IP array. The second loop could then check for dup names from the host file but when it does find a match it should only add the dup hostname to the dup hostname array if it is not already in the dup IP array. Then I can continue to print both arrays into an email, with tab and space in between each pair or mutipair of dups.

I've been holding off my response because I was waiting to see if you'd pick up the ball and at least try to learn how my example worked and how to adjust it to your exact requirements.
There is no need to fork a new process (i.e., the system call) to pre sort the file. If you need to sort the output, perl can sort it in any one of several sorting orders (ascii, numerical, lexilogical [I think I miss spelled that last one]).
The code you posted doesn't do any extraction of duplicates; it only reads the file into an array in an inefficient manor.
Another issue I have discovered that all duplicate IPs are not going under "Duplicate IP": For example:#10.61.38.45 and 10.61.38.45 should have gone to Duplicate IP, but it went under duplicate hosts.
I'm unable to duplicate that issue with the example lines you posted. How do those lines differ in format to the rest of the file? Are there any leading spaces before the IP address?The following two sets of dups goes in both sections,
apprently they are dups and IP and hostname.
Can we put them just in one section:
#172.11.111.222 abc014i00def
#172.11.111.222 abc014i00def
They're listed in both sections because that's what you previously stated/infered you wanted. See response numbers 6 & 7.Which section do you not want them listed? In that foreach loop you need to add a another test (exactly like the other foreach loop) and exclude the ones you don't want listed.
Here's an example:
foreach my $host ( sort keys %host ) {
if ( @{$host{$host}} > 1 ) {
my ($ip) = $host{$host}[0] =~ /^#?([\d.]+)/;
unless ( @{$ip{$ip}} > 1 ) {
$duplicate_host .= join('', @{$host{$host}}) . "\n";
}
}
}

Thank you, I have removed the 2nd loop ($host) and replced that code with the example you have advised, seems like it is working fine. When I copy the output to a excel sheet there is no TAB in between the fields
<IP>TAB<DeviceNamw>TAB<Descritpion>.
There is no leading spaces in IP but there are space in between IP and hostname.
thanks

To make it tab separated, you need to add the following line.
s/\s+\b/\t/g;
Now, can you make an attempt to see where it needs to go?

I put it in join functions:
$duplicate_ip .= join ('', s/\s+\b/\t/g, @{$ip{$ip}}) . "\n\n";
$duplicate_host .= join('', s/\s+\b/\t/g, @{$host{$host}}) . "\n\n";
the output generates TAB but getting error:
"Use of uninitialized value in substitution"

Besides the warning that it generated, it didn't convert the lines to tab delimited. Does that look like the proper way to use the regex? Do you know how to use a regex?
That regex only needs to be added in 1 place. Try again and post back with the results. Also post the entire script; with all of the variations that I should you, I don't know exactly which version you're using.

I took diff approach which solves the TAB issues:
basically I added these:
@linevals = split;
$outline = $linevals[0] . "\t" . $linevals[1] . "\n";
////////////////////////////////use strict;
use warnings;
use MIME::Lite;
my (%ip, %host, $duplicate_ip, $duplicate_host, @linevals, $outline);
my $host_file = '/etc/hosts';
open my $file, '<', $host_file or die "can't open $host_file $!";
while (<$file>) {
chop;
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
@linevals = split;
$outline = $linevals[0] . "\t" . $linevals[1] . "\n";
push @{$ip{$ip}}, $outline;
push @{$host{$host}}, $outline;
}
}
close $file;
#print "Duplicate IP's with hostnames\n";
foreach my $ip ( keys %ip ) {
if ( @{$ip{$ip}} > 1 ) {
$duplicate_ip .= join ('', @{$ip{$ip}}) . "\n\n";}
}
#print "\nDuplicate hostnames with IP's\n";
foreach my $host ( sort keys %host ) {
if ( @{$host{$host}} > 1 ) {
my ($ip) = $host{$host}[0] =~ /^#?([\d.]+)/;
unless ( @{$ip{$ip}} > 1 ) {
$duplicate_host .= join('', @{$host{$host}}) . "\n\n";
}
}
}my $email_msg = <<EMAIL_MSG;
The following entries in the host file are duplicates
either by IP address or by hostname.
Duplicate IP addresses:
$duplicate_ip
Duplicate Hostnames:
$duplicate_host
EMAIL_MSG
print $email_msg;
my $email = MIME::Lite->new(
From => 'xxx@xxx.com',
To => 'xxxx@xxx.com',
#Cc => 'xxx@xx.com,xxxx@att.com',
Subject => 'Host file duplicates',
Data => $email_msg
);
$email->send

That approach does work, but it's less efficient and not as clean.
You should use chomp instead of chop. chomp removes the line terminator (which can be multiple chars) and chop only removes the last char. In this case. using either of them doesn't make sense because you're adding it back in when you add the tab.
Here's a better approach.
while (<$file>) {
if( my ($ip, $host) = /^#?([\d.]+)\s+(\S+)/ ) {
s/\s+\b/\t/g;
push @{$ip{$ip}}, $_;
push @{$host{$host}}, $_;
}
}
BTW, we can increase the efficiency by replacing the regex with this:tr/ +/\t/;

Thanks, everything is working fine. One more item, I would like the email to be sent via cronjob only if there are duplicates otherwise not to send any email to the group.
Thank you very much for your assistance.

You need to put the $email->send; command in a conditional block that checks if $duplicate_ip or $duplicate_host is defined, and if so send the email.
See if you can work out the syntax for that test block then post the code you're testing and I'll give you more pointers. If you can figure it out on your own, then you will have a better understanding of Perl than you would have if I simply post the solution.

I have been thinking about if then else clause, but not sure how ot integrate here. I am a beginner in Perl, and this assignment is a huge help and I am also reading Perl tutorial written by Chan Benrnard Hong. Thanks.

I haven't throughly read that book, but I did skim over the online pdf version and I would not recommend it. In general, most of it is “ok” but terse and delves into areas that don't belong in a beginners book. Some sections are pretty bad/wrong, such as the section on prototypes.
Instead I'd recommend Learning Perl, Fourth Edition by Randal L. Schwartz, Tom Phoenix, Brian d foy. They are 3 of Perl communities top experts and authors.
http://www.oreilly.com/catalog/lear...The second book I'd recommend as being a required reference book for every Perl programmer is Programming Perl, Third Edition by Larry Wall, Tom Christiansen, Jon Orwant.
http://www.oreilly.com/catalog/pper...You say that this is not a homework assignment, and I'll take you at your word. However, that leads to the conclusion that this is a project assigned by your employer. So who's going to get paid for this solution? Normally I'm very happy to help people solve their problems with Perl scripts, but the key word here is “help” not “do it”.
You don't need an if/else clause, but you do need an if or an unless clause that checks to see if either of the duplicate vars are defined. There are 3 places/approaches to choose from.
1) Wrap the $email->send statement in an if clause.
2) Wrap the entire email portion in an if clause.
3) Exit the script prior to the email section unless the var(s) are defined.
Option 1 has the disadvantage of needlessly building the email.
Option 2 has the disadvantage of unneeded syntax.
Option 3 is my personal preference, because it's clean and the most efficient.
Now, please make an attempt and post your code and if needed, I'll help make the correction.

Thank for the book suggestions, I will get one of those book you recommended.
Here what I have added:
if ($duplicate_ip || $duplicate_host) {
$email->send }I did not encounter any error, since I still got lots of dup I wont tell this piece of code working until I fix all of the dups.
Thanks.

Can you please suggest how can I ran the program with option for like /file1 -o
so that an e-mail not be sent out.

![]() |
![]() |
![]() |

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.
| Ads by Google |