Solved match a text file with an excel file with Perl

March 6, 2013 at 17:21:02
Specs: Windows 7
file1(text file)

Jack CC sn1

Jack TT sn3

Jack TT sn2

Jack CC sn4

Mark CC sn4

Mark AA sn1

Mark TT sn3

Mark AA sn2



File2(excel file)

Jack Mark
sn1

sn2

sn3

sn4


DESIRED OUTPUT

Jack Mark

sn1 CC AA

sn2 TT AA

sn3 TT TT

sn4 CC CC

based on column 1 & 3 of file1 and Column1 and row1 of file 2 i want to extract the value of column 2 of file ine and insert in file2. I am a beginner to programming.Any help in perl would be greatly appreciated. Thanks


See More: match a text file with an excel file with Perl

Report •

✔ Best Answer
March 17, 2013 at 15:39:08
I made 2 major changes:

First, since col1 and col2 of the xg1.txt file are the keys, I changed the split command to break the line into 3 pieces. The 3rd piece includes all the data after col2 and that becomes the value of the given hash entry.

Second, the script now supports a variable number of users i.e. it assumes more users than just "Jack" and "Mac".

With the amount of data you are suggesting, I cannot guarantee this program will work:

use warnings;
use strict;
use IO::Handle;
use FileHandle;
my %account;
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";
#$file2= "C:\strawberry\perl\bin\eg2.txt";
my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

# build a hash with key = column 1 and 2
open( my $fh1,"<", "eg1.txt" ) or die "$!";
my $line;
while( $line = <$fh1> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   (my $value1, my $value2, my $value3) = split(' ', $line, 3);
   $account{$value1 . $value2} = $value3;
   }
close $fh1;

# list out the hash
#while ((my $key, my $value) = each(%account)){
#     print "key is: |$key|   value is: |$value|\n";
#}

open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
my $linecnt=0;
my @mykeys;
while( $line = <$fh2> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   if ($linecnt == 0)
      {
      # I know the first line contains names that are part of the key
      @mykeys = split ' ', $line;
      print $out_fh "$line\n";
      $linecnt++;
      next if $linecnt == 1;
      }
      # build the keys
     my $lc=0;
     foreach my $thekey (@mykeys) {
         my $key1 = $line . $thekey;
         if($lc == 0)
            { # print the line only on the first user
            print $out_fh "$line $account{$key1} ";
            $lc++;
            next if $lc == 1;
            }
         print $out_fh "$line $account{$key1} ";
         }
      print $out_fh "\n"; # need a CR after each line
   }
close $out_fh;
close $fh2;



#1
March 7, 2013 at 09:06:09
Accessing or creating Excel files with perl requires a cpan extension like Spreadsheet::WriteExcel. If your perl environment doesn't have it, it's available here:

http://search.cpan.org/dist/Spreads...

Google will provide you tons of examples on how to use it. Here is one:

http://cpansearch.perl.org/src/JMCN...

I won't do the work for you, but if you post code, I'll try to answer specific questions.


Report •

#2
March 10, 2013 at 19:06:21
Hey nails!! Thanks fr da links..i dunn want u to do ne wrk fr me ...i almost wrote da code(still need sum changes, dat um tryn to figure out)..shall post the final version soon

Report •

#3
March 12, 2013 at 04:26:02
eg1.txt
sn1 Jack CC
sn3 Jack TT
sn2 Jack TT
sn4 Jack CC
sn4 Mac CC
sn1 Mac AA
sn3 Mac TT
sn2 Mac AA

eg2.txt
Jack Mac
sn1
sn2
sn3
sn4

These are the example files n below is my code, i'm certainly doing alot wrong here. Plz help

use warnings;
use strict;
use IO::Handle;
use FileHandle;
my %account;
$file1 = "C:\strawberry\perl\bin\eg1.txt";
$file2= "C:\strawberry\perl\bin\eg2.txt";
$outfile = "C:\strawberry\perl\bin\adi3.txt";
open( my $fh1,"<", "eg1.txt" ) or die "$!";
while( my $line = <$fh1> ) {
my @values = split ' ', $line;
$account{$values[0]} = $values[1];

print"$values[2]\n";
}
close $fh1;
open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
while( my $line = <$fh2> ) {
my @values = split ' ', $line;
print $out_fh join $values[0], $account{$values[0]}, $values[1];
#print"$out_fh\n";
}
close $out_fh;
close $fh2;


Report •

Related Solutions

#4
March 13, 2013 at 11:20:45
First, I don't think your open is correct. Don't you want to open the eg1.txt file that's defined by variable $file1?

#!/usr/local/bin/perl

use warnings;
use strict;

my %account;
# UNTESTED WITH YOUR DATA
my $file1 = "C:\strawberry\perl\bin\eg1.txt";
open(my $FH, "<", $file1) or die "$!";
while( my $line = <$FH>)
   {
   chomp($line); # remove the newline
   my @values = split ' ', $line;
   print "$values[2]\n";
   }
close $FH
# end script

Second, please explain what you are trying to do, and I'll give you my input.


Report •

#5
March 13, 2013 at 18:09:10
Nails, Thanks for the help i corrected my code
use IO::Handle;
my %account;
$file1 = "C:\\begperl\\eg1.txt";
$file2= "C:\\begperl\\eg2.txt";
$outfile = "C:\\begperl\\adi3.txt";

open( my $fh1, '<', $file1 ) or die "$!";
while( my $line = <$fh1> ) {
my @values = split ' ', $line;
$account{$values[0]} = $values[2];
print"$values[1]\n";
}
close $fh1;
open( my $out_fh, '>', $outfile ) or die "$!";
open( my $fh2, '<', $file2 ) or die "$!";
while( my $line = <$fh2> ) {
my @values = split ' ', $line;
say $out_fh join $account{$values[0]} , $values[0] , $values[1];
}
close $out_fh;
close $fh2;

below is the output i am getting but this is not hw i exactrly want...i want it lyk the one i posted obove..with my present code i'm getting data fr mac but not fr jack.. :( wt am i missing?
-----output---
JackMac
sn1AA
sn2AA
sn3TT
sn4CC


Report •

#6
March 14, 2013 at 13:15:13
First, I think you don't know how perl hashs exactly work. This is the link I learned from:

http://www.tizag.com/perlT/perlhash...

Second, I think your logic is flawed. I created a hash using this data:

sn1 Jack CC

I combined the first and second fields into a unique key sn1JACK with the value being the third field CC

suggestions:
1) When reading a file, each line read has a trailing newline. It's best to remove the newline using the chomp function. There can be trouble if you don't.
2) Consider if you need trailing or leading white space. In this example, I had to remove the trailing spaces.

Let me know if you have any questions:

# NOTE: change the file names back to what you are using:

use warnings;
use strict;
use IO::Handle;
use FileHandle;
my %account;
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";
#$file2= "C:\strawberry\perl\bin\eg2.txt";
my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

# build a hash with key = column 1 and 2
open( my $fh1,"<", "eg1.txt" ) or die "$!";
my $line;
while( $line = <$fh1> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   my @values = split ' ', $line;
   $account{$values[0] . $values[1]} = $values[2];
   }
close $fh1;

# list out the hash
#while ((my $key, my $value) = each(%account)){
#     print "key is: |$key|   value is: |$value|\n";
#}

open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
my $linecnt=0;
my $keypart1;
my $keypart2;
while( my $line = <$fh2> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   if ($linecnt == 0)
      {
      # I know the first line contains names that are part of the key
      ($keypart1, $keypart2) = split ' ', $line;
      print $out_fh "$line\n";
      $linecnt++;
      next if $linecnt == 1;
      }
      # build the keys
      my $key1 = $line . $keypart1;
      my $key2 = $line . $keypart2;
      print $out_fh "$line $account{$key1} $account{$key2}\n";
   }
close $out_fh;
close $fh2;


Report •

#7
March 14, 2013 at 17:06:04
@nails...yeah, u r right. I don't know much bout hashes as im a begginer in programming...thanks alot fr your code its wrkin fyn...but the problem is that i have large files..eg1 with thousands f rows and eg2 with1100 columns..the example i posted was just a small part f file..how to go bout it nw?

Report •

#8
March 15, 2013 at 12:33:34
First, if the two files are that large you could have a problem. Since a hash resides in memory, it's possible to run out of memory. Even if you have enough memory, it might take a long time to run the program.

Second, if you have more than two users i.e. Jack and Mac, this line could be modified to accept an array or a hash:

($keypart1, $keypart2) = split ' ', $line;

and the perl program modified to use that array or hash.

Third, the entire premise of the perl program I posted in #6 is based on the fact that columns 1 and 2 of the eg2 file are a unique key - no matter how many columns are in the eg2 file. If that isn't true or if you can't define a unique key, there is not anything else I can do for you.

That is, unless you can identify so algorithm/pattern within the file that I haven't seen.


Report •

#9
March 17, 2013 at 04:59:31
Yeah nails..thats right..col 1 & 2 are unique..

Report •

#10
March 17, 2013 at 15:39:08
✔ Best Answer
I made 2 major changes:

First, since col1 and col2 of the xg1.txt file are the keys, I changed the split command to break the line into 3 pieces. The 3rd piece includes all the data after col2 and that becomes the value of the given hash entry.

Second, the script now supports a variable number of users i.e. it assumes more users than just "Jack" and "Mac".

With the amount of data you are suggesting, I cannot guarantee this program will work:

use warnings;
use strict;
use IO::Handle;
use FileHandle;
my %account;
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";
#$file2= "C:\strawberry\perl\bin\eg2.txt";
my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

# build a hash with key = column 1 and 2
open( my $fh1,"<", "eg1.txt" ) or die "$!";
my $line;
while( $line = <$fh1> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   (my $value1, my $value2, my $value3) = split(' ', $line, 3);
   $account{$value1 . $value2} = $value3;
   }
close $fh1;

# list out the hash
#while ((my $key, my $value) = each(%account)){
#     print "key is: |$key|   value is: |$value|\n";
#}

open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
my $linecnt=0;
my @mykeys;
while( $line = <$fh2> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   if ($linecnt == 0)
      {
      # I know the first line contains names that are part of the key
      @mykeys = split ' ', $line;
      print $out_fh "$line\n";
      $linecnt++;
      next if $linecnt == 1;
      }
      # build the keys
     my $lc=0;
     foreach my $thekey (@mykeys) {
         my $key1 = $line . $thekey;
         if($lc == 0)
            { # print the line only on the first user
            print $out_fh "$line $account{$key1} ";
            $lc++;
            next if $lc == 1;
            }
         print $out_fh "$line $account{$key1} ";
         }
      print $out_fh "\n"; # need a CR after each line
   }
close $out_fh;
close $fh2;


Report •

#11
March 17, 2013 at 18:10:41
I added a 3rd user ...this is wt the output is :-(

Jack Mac Pi
sn1 CC sn1 AA sn1 AT
sn2 TT sn2 AA sn2
sn3 TT sn3 TT sn3
sn4 CC sn4 CC sn4 tt


Report •

#12
March 17, 2013 at 22:25:17
Sorry, but i posted the wrong version. Try this one:

use warnings;
use strict;
use IO::Handle;
use FileHandle;
my %account;
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";
#$file2= "C:\strawberry\perl\bin\eg2.txt";
my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

# build a hash with key = column 1 and 2
open( my $fh1,"<", "eg1.txt" ) or die "$!";
my $line;
while( $line = <$fh1> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   (my $value1, my $value2, my $value3) = split(' ', $line, 3);
   $account{$value1 . $value2} = $value3;
   }
close $fh1;

# list out the hash
#while ((my $key, my $value) = each(%account)){
#     print "key is: |$key|   value is: |$value|\n";
#}

open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
my $linecnt=0;
my @mykeys;
#my $line;
while( $line = <$fh2> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces
   if ($linecnt == 0)
      {
      # I know the first line contains names that are part of the key
      @mykeys = split ' ', $line;
      print $out_fh "$line\n";
      $linecnt++;
      next if $linecnt == 1;
      }
      # build the keys
     my $lc=0;
     foreach my $thekey (@mykeys) {
         my $key1 = $line . $thekey;
         if($lc == 0)
            { # print the line only on the first user
            print $out_fh "$line $account{$key1} ";
            $lc++;
            next if $lc == 1;
            }
         if(defined $account{$key1} )
            {
            print $out_fh "$account{$key1} ";
            }
         else
            {
            print $out_fh " XX ";
            }
         }
      print $out_fh "\n"; # need a CR after each line
   }
close $out_fh;
close $fh2;


Report •

#13
March 31, 2013 at 23:45:51
Thanks nails..Script does wt is required bt when i run it with my actual files it shows error - out of memory..hw to go bout it..i tried running in windows as well as ubuntu..same error shows up

Report •

#14
April 1, 2013 at 10:29:39
I warned you in reply #8 that you might run out of memory if your files were large. A hash resides in memory so let us throw away this algorithm.

Another way to do it is to create a seperate file for each user: i.e. Jack.txt, Mac.txt, and process each user's file (instead of using a hash).

This perl program creates the user's files. It only works on Linux as it calls the Linux command grep to build the files. I warn you with the amount of data you have, it will take some time to build the files. If this is agreeable and you can build the files, I will show you how to process them in the next post:


#!/usr/local/bin/perl


# this perl script builds a txt file for each user: i.e. Jack.txt, Mac.txt
# this program only runs on Linux as it uses grep.
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";

my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

open( my $fh2, "<", "eg2.txt" ) or die "$!";
my @mykeys;
while( $line = <$fh2> ) {
   chomp($line);

   # I know the first line contains names
   @mykeys = split ' ', $line;
   # build a file for each user
   foreach my $thekey (@mykeys) {
      my $msg = `grep $thekey $file1 > $thekey.txt`;
      }

   exit; # stop processing after the first line
   }
close $fh2;


Report •

#15
April 1, 2013 at 14:32:09
Yes ..we could go that ways as well..

Report •

#16
April 2, 2013 at 10:33:10
This perl script assumes the script in post #14 has run so there is a text file for each user. Read each line and create the user hash on the first line. Then read each line and grep for the line in each user's file and print it out. I don't check for the error of a user not having a file. Also, this program is designed to be run in the same directory where the users text files exist.

Let me know if you have any questions:

#!/usr/local/bin/perl


# this perl script builds a txt file for each user: i.e. Jack.txt, Mac.txt
# this program only runs on Linux as it uses grep.
#$file1 = "C:\strawberry\perl\bin\eg1.txt";
my $file1 = "./eg1.txt";

my $file2= "./eg2.txt";
#my $outfile = "C:\strawberry\perl\bin\adi3.txt";

open( my $out_fh, ">", "adi3.txt" ) or die "$!";
open( my $fh2, "<", "eg2.txt" ) or die "$!";
my @mykeys;
my $line;
my $linecnt=0;
while( $line = <$fh2> ) {
   chomp($line);
   $line =~ s/\s+$//; # remove trailing spaces

   # I know the first line contains names
   if ($linecnt == 0)
      {
      # I know the first line contains names that are part of the key
      @mykeys = split ' ', $line;
      print $out_fh "$line\n";
      $linecnt++;
      next if $linecnt == 1;
      }

   print $out_fh "$line ";
   # for each user's file grep for the line in question
   foreach my $thekey (@mykeys) {
      my $msg = `grep $line $thekey.txt`;
      chomp($msg);
      # get rid of the first two colums
      (my $value1, my $value2, my $value3) = split(' ', $msg, 3);
      print $out_fh "$value3 ";
      }
   print $out_fh "\n"; # CR after every line
   }
close $out_fh;
close $fh2;


Report •

#17
April 3, 2013 at 06:52:28
I tried using the program in #16 ..its still running...bt output is nt wt we want...it doesnt form a matrix as we need.yet i have to check the final output to be sure...

Report •

#18
April 3, 2013 at 08:23:05
I'm joining this thread a little late, but if you don't mind, I'd like to throw in my .02 cents.

You have a discrepancy between your opening post and post #3 regarding the format of the first file, which throws in some confusion as to which fields you need to extract.

How big are these files? I'm referring to their file size, not the number of rows in each file.

nails has done a good job of providing a couple parsing options. Personally, I prefer the hash approach, especially if we can work around the out of memory issue.

Can you post a reasonable sample of the actual data files and the expected output after they are merged? That would go a long way in allowing us to see other possible avenues of parsing. You won't be able to post file attachments here, but you could use one of the file sharing sites and post the link.


Code Comments:
I see that you're using lexical filehandles and the 3arg for of open, which is great, but the die statement should include the filename so that you know which file failed to open.

There is no need for the $linecnt var since perl already has the built-in $. var to track the current line number.

The if block where the $linecnt is being used can and should be removed from the while loop and its functionality be handled prior to the loop.

This

(my $value1, my $value2, my $value3) = split(' ', $msg, 3);

is better written as
my ($value1, $value2, $value3) = split(' ', $msg, 3);

Since $value1 and $value2 are not being used, it would be better to use undef as placeholders or even better would be to use an array slice.

Here's an update version of the code in post #16.

#!/usr/bin/perl

use strict;
use warnings;

my $file2 = './eg2.txt';
my $file3 = './adi3.txt';

open my $in_fh2, '<', $file2 or die "failed to open '$file2' $!";
open my $out_fh, '>', $file3 or die "failed to open '$file3' $!";

my @names = split /\s+/, scalar <$in_fh2>;
print $out_fh "@names\n";

while (my $line = <$in_fh2>) {

    # strips trailing whitespace which includes the \n line terminator
    $line =~ s/\s+$//;

    print $out_fh "$line ";

    foreach my $name (@names) {
        chomp(my $msg = `grep $line $name.txt`);

        # use an array slice to strip out the first 2 fields
        # and output the rest directly to the file without a var assignment
        print $out_fh (split(' ', $msg, 3))[2], ' ';
    }
    print $out_fh "\n";
}

close $out_fh;
close $in_fh2;


Report •

#19
April 3, 2013 at 12:01:05
i'm not familiar with perl, but like Fish, I'll toss in my tuppence fwiw, which is that if you could rearrange the textfile elements, then you could sort the textfile, then process the sorted textfile sequentially, starting a new output-row whenever the sn field changes (but the names in the second file would also have to be in sorted order).
This might circumvent the out-of-memory issue.
(the textfile would be arranged like: sn name data data, and sorted on just sn+name)

Report •

#20
April 4, 2013 at 02:12:20
@fish monger:Thanks fr the code..ill gv it a try.. file1 is 1.6 gb and file 2 is 45MB..program in #12 is wrking fine fr small data bt runs out f memory fr the large files..
@nbrane: its a lil difficult to sort these files...

Report •

#21
April 4, 2013 at 06:31:31
Can you please post a sample of the actual data files. I have an idea on how to parse them more efficiently, but I'd like to run some realistic tests with realistic data.

Report •

#22
April 4, 2013 at 08:00:24
I've worked up a possible solution but needs proper testing against the actual data.

I'm building a hash of eg1.txt but instead of assigning the entire "3rd field" as the value, I'm assigning the byte offset where it begins. That offset is then used while parsing eg2.txt to seek to given offset when outputting the data.

#!/usr/bin/perl

use strict;
use warnings;

my $file1  = './eg1.txt';
my $file2  = './eg2.txt';
my $file3  = './adi3.txt';
my $offset = 0;
my %offset;

open my $in_fh1, '<', $file1 or die "failed to open '$file1' $!";
open my $in_fh2, '<', $file2 or die "failed to open '$file2' $!";
open my $out_fh, '>', $file3 or die "failed to open '$file3' $!";

while ( my $line = <$in_fh1> ) {
    $line =~ /^(\w+ \w+)/g;
    $offset{$1} = $offset + pos($line);
    $offset = tell $in_fh1;
}


my @names = split /\s+/, scalar <$in_fh2>;
print  $out_fh "@names\n";

while ( my $line = <$in_fh2> ) {
    $line =~ s/\s+$//;
    print $out_fh "$line ";

    foreach my $name ( @names) {
        if (defined $offset{"$line $name"} ) {
            seek $in_fh1, $offset{"$line $name"}, 0;
            my $record = <$in_fh1>;
            chomp $record;
            print $out_fh $record, '';
        }
    }
    print $out_fh "\n";
}
close $in_fh1;
close $in_fh2;
close $out_fh;


Report •

#23
April 4, 2013 at 13:38:52
@FishMonger (sorry)

First, thanks for a critique of my code. I picked up a couple of things I didn't know.

Second, I tested the code you posted successfully with the small dataset I'm using. With the large data files that adi27 is using, it'll take awhile, but it should be significantly faster than my seperate file method.

I'm going to have to read up on perl's seek/offset stuff. Thanks!


Report •

#24
April 4, 2013 at 14:26:19
Hi nails,

s/Ghostdog/FishMonger/ :-)

My first idea was to flip things around and build the hash using eg2.txt, but soon decided that would not work correctly so instead I went with using seek/tell. I did have a little trouble with the pos() function. It seems to have a little quirk which required the needless use of the g modifier on the regex.

Another approach I considered, but didn't test is to load the main data file into a database instead of the hash. Then I could use sql calls while parsing the smaller file. That approach should be faster than using seek to reset the file pointer.


Report •

#25
April 7, 2013 at 23:46:53
Below is the small part of my actual data files. The code in # 22 didn't wrk with the actual data files

File1

BTA-29644-no-rs ABO_1444 CC
BTA-29515-no-rs ABO_1444 GG
BTA-29334-no-rs ABO_1444 AT
BTA-28763-no-rs ABO_1444 AA
BTA-32647-no-rs ABO_1444 AA
BTA-103663-no-rs ABO_1444 GG
Hapmap49509-BTA-17127 ABO_1444 GG
BTA-95262-no-rs ABO_1444 AG
Hapmap42400-BTA-102731 ABO_1444 GG
BTA-40478-no-rs ABO_1444 AA
Hapmap24476-BTA-130730 ABO_1444 GG
BTA-36593-no-rs ABO_1444 AC
BTA-94921-no-rs ABO_1444 AA
Hapmap47551-BTA-25902 ABO_1444 AG
Hapmap52416-rs29016842 ABO_1444 AC
BTA-47238-no-rs ABO_1444 AG
BTA-103555-no-rs ABO_1444 AA
Hapmap41451-BTA-119981 ABO_1444 AA
Hapmap43437-BTA-101873 ABO_1444 AG
Hapmap57391-rs29026905 ABO_1444 AA
Hapmap58800-rs29016980 ABO_1444 AA
ARS-BFGL-NGS-16466 ABO_1444 AG
Hapmap50221-BTA-112986 ABO_1444 GG
Hapmap34944-BES1_Contig627_1906 ABO_1444 CC

File2
ABO_1444 ABO_1445 ABO_1446 ABO_1447 ABO_1448 ABO_1449 ABO_1450
BTA-29644-no-rs
BTA-29515-no-rs
BTA-29334-no-rs
BTA-28763-no-rs
BTA-32647-no-rs
Hapmap49509-BTA-17127


Report •

#26
April 10, 2013 at 04:05:07
@fishmonger : Thanks fr the code but it is simply printing file2 to the output file..
Also I didn't mention that file 1 may have some missing data.
to account for missing data I had put an else, like below. But still um not getting wt I want...any further suggestions?
else
{
print $out_fh "XX ";
}

Report •

#27
April 11, 2013 at 22:41:53
foreach my $name ( @names) {
if (defined $offset{"$line $name"} ) {
seek $in_fh1, $offset{"$line $name"}, 0;
my $record = <$in_fh1>;
chomp $record;
print $out_fh $record, '';
print "if working fine\n";
}
else
{
print $out_fh " XX ";
print "else working fine\n";
}
}
print $out_fh "\n";

this is the change I made...its doing exactly wt I want but again it runs out of memory.. :-(


Report •

#28
April 12, 2013 at 07:18:01
You must be running with only 256k of memory.

Buy more memory.


Report •

#29
April 26, 2013 at 06:32:03
I tried the script with small part f data on windows n it wrks just fine...but it dsnt give the desired output on ubuntu...i am using ubuntu12.10

Report •

#30
April 26, 2013 at 12:02:28
What output did it give and how does that differ from what you expected?

How was the file transferred to the ubuntu system?


Report •

#31
April 26, 2013 at 15:28:49
Oh i transfered the file via samba sharing ...nd the script prints the output file in one column rather than a matrix form..also the order is not maintained...on windows its wrkin perfect..bt coz my windows machine runs out f memory i hav to use another machine on which i recently installed 64bit ubuntu...

Report •

#32
April 26, 2013 at 20:02:16
How did you transfer the files the samba share? If you used the Wndow's copy command you could have a problem with an extra carriage return at the end of each line. That is because Window/DOS terminates each line with a carriage return/lne feed combination where Linux converts each line with just a line feed.

You can tell by vi'ing the file on Linux. You'll see ^M at the end of each line if the carriage returns weren't removed. The Linux dos2unix command well convert the files for you.


Report •

#33
April 26, 2013 at 20:23:58
@nails..thanks..ill try dos2unix

Report •

Ask Question