Computing.Net > Forums > Unix > Merging two data files

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Merging two data files

Reply to Message Icon

Name: cocoacat
Date: January 28, 2006 at 23:33:37 Pacific
OS: unix
CPU/Ram: -
Comment:

I have some broblem about merging two files. this is my perl code which is not completed.

$filename = "delete.txt";
unless ( -e $filename ) {
print "The file $filename does not seem to exist \n";
exit;
}
print "\nThe file $filename exist and will be uploaded.\n";
unless ( open(FILEA , $filename)) {
print "Can not open $filename \n";
exit;
}
my @readdata = <FILEA>;
close FILEA;

$filenameB = "delete2.txt";
unless ( -e $filenameB ) {
print "The file $filenameB does not seem to exist \n";
exit;
}
print "\nThe file $filenameB exist and will be uploaded.\n";
unless ( open(FILEB , $filenameB)) {
print "Can not open $filenameB \n";
exit;
}
my @readdataB = <FILEB>;
close FILEB;

#---extracting term from the first array
my $linenum = @readdata;
my $currline = 0;
for($currline = 0; $currline < $linenum; $currline++)
{
my @splitdataA = split(" ", $readdata[$currline]);
my $geneA = $splitdataA[2];
push (@exgenesA, $geneA);
}

my $linenumB = @readdataB;
my $currlineB = 0;
for($currlineB = 0; $currlineB < $linenumB; $currlineB++)
{
my @splitdataB = split(" ", $readdataB[$currlineB]);
my $geneB = $splitdataB[1];
push (@exgenesB, $geneB);
}

@union = @isect = ();
%union = %isect = ();
foreach $e(@exgenesB)
{
$union{$e} = 1;

}
foreach $e(@exgenesA)
{
if ($union{$e})
{
$isect{$e} = 1;
}
$union{$e} =1;
}
@union = keys %union;
@isect = keys %isect;
print join (",", sort @isect);
exit;
--------------------
input file (delete.txt)
cd A1 B1 0.1
cd A2 B2 0.3
cd A3 B4 0.2
cd A5 B3 0.2
cd A6 B3 0.2
---------------------------
input file (delete2.txt)
ab B1 A1 0.2
ab B2 A2 0.3
ab B3 A4 0.2
ab B5 A3 0.2
---------------------------
I tried search for pair of matching data
the results should be
Output :
A1 B1 0.1 B1 A1 0.2
A2 B2 0.3 B2 A2 0.3
A3 B3 0.2 B3 A4 0.2
---------------------------
But in my code, i can only get intersec between them that are B1,B2 and B3.
Could you please suggest me about extracting output like as over output.
Thank you so much.




Sponsored Link
Ads by Google

Response Number 1
Name: cocoacat
Date: January 30, 2006 at 18:50:31 Pacific
Reply:

I tried to do my code to search matching pair in two files,shown like this :

open(IF1, "data1.txt") or die "Error opening data file: $!\n";
my @a;
while (<IF1>)
{
chomp;
my @fields=split;
for(my $j=0;$j<@fields;$j++){
push (@a,@fields[$j]);

}
}
close(IF1);
open(IF2, "data2.txt") or die "Error opening data file: $!\n";
my @b;
while (<IF2>)
{
chomp;
my @fields=split;
for(my $j=0;$j<@fields;$j++){
push (@b,@fields[$j]);

}
}
close(IF2);

my $count = 0;
for (my $i = 0; $i <8; $i++)
{
for (my $t = 0; $t <8; $t++)
{
# 8 mean total number of column, $i mean row, 1 and 2 mean position of column
if ((@a[8*$i+1] eq @b[8*$t+2])and(@a[8*$i+2] eq @b[8*$t+1]))
{
$count++; for (my $k=1; $k<8; $k++)
{
my @c = push(@c,@a[(8*$i)+$k]);
}
for (my $k=1; $k<8; $k++)
{
my @c = push(@c,@b[(8*$t)+$k]);
}
}
}
}
print @c;
---------
data1.txt
cc A1 B1 7e-14 149 33 74.3 181
cc A2 B3 5e-13 72 45 71.6 174
cc A3 B5 1e-11 152 30 64.7 156
cc A4 B6 1e-10 175 26 63.5 153
cc A5 B7 5e-10 95 33 62.8 151
---------
data2.txt
aa B1 A1 4e-13 207 23 56.6 135
aa B2 A1 5e-13 207 23 56.6 135
aa B3 A2 6e-13 72 45 71.6 174
aa B4 A3 7e-12 163 31 69.3 168
aa B5 A3 8e-11 152 30 64.7 156
aa B6 A3 9e-10 175 26 63.5 153
--------
The result showed like this :
A1
B1
7e-14
149
33.5570469798658
74.3
181
B1
A1
4e-13
207
23.6714975845411
56.6
135
A1
B1
6e-14
149
33.5570469798658
74.3
181
B1
A1
4e-13
207
23.6714975845411
56.6
135
A2
B3
5e-13
72
45.8333333333333
71.6
174
B3
A2
6e-13
72
45.8333333333333
71.6
174
A3
B5
1e-11
152
30.9210526315789
64.7
156
B5
A3
8e-11
152
30.9210526315789
64.7
156
-----------
I have some problem about lines. I want lines like this in one time of matching :

A1 B1 7e-14 149 33 74.3 181 B1 A1 4e-13 207
23 56.6 135

A2 B3 5e-13 72 45 71.6 174 B3 A2 6e-13 72
45 71.6 174

A3 B5 1e-11 152 30 64.7 156 B5 A3 8e-11 152
30 64.7 156

you can see matching pair like that. Please suggest me about code or output like that. Thank you so much. Now I feel my code is very slow for running data which have hundred thousand lines.



0
Reply to Message Icon

Related Posts

See More


want to start Unix portal Ksh mail using addresses ...



Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: Merging two data files

trouble merging two files with awk www.computing.net/answers/unix/trouble-merging-two-files-with-awk/7937.html

Merging two files in Specific Manner www.computing.net/answers/unix/merging-two-files-in-specific-manner/8451.html

Spinning a data file 90 degrees www.computing.net/answers/unix/spinning-a-data-file-90-degrees/5803.html