I want to split data from one file to many

August 12, 2011 at 10:14:52
Specs: Solaris
I have data in a file and want to split it to many files based on the next to last section of the name.
TEST2_GT365_0043_0001_0019
TEST2_GT365_0044_0001_0020
TEST2_GT365_0045_0001_0021
TEST2_GT365_0046_0001_0022
TEST2_GT365_0047_0001_0023
TEST2_GT365_0048_0001_0024
TEST2_GT365_0002_0002_0001
TEST2_GT365_0003_0002_0002
TEST2_GT365_0011_0002_0003
TEST2_GT365_0012_0002_0004
TEST2_GT365_0013_0002_0005
TEST2_GT365_0020_0002_0006
TEST2_GT365_0020_0002_0007
TEST2_GT365_0021_0002_0008
TEST2_GT365_0022_0002_0009
TEST2_GT365_0024_0002_0010
TEST2_GT365_0025_0002_0011
TEST2_GT365_0026_0002_0012
TEST2_GT365_0027_0002_0013
TEST2_GT365_0028_0002_0014
TEST2_GT365_0029_0002_0015
TEST2_GT365_0002_0003_0001
TEST2_GT365_0003_0003_0002
TEST2_GT365_0004_0003_0003
TEST2_GT365_0005_0003_0004
TEST2_GT365_0006_0003_0005
TEST2_GT365_0007_0003_0006
TEST2_GT365_0008_0003_0007
TEST2_GT365_0009_0003_0008
TEST2_GT365_0010_0003_0009
TEST2_GT365_0011_0003_0010
TEST2_GT365_0012_0003_0011
TEST2_GT365_0013_0003_0012
TEST2_GT365_0014_0003_0013


I would like to have 3 files with all the "0001" in a file and the "0002" in a file and "0003" in a file.

Thanks for the help.


See More: I want to split data from one file to many

Report •

#1
August 12, 2011 at 14:08:19
# using the under score as a field seperator, the 4th field is the key. Print out the 4th field and sort unique to get what the key is 0001, 002, 003
#!/bin/ksh

# get the unique keys 0001, 0002, 0003 in this case
rm -f tmp.file
awk ' BEGIN { FS="_" }
{
  print $4
} ' data1.txt|sort -u > tmp.file

 #What makes this problem tricky is that the 3rd field can be the same as the 4th field. 

 #The only way I could see was in sed to look at the $key, an underscore and 4 other characters (4 periods) before the end of the line:

# sed -n '/'"$key"'_....$/p' 
 
cnt=0
# look at each key
while read key
do
   ((cnt=cnt+1))
   myfile="myfile.$cnt"
   # print lines only having the  $key, plus 1 underscore, plus 4 characters at the end
   sed -n '/'"$key"'_....$/p' data1.txt > $myfile
done < tmp.file


Report •

#2
August 15, 2011 at 14:17:12
Thanks so much for the help. This works perfectly.

Report •
Related Solutions


Ask Question