Computing.Net > Forums > Unix > joining lines with sed

joining lines with sed

Reply to Message Icon

Original Message
Name: kas61
Date: January 17, 2005 at 08:11:37 Pacific
Subject: joining lines with sed
OS: HP-UX 10.20
CPU/Ram: 256
Comment:

If have a sql-dump, wherein records can contain a <LF>. I want to join these "multiple line" records into 1 line.

The file contains data like:
B1CO38EHHRA0Z3|me_draw_tekam | 917356137|......|...|
B1CO3999999993|me_draw_tekam
special| 917356137|......|...|

Each line must begin with 14 char. followed by a "|". If the 15-char. isn't a "|", the line must be joined with the previous one (I know this isn't always correct).

I've tried:
sed 's/\n\(..............[^|].*\)/<CRLF>\1/'

and

:join
/.*\n..............[^|].*/{N
s/\n\(..............[^|].*\)/<CRLF>\1/
b join
}

But both methods does not work.

Any suggestions?


Report Offensive Message For Removal


Response Number 1
Name: Jim Boothe
Date: January 17, 2005 at 08:36:04 Pacific
Subject: joining lines with sed
Reply: (edit)

My first suggestion is to try to avoid the wrapped lines to begin with. In Oracle sqlplus, increase the width of an output line so that one entire logical line will fit on one physical line:

set linesize 250


Report Offensive Follow Up For Removal

Response Number 2
Name: kas61
Date: January 17, 2005 at 08:42:04 Pacific
Subject: joining lines with sed
Reply: (edit)

It's not Oracle, but Allbase (an old database from HP). I've set the width of each record, but some fields can contain a <LF>. These fields were used by users to enter a multi-line note.
I don't know, within Allbase, how to avoid the output of these <LF>, so i'm trying to do this afterwards.


Report Offensive Follow Up For Removal

Response Number 3
Name: Jim Boothe
Date: January 17, 2005 at 11:32:55 Pacific
Subject: joining lines with sed
Reply: (edit)

Here is an awk solution.  That match expression insists on 14 non-bar characters followed by a bar character followed by anything.  It would recognized the first line below as a "starting" line, but not the second:

aaaaaaaaaaaaaa|cccc|
aaaaaa|bbbbbbb|cccc|

awk '{
if (match($0,"^[^|]{14,14}\|"))
  {if ( out!="" ) print out
   out=$0}
else
   out=out $0
}
END {if ( out!="" ) print out}' file.in

A sed solution will be a bit more challenging (for me at least).  But I will give it a try.


Report Offensive Follow Up For Removal

Response Number 4
Name: kas61
Date: January 17, 2005 at 13:41:23 Pacific
Subject: joining lines with sed
Reply: (edit)

Don't we love awk!

I started with awk, but gave it up. I see now that I took the wrong approach.

Thanks for your input, so no need for sed anymore.


Report Offensive Follow Up For Removal

Response Number 5
Name: Jim Boothe
Date: January 20, 2005 at 07:56:02 Pacific
Subject: joining lines with sed
Reply: (edit)

For practice, I coded a sed solution for this.  The rules that I followed are: Join all lines together, but a line beginning with NEW will start a new line.  The first and last lines of the file may or may not be a NEW line. The file may have only one line.

sed -e '1{$p;x;d;}'                     \
    -e '/^NEW/!{H;$!d;x;s/\n//g;b;}' \
    -e 'x;s/\n//g;${p;x;}'              \
filein

While the above does work for me on a non-HPUX box, my HPUX box does not allow me to have a b within braces (what a pain).  In this particular case however, I am able to make a simple change that will work on HPUX:

sed -e '1{$p;x;d;}'                       \
    -e '/^NEW/!{H;$!d;x;s/\n//g;p;d;}' \
    -e 'x;s/\n//g;${p;x;}'                \
filein

Or, here is a slight variation that will work on HPUX or not.  The branch to the :s label avoids the need for that particular set of braces.

sed -e '1{$p;x;d;}'        \
    -e '/^NEW/bs'       \
    -e 'H;$!d;x;s/\n//g;b' \
    -e :s                  \
    -e 'x;s/\n//g;${p;x;}' \
filein

To adapt this code to the original problem, the starter lines, rather than being identified with /^NEW/, are identifed as starting with 14 non bar characters followed by a bar character: /^[^|]\{14,14\}|/


Report Offensive Follow Up For Removal


Response Number 6
Name: kas61
Date: January 25, 2005 at 05:54:23 Pacific
Subject: joining lines with sed
Reply: (edit)

I've to 'decode' your sed-script to learn more about sed.

Jim, thanks for all your input.


Report Offensive Follow Up For Removal

Response Number 7
Name: Jim Boothe
Date: January 25, 2005 at 07:13:04 Pacific
Subject: joining lines with sed
Reply: (edit)

You're quite welcome.

That sed script was a good exercise for my practice, and would be a good intermediate script to study, but I would not suggest trying to start with it.

You will find "HANDY ONE-LINERS FOR SED" at:
http://www.student.northpark.edu/pemente/sed/sed1line.txt

Those are great for self study (along with the man page, of course).


Report Offensive Follow Up For Removal






Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: joining lines with sed

Comments:

 


  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 
Data Recovery Software




How often do you use Computing.Net?

Every Day
Once a Week
Once a Month
This Is My First Time!


View Results

Poll Finishes In 4 Days.
Discuss in The Lounge