Computing.Net > Forums > Unix > sed & awk question

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

sed & awk question

Reply to Message Icon

Name: raymond
Date: November 1, 2002 at 08:46:51 Pacific
OS: Unix
CPU/Ram: UltraSparcIII, 2G
Comment:

Try to find the simplest sed & awk script to
extract from input file to produce the outputfile?

Input file:
A1
A2
A3 Pattern1
A4
A5
...
B1
B2
B3 Pattern1
B4
B5
...

Output file:
A2
A3 Pattern1
A4
Blank line
B2
B3 Pattern1
B4
Blank line
...




Sponsored Link
Ads by Google

Response Number 1
Name: LANkrypt0
Date: November 1, 2002 at 10:39:21 Pacific
Reply:

So remove A1, B1, A5 and B5? then just sort it?
Or am I not interpreting this correctly?


0

Response Number 2
Name: James Boothe
Date: November 1, 2002 at 12:26:12 Pacific
Reply:

I should let raymond answer, but my guess would be, for each line that contains Pattern1, output it along with the preceding line, the following line, and a blank line for separation.

If so, what to do with two adjacent lines containing Pattern1.


0

Response Number 3
Name: Raymond
Date: November 4, 2002 at 13:57:45 Pacific
Reply:

James interpretation is correct.

There is no repeated "Pattern 1" in the same block.

What I try is to:

pkginfo -l > a_file

Then run sed & awk to extract some info about packages that installed on a particular date.

Any suggestion?


0

Response Number 4
Name: James Boothe
Date: November 6, 2002 at 08:17:39 Pacific
Reply:

On my HP-UX, I don't have pkginfo. If you post a little of the pkginfo output and a sample of the output you desire, that would be a big help.


0

Response Number 5
Name: raymond
Date: November 6, 2002 at 15:03:55 Pacific
Reply:

James:

A partial input file:

40 executables
77 blocks used (approx)

PKGINST: SUNWeuezt
NAME: English UTF-8 L10N For Desktop Power Pack Applications
CATEGORY: system
ARCH: sparc
VERSION: 1.5,REV=1999.12.03.14.40
BASEDIR: /usr
VENDOR: Sun Microsystems, Inc.
DESC: American English/UTF-8 L10N For Desktop Power Pack Applications
INSTDATE: Feb 04 2000 09:23
HOTLINE: Please contact your local service provider
STATUS: completely installed
FILES: 28 installed pathnames
16 shared pathnames
13 directories
7 executables
197 blocks used (approx)

PKGINST: SUNWeugrf
NAME: X11 sun_eu_greek fonts
CATEGORY: system
ARCH: sparc
VERSION: 3.6,REV=1999.10.13.14.44
BASEDIR: /usr
VENDOR: Sun Microsystems, Inc.
DESC: X11 fonts for sun_eu_greek character set
INSTDATE: Feb 04 2000 09:23
HOTLINE: Please contact your local service provider
STATUS: completely installed
FILES: 32 installed pathnames
10 shared pathnames
12 directories
88 blocks used (approx)

Desired output: (after I filtered based on
"Feb 04 2000 " )

PKGINST: SUNWeuezt
NAME: English UTF-8 L10N For Desktop Power Pack Applications
CATEGORY: system
VERSION: 1.5,REV=1999.12.03.14.40
BASEDIR: /usr
DESC: American English/UTF-8 L10N For Desktop Power Pack Applications
INSTDATE: Feb 04 2000 09:23


PKGINST: SUNWeugrf
NAME: X11 sun_eu_greek fonts
CATEGORY: system
VERSION: 3.6,REV=1999.10.13.14.44
BASEDIR: /usr
DESC: X11 fonts for sun_eu_greek character set
INSTDATE: Feb 04 2000 09:23

...



0

Related Posts

See More



Response Number 6
Name: James Boothe
Date: November 7, 2002 at 08:52:53 Pacific
Reply:

In both of the scripts below, to prevent from losing my indentation, I use leading underscores instead of leading spaces, so you would need to change those back to spaces.

This first script does not do date filtering. It prints the lines of interest for all packages:

#!/bin/sh

awk '\
/^PKGI/ {\
__print ""
__print}
/^NAME/||/^CAT/||/^VER/||/^BAS/||/^DES/||/^INST/ {\
__print}' a_file

exit 0

This second script expects 3 parameters such as:

pkgsearch Feb 04 2000

Alternately, it could be coded to receive the date as a single parameter in double-quotes, such as:

pkgsearch "Feb 04 2000"

or to avoid the quotes, it could be coded to receive a single parameter such as:

pkgsearch Feb-04-2000

and it could change the dashes to spaces.

#!/bin/sh

if [ $# -ne 3 ] ; then
___echo '\nRun script with desired date of package installation.'
___echo '\nExample: pkgsearch Feb 04 2000\n'
___exit 1
fi

reqdate=$*

awk -v reqdate="$reqdate" '{\
if (substr($0,1,8) == "PKGINST:")
__{h1=$0
___h2=""
___h3=""
___h4=""
___h5=""
___h6=""}
if (substr($0,1,5) == "NAME:") h2=$0
if (substr($0,1,9) == "CATEGORY:") h3=$0
if (substr($0,1,8) == "VERSION:") h4=$0
if (substr($0,1,8) == "BASEDIR:") h5=$0
if (substr($0,1,5) == "DESC:") h6=$0
if (substr($0,1,9) == "INSTDATE:")
___if ($2 " " $3 " " $4 == reqdate)
_____{print ""
______print h1
______print h2
______print h3
______print h4
______print h5
______print h6
______print}
}' a_file

exit 0

The script stores the lines of interest, and prints all those out if it gets a match on INSTDATE. When it hits a PKGINST line, it nullifies all the hold lines. If it did not do this, and some of the package displays do not have all of those lines, it could be displaying hold lines left over from a previous package. If each package display always includes all of those lines of interest, then nullifying the hold lines is not required.

There are several ways to pass a shell parameter to awk. I think the -v option is pretty common. If that does not work for you, we can use one of the other approaches to feed reqdate to awk.


0

Response Number 7
Name: raymond
Date: November 7, 2002 at 11:25:25 Pacific
Reply:

James,

I have updated to make it work. See below.

However, I am getting a bit greedy here.

I would like to run in a single line, something like

#pkginfo -l | ./pkgsearch "Feb 14 2000"

Can you help as well?

Thanks.
#!/bin/sh

if [ $# -ne 3 ] ; then
echo '\nAssume pkginfo_list already exist
echo '\nExample: pkgsearch Feb 14 2000\n'
exit 1
fi


reqdate=$*

/usr/xpg4/bin/awk -v reqdate="$reqdate" '{\

if (substr($0,4,8) == "PKGINST:")
{h1=$0
h2=""
h3=""
h4=""
h5=""
h6=""

}

if (substr($0,7,5) == "NAME:") h2=$0
if (substr($0,3,9) == "CATEGORY:") h3=$0
if (substr($0,4,8) == "VERSION:") h4=$0
if (substr($0,4,8) == "BASEDIR:") h5=$0
if (substr($0,7,5) == "DESC:") h6=$0
if (substr($0,3,9) == "INSTDATE:")
{
if ($2 " " $3 " " $4 == reqdate)
{print ""
print h1
print h2
print h3
print h4
print h5
print h6
print}
}
}' pkginfo_list

exit 0


0

Response Number 8
Name: James Boothe
Date: November 7, 2002 at 14:28:08 Pacific
Reply:

Yep, that's easy, but out of time today. If no one answers before tomorrow morning, I will at that time.


0

Response Number 9
Name: James Boothe
Date: November 8, 2002 at 13:09:16 Pacific
Reply:

Instead of specifying the filename of pkginfo_list, you would remove that filename and instead pipe that data into awk. So, instead of:

awk -v reqdate="$reqdate" '{\
etc etc etc
}' pkginfo_list

it would be:

pkginfo -l | awk -v reqdate="$reqdate" '{\
etc etc etc
}'

and I used the simple awk reference just to keep the lines shorter. Of course you would need the xpg4 awk.


0

Response Number 10
Name: Raymond
Date: November 12, 2002 at 08:55:32 Pacific
Reply:

Thanks, James


0

Sponsored Link
Ads by Google
Reply to Message Icon






Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: sed & awk question

sed awk question www.computing.net/answers/unix/sed-awk-question/6900.html

awk question www.computing.net/answers/unix/awk-question/6685.html

sed and awk www.computing.net/answers/unix/sed-and-awk/1403.html