Computing.Net > Forums > Unix > Grep expression help

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

Grep expression help

Reply to Message Icon

Name: Trent
Date: May 2, 2003 at 15:26:03 Pacific
OS: Mac OS X 10.2.5
CPU/Ram: 500Mhz G4/768 MB
Comment:

Hi all, here's hoping some one can give me a pointer or two with this probelm...
I've got a huge text file that I'm cleaning up into tab'd data for import into a database. I've got it all cleaned up except for some of the tabbing... Here's the problem:

Using grep I need to come up with an expression that finds lines that, 1. contain fewer than 5 tabs and 2. contain more than 5 tabs.

I know enough about grep to be dangerous and have been reading over the docs on it but haven't been able to construct anything useful as of yet. I'm using BBEdit on Mac OS X to put together this text file, so all I need is the expression that actually does the search, I don't need to worry about returning or formatting the returned data, BBEdit handles all that. Thanks.



Sponsored Link
Ads by Google

Response Number 1
Name: David Perry
Date: May 2, 2003 at 20:24:30 Pacific
Reply:

Try these options to see if BBedit supports the same syntax.

? - matches zero or one of the preceding character

{n} - matches n copies of the preceding character!

{n,m} - matches at least n but not more than m copies of the preceding character

{n,} - matches at least n copies of the preceding character.


0

Response Number 2
Name: Trent
Date: May 5, 2003 at 09:19:36 Pacific
Reply:

It does indeed support all of those options and all other options that grep supports, however, I'm having trouble constructing a string that alerts me on a line by line basis as to which lines have either fewer than 5 tabs or more than 5 tabs. Those are the lines I want to know about and fix. Each line will be a record in a database and there are over 8000 lines. Total pain, but this is the final cleanup. Thanks.


0

Response Number 3
Name: James Boothe
Date: May 5, 2003 at 13:55:13 Pacific
Reply:

These expressions use the letter X, so you need to substitute a tab character representation for each X.

First expression uses REs:

grep -v '^[^X]*X[^X]*X[^X]*X[^X]*X[^X]*X[^X]*$' myfile

And this expression uses EREs, which allows us to represent that repeating multi-character pattern with the {5} construct:

egrep -v '^([^X]*X){5}[^X]*$' myfile


0

Response Number 4
Name: gcl
Date: May 15, 2003 at 09:37:21 Pacific
Reply:

GREP SEARCHING FOR TABS USING THE TERMINAL

Broken down there are 4 possibilities how tabs might
occur on a line;
1. find tabs only
^([[:cntrl:]]{n,m})$

2. find non-control chars, then n tabs
^([^[:cntrl:]])[[:cntrl:]]{n,m}$

3. find n tabs, then non-control chars
^[[:cntrl:]]{n,m}[^[:cntrl:]]$

4. find non-control chars, n tabs, non-
control chars
^[^[:cntrl:]][[:cntrl:]]{n,m}[^[:cntrl:]]$


Now, pipe it all together!

find more than 5 tabs (ugly, ain't it!);
grep -E
'^([[:cntrl:]]{5,})$|^([^[:cntrl:]])[[:cntrl:]]{5,}$|^[[:cntrl:]]
{5,}[^[:cntrl:]]$|^[^[:cntrl:]][[:cntrl:]]{5,}[^[:cntrl:]]$' /
filename

find fewer than 5 tabs; just substitute `5,' above with `0,4'
(there are 4 occurances)


0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More


ftp from Unix to Windows ... Free Shell acct anywhere ...



Post Locked

This post is quite old and has been locked from receiving new replies. Please create a new posting instead.


Go to Unix Forum Home


Sponsored links

Ads by Google


Results for: Grep expression help

Grep Help www.computing.net/answers/unix/grep-help/5833.html

Grep Command Help www.computing.net/answers/unix/grep-command-help/7089.html

VI / General Expression Help www.computing.net/answers/unix/vi-general-expression-help/5617.html