Solved A shell script to remove certain lines from a text file

February 7, 2012 at 04:55:38
Specs: CentOS
I have a text file containing data that i have to add to a database line by line. Once i have added them to the database, i want those lines removed so thet they are not processed again, leading to redundancy errors.
My data in the text file looks something like this
ID no Name Dept Score1 Score2 Score3
x111 XXX admin 55 66 45
x112 AAA research 34 76 48
x113 BBB admin 22 87 54


My question is, i will be processing the details based on the ID no. So, i want to delete the line with ID no after processing it. I will retreive the ID no and store it in a variable called empID. Is it possible to delete the line using the value of empID ($empID) ?
Please Help


See More: A shell script to remove certain lines from a text file

Report •

✔ Best Answer
February 10, 2012 at 06:54:30
How stupid of me! Since the empid starts each row, simply use a regular expression to delete the empid at the beginning of the line:

sed -i "/^$empid/d" textfile.txt

^$empid means match the beginning $empid.

This a good link describing regular expressions:

http://linuxreviews.org/beginner/ta...



#1
February 7, 2012 at 09:19:05
There are any number of ways of doing this: shell script, awk or perl script, etc. But the basic algorithm is to read thru the file, parse the first field. Print the line out or skip it if the first field equals the empID. I choose to use the shell, but what is best depends on what you know and how you are processing the data for your other requirements:

#!/bin/bash

rm -f newtestfile.txt
empID="x112" # skip this ID
flag=0
while read ID therest
do
   if [[ $ID == $empID ]]
   then # skip if equal
      flag=1
      continue
   fi
   echo "$ID $therest"
done < textfile.txt > newtextfile.txt
# rename the file if there was a change
if [[ $flag -eq 1 ]]
then
   mv newtextfile.txt textfile.txt
fi

Let me know if you have any questions.


Report •

#2
February 8, 2012 at 22:41:42
Thanks a lot for your reply, but i found a way to do it myself using just this single line of code.

sed -i "/value of empid/d" filename


The code i'm using is something like this

#!/bin/sh
echo "Enter the employee id to be deleted : "
read empid
sed -i "/$empid/d" newtextfile1.txt

Hope my code helps someone else out here.


Report •

#3
February 9, 2012 at 09:15:31
If this works for you fine, but be aware that you are not deleting by the Empid field, but by the entire line. If by chance, the Empid is repeated somewhere else in another record, that line will also be deleted. For example, with this data:

x111 XXX adminx112 55 66 45
x112 AAA research 34 76 48

deleting x112 deletes both records.

Also, keep in mind the "edit in place" option -i is a GNU sed extention; On a legacy, non-Linux platform, it will not be portable.


Report •

Related Solutions

#4
February 9, 2012 at 22:14:12
Nails,

Thanks a lot for pointing that out. I wasn't fully aware of how -i works. Now that i'm clear, i think i'll use your code as i do hav some records like the ones you mentioned in my DB.

And since i'll be working only on a Linux platform, I don't think portability is an issue.

Thanks again...!!! I'll let you know if i have any more doubts.


Report •

#5
February 10, 2012 at 06:54:30
✔ Best Answer
How stupid of me! Since the empid starts each row, simply use a regular expression to delete the empid at the beginning of the line:

sed -i "/^$empid/d" textfile.txt

^$empid means match the beginning $empid.

This a good link describing regular expressions:

http://linuxreviews.org/beginner/ta...


Report •

#6
February 13, 2012 at 23:32:16
Nails,

This was the expression i was looking for. I guess i missed out the ^ symbol in my code and used it as

sed -i "/$empid/d" textfile.txt

instead of

sed -i "/^$empid/d" textfile.txt

But thanks for your code anyway, as it might prove helpful if i'm working with some records where each line doesn't start with the empID value.


Report •

Ask Question