I work under Sun OS 5.10
I look for a shell script (using awk, sed, ... ?) for removing any duplicate lines within the same block of text in the input file.Here is a short example of the input file (which is a script for plantuml) :
@startuml graph1.png object1 --> link1 link1 --> object2 object2 --> link2 link2 --> object3 object3 --> link3 link3 --> object1 object1 --> link1 @enduml @startuml graph2.png object4 --> link4 link4 --> object1 object1 --> link1 link1 --> object5 link1 --> object5 @endumlHere is what I'd like to get as results :
@startuml graph1.png object1 --> link1 link1 --> object2 object2 --> link2 link2 --> object3 object3 --> link3 link3 --> object1 @enduml @startuml graph2.png object4 --> link4 link4 --> object1 object1 --> link1 link1 --> object5 @endumlExplanations:
A block starts with @startuml and ends with @enduml
In the first block, duplicate line to be removed is:object1 --> link1
In the 2nd block, duplicate line to be removed is:link1 --> object5
I'm going to reply to myself ! :-)
After some searches and testing, I think I found the solution to my problem with the following command :nawk '/startuml/ { startuml = $2 } !x[startuml,$0]++' myfile.txt
