Processing IRC chat logs

December 5, 2009 at 07:40:27
Specs: Windows XP
I'm looking to batch process some text chat logs, essentially performing some repetitive functions and saving me some time and headache.

The text I want to KEEP, will be between "quotation marks" and the text I want to DELETE, will be inbetween (parentheses).

I have Word 2003, and can find and replace text without a problem, but I can't seem to find a program capable of performing the functions above.

Can you help?

Thanks in advance,

Mike M

December 5, 2009 at 09:19:10
Please give us some examples of a chat log that you would like to process.

Having some actual (example) layout/content will help us be more specific with our suggestions.

December 5, 2009 at 11:10:02
God bless you guys for the help, I appreciate it.

These are IRC chat logs are for D&D roleplaying sessions. Chat that is in parentheses, either double or single (( )) or ( ) indicated out of character chat. Chat that indicated an emote, where a character references themselves, is indicated with -- * nickname -- without the dashes.
And in character chat is surrounding by quotation marks.

Here's an example that includes everything I want to process.

[18:05] <@Lorelai> Is that something you made up or is it an actual clue leading us to do something?
[18:06] <@Widersinn> Lat time, the group was divided into two parts. One team continued shopping from the previous session. The other team stayed at he tavern for lunch. At the end of the session the teams rejoined at the tavern.
[18:07] <Grennik> (( I asked Widersinn if anything unusual has happened in the time he's been working there. That was the interesting thing ))
[18:07] <Grennik> (( not metagaming for you to know that either, I brought it up at least three times last session ))
[18:08] <Des`> (( yes he did ))
[18:08] * Sundirra finishs his lunch while listening to the group talk, "So whats the plan for today? I have a meeting i wish to make tomorrow but the rest of today is fine."
[18:08] <Grennik> (( I was juuust mentioning we should take horses around the city, and Marco said "tomorrow" ))
[18:08] <@Lorelai> "A meeting?"
[18:10] <Sundirra> "With a Master armorer/weaponcrafter. Was planing on turning these items into dangerous things."
[18:10] <Des`> "What things would that be?"
[18:10] <@Lorelai> "Oh, the dragon bits?"
[18:10] * Sundirra nods to Lorelai

I want to trip the time stamps, so delete the brackets [ ] and everything in between them.

I want to remove all out of character chat, which is in between parentheses.

And retain all chat that is in emote (following an asterix) and in character (in quotation marks).

The only program I've run into that gets close to being able to do batch edit in that much detail is File Renamer, and it only edits file names.

Thanks for the advice.

Mike M

December 5, 2009 at 11:12:07

Also, in a perfect world, I'd like to be able to retain everything a certain user chats. in this case, Widersinn is the DM, and everything he says needs to be retained.

December 5, 2009 at 15:27:52

I had a look around for an IRC AI bot or a IRC chat log parser.

Didn't have much luck

December 5, 2009 at 19:55:34
What about this line?

[18:08] <Grennik> (( I was juuust mentioning we should take horses around the city, and Marco said "tomorrow" ))

Per your instructions - remove all out of character chat, which is in between parentheses... retain all chat that is in character (in quotation marks) - we should retain "tomorrow", right?

What about this line, which has no distinguishing punctuation. Retain or delete?

[18:05] <@Lorelai> Is that something you made up or is it an actual clue leading us to do something?

December 6, 2009 at 07:39:39

No, anything in parentheses goes, no matter how it's formatting inbetween.

and if it's just chat without any punctuation around it, well, it'd be easiest for me to have the option to either keep all of those or delete all of those.

What sort of solution are you thinking of, DerbyDad?

December 6, 2009 at 09:59:08
Paste your chat log into an Excel worksheet, starting in A1.

Click the Sheet tab and choose View Code.

Paste the code below into the window that opens

When run, it will ask the user for the DM's name. If it doesn't find that name in the log, it will ask the user again. The user can try again or cancel out of the program.

It will then loop through the log, deleting or retaining lines based on your criteria. When it finds a line without any of the Criteria specified, it will ask the user if they want to Retain that line. The user can say Yes, No or Cancel out of the program.

It's not perfect...

When entering the DM's name, if the user enters any string that the code can find, it will assume that that is the DM's name, even if the user made a typing error. With some more code, it's possible to fix that.

The Search routine isn't bulletproof. These lines would be retained, even though based on your criteria, the user should be asked what they want to do with the:

This line contains the DM's name, so it would be retained:

<@Lorelai> Is that something Widersinn made up or is it an actual clue leading us to do something?

This line contains quotation marks, so it would be retained.

<@Lorelai> Is that something you made up or is it an actual "clue" leading us to do something?

Both of these issues can probably be dealt with by adding more code, but I didn't want to do any more work if this solution does not fit your needs.

Sub DD_Chat()
Dim nxtRow, firstSpace, lastChat_Line
Dim DM, no_DM, NotSure

'Get DM's name from user
  DM = Application.InputBox("Enter DM's name")
   If DM = False Then Exit Sub

'Make sure DM's name is found in log
   With Cells
    If .Find(DM) Is Nothing Then
     no_DM = MsgBox("DM's Name Not Found", vbOKCancel)
      If no_DM = vbCancel Then Exit Sub
      GoTo get_DM
    End If
   End With
'Find row of last entry
  lastChat_Line = Cells(Rows.Count, "A").End(xlUp).Row

'Delete Time Stamps
  For nxtRow = 1 To lastChat_Line
   firstSpace = Application.WorksheetFunction.Find(" ", Range("A" & nxtRow))
   Cells(nxtRow, 1) = Right(Cells(nxtRow, 1), Len(Cells(nxtRow, 1)) - firstSpace)
'Loop through lines starting at bottom and working up
'You must start at the bottom when deleting lines
  For Chat_Line = lastChat_Line To 1 Step -1
   With Cells(Chat_Line, "A")
'If line contains a (, delete the line
    If Not .Find("(") Is Nothing Then
      GoTo Chk_Nxt
    End If
'If line contains a * or " or the DM name, do not delete it
    If Not .Find(DM) Is Nothing Or _
       Not .Find("""") Is Nothing Or _
       Left(Cells(Chat_Line, "A"), 1) = "*" Then GoTo Chk_Nxt
'If line doesn't fit any of the above criteria, query the user
     NotSure = MsgBox(Cells(Chat_Line, "A") & vbCrLf & vbCrLf & "Retain?", _
               vbYesNoCancel, "What about this line?")
      If NotSure = vbNo Then .EntireRow.Delete
      If NotSure = False Then Exit Sub
   End With
End Sub

December 6, 2009 at 10:18:54
I didn't know that Excel could accept code like that. Is that VBscript? I'm no programmer but I'd still like to know.

I will test this out and let you know how it goes. But if it works like you say it does, it'll save me a mess of time. Thanks very much for your work. Thanks DerbyDad03

December 6, 2009 at 10:35:10
This is my first time trying to run a script from within Excel (I'm using Excel 2003) so I'm going to tell you exactly what I did, so let me know if I'm using it incorrectly.

Startup Excel 2003, widen and heighten cell A1 on a new worksheet.

Copy and paste a portion of chat log in A1

Right click Worksheet1 -> View Code -> copy/paste your code into the blank window that comes up next.

From the drop down menu, select Run -> Run Sub/Userform

Excel brings back to the foreground view of cell A1, and pop up box asks who the DM's name is, I enter "Widersinn" (no quotes) and hit enter

All text clears completely out of cell A1, it is then blank.

I tried this twice, same result.

Report •

December 6, 2009 at 10:38:34
Also strangely enough after the cell is blanked out, the cell keeps the width I widen it to, but the cell heighth goes back to default.

December 6, 2009 at 11:35:36
I'm not sure how you are doing your paste, but each line should be in it's own cell.

When I copied and pasted your example chat session from Response #2, I opened Excel, selected A1 and did a paste directly into the cell.

Each line was pasted into it's own cell.

A1: [18:05] <@Lorelai> Is that something you...
A2: [18:06] <@Widersinn> Lat time, the group was...
A3: [18:07] <Grennik> (( I asked Widersinn...

If you are pasting the data into the formula bar, then it will probably end up all in A1. In that case, the code would indeed delete everything since it is only checking one cell. As soon as it finds any reason to delete the line, it will delete whatever is in that single cell.

Report •

December 6, 2009 at 12:06:34
Thanks so very much, that worked just like you said it would.

If you don't mind ... a few additions to the code, and I'm in no hurry for it mind you also.

all lines from GameServ should be retained

all lines from MightyMarvel should be retained.

all lines beginning with (`roll) and with (`mm) should be retained, without the parentheses.

If those can be added, this little program will be ideal.

December 6, 2009 at 12:13:58

To reiterate, GameServ is a IRC server bot we use for dice rolls.

MightyMarvel is another bot we use for random determinations in Marvel Super Heroes RPG

We use those with the `roll and `mm commands, respectively.

When someone uses one of those commands, the bot responds. I want to keep the command, and then the bots response. Here's an example of them both for you again:

[20:00] <@GMDog> `roll 1d100
[20:00] <@GameServ> GMDog rolled 1d100: 42 <Total: 42>

[20:09] <+Nitro9> `mm Ex
[20:09] <+MightyMarvel> Nitro9 rolls [ Excellent: 20%, White ]

December 7, 2009 at 05:27:16
Take a look at the code I offered.

You'll see where I use .FIND to find the required strings.

Substitute or add the newly required strings for the ones in the code and things should work fine.

December 7, 2009 at 06:16:51
Alright just to clarify:

'If line contains a * or " or the DM name, do not delete it
If Not .Find(DM) Is Nothing Or _
Not .Find("`roll") Is Nothing Or _
Not .Find("`mm") Is Nothing Or _
Not .Find("""") Is Nothing Or _
Left(Cells(Chat_Line, "A"), 1) = "*" Then GoTo Chk_Nxt

The additions I've made above to your code, would handle it?

I have to ask because I have no idea what I'm doing really.

December 7, 2009 at 07:21:34
It works I tell you, it works. Thanks DerbyDad03

December 7, 2009 at 09:56:38
Not bad for a rookie! ;-)

