Script to translate multi-line text to XLS

Microsoft Windows xp professional w/serv...
March 26, 2010 at 11:49:28
Specs: Windows XP, Dell/2gb
I have a text file report that I need to translate to XLS. There are 2 lines of data for each record that I need to concatenate into one line and then extract the data to map to an XLS excel spreadsheet. See sample below. I want to automate this process as each text file contains 124 pages of data that I need to remap to Excel.

Token Report Date: 02/12/2010 08:20:19
All Tokens Page: 1 of 124


Serial No. User Name Token Status Last Login
-->Token Type/Auth with/Algorithm Start Date Shutdown Date

000030470765 Bill Ding Enabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470766 Joe South Enabled 01/21/2010 19:25:17
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00

Thanks in advance.


See More: Script to translate multi-line text to XLS

Report •


#1
March 26, 2010 at 16:17:34
@echo off & setlocal enabledelayedexpansion
:: put outer for-loop here in order to do multiple files:
for %%y in (*.log) do (
set /a cc=0
for /f "tokens=*" %%z in (%%y) do (
set /a cc+=1
if cc equ 1 (
set xx=%%z
set xx=!xx:,=_!
set xx=!xx: =,!
) else (
for /f "tokens=1-14 delimiters=/:" %%a in ("%%z") do (
set xx=!xx,%%a,%%b,%%c,%%d/%%e/%%f,%%g:%%h:%%i,%%j/%%k/%%l,%%m:%%n
)
>> %%~ny.csv echo !xx!
set /a cc=0
)
)
)
::----- end ------
I had a whole slew of questions but i decided to write a trial-balloon first.
I assumed you wanted csv file for import (as opposed to TABs, which are kind of a pain)
I made other assumptions that are also obvious, so i'll wait for feedback on how many
of them were wrong, then we or you can go from there. it's not tested...
(one especially glaring: firstname lastname are put into two different fields. to put them
into one will require a tad more effort but not much, and i replaced comma with underscore
in that status field)

Report •

#2
March 26, 2010 at 18:54:16
get gawk for windows(see my sig), then use this one liner to join your lines

c:\test> type file
Serial No. User Name Token Status Last Login
-->Token Type/Auth with/Algorithm Start Date Shutdown Date
000030470765 Bill Ding Enabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470766 Joe South Enabled 01/21/2010 19:25:17
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00

c:\test> gawk "{printf $0;getline;print}" file
Serial No. User Name Token Status Last Login-->Token Type/Auth with/Algorithm Start Date Shutdown Date
000030470765 Bill Ding Enabled,NewPIN 01/01/1986 00:00:00-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470766 Joe South Enabled 01/21/2010 19:25:17-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00

GNU win32 packages | Gawk


Report •

#3
March 29, 2010 at 11:41:14
Neither script worked. The original file has headers and page numbers that have to be removed and then the remaining two lines/each record combined to one line, then the file converted to CSV file or XLS file format. see below:


Token Report Date: 02/12/2010 08:20:19
All Tokens Page: 1 of 124


Serial No. User Name Token Status Last Login
-->Token Type/Auth with/Algorithm Start Date Shutdown Date

000030470765 Bill Ding Enabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470766 Joe South Enabled 01/21/2010 19:25:17
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470767 Jack Frost Enabled 02/12/2010 13:18:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470768 Pamela Bowler Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470769 Sandolph Jones Enabled,NewPIN 01/20/2010 20:25:44
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470770 Jessica Rabbit Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470771 Emily Carebear Disabled,NewPIN 01/19/2010 20:47:16
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470772 Mickey Mouse Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00

Token Report Date: 02/12/2010 08:20:19
All Tokens Page: 2 of 124


Serial No. User Name Token Status Last Login
-->Token Type/Auth with/Algorithm Start Date Shutdown Date

000030470773 John Jabes Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470774 Sloopy Sales Enabled 02/11/2010 19:27:13
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470775 Rubber Bandman Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470776 Geri Atrics Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470777 Philip Mycar Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470778 Liberty Valence Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470779 John Wayne Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00
000030470780 Jeb Stuart Disabled,NewPIN 01/01/1986 00:00:00
-->Key Fob/Passcode/AES 03/16/2006 00:00:00 08/31/2010 00:00

Token Report Date: 02/12/2010 08:20:19
All Tokens Page: 3 of 124


Report •

Related Solutions

#4
March 29, 2010 at 14:09:29
NBRANE:

Here is the output from running your script:

delimiters=/:" was unexpected at this time.
about 124 times - the same number of pages in the output.
C:\Program Files\GnuWin32\bin>


Report •

#5
March 29, 2010 at 16:41:38
I've encountered that before, but not consistantly, where delimiters balks at the colon up against the quote. Some funky bug that MS never squashed. try swapping the delimiters (which i should have done in first place):
(also adding skip=, set that to ever how many header-lines need to be skipped)
for /f "skip=6 tokens=1-14 delimiters=:/" %%a in ("%%z") do (

i will try it on this end to test, after some coffee


Report •

#6
March 29, 2010 at 17:11:26
then this should do it

c:\test> gawk "!/Token Report/&&!/All Tokens/&&NF{printf $0;getline;print}" file

read the gawk documentation if you want to learn how to use it.

GNU win32 packages | Gawk


Report •

#7
March 30, 2010 at 09:11:23
NBRANE:

Same error results. no difference.

ghostdog:

results still leave headers of each page but concatenate the two lines per user into a single line. I still need to remove the header lines from each page, and then export to CSV file for reading into Excel.


Report •

#8
March 30, 2010 at 10:39:12
You know what might be easier? To just write a VBA macro to import that data for you, instead of making and formatting a CSV file for Excel to import. At the very least, it simplifies intermediate steps. (Also, I am a horrible VBA programmer.)

Sub importFile()
  Const filePath As String = "C:\SomeFile.txt"
  Dim cell As Range
  Set cell = Workbooks.Add.Worksheets(1).Cells(1)
  
  With CreateObject("Scripting.FileSystemObject").OpenTextFile(filePath)
    Dim line As String
    Dim cnt As Integer
    Do Until .AtEndOfStream
      'Skip headers (Next 7 lines)
      cnt = 0
      Do Until .AtEndOfStream Or cnt = 7
        .ReadLine
        cnt = cnt + 1
      Loop
      
      'Break on Ctrl-L (Form feed char)
      Do Until .AtEndOfStream
        line = Trim(.ReadLine)
        If line = Chr(12) Then _
          Exit Do
        cell.Value = line & " " & Trim(.ReadLine)
        Set cell = cell.Offset(1, 0)
      Loop
    Loop
  End With

  'Now split data across columns
  With cell.EntireColumn
    .TextToColumns .Cells(1), xlDelimited, , True, , , , True
  End With
End Sub

EDIT: I win by 2 seconds. :P
VVVVVVVVVVVVVVVVVVV


Report •

#9
March 30, 2010 at 10:39:14
sorry, lots of errors when finally tested. hopefully fixedem:
@echo off>test.csv & setlocal enabledelayedexpansion
for %%y in (*.txt) do (
findstr /i /v "token" %%y>temp.fil
set /a cc=0
for /f "tokens=*" %%z in (temp.fil) do (
set /a cc+=1
if !cc! equ 1 (
set xx=%%z
set xx=!xx:,=_!
set xx=!xx: =,!
) else (
set test=%%z
set test=!test: =,!
set test=!test:--^>=!
set xx=!xx!,!test!
>> test.csv echo !xx!
set /a cc=0
)
)
)
::------ end -----
i had to go on the assumption that the word "token" does not appear anywhere in any of the data-blocks, and that lines don't end with a space (the comma inserted between xx and test right before output). the temp file eliminates a lot of headache trying to count lines.

Report •

#10
March 30, 2010 at 18:35:30
Lol! yeah, i was gonna annotate "crossed in the mail" but i guess that was rather obvious! photo-finish your horse won( with points for style, too!).

Report •

#11
March 31, 2010 at 06:33:58
Nbrane:

Your script worked on the test file, but on the actual file that contains 124 pages of data, the output was scrambled all over the excel spreadsheet. Sample below. Not sure if there are embedded spaces or special characters involved.

02470765 Rich Meeks Enabled_NewPIN
02470766 Andrew Buzzard Enabled
02470767 Kirk Job
02470768 Pamela Bounce Disabled_NewPIN 1/1/1986
02470769 Matthew Napkin Enabled_NewPIN


Report •

#12
March 31, 2010 at 06:37:17
Razor:

This script worked. But how do I add Header Rows into this file? Otherwise, that was good.

Ghostdog:

Same question. This was ok, but need to remove all the page numbers, blank lines and keep the headers for the first line. How do I do that with GAWK/SED? your script is close.


Report •

#13
March 31, 2010 at 07:06:08
magickron:

I looked at your sample input but the formatting is all so messed up for me, i don't understand the output that you want. You can try using pastebin to upload a sample of your file, and show the output you want.

GNU win32 packages | Gawk


Report •

#14
March 31, 2010 at 07:30:12
magickron: how do I add Header Rows into this file?
PROTIP: Give the programmers/scriptwriters the FULL list of requirements up front. You'll get what you want, and we'll be happy.

Here's the version that adds headers. As an added bonus, I made it faster.

Sub importFile()
  On Error GoTo importFile_End
  Application.ScreenUpdating = False
  
  Const filePath As String = "H:\SomeFile.txt"
  Dim cell As Range
  Dim headerLine As String
  Set cell = Workbooks.Add.Worksheets(1).Cells(6, 1)
  
  With CreateObject("Scripting.FileSystemObject").OpenTextFile(filePath)
    Dim line As String
    Dim cnt As Integer
    
    'QnD hack to keep first set of headers
    headerLine = .ReadLine
    cnt = 0
    Do Until .AtEndOfStream Or cnt = 6
      .ReadLine
      cnt = cnt + 1
    Loop
    
    Do Until .AtEndOfStream
      'Break on Ctrl-L (Form feed char)
      Do Until .AtEndOfStream
        line = Trim(.ReadLine)
        If line = Chr(12) Then _
          Exit Do
        cell.Value = line & " " & Trim(.ReadLine)
        Set cell = cell.Offset(1, 0)
      Loop
     
      'Skip headers (Next 7 lines)
      cnt = 0
      Do Until .AtEndOfStream Or cnt = 7
        .ReadLine
        cnt = cnt + 1
      Loop

    Loop
  End With

  Dim s As Variant
  'Split data across columns & Add Top Header Line
  With cell.EntireColumn
    .TextToColumns .Cells(6), xlDelimited, , True, , , , True
    .Cells(1).Value = headerLine
    Set cell = .Cells(4)
  End With
  
  'Add Headers
  For Each s In Array("Serial No.", "User", "Name", "Token  Status", _
                      "Last", "Login", "-->Token Type/Auth", _
                      "with/Algorithm", "Start", "Date", "Shutdown", "Date")
    cell = s
    Set cell = cell.Offset(0, 1)
  Next 's
  
importFile_End:
  Application.ScreenUpdating = True
  If Err Then _
    MsgBox "0x" & Hex(Err) & " " & Err.Description
End Sub


Report •

#15
March 31, 2010 at 08:33:53
Razor:

Here is the output from the last VBA script. Output did not align uniformly in the Excel spreadsheet. Output also did not remove all header and blank lines in the spreadsheet.

I wanted one header line in the report, followed by all the data without any breaks or blank lines.

Serial No. User Name Token Status Last Login

02470765 Rich Mann Enabled,NewPIN 1/1/1986 0:00:00
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470766 Alex Trebec Enabled 1/21/2010 19:25:17
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470767 Kahn Job Enabled 2/12/2010 13:18:00
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470768 Pamela Finger Disabled,NewPIN 1/1/1986 0:00:00
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470769 Matthew Naptol Enabled,NewPIN 1/20/2010 20:25:44
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470770 Jennifer Hackness Disabled,NewPIN 1/1/1986 0:00:00
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470771 Anne Marie Danville Disabled,NewPIN 1/19/2010 20:47:16
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00 02470772 Louis Creamer Disabled,NewPIN 1/1/1986 0:00:00
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010 0:00
Token Report Date: 2/12/2010 8:20:19 All Tokens Page: 2 of 124


Report •

#16
March 31, 2010 at 08:51:33
That's odd. Sure you copy/pasted the code right? This is what it produces for me w/ the test data:
Token Report Date: 02/12/2010 08:20:19											


Serial No.	User	Name	Token  Status	Last	Login	-->Token Type/Auth	with/Algorithm	Start	Date	Shutdown	Date

30470765	Bill	Ding	Enabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470766	Joe	South	Enabled	1/21/2010	19:25:17	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470767	Jack	Frost	Enabled	2/12/2010	13:18:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470768	Pamela	Bowler	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470769	Sandolph	Jones	Enabled,NewPIN	1/20/2010	20:25:44	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470770	Jessica	Rabbit	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470771	Emily	Carebear	Disabled,NewPIN	1/19/2010	20:47:16	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470772	Mickey	Mouse	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470773	John	Jabes	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470774	Sloopy	Sales	Enabled	2/11/2010	19:27:13	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470775	Rubber	Bandman	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470776	Geri	Atrics	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470777	Philip	Mycar	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470778	Liberty	Valence	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470779	John	Wayne	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00
30470780	Jeb	Stuart	Disabled,NewPIN	1/1/1986	0:00:00	-->Key	Fob/Passcode/AES	3/16/2006	0:00:00	8/31/2010	0:00


Report •

#17
March 31, 2010 at 09:20:02
Razor:

I copied your code and loaded it as a macro to run in the excel spreadsheet, and when i run it, it provides the following data. I even used the test file and it came out the same way. What am I doing wrong?:

Serial No. User Name Token Status Last

32470765 Rich Mason Enabled,NewPIN
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010
-->Key Fob/Passcode/AES 3/16/2006 0:00:00 8/31/2010


Report •

#18
March 31, 2010 at 09:30:23
I don't know? It kinda looks like the line numbers are off. Where, exactly, does that test data start? Without evidence to the contrary, I assume it starts on the line "Token Report Date:", without any blank lines before it. If that's not the case, the script will be off for the first page.

Report •

#19
March 31, 2010 at 09:39:07
Razor:

I copied my original file token.rpt into a txt file with notepad, and then used your VBA code in Excel. The results came out better, but the token number and date pages still show up. Otherwise, this would work.

Token Report Date: 2/12/2010 8:20:19

Serial No. User Name Token


Report •

#20
March 31, 2010 at 10:45:30
It appears we've hit something of an impasse because my tests are turning out fine. If you trust some random guy on the Internet, PM me and I'll give you an email you can use to send the token.rpt file, and I'll try to find the discrepancy.

Report •


Ask Question