I've done some testing with splitting of files, using the SPLIT command, but it does not seem to work properly. I'll post the exact commands, but it comes down to this : - splitting a TAR.GZ file into files just less than 2G, by using split
- recreating them again using cat, keeping the alphabetic order of the filenames in mindBoth actions do their work, but then when un-gzipping, he's stating the file is corrupt.
Should work, no ?
Using this to re-construct: 1. deleting env001.tar.gz
2. (there are 10 files complying to this, from .000 to .009)
for f in `ls -l env001.tar.gz.* | sort`
do
echo $f
cat $f >> env001.tar.gz
done3. gzip -d env001.tar.gz
tar and zip files are binary files - not ASCII. IMO, I don't think you will have much luck splitting them with the unix/linux split command.
Of course they are binary, but I don't see the issue ...splitting a file involves taking it apart, then recreating it. No matter what the content is, no ? I'll check the man pages on SPLIT, but IMO there is no such restriction.
And if there is, how the f___ do you have to split a file on linux/unix ? By that I mean any file, not a text file only (which is useless on itself, I mean, a tool which can only split text-files, I would consider that rather silly)
Ow wait ... it's the CAT command that does not know how to handle binary files !? Problem is not with the SPLIT command ...
Check the resulting files; I assume there are a few \n are sneaking in there some how.
tvc: As Razor touched on - and as you are finding out - it's not a case of spltting a binary file - it's glueing it back together.
I don't have experience with this command, but since you are trying to split tar files, you might want to check out split-tar:
http://www.informatik-vollmer.de/so...
I may have a test, but I like to stick with original commands ...
does cpio not work?
Forgot about that one (cause I dont like that tool) ... does it have spanning ? (if not, there's no point using it)
I thought it did when i first viewed the manual, but on review, it appears not. the -C option looked like it would chunk the file, but it doesn't seem to work that way. sorry...
Doesn't tar have a chunking option? If so, can't you just tar your gzip'd tar ball? Apperently GNU tar does have such an option; it's -L, --tape-length N, where N is in kilobytes. Not sure if it sticks a number in there somewhere, or if you need an actual removable drive.
I tried something that looked like that, but it resulted in the action being paused, and asking me to change the tape, then to continue to write the next chunk to the same named file (since he thinks it's a tape, but it is not)
Could you run tar in the background, and manually rename the file? Sounds like it'd be a PITA, though.
Yes ... something like that should be possible. I'll have a test that way, the only other (completely different) idea I have is to convert to HEX, then use CAT ... but I havent found time yet.
edit: the split option in TAR (using --tape-length) does work, but :
- you have to manually come in between, or find a way to script it (something is telling me that would mean re-inventing the wheel)
- you cannot use the --gzip option as he states you cannot combine --gzip and --tape-length (this would be necessary since the splitting may not be done BEFORE the compression ... if you split whilst using TAR, it MUST be compressed already, otherwise ... aaargh)And, when recreating, it seems to work with the ... well, forgot the option name ... option.
;)
Manually as well, of course.
It's really bizarre that *nix doesn't have a "native" command to just paste files together. even "dos" can do that!
(copy a + b + c dd.out)
I fished around and did find "paste" (imagine that!) and it might work with the -s -d \0 options, but it still wants to put a single linefeed at the end of the very last line. Tail might remove that.
(edit: "tail" won't but "head -c -1" will i think.)
(edit again: i tried it out and it bombed, but i tried it with CAT and it seemed to work ok.
i split up a .zip file into 4 10000 byte chunks, then "catted" them back together, downloaded and did pkunzip -v and it did not complain. something else might be going on with your process...)
It may depend on the content of those files ... this is what I get: [root@localhost ~]# /root/pakuiteen
env001.tar.gz.000 - Tue Mar 16 04:12:27 CET 2010
env001.tar.gz.001 - Tue Mar 16 04:14:45 CET 2010
env001.tar.gz.002 - Tue Mar 16 04:17:10 CET 2010
env001.tar.gz.003 - Tue Mar 16 04:19:36 CET 2010
env001.tar.gz.004 - Tue Mar 16 04:22:03 CET 2010
env001.tar.gz.005 - Tue Mar 16 04:24:32 CET 2010
env001.tar.gz.006 - Tue Mar 16 04:27:06 CET 2010
env001.tar.gz.007 - Tue Mar 16 04:29:49 CET 2010
env001.tar.gz.008 - Tue Mar 16 04:32:20 CET 2010
env001.tar.gz.009 - Tue Mar 16 04:34:40 CET 2010
env001.tar.gz.010 - Tue Mar 16 04:37:06 CET 2010
Tue Mar 16 04:37:36 CET 2010gzip: env001.tar.gz: invalid compressed data--format violated
Tue Mar 16 05:25:03 CET 2010
You have new mail in /var/spool/mail/root
[root@localhost ~]#
To create the GZ file from splitted files: #!/bin/bash
tarfile=env001.tar
if [ ! -f ${tarfile}.gz.* ]
then
echo
echo "ERROR - missing sourcefiles"
echo
exit 1
fiif [ -f ${tarfile}.gz ]
then
echo
echo "INFO - The target file ($tarfile) already exists"
echo
exit 1
fifor f in `ls -1 ${tarfile}.gz.* | sort`
do
echo $f - `date`
cat $f >> ${tarfile}.gz
done
To extract the GZ file: #!/bin/bash
tarfile=env001.tar
if [ ! -f ${tarfile}.gz ]
then
echo
echo "ERROR - Missing file ${tarfile}.gz"
echo
exit 1
fiif [ -f $tarfile ]
then
echo
echo "INFO - Extracted file (${tarfile}) already exists."
echo
exit 1
fidate
gzip -d ${tarfile}.gz
date
Found what the problems were with the above : - CAT incorrectly shows some characters, so if your file contains these characters, it will cause the resulting file to not be correct
- when splitting (so, this is NOT a problem of CAT, this is a problem of split) you have to make sure that the split does not occur in the middle of the line, or that CAT command (used in the way as shown above) will introduce extra line returns, which corrupt the content as well. Maybe (but not tested) you can use CAT to overcome this issue, but I've opted for the solution to split the files EXACTLY at the end of a line, not in the middle (which means : a random place)
Yes (14) | ![]() | |
No (14) | ![]() | |
I don't know (15) | ![]() |