Searching \ for '[EE] HELP NEEDED Tar, XP, and dir by dir backup' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=help+needed+tar
Search entire site for: 'HELP NEEDED Tar, XP, and dir by dir backup'.

Exact match. Not showing close matches.
PICList Thread
'[EE] HELP NEEDED Tar, XP, and dir by dir backup'
2008\06\11@142756 by Dr Skip

picon face
I'd like to backup some drives that are quite large (>50GB, some >200GB and #
of files >65k) on XP. I would like to back up to individual zip files for each
directory, AND split the zip if it's going to be greater than some size, ALSO
including any files in the root as a separate zip. Tools that understand and
operate on the Archive bit would be nice too. ;)

I've been trying tar and gzip from gnuwin32. Winzip and the like can't deal
with this - too manay files. Using FORFILES from the win2k resource kit, tar
worked OK, and gzip afterward (but not tar with -z option). However, FORFILES
had an error somewhere around the recycle bin so it may not handle
hidden/systems files well. Only half of a disk copied.

Using FOR in a bat file, I've tried this:

FOR /D %%G IN (*.*) DO "D:\bin-local\UNIX\usr\local\wbin\gzip.exe -c
"d:\%%G\*.*" > "k:\gz\%%G.gz""

Which is pretty much the same as what worked with the cmd line to execute in
FORFILES. The "" are for names with spaces. T get this:

D:\>"D:\bin-local\UNIX\usr\local\wbin\gzip.exe -c "d:\Temp.install\*.*" >
k:\gz\Temp.install.gz""
The system cannot find the path specified.

Same thing with tar and FOR. The paths do exist... I've tried all different
ways of adding "" and some show a mangled command line, but this one seems OK
and still fails. Same for tar -cvf.

There may be a case where some dir has more than 65k files (breaks zip), but
recursing every dir and zipping each would get out of hand. That's why I've
thought to go tar first.

Does anyone know of a tool or have a bat or cmd file that handles this? Seeing
as how this FOR command isn't working as it should is frustrating. I'm also
unsure why -z is listed in gnuwin tar, but fails.

TIA, Skip

2008\06\11@161815 by Tamas Rudnai

face picon face
Hi,

I think the quotation is wrong, so it should be like this:

FOR /D %%G IN (*.*) DO "D:\bin-local\UNIX\usr\local\wbin\gzip.exe" -c
"d:\%%G\*.*" > "k:\gz\%%G.gz"

Tamas


On Wed, Jun 11, 2008 at 7:27 PM, Dr Skip <spam_OUTdrskipTakeThisOuTspamgmail.com> wrote:

{Quote hidden}

> -

2008\06\11@165324 by Dr Skip

picon face
Thanks. I made the change but still no luck. I'm not sure why it isn't seeing
what is there...

D:\>"D:\bin-local\UNIX\usr\local\wbin\gzip.exe" -c "d:\Temp.install\*.*"
1>"k:\gz\Temp.install.gz"

d:\Temp.install\*.*: No such file or directory


I'm also try to figure out how to take a list of dirs and get tar/gzip to work
from that perhaps, but I'm don't seem to be getting my mind around it... :( I'd
like not to have to dir to a file, edit the file, then run it individually...


Tamas Rudnai wrote:
{Quote hidden}

2008\06\11@173001 by William \Chops\ Westfield

face picon face

On Jun 11, 2008, at 1:52 PM, Dr Skip wrote:

> D:\>"D:\bin-local\UNIX\usr\local\wbin\gzip.exe" -c "d:\Temp.install
> \*.*"
> 1>"k:\gz\Temp.install.gz"
>
> d:\Temp.install\*.*: No such file or directory

I haven't used gzip or the shell on windows very much, but...

Most (many?) unix utilities depend on the shell for wildcard  
expansion.  The quotes around "d:\tmp.install\*.*", or calling from  
the windows CMD processor rather than a unix shell, may suppress this  
expansion, leaving gzip thinking that it needs to compress a file  
literally called "*.*"

Perhaps you could use the recursion option (if it's present in the  
windows gzip):
> D:\bin-local\UNIX\usr\local\wbin\gzip.exe -r -c "d:\Temp.install"  
> 1>"k:\gz\Temp.install.gz"

BillW

2008\06\11@173950 by Ariel Rocholl
flavicon
face
IMHO this is one of these best examples to use WSH (Windows Scripting
Host) with either VBScript or JScript.

Assuming VBS, just create a MyBackup.vbs file, then iterate thorugh
all the folders you want, calling your compressor on each folder or
doing whatever you want. Simple batch files are really outdated for
these kind of things.

Introductory website http://support.microsoft.com/kb/188135/en-us ,
there is also a script clinic from MS website that may have already
something very close to what you need, just a minor change and voila!

HTH

2008/6/11 Dr Skip <drskipspamKILLspamgmail.com>:
{Quote hidden}

> -

2008\06\11@181434 by Dr Skip

picon face
Funny thing is, it worked as such in the FORFILES command, which just passes
the directory and runs it for each. Without putting quotes around it each, all
those stupid windows folders (and files) with spaces in them get lost. It only
takes the part before the first space char.

For instance, these both worked, except for the bombing out at what I think was
the "recycle bin":

d:\bin-test\FORFILES -pd:\ -s -m*.* -c"CMD /C if @ISDIR==TRUE
D:\bin-local\UNIX\usr\local\wbin\tar.exe -cvf "k:\@FILE.tar" "@FILE\*.*""

d:\bin-test\FORFILES -pk:\ -m*.tar -c"CMD /C
D:\bin-local\UNIX\usr\local\wbin\gzip.exe -fS .gz "k:\@FILE" "

<each double line is one line before word wrap>

WSH seems like overkill, not to mention a learning curve. It just seems like
this should work more easily...


William "Chops" Westfield wrote:
{Quote hidden}

2008\06\11@182406 by Tamas Rudnai

face picon face
With tar + gzip you can do the following under linux:

$ targ cvzf file.tgz dir/*

On Win, however, for some reason that does not work, so you have to have a
workaround, like this:

C:\> tar cv dir\*.* | gzip > file.tgz

...therefore, the following works for me fine:

C:\>FOR /D %G IN (*.*) DO tar cv "%G\*.*" | gzip > "BACKUP_%G.gz"

If that does not work check what tar + gzip version you have, mine is:

C:\>gzip --version
gzip 1.2.4 (18 Aug 93)
Compilation options:
DIRENT SYS_UTIME STDC_HEADERS HAVE_UNISTD_H NO_CHOWN PROTO ASMV

C:\>tar --version
tar (GNU tar) 1.12

Copyright (C) 1988, 92, 93, 94, 95, 96, 97 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Written by John Gilmore and Jay Fenlason.


Regards,
Tamas




On Wed, Jun 11, 2008 at 9:52 PM, Dr Skip <.....drskipKILLspamspam.....gmail.com> wrote:

{Quote hidden}

> -

2008\06\11@183319 by Tamas Rudnai

face picon face
Oh, just realized i misspelled two things, in the linux example "targ"
should be "tar" of course, and the extension of the gzipped tar archive
should be either "tgz" or "tar.gz" instead of the "gz" for proper operation.

Tamas


On Wed, Jun 11, 2008 at 11:24 PM, Tamas Rudnai <tamas.rudnaispamspam_OUTgmail.com>
wrote:

{Quote hidden}

>> --

2008\06\11@184451 by Gerhard Fiedler

picon face
Dr Skip wrote:

> For instance, these both worked, except for the bombing out at what I think was
> the "recycle bin":
>
> d:\bin-test\FORFILES -pd:\ -s -m*.* -c"CMD /C if @ISDIR==TRUE
> D:\bin-local\UNIX\usr\local\wbin\tar.exe -cvf "k:\@FILE.tar" "@FILE\*.*""
>
> d:\bin-test\FORFILES -pk:\ -m*.tar -c"CMD /C
> D:\bin-local\UNIX\usr\local\wbin\gzip.exe -fS .gz "k:\@FILE" "

What helps often with this is creating little batch files, in this case for
the command that is run by the for or forfiles loop, and/or around tar,
gzip and the like. This gives you more freedom to massage the action and
the input to the executables. For example, you can not call gzip if the
directory is the recycle bin and possibly avoid the "bombing".

> WSH seems like overkill, not to mention a learning curve. It just seems
> like this should work more easily...

It's a bit ugly to access the file system, but I didn't find it difficult
when starting from an example.

Another possibility is to use an evaluation copy of the 4nt shell; I found
that many things that don't work well in cmd.exe do work there.

And still another question is why you need the files ziped... maybe another
option gives you what you need; like for example using NTFS compression on
the backup drive, since you're working on WinXP.

Gerhard

2008\06\11@231314 by Dr Skip

picon face
I've done this, and it may become intractable... It will work, one line at a
time, no quotes and no spaces in the dir name. Quoting the whole filespec tar
says it can't find it, but the error message looks like it parsed it right.
Quoting just the variable, as in "%1" in the bat file passes the quotes to tar,
and no quotes fails on dirs with spaces such as Program Files. Blasted Microsoft!!

Quoting the name with spaces and passing it to a batch file with %1 as
variable, with no quotes in the line, picks up the first " and the first word
of the name only, and no " on the bat command line only passes the first word...

Using tar on just a command line, no quotes around Program Files looks for just
d:\program and with quotes it fails, but the message looks correctly interpreted...

The backup media are FAT32 (easier to go cross-system) and is the way they
come. One loses a lot of space in the MFT if you have just a few large files
with NTFS, and then there's the older systems... FAT32 is just the best common
denominator. That means, no native compression, and some of these tar files
drop to half size when gzipped (default level too). There is also the 4GB file
size limit, so splitting may be necessary. I would anyway, at 4GB, so I could
do longer term copies to DVD as time permits, so there really isn't any loss in
making that a requirement.

It sure would be nice to tar or zip each folder on the disk, then combine them
into a zip of the top folder (so each top level dir is zipped into its own
file), all automatically for each machine it's run on, and splitting any
particular zip file over x MB. I can't be the first guy to want to do this... :-O



Gerhard Fiedler wrote:
{Quote hidden}

2008\06\11@232534 by Xiaofan Chen

face picon face
On 6/12/08, Gerhard Fiedler <RemoveMElistsTakeThisOuTspamconnectionbrazil.com> wrote:
> > WSH seems like overkill, not to mention a learning curve. It just seems
> > like this should work more easily...
>
> It's a bit ugly to access the file system, but I didn't find it difficult
> when starting from an example.
>
> Another possibility is to use an evaluation copy of the 4nt shell; I found
> that many things that don't work well in cmd.exe do work there.
>

Why not use Cygwin/Bash?

I am not that familiar with WSH but I think it can be of good use here.
Or the new Windows Powershell (code name Monad). It is available
for XP and Vista.

Xiaofan

2008\06\12@032231 by Tamas Rudnai

face picon face
Hi,

> d:\Temp.install\*.*: No such file or directory

Just woke up in the morning, and realized what could be the problem. For
some reason the GNU tar+gzip are parsing the parameters, so you can use the
C string like backslash encodings. Therefore the '\t' is a tab character...
Therefore the "d:\Temp.install\*.*" will seen as "d:\   emp.install\*.*" for
tar as '\T' from the "\Temp" will be replaced by tabulator... Weird and also
sad that when Mr Bill Gates took over the directory handling from Unix and
put it into the CP/M they reversed the slash (as slash was the parameter
separator on CP/M...

Anyway, thanks god if you start using normal slashes it uses slashes for the
entire path plus thanks god XP and Vista can handle slashes too. So this
should work:

C:\>FOR /D %G IN (*.*) DO tar cv "C:/%G/*.*" | gzip > "BACKUP_%G.tgz"

BTW: Did you try WinRar? I am using it for huge archives for backing up the
malware and threat samples and so far had no problem with that, plus it can
split up the archive into given size.

Tamas



On Wed, Jun 11, 2008 at 9:52 PM, Dr Skip <spamBeGonedrskipspamBeGonespamgmail.com> wrote:

{Quote hidden}

> -

2008\06\12@040342 by Nicola Perotto

picon face
If I understand the drive are in a/some linux machine (if not it doesn't
change).
Use rsync: http://en.wikipedia.org/wiki/Rsync (sorry but my english is
terrible: read wikipedia...)
And use, for target, an ntfs partition with compression enabled. Less
efficent to zip but faster. And you retain the entire disk structure.

- Install in the XP rsync server
www.itefix.no/phpws/index.php?module=pagemaster&PAGE_user_op=view_page&PAGE_id=6&MMN_position=150:150
- Install in linux rsync
http://rsync.samba.org/

If the source hdd is in a windows machine you can use also DeltaCopy, a
gui frontend for rsync:
http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp

Have fun
     Nicola


Dr Skip wrote:
{Quote hidden}

2008\06\12@072741 by Gerhard Fiedler

picon face
Dr Skip wrote:

> Blasted Microsoft!!

Well, it's you who wants to use partly ported Linux programs on Windows :)

("Partly ported" in that they don't seem to be adapted to handle
Windows-style quoting.)

I think if you want to use Linux ports of compression tools on Windows, it
probably makes sense to use a port of a Linux shell. This may give you less
trouble, as the quoting probably is more consistent.

Or you use a proper Windows zipper in a Windows shell. Did you try 7-zip
(is said to work also on the command line)? Or the WinZip command line
tool? I think using a zipper that understands Windows-style quoting goes a
long way to make this work.

> I can't be the first guy to want to do this... :-O

Maybe you are... I haven't yet zipped a whole disk, not in parts and not
otherwise.

Gerhard

2008\06\12@093653 by olin piclist

face picon face
Dr Skip wrote:
> Quoting just the variable, as in "%1" in the bat file passes
> the quotes to tar, and no quotes fails on dirs with spaces such as
> Program Files. Blasted Microsoft!!

This is not the shell's fault, but rather the fault of the program that
can't handle command line arguments quoted.  The shell passes quotes from
the command line to the target program just like it should.  A shell that
thinks it's clever about what you meant on the command line would only cause
a lot more trouble in other areas.  I'd rather have a dumb shell than one
that's too clever by a half.

Of course this particular case would go unnoticed if spaces weren't allowed
in file system pathnames, but that's a different discussion.


********************************************************************
Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products
(978) 742-9014.  Gold level PIC consultants since 2000.

2008\06\12@131843 by Dr Skip

picon face
Thanks to all who responded, it's helped me quite a bit. I did get one solution
working last night - enough that it might be the one (one line, wrapped):

7za.exe a -v4g -scsWIN -t7z "K:\gz\Backup of Profile Untitled made on Thursday,
12-Jun-2008.7z" -mx=5  -w"c:\TEMP\" @"K:\gz\Backup of Profile Untitled made on
Thursday, 12-Jun-2008~.txt"

That's 7-zip, telling it to split at 4GB, use the 7z format, work in the temp
dir, and take the filelist from the .txt file. I used a program called Abakt
that's essentially a front end that lets you pick the files and dirs you want
done and makes up the command line, and makes the file list to use from that.
Its command line params don't include the splitting on 7z (they do on the zip
side) so I added that. 7Zip also found it needed to be told which character set
my dirs were using... I picked Win from Win, DOS, Unicode and it seemed to have
about a dozen cases where it said it couldn't find the file listed, but it IS
there on the disk. That will need some looking into.

It isn't perfect yet, and as I write I'm seeing if I can get anything out of
these 4GB files.. ;) It takes a while - maybe 4GB isn't the right size for
each... I also don't have a way to decide which archive to look into, so
selective restore will be a pain.

The \t=tab idea has merit. I did see the dir name split one time, but I don't
remember which time. I do remember using \\ to escape the backslashes on one
try, but maybe I need to try all variations and \\.

I'll have to look at Rsync. I did years ago and the windows support/apps just
weren't reliable. I remember a lot of hangs. Maybe it's better now, or maybe
I'll be better at it. ;)

Trying to use partly ported UNIX apps could be the problem, but I tried it for
2 reasons. I used to use Unix a lot, even though I forget most of it now, and
it seems better suited for the task here - if I can get it to work. The other
is that I deal with a lot of executive level folks who challenge me to defend
this-or-that about Linux vs Windows (I must be the Linux whipping boy). My
approach to that is multi-tiered, and sometimes using a Unix tool in Windows
carries a lot of weight - especially with someone not willing to take a close
look at Linux itself. So, an elegant tar-by-directory solution, without making
it look complex, carries some 'opinion weight' potential, or is at least the
'thin end of the wedge'. Winzip/zip is limited to 65k files, so it fails early.
Last night's stats appear to be 61GB and 250k files+ compressed to 21GB and 6
files. Decompressing may be a problem though.

And lastly, I agree that it's a shared problem of quotes and spaces, but it's
really the spaces being allowed that started it all. I could maybe accept the
ability to use spaces in file and dir names, and since I refrain form using
them, would be OK, but there are so many Microsoft-defined dirs that have them
that it breaks any sort of elegant solution to have to rework all names before
backup. I believe they did this to offer convenience to novices AND frustrate
any Unix cross-pollination. I'm not sure which was the steak and which was the
gravy though! ;)

So far, no luck on decompressing with 7zip portable... it may be too ugly after
all.

BTW, I should mention a few other methods I've tried: Since a handful of files
will always be locked, a full backup will take booting with an alternative.
I've tried Knoppix to just copy the files and folders, but a byte compare
afterward always shows massive differences, even though the copy seemed to work
fine. I don't know if I could trust a tar from it until I know more about why.

I've used Bart PE, and it's the best so far in conjunction with XXcopy, however
there is no compression, and as you can see, I can get 3:1, which is a BIG
help, not to mention the 'size on disk' issue with having small vs large files.
AND, I can do things like make the archive read only and not affect the
internal files, where doing that to the file by file copy would mess up the
real bits.

I did get Cygwin last night too, but haven't installed it. I remember on
previous machines, where I use apps that bundle the cygwin.dll, that there
would be conflicts after installing the latest cygwin. I'll have to contain
that and see what happens.

So far, it looks like there isn't a nice way to list and select a file to get
from these big 7z files - just dump them all out. Not a good thing, so I'm
still looking. From a process point of view, the top level folders compressed
into a single file each is still the best idea, and preferably something winzip
or such can read and index. If anyone has any more ideas, I'm still not done.

Maybe using FOR with a command line winzip... I may need therapy after all
this... ;)

Thanks for all the the ideas so far.
-Skip



Tamas Rudnai wrote:
{Quote hidden}

2008\06\12@142658 by Nicola Perotto

picon face
If you use NTFS there isn't problem with 4 GB or more but if you need to
burn a dvd you may encounter some problem with file bigger than 2 GB
because you can do it only with the UDF file system and it is not so
well supported.
If 7-zip appears slow you can try to loose some compression and it will
be much faster.
   Nic

Dr Skip wrote:
{Quote hidden}

2008\06\12@144205 by Dr Skip

picon face
The older w98 systems can't work with NTFS, and one still has the wasted space
in the MFT. FAT32 is much more efficient for large files in this case. It seems
I have to uncompress everything from a 7z spanned archive. It isn't so much
compression level, which was medium, but 50GB is 50GB, and over USB... In
retrospect, 4GB is probably too big for any method.


Nicola Perotto wrote:
> If you use NTFS there isn't problem with 4 GB or more but if you need to
> burn a dvd you may encounter some problem with file bigger than 2 GB
> because you can do it only with the UDF file system and it is not so
> well supported.
> If 7-zip appears slow you can try to loose some compression and it will
> be much faster.
>     Nic
>
> Dr Skip wrote:
>

2008\06\13@091039 by Gerhard Fiedler

picon face
Olin Lathrop wrote:

> Dr Skip wrote:
>> Quoting just the variable, as in "%1" in the bat file passes
>> the quotes to tar, and no quotes fails on dirs with spaces such as
>> Program Files. Blasted Microsoft!!
>
> This is not the shell's fault, but rather the fault of the program that
> can't handle command line arguments quoted.  The shell passes quotes
> from the command line to the target program just like it should.  A
> shell that thinks it's clever about what you meant on the command line
> would only cause a lot more trouble in other areas.  I'd rather have a
> dumb shell than one that's too clever by a half.

Exactly. FWIW, I wouldn't call that "too clever", not even by a half :)
Quoted command line argument are simply standard in Windows. Any CLI
program meant to run on Windows should understand these, and strip the
(double) quotes appropriately.

That's what I meant with "partly ported": such a program seems to be
"ported" so that it runs on Windows, but using *ix CLI conventions. That
may be useful when you have a shell that also uses *ix CLI conventions, but
it doesn't do what it should in a Windows shell. So the solution for using
these programs without too many surprises is probably using a (also partly
ported :) *ix shell on Windows, like bash.

Skip, you said you installed cygwin. That's a biggie, and it seems to come
with its own set of integration problems. A smaller solution is MinGW
(MSYS); it seems to work well in the sense that it has most of the common
*ix CLI tools, but with less integration hassles.

Gerhard

2008\06\13@105340 by Dave Tweed

face
flavicon
face
Gerhard Fiedler wrote:
> Olin Lathrop wrote:
> > Dr Skip wrote:
> > > Quoting just the variable, as in "%1" in the bat file passes
> > > the quotes to tar, and no quotes fails on dirs with spaces such as
> > > Program Files. Blasted Microsoft!!
> >
> > This is not the shell's fault, but rather the fault of the program that
> > can't handle command line arguments quoted.  The shell passes quotes
> > from the command line to the target program just like it should.  A
> > shell that thinks it's clever about what you meant on the command line
> > would only cause a lot more trouble in other areas.  I'd rather have a
> > dumb shell than one that's too clever by a half.
>
> Exactly. FWIW, I wouldn't call that "too clever", not even by a half :)
> Quoted command line argument are simply standard in Windows. Any CLI
> program meant to run on Windows should understand these, and strip the
> (double) quotes appropriately.

Yes, on an archaic system like the Windows command-line processor (which
has distant roots in the original CP/M CCP), the application programs
must do some of the work of handling the "raw" command line. In particular,
having one ".BAT" file invoke and pass arguments correctly to another
".BAT" file is incredibly arcane. This is unfortunate, but a large number
of programmers have grown up under this system and don't seem to expect
anything better.

On a modern system (since the mid-1970s -- note the irony there), such as
any *ix shell, the shell itself correctly handles ALL of the command-line
issues, and passes arrays of fully parsed arguments and environment
variables directly to the application program. The application can be a
binary executable or an interpreted script (another instance of the shell,
or Perl, or ...) and it all Just Works.

> That's what I meant with "partly ported": such a program seems to be
> "ported" so that it runs on Windows, but using *ix CLI conventions. That
> may be useful when you have a shell that also uses *ix CLI conventions,
> but it doesn't do what it should in a Windows shell. So the solution for
> using these programs without too many surprises is probably using a (also
> partly ported :) *ix shell on Windows, like bash.

I'm not sure what stones you are trying to cast here. In what way would a
*ix shell be only "partly ported"? I've been running Cygwin bash here for
several years now, and have never run into any "porting" problems. Indeed,
I often get confused as to whether I'm using my local shell, or I'm ssh'd
to my web hosting provider (which is running FreeBSD), because, for all
practical purposes, the environments feel identical.

> Skip, you said you installed cygwin. That's a biggie, and it seems to
> come with its own set of integration problems. A smaller solution is
> MinGW (MSYS); it seems to work well in the sense that it has most of
> the common *ix CLI tools, but with less integration hassles.

Yes, Cygwin does have some integration problems, but not related to
command-line handling. I've seen coLinux promoted as an alternative way
to run *ix tools on a Windows host, but I haven't tried it myself yet.
See

  http://www.colinux.org/

... and this Circuit Cellar article provides a nice overview:

  http://www.dtweed.com/circuitcellar/caj00213.htm#3618

MinGW is a more restricted environment than Cygwin; it is oriented pretty
much around developing software only -- specifically, using GNU compiler
tools to produce Windows applications. Cygwin is a much more diverse
environment, supporting all of the GNU tools and a lot of other GPL
software as well.

-- Dave Tweed

2008\06\13@110051 by Dr Skip

picon face
I haven't finished the installation, just pulled it down, and just the shell
and a few other things. I'm still hesitating. I'll take a look at the MinGW
system. I'm not familiar with it yet. Thanks.

Gerhard Fiedler wrote:
>
> Skip, you said you installed cygwin. That's a biggie, and it seems to come
> with its own set of integration problems. A smaller solution is MinGW
> (MSYS); it seems to work well in the sense that it has most of the common
> *ix CLI tools, but with less integration hassles.
>
> Gerhard
>

2008\06\13@182716 by Gerhard Fiedler

picon face
Dave Tweed wrote:

> Yes, on an archaic system like the Windows command-line processor (which
> has distant roots in the original CP/M CCP), the application programs
> must do some of the work of handling the "raw" command line.

Are you sure about this? When was the last time you wrote a CLI program on
Windows where you had to handle the raw command line and didn't have access
to an array of parsed command line arguments? What language did you use?

(FWIW, there's nothing like "the" Windows command line processor. There are
several, just like for *ix. I only use cmd.exe when I need to provide
support for others... :)

> I'm not sure what stones you are trying to cast here.

I wasn't trying to cast any stones at all -- maybe that's why you're not
sure. I, OTOH, am not sure why you think I'm trying to cast any stones at
all.

Gerhard

2008\06\13@214256 by Dave Tweed

face
flavicon
face
Gerhard Fiedler wrote:
> Dave Tweed wrote:
> > Yes, on an archaic system like the Windows command-line processor (which
> > has distant roots in the original CP/M CCP), the application programs
> > must do some of the work of handling the "raw" command line.
>
> Are you sure about this? When was the last time you wrote a CLI program on
> Windows where you had to handle the raw command line and didn't have access
> to an array of parsed command line arguments? What language did you use?

I never have. YOU'RE the one who said that applications built to work with
Windows shells need to handle quoted arguments.

I assume that the command-line handling is hidden in run-time libraries
that come with the software development environments for Windows.

Pretty much all of the command-line software I write for Windows is done
either in Perl or gcc, which are the same thing, really, when it comes to
handling the command line.

But I gave up on the MS command line long ago, first with DJGPP bash, and
more recently with Cygwin bash. The only time I need to work with cmd.exe
(and nested .BAT files) is in Olin's build environment for PIC software.

-- Dave Tweed

2008\06\14@001217 by Gerhard Fiedler

picon face
Dave Tweed wrote:

> Gerhard Fiedler wrote:
>> Dave Tweed wrote:
>>> Yes, on an archaic system like the Windows command-line processor (which
>>> has distant roots in the original CP/M CCP), the application programs
>>> must do some of the work of handling the "raw" command line.
>>
>> Are you sure about this? When was the last time you wrote a CLI program on
>> Windows where you had to handle the raw command line and didn't have access
>> to an array of parsed command line arguments? What language did you use?
>
> I never have. YOU'RE the one who said that applications built to work with
> Windows shells need to handle quoted arguments.

May I quote you: "Yes, on an archaic system like the Windows command-line
processor (which has distant roots in the original CP/M CCP), the
application programs must do some of the work of handling the "raw" command
line." This made me assume you knew what you were talking about -- you sure
didn't just copy me :)

> I assume that the command-line handling is hidden in run-time libraries
> that come with the software development environments for Windows.

That's probably correct, and is probably the same as it is in *ix
environments.

>> On a modern system (since the mid-1970s -- note the irony there), such
>> as any *ix shell, the shell itself correctly handles ALL of the
>> command-line issues, and passes arrays of fully parsed arguments and
>> environment variables directly to the application program.

You said that in *ix environments, that's handled by the "shell". If you
mean the command processor (e.g. the bash shell), that would mean that in
*ix environments, you couldn't run a CLI application without going through
a command processor. At least in Windows, you can run a CLI application
directly, without a command processor (like cmd.exe), so the command line
argument handling has to happen elsewhere (for example in the C runtime).
I'm not really sure you're correct with your statement that in *ix this all
happens in the shell. I'm rather certain that it is possible to run an
executable on *ix without spawning a shell, so the command line handling
probably happens elsewhere.

> Pretty much all of the command-line software I write for Windows is done
> either in Perl or gcc, which are the same thing, really, when it comes to
> handling the command line.

So... I assume you do get your arguments as an array in both, right? How
does this relate to "the application programs must do some of the work of
handling the "raw" command line"?

Gerhard

2008\06\14@012927 by Christopher Head

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gerhard Fiedler wrote:
[snip]
| May I quote you: "Yes, on an archaic system like the Windows command-line
| processor (which has distant roots in the original CP/M CCP), the
| application programs must do some of the work of handling the "raw"
command
| line." This made me assume you knew what you were talking about -- you
sure
| didn't just copy me :)
|
|> I assume that the command-line handling is hidden in run-time libraries
|> that come with the software development environments for Windows.
|
| That's probably correct, and is probably the same as it is in *ix
| environments.

This is absolutely true in Windows. At the Win32 API level, a function
called GetCommandLine() passes you a single string. It's up to you to
break the string up into pieces. The fact that you see the conventional
argc/argv in main() is a function of the C runtime library: the library
calls GetCommandLine() and then splits up the result.

[snip]
| You said that in *ix environments, that's handled by the "shell". If you
| mean the command processor (e.g. the bash shell), that would mean that in
| *ix environments, you couldn't run a CLI application without going through
| a command processor. At least in Windows, you can run a CLI application
| directly, without a command processor (like cmd.exe), so the command line
| argument handling has to happen elsewhere (for example in the C runtime).
| I'm not really sure you're correct with your statement that in *ix
this all
| happens in the shell. I'm rather certain that it is possible to run an
| executable on *ix without spawning a shell, so the command line handling
| probably happens elsewhere.
[snip]

In Unix, the splitting of command-line arguments is, indeed, done by a
shell. The system call that allows a process to invoke a program file on
disk is execve(). You pass execve() not a command-line string, but an
ARRAY. That array becomes the argc/argv directly, with no translation at
all. You also must pass execve() the name of the program to execute,
separately. There is no searching of $PATH, you must pass a full
pathname. There is no command-line substitution, if one of the elements
of the array is "*.txt", then the invoked binary sees "*.txt" in argv.

There is a library function called system() which is specified by ISO C,
which accepts a command-line as a string instead of as an array. This
function is not a system call into the kernel, it's implemented in the C
runtime library by executing the shell and having the shell parse the
command line. If you put "*.txt" somewhere in the string, then the shell
will scan the directory for files matching the pattern and expand the
pattern. The last thing the shell does after doing all its expanding and
parsing is call execve() in order to run the actual target program.
Again, this is done by calling to the shell! In the Linux man page for
the system() function is this text:

"system() executes a command specified in command by calling /bin/sh -c
command"

so a call to system(foo) will end up with a call to execve({"/bin/sh",
"-c", foo}), which will end up with the shell starting up and splitting foo.

Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: GnuPT 2.7.2
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhTVzIACgkQiD2svb/jCb6ujwCdGa+cAX05RKUrKcs6/WqSqRr1
qEUAoIMtko8w/1tYrvw4zNtdp3g4cUZ7
=7cx3
-----END PGP SIGNATURE-----

2008\06\14@085555 by olin piclist

face picon face
Dave Tweed wrote:
>> Are you sure about this? When was the last time you wrote a CLI
>> program on Windows where you had to handle the raw command line and
>> didn't have access to an array of parsed command line arguments?
>> What language did you use?
>
> I never have. YOU'RE the one who said that applications built to work
> with Windows shells need to handle quoted arguments.
>
> I assume that the command-line handling is hidden in run-time
> libraries
> that come with the software development environments for Windows.

To clarify, Windows makes the whole command line available to the
application as one string.  You can do with it as you please.  The shell
does do some substitutions if you use special characters, but quotes are not
significant.

If you step back and think about it instead of just what you're used to,
this isn't such a bad idea.  I would rather the operating system not think
it knows how my program wants to interpret the command line.  Passing the
whole thing as a raw string is the most policy-free thing to do.  It's
trivial enough to have some application-space layer parse it into tokens if
that's how you want to view the command line.

Any competent application that is intended to run accross platforms wouldn't
make direct OS calls anyway.  You use some kind of abstraction layer that
presents the model your app wants to see.  That layer is rewritten per OS to
convert from the native interface to your portable interface.  My software
does this too.  The routine STRING_CMLINE_TOKEN gets the next command line
token.  The Windows implementation parses the next token from the raw
command line string, which includes stripping off enclosing quotes ("") or
apostrophies ('').  That's why I had totally forgotten about this
distinction for 15 years until it was brought up here a few days ago.  The
"" and '' is my convention, and I appreciate Windows not trying to enforce
something else.

Unix has a rather different concept of the command line, and this same
concept if also encoded in C by the definition of the call arguments to
MAIN.  Both these view the command line as a array of text string arguments.
While that's generally how command lines are used, it is more restrictive,
and I think a bad choice to enforce at the operating system and language
level, even if eventually you want to view the command line that way at
higher levels.

I'd be curious to know how the conversion is done on Windows in the C
runtime library to parse the command line into separate arguments that are
then passed to MAIN.  Is a outer set of quotes ("") stripped off?  What
about apostrophies ('')?  What, if anything, does the C standard say about
this?  If this is left to the implementation, then you really can't blame
Windows just because they chose a different but still complient approach.
If the standard really does allow both interpretations, then it's clearly
the app that is wrong if it assumes only possible implementation.

Just to verify, I wrote a dumb program SHOW all the command line arguments
as presented by STRING_CMLINE_TOKEN.  It worked as expected.  Here is the
result when the command was entered using the Windows CMD shell:

C:\olin\string>test_args abc "def" 'ghi' "'jkl'" '"mno"' 'pq''r' "st'u"
Arg 1: abc
Arg 2: def
Arg 3: ghi
Arg 4: 'jkl'
Arg 5: "mno"
Arg 6: pq'r
Arg 7: st'u


********************************************************************
Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products
(978) 742-9014.  Gold level PIC consultants since 2000.

2008\06\14@095908 by Gerhard Fiedler

picon face
Christopher Head wrote:

>|> I assume that the command-line handling is hidden in run-time libraries
>|> that come with the software development environments for Windows.
>|
>| That's probably correct, and is probably the same as it is in *ix
>| environments.
>
> This is absolutely true in Windows. At the Win32 API level, a function
> called GetCommandLine() passes you a single string. It's up to you to
> break the string up into pieces.

> In Unix, the splitting of command-line arguments is, indeed, done by a
> shell. The system call that allows a process to invoke a program file on
> disk is execve(). You pass execve() not a command-line string, but an
> ARRAY. That array becomes the argc/argv directly, with no translation at
> all. You also must pass execve() the name of the program to execute,
> separately. There is no searching of $PATH, you must pass a full
> pathname. There is no command-line substitution, if one of the elements
> of the array is "*.txt", then the invoked binary sees "*.txt" in argv.

Thanks for clearing this up. I think that in Windows, there is no
substitution in the arguments either when using CreateProcess(). It may
prepend a path to the executable name, though.

It may be here where there is a chance of not handling quoting of paths
correctly when porting an application to Windows.

> There is a library function called system() which is specified by ISO C,
> which accepts a command-line as a string instead of as an array. This
> function is not a system call into the kernel, it's implemented in the C
> runtime library by executing the shell and having the shell parse the
> command line.

FWIW, this is similar on Windows (not surprisingly, as this is how the
system() function is defined in the C language): a call to system() goes
through the shell defined with COMSPEC.

Gerhard

2008\06\14@103517 by olin piclist

face picon face
Gerhard Fiedler wrote:
> Thanks for clearing this up. I think that in Windows, there is no
> substitution in the arguments either when using CreateProcess(). It
> may prepend a path to the executable name, though.
>
> It may be here where there is a chance of not handling quoting of
> paths correctly when porting an application to Windows.

No, since the application doesn't call CreateProcess on Windows or some form
of exec on Unix.  The application receives what the system delivers on the
other end.  Neither OS does any interpretation or substitution on the
arguments passed to the CreateProcess/exec call.  This is a service the
shell provides for processes originating from a text command line.

The fundamental difference is that on Unix a process is passed a array of
string "arguments".  On Windows a process is passed a single string.  To
make it appear like you are passing a list of command line parameters on a
OS that uses the single string model, both the program that launches the
process and the launched program have to agree on a syntax to pass multiple
tokens in a single string.  The standard Windows command line shell
(CMD.EXE) does some parameter substitutions but otherwise passes the command
line to the process as is.  It is the process' job to interpret that as
tokens if that is the model it wants to present to the user.


********************************************************************
Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products
(978) 742-9014.  Gold level PIC consultants since 2000.

2008\06\14@130618 by Dr Skip

picon face
You guys know a lot more about command line parsing than I do! I've come
across another oddity perhaps someone could explain. Using the following:

FOR /F %%G IN ('dir /b /a:-d O:\PICS\*.*') DO echo %%G

I should get each filename echoed on the screen (a prelude to bigger and
better functions). The single quotes tell it to take the output of the
dir cmd. No matter where I put dbl quotes, and I've tried it everywhere,
any filenames with spaces drop everything after the first space.

Even
FOR /F %%G IN ('"dir /b /a:-d "O:\PICS\*.*""') DO echo "%%G"
causes it merely to quote the incomplete name.

7zip has the unfortunate habit of taking c:\my_name and interpreting it
as meaning ANY folder in the entire tree with that name. So, these
filespecs that are given to it broken become foldernames it looks for
everywhere. It's made for some interesting results, but the real targets
don't get what's due them.

Thanks again,
-Skip

2008\06\14@134543 by Dave Tweed

face
flavicon
face
Gerhard Fiedler wrote:
> Dave Tweed wrote:
> > On a modern system (since the mid-1970s -- note the irony there), such
> > as any *ix shell, the shell itself correctly handles ALL of the
> > command-line issues, and passes arrays of fully parsed arguments and
> > environment variables directly to the application program.
>
> You said that in *ix environments, that's handled by the "shell". If you
> mean the command processor (e.g. the bash shell), that would mean that
> in *ix environments, you couldn't run a CLI application without going
> through a command processor.

I see we have some problems in terminlogy here. I use the generic
definition of "shell", which is any program that interacts with a user
and allows him to run other programs. There are graphical shells and
text-based, or "command-line" shells.

Windows supplies command-line shells called "command.com" and "cmd.exe" by
default (plus the GUI shell/file browser called "explorer.exe"), and you
can run other shells from third parties, such as 4DOS/4NT and Cygwin bash.

> At least in Windows, you can run a CLI application directly, without a
> command processor (like cmd.exe), so the command line argument handling
> has to happen elsewhere (for example in the C runtime).

You can in fact run ANY application directly from another application, just
like you can run ANY application from the command line (text shell) or from
the Explorer (graphical shell), since they (the shells), for the most part
are just applications themselves. This is true on both Windows and *ix.
Pretty much any application has a command line, even the ones that you
think of a "GUI applications", but the details are mostly hidden from end
users by the GUI shells.

> I'm not really sure you're correct with your statement that in *ix this
> all happens in the shell. I'm rather certain that it is possible to run
> an executable on *ix without spawning a shell, so the command line
> handling probably happens elsewhere.

When you run an application directly, there is no "command line". The whole
concept of a command line is entirely defined by whatever shell you happen
to be using. If you really want command-line processing from within an
application, one way or another, you need to invoke the corresponding shell
to do it.

-- Dave Tweed

2008\06\14@143334 by Dave Tweed

face
flavicon
face
Olin Lathrop wrote:
> To clarify, Windows makes the whole command line available to the
> application as one string. You can do with it as you please. The shell
> does do some substitutions if you use special characters, but quotes are
> not significant.
>
> If you step back and think about it instead of just what you're used to,
> this isn't such a bad idea. I would rather the operating system not think
> it knows how my program wants to interpret the command line. Passing the
> whole thing as a raw string is the most policy-free thing to do. It's
> trivial enough to have some application-space layer parse it into tokens
> if that's how you want to view the command line.

No operating system, including Windows, imposes anything at all on the
interpretation of the command line; that's entirely up to the shell,
whether that shell is cmd.exe or bash.

It just so happens that the text-based shells that come with Windows take
the view that they are pretty much just there to invoke programs, and any
functionality beyond that is primitive at best. Programming in .BAT files
is a lot like programming using an old-style BASIC interpreter. As a
result, applications tend to be large and persistent, interacting directly
with the end user and not relying on any "features" of the shell.

On the other hand, the shells that come with UNIX-like operating systems
include a lot of useful functionality beyond invoking programs, and
programming them is more like programming in a functional language like
C. The application programs tend to be small, organized as building blocks
with relatively little direct interaction with the user.

The point is, you're free to pick the shell you like, pretty much on any
operating system, that implements the philosophy you prefer. Or create your
own.

> Any competent application that is intended to run across platforms
> wouldn't make direct OS calls anyway. You use some kind of abstraction
> layer that presents the model your app wants to see. That layer is
> rewritten per OS to convert from the native interface to your portable
> interface.

Well, any modern OS, including Windows, already offers a choice of portable
API layers. If you stick with POSIX, you can run on any *ix or Windows
platform (and a lot of others). If you like the "native" Windows API, you
can run on *ix, for example, by using an adaptation layer like Wine.

I know you like the level of control you get by inventing and implementing
your own API layer, but that isn't really the best choice for most
programmers.

-- Dave Tweed

2008\06\14@162047 by olin piclist

face picon face
Dr Skip wrote:
> You guys know a lot more about command line parsing than I do! I've
> come across another oddity perhaps someone could explain. Using the
> following:
>
> FOR /F %%G IN ('dir /b /a:-d O:\PICS\*.*') DO echo %%G

"delims="

********************************************************************
Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products
(978) 742-9014.  Gold level PIC consultants since 2000.

2008\06\14@162949 by olin piclist

face picon face
Dave Tweed wrote:
> The point is, you're free to pick the shell you like, pretty much on
> any operating system, that implements the philosophy you prefer. Or
> create your own.

It's not quite that simple.  There is a fundamental difference between
Windows and Unix in how information is passed from a launching process to
the launched process.  Different shells can dress this up differently on the
launching side.  The launched apps still have to deal with the inherent OS
differences on their end if they want to appear to work the same accross
different OSs and shells on those OSs.


********************************************************************
Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products
(978) 742-9014.  Gold level PIC consultants since 2000.

2008\06\15@080241 by Gerhard Fiedler

picon face
Olin Lathrop wrote:

> Gerhard Fiedler wrote:
>> Thanks for clearing this up. I think that in Windows, there is no
>> substitution in the arguments either when using CreateProcess(). It may
>> prepend a path to the executable name, though.
>>
>> It may be here where there is a chance of not handling quoting of paths
>> correctly when porting an application to Windows.
>
> No, since the application doesn't call CreateProcess on Windows or some
> form of exec on Unix.  The application receives what the system delivers
> on the other end.  Neither OS does any interpretation or substitution on
> the arguments passed to the CreateProcess/exec call.  

I'm not sure I made myself understood here. An application may call
CreateProcess; that's what it is for. And CreateProcess requires that the
first token, the program's path (may be an absolute or relative path) is
properly quoted (in the string passed to CreateProcess) if it contains
spaces, otherwise the OS just tries until it finds a matching executable,
or no more tokens are in the string.

Gerhard

2008\06\15@081332 by Gerhard Fiedler

picon face
Dr Skip wrote:

> You guys know a lot more about command line parsing than I do! I've come
> across another oddity perhaps someone could explain. Using the
> following:
>
> FOR /F %%G IN ('dir /b /a:-d O:\PICS\*.*') DO echo %%G
>
> I should get each filename echoed on the screen (a prelude to bigger and
> better functions). The single quotes tell it to take the output of the
> dir cmd. No matter where I put dbl quotes, and I've tried it everywhere,
> any filenames with spaces drop everything after the first space.

You can't put the quotes you need anywhere in this command line, as they
need to be in the output of dir (that is then handled by for). Using the
option delims of the for command Olin mentioned instructs for to put the
quotes around each line in the output from dir.

Gerhard

More... (looser matching)
- Last day of these posts
- In 2008 , 2009 only
- Today
- New search...