The API could even be a more modern pointer+length interface rather than null termination, to sidestep that class of mistakes/exploits (CWE-170).
https://www.daviddeley.com/autohotkey/parameters/parameters.... is a great read on how fragmented this all seems to be.
> Spaces embedded in strings may cause unexpected behavior; for example, passing _exec the string "hi there" will result in the new process getting two arguments, "hi" and "there".
https://learn.microsoft.com/en-us/cpp/c-runtime-library/exec...
Oh yeah, pass those arguments as a list, then we'll completely ignore that and fuck your shit up. Err, I mean you need to quote them! Even though they are passed as separate arguments.
I fail to see how this would help. If i understand correctly, the issue is how cmd.exe interprets the args, not how the args get to it.
- Create `CreateProcessArgv`, a version of `CreateProcess` that takes `argv` rather than `lpCommandLine` (like `execv*`)
- Create `GetCommandLineArgv`, an alternative to `GetCommandLine` that returns an `argv`
- Create `ProcessCreatedWithArgv` so a program can prefer either `GetCommandLine` or `GetCommandLineArgv` (for compatibility with those that have their own quoting, such as cmd)
Then child processes can use `GetCommandLineArgv` with no overhead if the parent invoked with `CreateProcessArgv`, otherwise `CreateProcess` and `GetCommandLine` will continue to work with no overhead. There would be a compatibility layer in the kernel to either split `lpCommandLine` or quote `argv` for `CreateProcess`+`GetCommandLineArgv` or `CreateProcessArgv`+`GetCommandLine` combinations. Probably need a way to opt out of taking `lpCmdLine` in `WinMain`.
Seems not-impossible, but also a bit of a pipe dream...
This "usual escaping mechanism" is a bit of a weasel word. Windows passes a single null-terminated character string to a process. Every application run-time must parse that into arguments itself.
I think what "usual escaping mechanism" refers to is the algorithm implemented in the Microsoft Visual C Run Time which takes the command line string and produces a char *argv[] for the main function.
There is no telling what uses that exact algorithm and what doesn't. Programs built with Microsoft languages probably do; obviously VC and VC++.
https://learn.microsoft.com/en-us/windows/win32/api/shellapi...
https://learn.microsoft.com/en-us/cpp/cpp/main-function-comm...
Also, here is another implementation of this algorithm in C#, used in .NET:
https://github.com/dotnet/runtime/blob/main/src/libraries/Sy...
On Linux the parsing is done on the caller's side of the interface, so the caller knows what quoting rules to apply – could be Python's rules if they're using Python to construct the argument array, bash's rules if they're using bash, etc.
On Windows, the parsing is done on the receiver's side of the interface, so the caller can't know how it's supposed to be quoted unless they have special knowledge of a specific receiver and its parsing rules.
I could see how allowing the user to whitelist individual scripts would make sense, but as far as I can tell that's not how it works? A blanket policy of "all scripts are forbidden unless wrapped with fragile and shady-looking hacks" doesn't seem particularly useful.
But I guess here we have some of the underlying problem.
If something just executes whatever you throw at it, people complain.
If something doesn't just execute whatever your throw it, people complain as well. ;)
Sadly some software I use is so old that the only way to call Powershell scripts is via a batch script...
> bad, but not the worst
For example imagine that I have a shell script to write an entry to a guestbook. Maybe I call it from my webapp like this:
# webapp.py
subprocess.run(['guestbook', untrusted_msg])
On Linux this is perfectly fine. I can then write my guestbook script like #!/bin/bash
echo "$1" >> guestbook.txt
As far as I am aware there are no security issues here. The user can pass whatever they want as the message and other than some mess in the `guestbook.txt` file they can't cause any harm.However this doesn't work well on Windows because in order to escape the arguments you need to know how the `guestbook` program parses its arguments. Right now basically all languages assume that the caller will use `CommandLineToArgvW`. However if `guestbook` is a batch file a different parsing mechanism is used and remote code execution can occur before the batch script even starts executing.
Basically in order to properly escape the arguments the caller needs to know what is being called. The current APIs don't have a way to know this so they can't do it right in all cases.
I was just trying to write a simple batch script that accepted filenames as arguments and was surprised to find that there is no safe way to do so, as they're always passed through shell expansion, so if you have a filename like "foo %PATH% bar.txt" (which is allowed) the script will receive it with the PATH variable expanded and cannot get at the actual filename.
Also, passing arguments to programs is unsafe on Windows even if you don't go through the shell, because the quoting rules are entirely up to the program being invoked. The CreateProcess function[0] accepts a string, not an array, so you have to quote the arguments – but you can't do this quoting correctly unless you know exactly what program you're invoking and what grammar it has chosen for parsing its lpCommandLine string.
The article mentions that "many programming languages guarantee that the command arguments are escaped properly", but there is no universal "escaped properly" on Windows. There is escaped properly for the C runtime's parser[1], or escaped properly for CommandLineToArgv[2] which parses "in a way that is similar to" the C runtime, or escaped properly for .NET which has its own set of rules[3] – but there is no guarantee that any particular program is using any of these ways; any program can use whatever rules it likes!
Raymond Chen has written[4] about this as well.
PowerShell has an interesting workaround[5] of sorts: If you specify "-EncodedCommand" and "-EncodedArguments" it lets you pass base64-encoded strings when you "require complex quotation marks or curly braces".
[0] https://learn.microsoft.com/en-us/windows/win32/api/processt...
[1] https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-c...
[2] https://learn.microsoft.com/en-us/windows/win32/api/shellapi...
[3] https://learn.microsoft.com/en-us/dotnet/api/system.environm...
[4] https://devblogs.microsoft.com/oldnewthing/20100917-00/?p=12...
[5] https://learn.microsoft.com/en-us/powershell/module/microsof...
Here's my try in ruby from a prior life: https://github.com/chef/mixlib-shellout/blob/main/lib/mixlib...
In practice, with the exception of `cmd.exe`, which is an old beast that cannot be redeemed due to backwards compatibility, there is a consistent way to round-trip argv to more-or-less all programs one encounters in the wild. It's not a guarantee and I'm sure you could find a program which does something weird, but you could find the same in the POSIX world. In both cases, we can probably agree that it's the mistake of the program that it's parsing arguments in a non-standard way.