Solved How to strip invalid characters from SET /P user input

October 1, 2020 at 11:38:55
Specs: Windows 10, 6GB
Greetings. Before I get into the nitty-gritty, I want to say up front that I am nowhere near an expert batch file writer. I am high-level amateur at best. Thus, a lot of the methods I use to accomplish certain tasks in my batch files may not be the most optimal methods. But, when it comes to my personal use, I prioritize functional over optimal. Thus, please forgive the crudity of any code I might post in future replies, because even if it's not "the right way" to do it, I've learned to stick with it because "eh, it still does what I want it to".

I'm currently in the process of building a rudimentary inventory management system for myself using a .CMD batch file running under Windows 10 (release 20H1 as of the time of this post). The primary purpose of this batch file is to allow me to scan items into the computer using a barcode scanner, enter relevant information (description, price, etc), and then save it to a "database" (really just a folder where the data for each individual item is saved to). Then, whenever needed, it'll allow me to export the data for every single item into a single CSV file, which can then be imported into a separate Point-Of-Sale system.

The core functionality stated above is complete and working as intended. Now, I'm at the phase where I'm adding optional features of convenience, and fixes for potential problems. One of those latter fixes I want to add is user input sanitization. Two main reasons I want to add this:

-To prevent arbitrary command execution
-To filter out invalid filename/path characters when asking where to save a file

Basically, the ONLY characters I want the user to be able to enter is letters, numbers, decimals, dashes, spaces, and backslashes when asking for a filename/path. No other characters. Even though I'm likely the only person on the entire planet who will be using this script, I still wish to add input sanitization as a "just in case" feature.

My first thought was to pass every user input through a subroutine that simply runs multiple commands along the lines of "SET UINPUT=%UINPUT:x=%" (with "x" being the different illegal characters, one per iteration of the SET command). But, this obviously raises some issues with characters that require escaping, not to mention * which can't even be filtered using this method.

My next thought was to do something using the FINDSTR command. Perhaps if I can't filter out illegal characters, maybe I can at least use FINDSTR to trigger an alert and allow the script to jump back and ask for the input to be re-entered. The problem I've discovered with this method though is that FINDSTR only works on files, and not directly on variables (or if it does, I don't know how). Also, if I ECHO the variable out to a file for FINDSTR to run on, it still poses the risk of arbitrary command execution.

I know I can call certain PowerShell commands from a batch file, so I've wondered if perhaps there's something I can do with PowerShell to achieve this...feed the variable with the user input into a PowerShell script that'll then crunch through it, strip out the illegal characters (or at the very least trigger an alert that allows it to ask for the input again), and then pass it back to the batch file before the input is processed any further. I've not yet explored this in depth though, because my knowledge of PowerShell is almost nonexistent (I acknowledge its superiority over the ancient Command Prompt, but its perceived complexity is why I've stuck with regular batch files).

So, I was wondering if anyone else knew of a good way to remove (or at the very least detect and prevent execution of) unwanted characters in user input in a batch file?

Thank you.

message edited by lagrange.rsf


See More: How to strip invalid characters from SET /P user input


#1
October 1, 2020 at 21:21:09
This requires vbscript, testing prototype.
:: begin batch-script "filter.bat"
@echo off & setlocal
:1
set /p=enter: <nul
for /f "tokens=*" %%a in ('filt.vbs') do set xx=%%a
echo final result: %xx%
goto :1
::---------- end test-snippet batch-script

'begin vbscript "filt.vbs"
k=wscript.stdin.readline
bad="~`!@#$%^&*()_+=?<>|/,{}[]:;"
for i=1 to len(bad)
k=replace(k,mid(bad,i,1),"")
next
wscript.echo k
'----------- end vbscript "filt.vbs"

Batch just can't do it, and vbscript required some acrobatics. Batch couldn't even send the item to the vbs because of the possibility of double-quotes, which messed up the delivery. So the vbs has to handle collecting the input, but batch does the prompting. This still doesn't really address the entry of "garbage" filenames. Lots of things might qualify that you might not want in your directories, f/e: ..--\..\ \\. And, remember that . represents cwd, and two dots pwd, so that could send things where they're not meant to be. (in above example, it would go to the parent of the current directory and have a real mucked-up filename, if any). That sends us back to regexp to do pattern checking.
Without knowing more, can't help much more, but IF the subdir is supposed to already exist, the best bet would be to test for it exists, THEN take the filename and restrict it to [a-z0-9-]. First, applied to the entered data, separate the path from the filename, and
the extension from the filename, using %%~p and %%~n and %%~x, then analyze each: first, the path must exist, second, the name and extension are tamed of all garbage. I'm "old school", so my extensions are always 3 bytes, and my filenames have no spaces, but of course that is all a matter of preference. The above scripts appear to do what you specified, but you can still get some real garbage.
PS, you CAN submit a batch variable to findstr directly:
echo %x% | findstr "something"...

message edited by nbrane


Reply ↓  Report •

#2
October 2, 2020 at 03:02:31

line: 2
char: 1
error: not enough storage is available to process this

=====================

M2


Reply ↓  Report •

#3
October 2, 2020 at 07:37:28
✔ Best Answer
You can give a try with this batch file in order to remove special characters from a string using Regular Expression (Regex) in Powershell

@echo off
SetLocal EnableDelayedExpansion
Title Remove special characters from a string using Regular Expression (Regex)
echo( Type any characters here :
set /P OLD_String=
echo( OLD String = !OLD_String!
@for /f "delims=" %%a in ('Powershell -C '"!OLD_String!"' -replace '[\W]',""') do ( Set "New_String=%%a" )
Echo( New String = !New_String!
Pause & Exit

message edited by Hackoo


Reply ↓  Report •

Related Solutions

#4
October 2, 2020 at 13:13:03
That regex script does succeed in stripping out invalid characters (in fact it works a little too well, as it also removes characters I want to allow, such as spaces, hyphens and periods). HOWEVER, while experimenting, I discovered it still allows arbitrary command execution under certain circumstances. It appears to be hard to produce though, since processing a string with just an & in it doesn't cause it. But, I was able to get it to run the DIR command and output the file list to a text file by feeding it the following string:

this is a test. comma, hyphen- & DIR /W>test998.txt

This might be the solution I go with in the meantime though, because at the moment, stripping out invalid characters as a means of preventing the script from breaking itself is more important than stripping out invalid characters to prevent arbitrary command execution (the former seems far more likely than the latter, considering most of my coworkers barely know how to even use a computer, let alone mess one up with console commands).

Now I just gotta figure out how to stop it from stripping out the characters I still want to allow. Thanks for pointing me in the right direction!


Reply ↓  Report •

#5
October 2, 2020 at 23:32:39
Refer to this : PowerShell - Remove special characters from a string using Regular Expression (Regex) :
You can add or remove what you want as bad string pattern like that :
$BadStringPattern= "[~`!@#$%^&*()-+=?<>\\\/|/,;:.{}\[\]:;€$¥£¢]"

So in powershell, you can write like this one :

$BadStringPattern= "[~`!@#$%^&*()-+=?<>\\\/|/,;:.{}\[\]:;€$¥£¢]"
$String = "Ha!?!#@$%^&*()+\|}{<>??/ck€$¥£¢oo - _ -\^$.|?*+()[{ 0123456789"
$NewString = $String -replace "$BadStringPattern",""
$NewString

message edited by Hackoo


Reply ↓  Report •

#6
October 7, 2020 at 06:53:11
Did anybody make a VBS work?

=====================

M2


Reply ↓  Report •

Ask Question