This is a lab exercise on developing secure software. For more information, see the introduction to the labs.
Practice validating specialized text input in a program using a regular expression.
In this exercise, we'll add some simple input validation to a server-side program written in JavaScript using the Express framework (version 4) and the express-validator library.
However, this time we're going to do input validation for a specific data value using regular expressions. Many programs have specialized data values that are easily tested using regular expressions.
The code below sets up handlers for a get request on path /invoices. This code could be triggered, for example, by requesting http://localhost:3000/part?id=AX-794-7 (if it was running at localhost and responding to port 3000). If there are no validation errors, the code is supposed to show the part id. If there is a validation error, it responds with HTTP error code 422 ("Unprocessable Content"), a status code suggesting that the request was invalid for some reason, along with an error message.
In this case, we want to implement proper input validation. We want to ensure it's not longer than a certain length and that it matches a specific pattern. Just like lab input1, as written, this program has a vulnerability we haven't discussed yet called Cross-site Scripting (XSS). This particular vulnerability would be entirely prevented if we did better input validation.
In this application, the part id format is always two uppercase Latin letters (each A through Z), then a dash (-), a sequence of one or more digits, another dash (-), and another sequence of one or more digits.
To do that:
We also need to verify that the input matches our pattern, which we can verify using a regular expression. In this situation, we can do this by:
Use the “hint” and “give up” buttons if necessary.
Please change the code below so the query parameter id is only accepted if it's no longer than 80 characters and meets this application's part id format requirement. The format is two uppercase Latin letters (each A through Z), then a dash (-), a sequence of one or more digits, another dash (-), and another sequence of one or more digits.