CGI / Perl: Before you begin...
Total freedom – and enough room for problems
In its basic configuration Perl grants you a lot of freedom. You may, like
colud be done in the good old days of BASIC, use variables and assign values to
them before Perl even knows about that variable. One could assume that that
doesn't constitute a problem, because only variables that have previously been
declared somewhere are used after all, but what if there's a typo anywhere?
The results are hard-to-track bugs that easily can drive you nuts. It may thus
happen that a condition that is known to fire only under certain conditions is
never triggered – or the other way round another condition always fires,
although the prerequisite wasn't met.
If your script displays these symptoms, you have most likely mistyped something
somewhere, be it among the variable definitions, be it at a later point at the
query in question.
The following piece of Perl code clarifies the problem:
Here a variable is defined whose contents are to be subsequently output. However, the line at which the output is supposed to take place contains a typo: On the one hand the variable used here isn't defined at all, and on the other hand an undefined variable contains the implicit value undef that would be output here. Both things won't cause Perl to complain in its basic configuration, though – with the consequence that mistakes that would cause the compiler or interpreter of other programming languages to complain are going to cause completely unexpected results.
Other problems that could occur are variables with undefined content, which is
going to cause trouble when attempting to use it, because the required data is
simply not available. One wold instead wonder what kind of result is returned
by the script.
A third source always is user input. Because ne wants to make things easy for
oneself and therefore omits crucial checks it may happen that user input is
just taken without any checks – because of which a malicious user can pass
input to the script that causes it to perform something unexpected it was never
meant to do.
Exactly these insufficient checks are the vulnerability number 1 as far as user
input is concerned. Input that serves for querying a MySQL database could be
crafted in such a fashion that additional commands are passed to the script
which get executed as well. The operator of the affected server would merely
wonder why tha database has been corrupted or – even worse – confidential data
has been pilfered and strangers have gained access to zones on the server to
which they weren't granted access in the first place.
However, Perl provides a sufficient amount of means to elegantly avoid a
plethora of problems.
Discovering potential problems
When you start writing a Perl script you are usually going to start it with
#!/usr/bin/perl (you may need to look for the Perl
interpreter by issuing where perl in case the path
differs from the example given here and instead enter the path thus
determined). This way the system knows that it has to use another program that
compiles the script appropriately so that you don't have to explicitly call
Perl. However, in its basic configuration Perl is quite taciturn, and you won't
receive many notices except for obvious mistakes during script compilation or
execution, if any at all in case something is amiss which isn't severe enough
for Perl to abort. In order to get more information, you need to enable Perl's
warnings. You do so by completing the invocation of the Perl interpreter so
that that line looks like this: #!/usr/bin/perl -w.
Now when Perl trips over a potentially problematic spot, you wil receive a
warning even if the script isn't aborted. You nevertheless get a hint on what
is going wrong, which in turn renders seemingly inexplicable problems turn out
to be not that inexplicable after all.
When you modify the above example accordingly, you are going to obtain this
piece of code:
Now, when you execute this code, Perl outputs this:
Aha! Here Perl becomes a lot more forthcoming. Here we are informed that Perl
has recognized two variables ($test and
$tset), both of which are used only once and an
undefined value should have been output from $tset.
This gives you a clear indicator that something isn't right and where to look
for errors. The only drawback: The script nevertheless gets executed, still
causing any undesired results.
Preventing misspelled variable names
This is the other huge cause of problems in a Perl script. In case it is rather
short typos may be found quite fast, but when the script grows longer and more
complex, things look much different: Here very often only the conventional
method helps, means printing out the script and search it for potential errors
“on foot”, however, one still risks missing some of them.
Therefore another method must be applied, but here Perl comes to our aid once
again: You only have to inlcude the following line at the head of your script:
use strict;
Now the code looks like this:
When you execute this code now, this happens:
This directive forces Perl to only accept variables that have previously been declared. Because nothing like this has happened so far, Perl is going to complain immediately, because it detects an error. So we just perform the explicit declaration of the variable $test. For this purpose Perl provides the keyword my. The code thus modified looks like this:
Executing this code yields this:
This way we have gotten rid of the error message concerning the declaration, but Perl still complains about the line containing the output. This error is thus found as well and can be corrected. When this error is corrected as well and the script invoked once again, it outputs this:
The script is now operable and can be used without further problems. What looks
to be trivial in this example could become quite a scavenger hunt, especially
when the error resides at a convoluted spot in the code or one only sees what
one wants to see because of a code overload and therefore doesn't recognize
potential typos any more.
In case you have told Perl to watch out for these kinds of problems it is going
to notify you of these problems and abort compiling the script altogether if
necessary. On top of that you are going to receive absolutely helpful messages
so that seemingly hard-to-find bugs become very obvious and so can be fixed
very quickly.
This makes seemingly hard-to-find bugs much less scary.
The Tainted mode
Where the two aforementioned methods served in getting wise to potential errors
in the script and so fix eventual faults, this mode covers a part of Perl that
becomes relevant at runtime but is also capable of causing trouble: We are
talking about user input.
Because there are always some idiots every now and then that intend to abuse
a script that awaits user input for malicious purposes, it is necessary to
sanitize this user input as much as possible so that one cannot cause any
mischief with it any more.
Here the Tainted mode of
Perl comes into play which marks all user input as potentially problematic.
This doesn't become a problem as long as the data is excusively used inside the
script, because they are used for calculations or the likes and so cannot wreak
any mischief. Only when this data is supposed to be used to perform potentially
harmful operations (e. g. for opening a file to write data to it) Perl is going
to abort with an error. This way any misuse of a Perl script can be elegantly
prevented.
In order to make this data usable it is at first necessary to “clean” it, that
is, remove any problematic parts and then use the completely sanitized data.
This way you can ascertain that nothing can go awry, even when interactions
with users are required.