Advertisement

What to Expect From PHP 5.5

by

The first PHP 5.5 alpha has been publicly released. After having some time to test and experiment with it, we can now bring you our in-depth overview of what to look forward for!


Installation

If you'd like to follow along with this article, you'll need to install PHP 5.5 for yourself. You can find the link to the source bundle here. Additionally, if you need the Windows binary file, you can grab it from the same page.

Once you have the source code downloaded, extract everything into a folder and navigate to it with your favorite Terminal program. You can install PHP to wherever you like, but, for convenience, I'm going to create a directory in the root of my drive, called PHP5.5. To create a new folder and then make your user the owner of said folder, type the following into your terminal:

sudo mkdir /PHP5.5
sudo chown username /PHP5.5

Next, you have to decide which extensions and features you want installed with your copy of PHP. Since this is an Alpha version, meant for testing only, I'm not going to worry about making it fully featured. For this tutorial, I am only going to install cURL, but there might be other things that you'd want to add, such as MySQL or zip support. To view a full list of configuration options, run:

./configure -h

Besides the option to install cURL, there are two other properties that we need to set: the prefix and with-config-file-path option. These two properties set up the location of the PHP installation and the .ini file, respectively. So in the terminal, type:

./configure --prefix=/PHP5.5 --with-config-file-path=/PHP5.5 --with-curl=ext/curl

make

make install

The first line configures PHP with cURL, and sets the location to the folder we made earlier. The next two lines build PHP and move the files to the specified directory. The next step is to copy the sample php.ini file to the PHP folder. To do this, run:

cp php.ini-development /PHP5.5/php.ini

You should now have everything installed correctly. The easiest way to test this new version out is to run its built in web server. Navigate to the bin folder inside the PHP5.5 directory (cd /PHP5.5/bin), and type ./php -t /Path/To/Directory -S 0.0.0.0:4444.

  • The -t option sets the server's root directory (i.e. the location where you will place your PHP files)
  • The -S property sets the IP address and port number where the server should bind to. Using 0.0.0.0 for the IP address tells the server to listen on all IP addresses (e.g. localhost, 127.0.0.1, 192.168.0.100, etc.).

If all goes well, you should be greeted with a message telling you that the server is listening on the IP/port specified, and it will tell you the document root where it's serving from. We can now start toying with the new features!


Generators

Generators allow you to create custom functions, which retain state between runs.

The biggest addition to PHP 5.5 has got to be generators. Generators allow you to create custom functions, which retain state between runs. It works off a new keyword, called yield, which can be used in a function for both inputting and outputting data.

Essentially, when the function gets to a line that contains the yield command, PHP will freeze the function's execution and go back to running the rest of your program. When you call for the function to continue - either by telling it to move on or by sending it data - PHP will go back to the function and pick up where it left off, preserving the values of any local variables that were defined up to there.

This may sound somewhat cool at first, but if you give it some thought, this opens doors to a lot of interesting design options. Firstly, it simulates the effects of other programming languages that have "Lazy Evaluation," like Haskell. This alone allows you to define infinite data sets and model math functions after their actual definition. Besides that, you don't have to create as many global variables; if a variable is only meant for a specific function, you can include it in a generator, and things like counters happen automatically by the generator itself in the form of the returned object's key.

Well that's enough theory for now; let's take a look at a generator in action. To start off, navigate to the document root you defined when running the PHP server, and create a file, called "index.php". Now, open up the file and type in the following:

function fibonacci(){
    $a = 0;
    $b = 1;

    while(true)
    {
        $a = $a + $b;
        $b = $a - $b;
        yield $a;
    }
}

This is the "Hello World" function of infinite datasets. It's a function that will output all fibonacci numbers. To use the generator, all you have to do is type:

$fib = fibonacci();
$fib->current();
$fib->next();
$fib->current();
//...

What's happening here is we're making $fib a generator object, and then you have access to the underlying commands, like current() and next(). The current() function returns the current value of the generator; this is the value of whatever you yielded in the function - in our case, $a. You can call this function multiple times and you will always get the same value, because the current() function doesn't tell the generator to continue evaluating its code. That's where the next() function comes in; next() is used to unfreeze the iterator and continue on with the function. Since our function is inside an infinite while loop, it will just freeze again by the next yield command, and we can get the next value with another call to current().

The benefit to generators is that the local variables are persistent.

If you needed to do accomplish something like this in the past, you would have to put some kind of for loop which pre-calculates the values into an array, and halts after a certain number of iterations (e.g. 100), so as not to overload PHP. The benefit to generators is that the local variables are persistent, and you can just write what the function is supposed to do, as apposed to how it should do it. What I mean by this is that you simply write the task and don't worry about global variables, and how many iterations should be performed.

The other way yield can be used is to receive data. It works in the same way as before: when the function gets to a line with the yield keyword, it will stop, but, instead of reading data with current(), we will give it data with the send() function. Here is an example of this in action:

function Logger(){
    $log_num = 1;

    while(true){
        $f = yield;
        echo "Log #" . $log_num++ . ": " . $f;
    }
}

$logger = Logger();

for($i = 0; $i<10; $i++){
    $logger->send($i*2);
}

This is a simple generator for displaying a log message. All generators start off in a paused state; a call to send (or even current) will start the generator and continue until it hits a yield command. The send command will then enter the sent data and continue to process the function until it comes to the next yield. Every subsequent call to send will process one loop; it will enter the sent data into $f, and then continue until it loops back to the next yield.

So why not just put this into a regular function? Well, you could, but, then, you would either need a separate global variable to keep track of the log number, or you would need to create a custom class.

Don't think of generators as a way to do something that was never possible, but rather as a tool to do things faster and more efficiently.

Even infinite sets were possible, but it would need to reprocess the list from the beginning each time (i.e. go through the math until it gets to the current index), or store all of its data within global variables. With generators, your code is much cleaner and more precise.


Lists in foreach Statements

The next update that I found to be quite helpful is the ability to break a nested array into a local variable in a foreach statement. The list construct has been around for a while (since PHP4); it maps a list of variables to an array. So, instead of writing something like:

$data = array("John", "Smith", "032-234-4923");

$fName = $data[0];
$lName = $data[1];
$cell = $data[2];

You could just write:

$data = array("John", "Smith", "032-234-4923");

list($fName, $lName, $cell) = $data;

The only problem was that, if you had an array of arrays (a nested array) of info you wanted to map, you weren't able to cycle through them with a foreach loop. Instead, you would have to assign the foreach result to a local variable, and then map it with a list construct only inside the loop.

As of version 5.5, you can cut out the middle man and clean up your code. Here's an example of the old way, versus the new:

//--Old Method--//
foreach($parentArr as $childArr){
	list($one, $two) = $childArr;
	//Continue with loop
}

//--New Method--//
foreach($parentArr as list($one, $two)){
	//Continue with loop
}

The old way might not seem like too much trouble, but it's messy and makes the code less readable.


Easy Password API

On my Mac's built in graphics card, I was able to go through over 200 million hashes a second!

Now this one requires a little knowledge of hashes and encryption to fully appreciate.

The easiest way to hash a password in PHP has been to use something like MD5 or a SHA algorithm. The problem with hash functions like these are that they are incredibly easy to compute. They aren't useful anymore for security. Nowadays, they should only be used for verifying a file's integrity. I installed a GPU hasher on my computer to test out this claim. On my Mac's built in graphics card, I was able to go through over 200 million hashes a second! If you were dedicated, and invested in a top of the line multi-graphics card setup, you could potentially go through billions of hashes a second.

The technology for these methods were not meant to last.

So how do you fix this problem? The answer is you pose an adjustable burden on the algorithm. What I mean by this is that you make it hard to process. Not that it should take a couple of seconds per hash, as that would ruin the user's experience. But, imagine that you made it take half a second to generate. Then, a user likely wouldn't even realize the delay, but someone attempting to bruteforce it would have to run through millions of tries - if not more - and all the half seconds would add up to decades and centuries. What about the problem of computers getting faster over time? That is where the "adjustable" part comes in: every so often, you would raise the complexity to generate a hash. This way, you can ensure that it always takes the same amount of time. This is what the developers of PHP are trying to encourage people to do.

The new PHP library is a hashing "workflow," where people are able to easily encrypt, verify and upgrade hashes and their respective complexities over time. It currently only ships with the bcrypt algorithm, but the PHP team have added an option, named default, which you can use. It will automatically update your hashes to the most secure algorithm, when they add new ones.

The way bcrypt works is it runs your password through blowfish encryption x amount of times, but instead of using blowfish with a key so that you could reverse it later, it passes the previous run as the key to the next iteration. According to Wikipedia, it runs your password through 2 to the x amount. That's the part that you could adjust. Say, right now, you want to use a cost level of 5: bcrypt will run your hash 2 to the 5, or 32 times. This may seem like a low number, but, since the cost parameter adjusts the function exponentially, if you changed it to 15, then the function would run it through 32768 times. The default cost level is 10, but this is configurable in the source code.

With the theory out of the way, let's take a look at a complete example.

 $pass = "Secret_Password";
 $hash = password_hash($pass, PASSWORD_BCRYPT, array('cost' => 12, 'salt' => "twenty.two.letter.salt"));

 if(password_verify($pass, $hash)){
 	if(password_needs_rehash($hash, PASSWORD_DEFAULT, array('cost' => 15))){
 		$hash = password_hash($pass, PASSWORD_DEFAULT, array('cost' => 15));
 	}
 	//Do something with hash here.
 }

The password_hash function accepts three parameters: the word to hash, a constant for the hashing algorithm, and an optional list of settings, which include the salt and cost. The next function, password_verify, makes sure that a string matches the hash after encryption, and, finally, the password_needs_rehash function makes sure that a hash follows the parameters given. For example, in our case, we set the hash cost to twelve, but, here, we are specifying fifteen, so the function will return true, meaning that it needs to be rehashed.

You may have noticed that, in the password_verify and password_needs_rehash functions, you don't have to specify the hashing method used, the salt, or the cost. This is because those details are prepended to the hash string.

Salts are used to prevent hashes from being precomputed into rainbow table.

The reason why it's okay to bundle the cost and salt along with the hash and not keep it secret, is because of how the system puts its strengths together. The cost doesn't need to be a secret, because it is meant to be a number which provides a big enough burden on the server. What I mean by this is that, even if someone gets your hash and determines that your cost level requires 1/2 a second to compute, he will know what level to bruteforce at, but it will take him too long to crack (e.g. decades).

Salts are used to prevent hashes from being precomputed into a rainbow table.

A rainbow table is basically a key-value store with a "dictionary" of words with their corresponding hashes as their keys.

All someone has to do is precompute enough common words - or, worse, all string possibilities - and then they can look up the word for a given hash instantly. Here's an example of how salts can be of help: let's say that your password is the word, "password." Normally, this is a fairly common word; it would likely be in their rainbow table. What a salt does is it adds a random string to your password; so, instead of hashing "password," it's really hashing "passwordRandomSalt524%#$&." This is considerably less likely to be pre-computed.

So, why are salts usually considered to be private information? Traditionally, with things like MD5 and the likes, once someone knows your salt, they can return to performing their bruteforce techniques, except they will add your salt to the end. This means, instead of bruteforcing your random password string, they are bruteforcing a much shorter standard password and just adding the salt. But, luckily, since we have the cost factor setup, it would take too long to compute every hash over with the new salt.

To recap: salts ensure that the precomputing of one hash cannot be used again on another hash, and the cost parameter makes sure that it isn't feasible to compute every hash from scratch. Both are needed, but neither of them have to be secret.

That is why the function attaches them to the hash.

Remember, it doesn't matter what your salt is, as long as it's unique.

Now, if you understood what the salt is doing, then you should know that it's better to let the function randomly generate one, than to enter your own word. Although this feature is provided, you don't want all of your hashes to have the same salt. If they do, then, if someone managed to break into your database, all they'd have to do is compute the table once. Since they will have the salt and cost level, it might take a while, but, with enough computers, once they process it, they will have unlocked all of your hashes. As such, it's much better to not assign one and instead let PHP randomly generate one.


cURL Additions

Up until now, there was no easy way to send mail through SMTP.

cURL is another area, where the PHP team have added some exciting new additions and updates. As of version 5.5, we now have support for the FTP directives, directives to set cookies, directives for SSL and accounts, and directives for the SMTP and RTSP protocols, among others. It would take too long to discuss all of them, but, to view the full list, you can refer to the NEWS page.

I do, however, want to talk about one set in particular that interested me most: the SMTP set of directives. Up until now, there was no easy way to send mail through SMTP. You would either have to modify your server's sendmail program to send messages via an SMTP server, or you would have to download a third party library - neither of which is the best option. With the new CURL directives, you are able to talk directly with a SMTP server, such as Gmail, in just a few short lines.

To better understand how the code works, it's worth learning a bit about the SMTP protocol. What happens is, your script connects to the mail server, the mail server acknowledges your connection and returns to you its information (e.g. domain, software). You then have to reply to the server with your address. Being a chatty (but polite) protocol, SMTP will greet you so that you know it was received.

At this point, you are ready to send commands. The commands needed are the MAIL FROM and the RCPT TO; these map directly to the cURL directives, CURLOPT_MAIL_FROM and CURLOPT_MAIL_RCPT, respectively. You only have one from address, but you are able to specify multiple to addresses. Once this is done, you can simply call the command, DATA, and start sending the actual message and message headers. To end the transmission, you have to send a blank line, followed by a period, followed by another blank line. These last two parts (i.e. DATA command and the ending sequence) are taken care of by cURL, so we don't have to worry about it.

So, in PHP, all we have to do is specify the mail from and to directives, and then send the actual message - all within cURL. To make things really simple, I'm going to create a class, called Gmail, which will accept a username/password and the message details, and it will send emails through your Gmail account.

I'll paste the entire class below, and then we'll go through it line by line, as most of it is boilerplate.

class Gmail{
        private $mail;
        private $email;
        private $pass;
        
        public function __construct($email, $pass){
        	$this->email = $email;
        	$this->pass = $pass; 
        }
        private function mailGen(){
            $from = yield;
            $to = yield;
            $subject = yield;
            $body = yield;
            yield "FROM: <" . $from . ">\n";
            yield "To: <" . $to . ">\n";
            yield "Date: " . date("r") . "\n";
            yield "Subject: " . $subject . "\n";
            yield "\n";
            yield $body;
            yield "";
        }
        public function getLine(){
            $resp = $this->mail->current();
            $this->mail->next();
            return $resp;
        }
        public function send($to, $subject, $body){
            $this->mail = $this->mailGen();
            $this->mail->send($this->email);
            $this->mail->send($to);
            $this->mail->send($subject);
            $this->mail->send($body);
            $ch = curl_init("smtps://smtp.gmail.com:465");

            curl_setopt($ch, CURLOPT_MAIL_FROM, "<" . $this->email . ">");
            curl_setopt($ch, CURLOPT_MAIL_RCPT, array("<" . $to . ">"));
            curl_setopt($ch, CURLOPT_USERNAME, $this->email);
            curl_setopt($ch, CURLOPT_PASSWORD, $this->pass);
            curl_setopt($ch, CURLOPT_USE_SSL, CURLUSESSL_ALL);
            //curl_setopt($ch, CURLOPT_VERBOSE, true); optional if you want to see the transaction
            curl_setopt($ch, CURLOPT_READFUNCTION, array($this, "getLine")); 

            return curl_exec($ch);
        }
    }

Hopefully, your reaction to this code was something along the lines of, "Wow, that's short for a complete SMTP implementation!" Just in case it seems complicated, we'll go over it. We begin by defining three private variables: one for the message generator, one to store the user's email, and one to store his password. Next, we have the constructor, which stores the email and password for later; this is so we can send multiple emails without re-entering this each time. The mailGen function is a PHP 5.5 generator, which is used to output the message according to the email protocol straight into cURL. The reason why this is needed is because the command used in cURL to enter the data is meant for reading from a file, line by line. So, instead of having an extra variable to remember which line we were up to, I used a generator, which saves its position.

The next function is used for cycling through the generator. We can't enter the generator directly into cURL; we need an intermediary. cURL will continue calling this function, until it comes to an empty line. This is why the last line in the generator return a blank string.

The last function in the class is the one that ties it all together. We first initialize the generator to the private variable defined earlier. Next, we send the generator all the information required, and create a new cURL variable. We've already discussed CURLOPT_MAIL_FROM and CURLOPT_MAIL_RCPT; they map to the equivalent SMTP commands. CURLOPT_MAIL_RCPT is an array so you can enter multiple addresses. Next, we need to add the credentials to log into GMail. I left the verbose option there; uncomment it, if you want to see the entire SMTP transaction. The final two lines just set the function where CURL should get the data from for the message, and then we return the results.

It's a good couple of lines, but nothing overly complicated. To use this function, create a new instance of this class, and call the send function. Here's an example of sending two emails with this:

    $gmail = new Gmail("gmanricks@gmail.com", "password");
    $gmail->send("first_guy@email.com", "Subject of email", "Hello Guy,\n What's going on.");
    $gmail->send("second_guy@email.com", "Different Subject", "Important message.");

Bits and Bobs

To finish up this article, I'll gover over some of the smaller updates to PHP 5.5.

One quite cool thing is the added support for constant string/string dereferencing. What this means is that you can access individual characters in a static string, as if the string was an array of characters. A quick example of this is the following:

echo "Hello World"[1]; //this line will echo out 'e'
echo ["one", "two", "three"][2]; //this echos "three"

Next, we have the finally keyword. This is appended to the end of a try/catch block; what it does is instruct PHP that, whether or not the try or catch was called, you want to process the finally section. This is good for situations, when you want to handle the outcome of a try/catch statement. Instead of repeating code in both, you can just put the "risky" part in the try/catch block, and all the processing in the finally block.

Another usage that was suggested by the creator as a best practice is to put all cleanup code in the finally block. This will ensure that you don't, for instance, try to close the same stream multiple times (e.g. your code crashed and went into the catch block after closing already, and you try closing it again).

The last thing worth mentioning is how the MySQL extension will be deprecated in this new release. You should convert your code to either the mysqli or PDO extensions instead. Though it's long since been considered an anti-pattern, it's nice for the PHP team to officially deprecate it.

While there's certainly more updates to dig into, the items in this article represent what I feel are the most important and exciting.


Conclusion

Thanks for reading; I hope you've learned a bit! As always, if you have any comments or questions, jump into the conversation below, and let's talk!

Advertisement