You can use the inbuilt PHP function htmlspecialchars( ) to convert certain HTML into their respective symbols. (See the previous lesson for why you want to do this.) For example, take the following HTML tag:

<B>Bold text</B>

On a web page, that just gives you Bold text. If you enter it into a textbox, and don't convert, then the browser renders it as HTML – in other words, it gives you bold text. The same is true of this:

<A HREF ="nastysite">A Nasty Site</A>

This unconverted HTML will turn into an hyperlink. That's because things like left and right pointy brackets are considered to be HTML. The browser sees the code above, and turns it into a hyperlink. It DOESN'T display the left and right pointy brackets. If you actually wanted a left point bracket on your page, you'd use the HTML special character for this symbol:


And this, essentially, is what the htmlspecialchars( ) function does – turns the HTML into the special character codes.

As an example, change your PHP script from the previous lesson from this:

$first_name = $_POST['first_name'];
echo $first_name;

to this:

$first_name = $_POST['first_name'];
$first_name = htmlspecialchars( $first_name );
echo $first_name;

Run your code again, and see what happens. You should see this display in the browser:

Effect of using htmlspecialchars

Now it's not treating the hyperlink as HTML - it's turning it into plain text.

The new line in the script is this:

$first_name = htmlspecialchars($first_name);

So in between the round brackets of htmlspecialchars( ) you type the name of the variable you want to convert to special characters. PHP takes care of the rest.


htmlentities( )

A function similar to htmlspecialchars( ) is htmlentities( ). Instead of the above, you can do this:

$first_name = $_POST['first_name'];
$first_name = htmlentities( $first_name );
echo $first_name;

The difference between the two is that htmlentities( ) will check for non English language characters, such as French accents, the German umlaut, etc. So if you think your attacker might launch an attack in a language that is not English, then use this.

In the next part, we'll see how to strip HTML tags altogether.