The Beginner's Guide to Type Coercion: What is Coercion?

In this series, we're taking a beginner's look at dynamically typed (or weakly typed) languages and how their lack of strong typing can both positively and negatively impact our programming.

As mentioned in the first post, this series is specifically geared towards beginners or towards those who don't have a lot of experience with weakly-typed languages. That is to say that if you've been programming in both strongly typed and/or weakly typed languages and are familiar with type coercion and the pitfalls that can occur when performing certain operations, then this series may not be of much interest to you.

On the other hand, if you're someone who's just getting started in writing code or you're someone who's coming into a dynamically typed language from another language, then this series is geared specifically towards you. Ultimately, the goal is to define type coercion, show how it works, and then examine the pitfalls of it.

Coercion Defined

According to Wikipedia, coercion is defined as follows:

In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another.

Or, perhaps in a simpler manner, you may define this as how you take one data type and convert it into another. The thing is, there's a fine line between conversion and coercion.

As a general rule of thumb, I tend to think of coercion as how an interpreter or compiler works to determine what kind of comparison is being made, whereas conversion is an explicit change in type that we, as the programmer, write in our code.

Let's look at this in more detail.

Type Conversion

Let's say, for example, that you have a string named example and its value is '5'. In statically typed languages, you may type cast this grab the value of the string and convert it to an int through a number of different methods.

Assume that we have an Integer object with a parseInt method. The method accepts a string and returns the value of the string in the integer data type. The code for doing so may look something like this:

string example = '5';

Integer myInt = new Integer();
int intExample = myInt.parseInt( example );

/* intExample now has the value of 5 (not '5')
 * and example still refers to the string '5'
 */

Of course, the syntax will vary from language to language, and there are other ways to go about casting a value, but this gives you an idea as to how to explicitly convert one type into another.

Another way to go about doing this is to use a type casting operator. Though the implementation of the operation varies from language-to-language, most programmers who have worked with C-style languages will likely recognize it as something similar to this:

1	int myInt = (int)example;

Generally speaking, type casting is usually done by placing the type to which you want to convert the variable in parentheses before the variable itself. In the example above, myInt will now contain 5, rather than '5' and example will still hold '5'.

As stated, this is something that's normally done within the context of compiled languages

Type Coercion

This still leaves the question of how type coercion differs from type conversion. Though coercion can happen within compiled languages, it's more likely that you'll see it happening within interpreted languages or in dynamically typed languages.

Furthermore, you're more than likely going to see type coercion happening whenever a comparison is being made between objects of different types, or when an operation or evaluate is being made with variables that have different types.

As a simple example, let's say that in JavaScript we have two variables - sName, iAge - where sName refers to a person's name and iAge refers to a person's age. The variables, for purposes of example, are using Hungarian Notation simply to denote that one is storing a string and one is storing an integer.

Note that this is not an argument for or against Hungarian Notation - that's a topic for another post. It's being used here to make it clear what type of value each variable is being stored so that it's easier to follow the code.

So we'll go ahead and define our variables and their values:

1	var sName = 'John Doe';
2	var iAge = 32;

Now we can look at a few examples of how type coercion works within the context of an interpreted language. Two examples of how type coercion works is as follows:

Comparing a number to a boolean
Concatenating a string and a number

Let's take a look at an example of each:

/**
 * Comparing a number to a boolean
 * will result in the boolean value 
 * of 'false'.
 */
var result = iAge == true;

/**
 * Concatenation strings and numbers will
 * coerce the number to a string.
 *
 * "John Doe is 32 years old."
 */
var bio = sName + ' is ' + iAge + ' years old.';

These examples are relatively simple. The first one makes sense as there's no way that a number could be compared to a boolean value.

In the second example, notice that we're taking a string, concatenating it with another set of strings, and also using the number in the concatenation operation. In this case, the number is converted to a string and then is concatenated along with the rest of the words.

This is type coercion: When you take a variable of one type and convert it's value to another type when performing an operation or evaluation.

The thing is, both of these examples are very simplistic. Let's look at a few more to demonstrate how coercion works, at least in JavaScript, when performing concatenation operations:

var one, two, result;

// one and two refer to string values of '1' and '2'
one = '1';
two = '2';

// result will contain the string '12';
result = one + two;

// redefine two to equal the number '2'
two = 2;

// concatenating a string and a number results in a string
// result will contain '12';
result = one + two;

// redefine one as a number
one = 1;

// then concatenate (or sum) the two values
// result will be 3
result = one + two;

There are two important things to note:

The + operator is overloaded. That means that when it's working with strings, it concatenates them together, but when it's working with numbers it adds them together.
One type is always coerced into another and there's normally a hierarchy for how it occurs. Though each language is different, note that in the second example when we are concatenating a string and a number, the result is a string. This is because the number is coerced into a string.

To take the example one step further, let's add one more variable, a prioritized set of operations, and then examine the result:

var one, two, tree, result;
one = '1';
two = 2;
three = 3;

// result is '123'
result = one + two + three;

// result is '15'
result = one + (two + three);

Notice in the second example, two and three are added together because they are both numbers and then the result is concatenated with one because it's a string.

Earlier, we mentioned that there is one special case for numbers and boolean values, at least in JavaScript. And since that's the language that we've been using to examine type coercion and since that's a language that's frequently used in modern web development, let's take a look.

In the case of JavaScript, note that 1 is considered to be a "truthy" value and 0 is concerned to be a "falsey" value. These words are chosen as such because the values can serve as numbers, but will also be evaluated to true or false when performing a comparison.

Let's take a look at some basic examples:

var bTrue, bFalse, iZero, iOne, result;

bTrue = true;
bFalse = false;
iZero = 0;
iOne = 1;

// result holds the boolean value of false
result = bTrue == iZero;

// result holds the boolean value of true
result = bTrue == iOne;

// result holds the boolean value of false
result = bFalse == iOne;

// result holds the boolean value of true
result = bFalse == iZero;

Notice that in the examples above, the number values are coerced into integer values by nature of the comparison that's being made.

But what happens if we're to compare a boolean value of true or false to a string value of one or zero?

var bTrue, bFalse, sTrue, sFalse, result;
bTrue = true;
bFalse = false;
sTrue = '1';
sFalse = '0';

// result holds true
result = bTrue == sTrue;

// result holds false
result = bTrue == sFalse;

// result holds false;
result = bFalse == sTrue;

// result holds true
result = bFalse == sFalse;

At this point, things can start to get really confusing because we're comparing a string value of a number which is 1 to a boolean value of true and we're getting a boolean result, and the boolean is true.

Make sense? We'll be taking a look at this in a bit more detail in the next article, but I wanted to go ahead and introduce the basics of it first.

Coming Up Next...

This is when dynamically typed languages can start to cause headaches for developers. Luckily, there are ways in which we can write code that's stricter than what we have above and that yields accurate results.

Additionally, some dynamically typed languages also contain values for undefined and for null. These also maintain "truthy" and "falsey" values which, in turn, affect how we deal with comparisons.

In the final article in the series, we're going to take a look at how values such as undefined and null compare with other values as well as with one another, and take a look at some strategies that we can implement that will make our code more resilient against incorrect type coercion, and that makes it more readable.

If this is your first foray into dynamically typed languages or type coercion and you have questions, comments, or feedback, then please don't hesitate to leave a comment in the feed below!