10 Essential SQL Tips for Developers

SQL stands for Standard Query Language. It is a language that allows access and manipulation of a database. SQL is an essential language for developers wishing to create data-driven websites. In this post, I'll show you ten essential tips.

Jump to tips in this section:

Use the Right Language
Secure Your Code
Understand Joins
Know Your Data Types
Write Compliant Code
Normalize Your Data
Fully Qualify Your Database Object Names
Understand Indexing
Properly Use Database Permissions
Know Your DBMS Limitations

1. Use the Right Language

Web developers often have a plethora of languages at their disposal. It is crucial for developers to use the proper language for the job. With the right language, it is important that code efficiency is considered.

Let's review the following code. In the first example, the developer is selecting all columns and all rows from the customer table. In the second example, the developer is selecting only the first name, last name, and shipping address from the customer table for a single customer with ID of 1001. Not only does the second query limit the columns that are returned, it will also perform better.

1	SELECT * FROM customer;
2
3	SELECT firstName, lastName, shippingAddress FROM customer WHERE customerID = 1001;

When you are writing code, it is important to ensure that it works efficiently.

Too many developers are satisfied with code that performs adequately on 100 rows of data, with little thought ahead to when the database will have 10,000 rows.

2. Secure Your Code

Databases store valuable information. Because of this, they're often prime targets for attack. Many developers are unaware that their code has critical security vulnerabilities, which is a very scary fact not only for clients, but also for you. Currently, developers can be held legally accountable if their own personal negligence results in a database security risk that is then exploited.

Let's review another example using pseudo-code.

// Theoretical code
txtUserName.setText("eshafer OR 1=1");
query = "SELECT username, password FROM users WHERE username = '" + txtUserName.getText() + "';";
// Final statement
query = "SELECT username, password FROM users WHERE username = eshafer OR 1=1;"

Hopefully you looked at the code above and noticed the vulnerability. The query will end up selecting all username and password records from the table, because 1 always is equal to 1. Now, this particular example doesn't accomplish much for the would-be hacker. However, there are nearly limitless possibilities for additional malicious code that can be added with catastrophic results.

How Can You Write Secure Code?

The solution is often DBMS-specific; that is, it varies between MySQL, Oracle, and SQL Server. In PHP with MySQL, for example, it is usual to escape parameters using the function mysql_real_escape_string() before sending the SQL query. This function inserts an escape character before some characters in a string. It helps prevent SQL injection attacks, which can be carried out by appending malicious code to an SQL query using the sign ('). Alternatively, you can use prepared statements to "prepare" your queries. Make it your mission to understand the DBMS with which you are working and the inherent security issues.

SQL injection isn't the only security vulnerability for databases and developers to worry about, but it is one of the most common methods of attack. It is important to test your code and be familiar with the latest security issues for your DBMS to protect against attacks.

3. Understand Joins

Single table SQL select statements are rather easy to write. However, business requirements often dictate that more complex queries must be written.

For example, in a situation where we should "find all orders for each customer, and display the products for each order", it would be likely that there is a customer table, an order table, and an order_products table. The customer table has a record of all the customers, the order table, has a record of all orders, and the order_products table has a record of all products referenced to an orderID. This is necessary to resolve a possible many-to-many record relationship.

For those who are slightly more familiar with SQL, it is readily apparent that two table joins will be required for this query. Let's look at some sample code.

SELECT customer.customerID, order.orderID, order_products.order_item
FROM customer
INNER JOIN order
ON customer.customerID = order.customerID
INNER JOIN order_products
ON order.orderID = order_products.orderID;


Alright, simple enough. For those who don't know, the code above is an inner join. More specifically, the code above is an equi-join. An equi-join is a type of join that combines tables based on matching values in specified columns. Let's define the various types of joins with visual representations.

Inner Joins

The inner join selects records from both tables that have matching values.

1	SELECT customer.customerID, customer.name, order.orderID, order.customerID, order.Date
2	FROM customer
3	INNER JOIN order
4	ON customer.customerID = order.customerID;

Outer Joins

The outer joins do not require each record to have a matching record. Types of outer joins include left outer join, right outer join, and full outer join.

Left outer join: A left outer join can also be referred to as a left join. A left outer join of tables A and B will return all records of the left table, table A, and also matching records from the right table, table B.

1	SELECT customer.customerID, customer.name, order.orderID, order.customerID, order.Date
2	FROM customer
3	LEFT JOIN order
4	ON customer.customerID = order.customerID;

Right outer join: A right outer join can also be referred to as a right join. A right outer join of tables A and B will return all records of the right table, table B, and also matching records from the left table, table A.

1	SELECT customer.customerID, customer.name, order.orderID, order.customerID, order.Date
2	FROM customer
3	RIGHT JOIN order
4	ON customer.customerID = order.customerID;

Full outer join: A full outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from both tables.

1	SELECT customer.customerID, customer.name, order.orderID, order.customerID, order.Date
2	FROM customer
3	FULL OUTER JOIN order
4	ON customer.customerID = order.customerID;

Special thanks to Ronald Erdei for the images.

Self Joins

There is one last type of join that must be considered, which is a self join. A self join is merely a join from a table to itself. The syntax of a self-join is similar to that of joining two separate tables. Because both table names are identical, aliases are employed here.

1	SELECT order1.customerId, order1.orderId, order1.date
2	FROM order order1, order order2
3	WHERE order1.date = order2.date ;

In the query above, in order to find out which orders have the same date, a self join would be required.

Hopefully this clarifies the basic tenets of joins, as they are one of the primary features of SQL that makes it such a powerful database language. Make sure you use the proper join for your given situation. The SQL Skill Assessment can help you estimate your current SQL proficiency and identify areas for improvement.

4. Know Your Data Types

In SQL, each table column has an associated data type. Typical data types available for developers to choose from include text, integer, varchar, and date.

During development, make sure you choose the proper data type for the column. Dates should be date variables, numbers should be a numeric type, etc. This becomes especially important when we deal with a later topic, indexing.

I'll demonstrate an example of poor knowledge of data types below:

1	SELECT employeeID, employeeName
2	FROM employee
3	WHERE employeeID = 112457891;

Looks fine based on what we currently know, correct? However, what happens if employeeID is a string? Now we have a problem because the DBMS might not find a match (because string data types and integers are different types).

Therefore, if you're using indexing, you'll probably be perplexed as to why your query is taking forever when it should be a simple index scan. This is the reason that developers need to pay special attention to data types and their applications. Non-key attributes which are IDs are often string types, as opposed to integers, because of the increased flexibility that is granted. However, this is also a trouble area for junior developers, who assume that ID fields will be integers.

Properly utilizing data types is essential to proper database programming, as they directly lead to query efficiency. Efficient queries are essential to creating quality, scalable applications.

5. Write Compliant Code

All programming languages have standards which web developers should be aware of, and SQL isn't any different. SQL was standardized by ANSI and then adopted by ISO, with new revisions to the language being occasionally submitted. The international standards have been periodically revised, with the latest revision as SQL:2019. However, the most important revision that developers should be aware of is SQL:1999. The 1999 revision introduced many important features, including recursive queries, triggers, support for PL/SQL and T-SQL, and a few newer features. It also specified that JOIN statements should be done in the FROM clause, as opposed to the WHERE clause.

When writing code, it is important to keep in mind why it's useful to write code that complies with standards. There are two primary reasons for using standards. The first is maintainability, and the second is cross-platform standardization. As with desktop applications, it is assumed that websites will have long lifespans and will go through various updates to add new functionality and repair problems. As any systems analyst will tell you, systems spend the majority of their lifespan in the maintenance phase. When a different programmer accesses your code in two, five, or ten years, will they still be able to understand what your code is doing? Standards and comments are designed to promote maintainability.

The other reason is cross-platform functionality. With CSS, there is currently an ongoing standards battle between Firefox, Internet Explorer, Chrome, and other browsers about the interpretation of code. The reason for the SQL standards is to prevent a similar situation between Oracle, Microsoft, and other SQL variants such as MySQL.

6. Normalize Your Data

Database normalization is a technique to organize the contents of databases. Without normalization, database systems can be inaccurate, slow, and inefficient. The community of database professionals developed a series of guidelines for the normalization of databases. Each 'level' of normalization is referred to as a form, and there are a total of five forms. The first normal form is the lowest level of normalization, up to the fifth normal form, which is the highest level of normalization.

First Normal Form (1NF): The most basic level of data normalization, the first normal form requires the elimination of all duplicate columns in a table, and also requires the creation of separate tables for related data and the identification of each table with a primary key attribute.
Second Normal Form (2NF): Meets all the requirements of the first normal form and creates relationships between tables using foreign keys.
Third Normal Form (3NF): Meets all the requirements of the second and first normal forms and removes all columns that are not dependent upon the primary key. The third normal form also removes all derived attributes, such as age.
Fourth Normal Form (4NF): The fourth normal form adds one additional requirement, which is the removal of any multi-valued dependencies in relationships.
Fifth Normal Form (5NF): The fifth normal form is a rarer form of normalization, in which case join dependencies are implied by candidate keys (possibly primary key values).

In the reality of database development, getting to 3NF is the most important jump. 4NF and 5NF are a bit more of a luxury (and sometimes a nuisance) in database development, and are rarely seen in practice. If you're struggling with the concepts, or remembering the first three forms, there is a simple relationship, "The key, the whole key, and nothing but the key," which relates to 1NF, 2NF, and 3NF.

The Benefits of Normalization

Now, without venturing too far into database theory, let's simply focus on the benefits of normalization. As the data progresses through the normalization forms, it becomes cleaner, better organized, and faster.

Now, with a small database that has only 5 tables and 100 rows of data, this won't be readily apparent. However, as the database grows, the effects of normalization will become much more apparent with regard to speed and maintaining data integrity. However, there are some situations in which normalization doesn't make sense, such as when normalizing the data will create excessively complex queries required to return the data.

7. Fully Qualify Your Database Object Names

Now, this is a commonly ignored point; in fact, all the sample code I've demonstrated in this tutorial has essentially violated this tip. In terms of database development, a fully qualified object name looks as follows:

1	DATABASE.schema.TABLE

Now, let's look at why fully qualified names are important, and in what situations they are necessary. The purpose of a fully qualified object name is to eliminate ambiguity. Beginning developers rarely have access to multiple databases and schemas, which complicates the issues in the future. When a given user has access to multiple databases, multiple schemas, and the tables therein, it becomes crucial to directly specify what the user is attempting to access. If you have an employee table, your boss has an employee table, and the schema that your web application is running on has an employee table, which employee table are you really attempting to access?

Logically, the fully qualified name would look like DATABASE.SCHEMA.OBJECTNAME, but in executable statements, it would simply be SCHEMA.OBJECTNAME. Although various DBMSs do have various syntax differences, the above style is generally applicable.

1	-- Not ''SELECT * FROM table''
2	SELECT * FROM schema.TABLE;

Fully qualifying your database names is important when working with databases that are larger, are used by multiple users, and contain multiple schemas. However, it is a good habit to develop even with simpler databases.

8. Understand Indexing

A database index is a data structure that improves the speed of operations on a database table. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records. Indexing is incredibly important when working with large tables, although occasionally smaller tables should be indexed if they are expected to grow. Small tables that will remain small, however, should not be indexed (for example, if your book is one page, does it make sense to turn to the index?).

Many developers write their code and test it on a table with 10 or 100 rows, and are satisfied when their code performs adequately. However, as the table grows to 10,000 or 1,000,000 rows, the code slows to a snail's pace, and the client might as well go out to lunch waiting for the code to execute.

When a query searches a database for a matching record, there are two ways in which the search can be performed.

The first, and the slowest way, is a table scan. In a table scan, the query searches every record in the table looking for a match.
The second, and the faster way, is an index scan. In an index scan, the query searches the index to find the record.

In non-technical terms, a table scan would be the equivalent of reading every page in a book looking for a word, while an index scan would be the equivalent of flipping to the back of the book, finding the word, flipping to the specified page, and then reading the words on the page to find the word.

It is important to remember that indexes need to be rebuilt occasionally, as data is added to the table. Additionally, while indexes increase data access performance, they slow the modification of data. Because of this, most DBMSs have an option to temporarily disable an index to facilitate mass data modification, and then allow it to be re-enabled and rebuilt later.

9. Properly Use Database Permissions

When working with a database that has multiple users, it is important to properly handle various database permissions. Obviously, most databases have an administrator user, but does it always make sense to run your queries as the administrator? Additionally, would you want to provide all your junior developers and users with your administrator credentials in order to write their queries? Most likely not. The various possible permissions for your database depend on your DBMS, but there are common themes between them.

In MySQL, for example, typing SHOW TABLES will reveal a list of tables in your database, of which you will likely notice a 'user' table. Typing DESC user will reveal that there are various fields in the user table. Along with a host, username, and password, there is also a list of privileges that can be set for a user. Additionally, there is a db table that governs more privileges for a specific database.

SQL Server provides the GRANT, DENY, and REVOKE statements to give or take away permissions from a user or role. Additionally, SQL Server provides roles such as db_writer and db_reader. Often, unknowledgeable developers grant these roles (as opposed to creating their own custom roles) to other users, resulting in overall lowered database security, as well as the possibility of a user performing an unwanted operation.

Properly managing your database user permissions is essential to managing not only security, but also providing a foundation for faster development and protecting data integrity.

10. Know Your DBMS Limitations

Databases are powerful tools, but they do have limitations. Oracle, SQL Server, and MySQL all have unique limitations on things such as the maximum database size and the maximum number of tables. Many developers unknowingly choose a DBMS solution for their project without planning or considering the requirements of their database.

Refer to your DBMS manual for the various limitations; for example, SQL Server limitations are located on the Microsoft Learn website.

Conclusion

In this article, we reviewed ten essential tips for SQL developers. However, there are many other useful SQL techniques that could be mentioned, so please check out our other SQL tutorials to learn more. Keep developing, and remember: the code you write supports the internet infrastructure, and without you, the internet would not be as successful as it is.