-
IF you use all INNER JOINs, then the conditions can be in the FROM clause.
-
If you use an OUTER JOIN, the conditions need to be in the WHERE clause.
-
All text needs to be bounded by single quotes; double quotes will cause an error.
-
Use Aliases for the following reasons:
-
Table and column names are big, it is easier to read and write the scripts.
-
Table and column names are cryptic and not very readable.
-
More than one table is involved in a query.
-
A table is called more than once and joined to itself.
-
A table is called more than once in subsequent queries.
-
Two or more tables have the same name for a column, e.g. depths and color fields.
-
Functions are used in the query.
-
Two or more columns are combined together
-
You can change the column headers to be more readable by changing the selected field to an alias e.g. l.areaname as Area_Name.
-
You can query the component data many different ways: all data, just major components, the dominant component, or the dominant condition.
-
"All data" retrieves all the data for all the components. This can be very complex and not very useful, depending on the data retrieved.
-
"All major components" does not query any minor components. Again, this can be a very complex data structure.
-
"Dominant component" finds the component with the greatest percentage and retrieves only that data.
-
"Dominant condition" finds the condition that is most prevalent and retrieves that data.
-
An example of a dominant condition is two components with a restrictive layer that have a combined percentage greater than that of the dominant component, which does not have a restrictive layer.
-
Only "tune" SQL after you have confirmed the code as working correctly.
-
Ensure repeated SQL statements are written absolutely identically. This promotes efficient reuse: re-parsing can often be avoided for each subsequent use.
-
Best practices for writing:
-
all SQL verbs are in upper-case i.e., SELECT;
-
all words are separated with a single space;
-
all SQL verbs begin on a new line;
-
SQL verbs are aligned right or left within the initial verb;
-
a table alias standard is to set and maintained;
-
table aliases are used;
-
if a query involves more than one table, all column names are prefixed with their aliases
-
Whatever you do, be consistent.
-
Code the query as simply as possible; i.e., no unnecessary columns are selected, and GROUP BY or ORDER BY are not used unnecessarily.
-
It is fast or faster to SELECT by actual column name(s) than to use an asterisk. The larger the table, the more likely you will save time.
SELECT muname, musym
rather than:
SELECT * FROM mapnit;
-
Perform functions instead of comparisons on the data objects referenced in the WHERE clause.
SELECT musym, muname, muacres FROM mapunit WHERE muacres > 0;
rather than:
SELECT musym, muname, muacres FROM mapunit WHERE muacres!= 0;
-
Do not use a HAVING clause in SELECT statements. It only filters selected rows after all the rows have been returned. Use HAVING only if summary operations that are applied to columns will be restricted by the clause. A WHERE clause may be more efficient.
SELECT musym FROM mapunit WHERE muname!= 'Menfro' AND muname!= 'Goss'; GROUP BY muname;
rather than:
SELECT musym FROM mapunit GROUP BY muname HAVING muname!= 'Menfro' AND muname!= 'Goss';
-
When writing a subquery (a SELECT statement within the WHERE or HAVING clause of another SQL statement):
-
Use a correlated subquery when the return is relatively small and/or other criteria are efficient i.e. if the tables within the subquery have efficient indexes. A correlated subquery refers to at least one value from the outer query.
-
Use a non-correlated subquery when dealing with large tables from which you expect a large return (many rows) and/or if the tables within the subquery do not have efficient indexes. A non-correlated subquery does not refer to the outer query.
-
Ensure that multiple subqueries are in the most efficient order.
-
Remember that rewriting a subquery as a join can sometimes increase efficiency.
-
Minimize the number of table lookups, especially if SELECT is use in a subquery.
-
When using multiple table joins consider the benefits and costs for each EXISTS command, IN command, and table joins. Depending on your data, one or another command may be faster.
-
Note: IN is usually the slowest.
-
Note: When most of the filter criteria are in the subquery, IN may be more efficient; when most of the filter criteria are in the parent-query, EXISTS may be more efficient.
-
Where possible, use EXISTS rather than DISTINCT.
-
Where possible, use a non-column expression (put the column on one side of the operator and all other values on the other). Non-column expressions are commonly processed earlier, thereby speeding up the query.
WHERE muacres < 1000/(1 + n);
rather than:
WHERE muacres + (n * muacres) < 1000;
-
Try not to use OR in a query:
Select compname Where comppct_r=50
UNION ALL
Select compname Where comppct_r=30
rather than:
Select compname
From component Where (comppct_r=50) OR (comppct_r=30)
-
If possible, do not use a wildcard in the beginning of the matching pattern "%pattern".
-
Use SQL Joins instead of using subqueries.
-
Make the table with the least number of rows the driving table. Do this by making it first in the FROM clause.
-
The OVER clause with an aggregate function is similar to, but more efficient than, a subquery.
E.g., the aggregate function
AVG(slope_r) over(partition by compname)
produces the same results as the subquery
(select AVG(slope_r)from component c1 where c1.compname=component.compname)
-
A significant difference between the OVER clause and subqueries is that subqueries use ORDER BY for the partitioning column(s); the over clause does not and is less restrictive.
-
When using GROUP BY, everything in the SELECT clause must be either in the GROUP BY list or an aggregate function. An aggregate function with OVER does not require GROUP BY.
-
Use an INTO #temp table when:
-
Data collected from different areas of the database need to be join together.
-
Data tables that are being queried are in multiple branches of a join statement.
-
Data concatenates due to multiple linkage joins in the data structure.
-
Data is collected with different filters and then joined together for comparison.
-
Data in main query is joined to a secondary query.
-
Data is filtered through several iterations.