Comments

Inline comments like so --dash dash to end of line
Multiline comments like so /* slash star to star slash */

Naming Convention

By default, names of tables/columns/etc can only include underscore (_) and letters.

To use spaces, put [square brackets] around the name.

Recommendations

Specify Schema

In stored procedures, views, etc, always specify the schema of each table. Say "dbo.customer" instead of just "customer".
[Reference]

Why? Users can have different default schemas set. The database will look in the user's default schema first.

1) It will take longer to find the table

2) defaultSchema.tableName might also exist, so the wrong table would be used

3) the query plan will be cached as defaultSchema.tableName, so each user could end up caching a separate query plan - very inefficient

Caching

General considerations:

Query plan caching is whitespace-sensitive and case-sensitive.

Custom queries (dynamic SQL) are cached using the hash of the query text. Using a lot of these can really degrade performance.

Parameterize

To improve query plan caching, always parameterize you queries. That way, the specific values of the parameters don't affect how the query plan is cached.

The speed difference between parameterized and non-parameterized queries can be dramatic.
Terminology

ACID

The ACID properties are rules for how a database management service will function to ensure that the database remains consistent, before and after each transaction.

Atomicity: Either the entire transaction succeeds, or none of it does. If anything in the transaction fails, everything already done will be rolled back. Also called "all or nothing".

Consistency: All data integrity constraints are followed, as of the end of each transaction.

Isolation: Multiple transactions can occur concurrently without messing each other up.

Durability: Once a transaction has been committed, that data will not be lost, even in the event of system failure.

Command Categories

DDL stands for Data Definition Language. These commands deal with the structure of the database.
DDL includes CREATE, DROP, ALTER, TRUNCATE, COMMENT, and RENAME.

DML stands for Data Manipulation Language. These commands deal with the manipulation of data.
DML includes SELECT, INSERT, UPDATE, and DELETE.

DCL stands for Data Control Language. These commands deal with database security.
DCL includes GRANT and REVOKE.

TCL stands for Transaction Control Language. These commands deal with managing transactions.
TCL includes COMMIT, ROLLBACK, SAVEPOINT, and SET TRANSACTION.

Referential Integrity

Maintaining the Foreign Key to Primary Key constraints.
Every Foreign Key must have a matching Primary Key in the referenced table.

Transaction

A transaction is a collection of statements that will be run together. Either all of them will succeed, or all of them will fail together.

T-SQL

T-SQL stands for Transaction SQL. It's a Microsoft language that extends basic SQL.

Misc

"Row version" is a synonym for "timestamp" in SQL Server. So you might see "row version" in an error message when it means the timestamp column.
Misc Commands

NOCOUNT

This prevents the default message about "rows affected by query" from being displayed; also prevents DONE_IN_PROC messages from stored procedures.

SET NOCOUNT ON;
Commonly used in Stored Procedures and Triggers. Turning this I/O operation off can significantly improve performance.

Allow the default message to be displayed.

SET NOCOUNT OFF;

UPDATE STATISTICS

Updates query optimization statistics on a table or indexed view.

This already runs by default, but you can call it more frequently yourself. Don't do it constantly, though, because it causes affected queries to re-compile.


UPDATE STATISTICS tableNameOrViewName indexName;

@@ROWCOUNT

Returns the number of rows affected by the last statement.

SET @count = @@ROWCOUNT;

Non-Transact

Commands which are not Transact-SQL.
They are only recognized by the sqlcmd and osql utilities, and SSMS code editor.

GO

GO signals the end of a batch of Transact-SQL. Execution will not continue until the previous batch is complete.

Note that local variable scope is limited to the current batch, so they cannot be referenced after a GO command.


some sql;
GO
some sql;
Note that GO is not followed by a semicolon.

You can add a count to GO, which will cause the previous batch to run that many times before execution continues.

GO 5
Normalization

Normalization is the act of designing a database to conform to one of the "normal forms".
The aim is to reduce data redundancy and improve data integrity.
Normalized databases should also be extensible without editing the existing design.

This involves organizing tables and columns so that data constraints can be enforced by the database.

Synthesis: creating a new database design.
Decomposition: improving an existing database design.

Forms

UNF: Unnormalized Form
1NF: First Normal Form
2NF: Second Normal Form
3NF: Third Normal Form
EKNF: Elementary Key Normal Form
BCNF: Boyce-Codd Normal Form
4NF: Fourth Normal Form
ETNF: Essential Tuple Normal Form
5NF: Fifth Normal Form
DKNF: Domain-Key Normal Form
6NF: Sixth Normal Form

Comparison of normal forms

Some general aims:
- reduce the number of NULL values being stored
- enable efficient indexing
- each column should hold one data type
- columns should not hold lists/etc of data, just atomic data

Denormalize

A denormalized database, such as one single giant table with many columns, can be used to increase query speed since all the joins are already done.
SSMS

SSMS stands for SQL Server Management Studio.
You can download this for free.
It's a good tool for building, browsing, edit, etc SQL Server databases.

LocalDb

SSMS can connect to LocalDb.

Add a connection for "Database Engine" to "(local)\MSSQLLocalDb".

CSV Import

How to import *.csv table data into a SQL Server database.

1) Create the table in the database.
2) Open SSMS
3) Right-click on the database > Tasks > Import Data
4) Data Source is "Flat File Source" > Browse to the file > enter configuration info
-- If your column data is enclosed in quotation marks (ex: "Smith" or "1"), set Text Qualifier to a ".
5) Select Destination "SQL Server Native Client 11.0" > Enter the server name > select the database
6) On the source to destination mapping:
-- You can click on the "Destination" to select an existing table or "ignore" the column.
-- If you want to swap two "Destinations", you'll need to set one to "ignore" first so it shows up as an option for the other.
-- To preserve identity columns, click "Edit Mappings" > select "Enable Identity Insert".
7) Data conversions will probably work fine with the default settings.
8) Check "Run Immediately" > Finish

I got this error when trying to import special characters into a text field with MaxLength=1. The problem is that some characters that display as one letter are stored as more than one character, so I needed to increase the MaxLength on this column (ex: "A" with a tilde on top).
Error 0xc02020c5: Data Flow Task 1: Data conversion failed while converting column "Column 2" (18) to column "Column 2" (63). The conversion returned status value 2 and status text "The value could not be converted because of a potential loss of data.". (SQL Server Import and Export Wizard)

Backup

Backup database (schema and data):
1) Right-click on database > Tasks > Backup
2) fill in the form and save the backup

Backup schema:
1) Right-click on database > Tasks > Generate Scripts
2) fill in the forms and save the backup
Database

Create


CREATE DATABASE databaseName;

Delete


DROP DATABASE databaseName;


Security

Current Information

Functions to get information about current user, etc.

USER_ID()
USER_NAME()
SYSTEM_USER()
SESSION_USER()
CURRENT_USER()
USER()
SUSER_ID()
HOST_NAME()
Table

Naming Convention

SQL tables are usually named plurally. For instance, "Customers" instead of "Customer".
But that's not required.

There are various naming conventions, like "tblCustomers" or "CustomersTbl". The benefit of using a prefix or suffix is that you avoid naming collisions.
- Common table names like "Users" can collide with system tables or keywords.
- A table that contains a column with the same name can cause problems when converting to an object model. For instance, C# does not allow a class to contain a property of the same name.

Create


CREATE TABLE tableName
(
    columnName dataType,
    columnName dataType
);

CREATE TABLE Customer
(
    CustomerId INT IDENTITY(1,1) PRIMARY KEY,
    Name VARCHAR(50) NOT NULL,
    CreatedDate DATE DEFAULT GetDate(),
);

Alter


ALTER TABLE tableName ADD columnName dataType;

ALTER TABLE tableName ALTER COLUMN columnName newDataType;

ALTER TABLE tableName DROP columnName;

Empty

Removes all rows of data from the table.


TRUNCATE TABLE tableName;

Delete


DROP TABLE tableName;

Index

An ordered lookup for records in a table or view.

Naming Conventions

Indexes are named "IX_TableName_ColumnNameA_ColumnNameB...".

Create


CREATE CLUSTERED INDEX indexName ON tableName(columnA, columnB);

CREATE NONCLUSTERED INDEX indexName ON tableName(columnA, columnB);

CREATE NONCLUSTERED INDEX indexName ON tableName(columnA, columnB) 
WHERE columnA > 0;

Drop


DROP INDEX tableName.indexName;

Search

Table Sys.Indexes contains all clustered and non-clustered indexes.
The object_id column is the object_id of the index's table.

SELECT * FROM Sys.Indexes WHERE Object_ID = Object_ID ('TableName')

Clustered

Limit one Clustered Index per table, this is the order the table will be physically stored in.

Creating a non-clustered primary key

CREATE TABLE tableName
(
    columnName INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED
);
If your table has no clustered index, it will have an index type "HEAP" in the Sys.Indexes table.

How to choose a clustered index if it isn't your primary key?
- Selective: many distinct values
- Narrow: few columns
- Immutable: or at least changes rarely (changes to clustered index value cause table re-ordering and fragmentation)
- Searched: frequently used in queries that scan a range sequentially
- Ordered: ideally, new records always have a higher value than previous records, so they just append to the end

Your clustered index will be referenced by every non-clustered index, so a bad clustered index will affect everything.

An example of a non-primary-key clustered index would be the Sale Date on a Sales table. It does not change, it always increases for new records, and it is frequently queried in ORDER BY and GROUP BY clauses.

Non-Clustered

Unlimited number of Non-Clustered Indexes per table, but too many will significantly slow down Insert/Update/Delete operations.

The slow-down effects don't generally happen until you get above 12-ish indexes.

Create a non-clustered indexes for:
- A column frequently used in WHERE clauses
- A column frequently used in JOINs
- A column frequently used in GROUP BY clauses

Also sometimes for Included columns (i.e. columns in the SELECT clause). If every column mentioned in the entire query is in one Index, then no further I/O will be required to complete the query.

I've read that Indexes will only be used for WHERE clauses if the columns in the Index exactly match the columns in the WHERE clause. But I can't remember if that was applicable to MS SQL Server.

Filtered

Filtered Indexes are a type of Non-Clustered Indexes.

The Index definition contains a WHERE clause, so not all records are Indexed.

For example, you could create a Filtered Index to support a set of reports. If you know that the reports only focus on sales of amounts X to Y, and ignores outliers, then you can make an index with just those values included.

Filtered Indexes cannot use CASE, NOT IN, BETWEEN, or date functions.


Column Constraints

NOT NULL

Column cannot contain a NULL value.

UNIQUE

Every non-null value in the column must be unique.

Can be defined across several columns at once, meaning that the combination of the column values must be unique.

Naming convention: "UQ_TableName_ColumnName".

CHECK

Set a condition, if FALSE then an error will be thrown when the row is inserted or edited.

If the constraint is added on its own, it can reference any value in the current row.
If the constraint is added in-line with its column, it can only reference that column.

Naming convention: "CHK_TableName_ColumnName".

DEFAULT

Set a default value to be used if the value is NULL when the row is added.

Default constraints can be set in-line during CREATE TABLE, or added individually with ALTER TABLE. Defaults cannot be added in the listed constraints at the end of a CREATE TABLE statement.

Naming convention: "DF_TableName_ColumnName".


CREATE TABLE Customer (
    ...
    CreatedDate DATETIME CONSTRAINT DF_Customer_CreatedDate DEFAULT (GETDATE()),
    ...
);

ALTER TABLE Customer ALTER CreatedDate SET DEFAULT (GETDATE());

You cannot set a DEFAULT on an IDENTITY column.

IDENTITY(S,I)

Set a default value that starts at seed S and auto-increments by I for each new row.

Truncating a table does not reset the Identity back to S.
A failed insert will also cause the Identity to increase, thereby skipping a number.

View identity value:

DBCC CHECKIDENT("TableName");

Reset identity:

DBCC CHECKIDENT("TableName", RESEED, 1);
You are telling the table that 1 is the current highest identity value used in the table, so the next insert will have Identity 2.

Returns the most recent Identity value inserted in the current scope (current stored procedure, trigger, function, or batch).

SCOPE_IDENTITY()

Returns the most recent Identity value inserted into the selected table, across all scopes and sessions.

IDENT_CURRENT('tableName')

Returns the most recent Identity value inserted in the current session.

@@IDENTITY

PRIMARY KEY

A unique identifier of an entire data row. Can contain multiple columns, called a Composite Key.
Does not allow NULL values.
Only one allowed per table.
This is usually also the Clustered Index on the table, but it doesn't have to be.

Naming convention: "PK_TableName".

Candidate Key: one possible set of columns that can uniquely identify all rows.
Alternate Key: a Candidate Key that was not selected as the Primary Key.

FOREIGN KEY

Defines a relationship between two tables.
Defines that the value in this column must also exist in a particular column of another table.

Values do not have to be unique.

If the relationship is clear, and there is only one relation between these two tables, the naming convention is "FK_ThisTableName_ForeignTableName".

If the relationship needs more explanation, or there are multiple relations between these two tables, the naming convention is "FK_ThisTableName_ForeignTableName_Descriptor".

The naming convention for the column that has this constraint is "fkColumnName". Frequently, it is named after the column in the foreign table that it is referencing.

SPARSE

Specify this on a column with many NULL values. It reduces the amount of physical space used to store a NULL value and a 0 value.

Non-null values will require greater overhead to read.
Creating a filtered index on a sparse column, for just the non-null values, may maximize storage and performance at the same time.

Cannot be used with data types: text, ntext, image, timestamp, geometry, geography, or user-defined types.

Examples

There are two ways to define constraints:
- in-line in the column definition
- as a separate line (using this method you can name the constraint for future reference)
    

CREATE TABLE Customer
{
    CustomerId INT NOT NULL UNIQUE
}
or

CREATE TABLE Customer
{
    CustomerId INT NOT NULL,
    CONSTRAINT unqCustomerId UNIQUE(CustomerId)
}


CREATE TABLE tableName
{
    columnA dataType,
    columnB dataType,
    columnC dataType,
    CONSTRAINT constraintName1 UNIQUE(columnA, columnB),
    CONSTRAINT constraintName2 PRIMARY KEY(columnA),
    CONSTRAINT constraintName3 FOREIGN KEY (columnB) REFERENCES otherTableName(columnOther),
    CONSTRAINT constraintName4 CHECK(columnC > 0),
    CONSTRAINT constraintName5 CHECK(columnA = 'abc' OR (columnB < 100 AND columnC <> GetDate()))
}


CREATE TABLE tableName
{
    Id BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY,
    OrderDate DATE DEFAULT GetDate(),
    City VARCHAR(100) DEFAULT 'Tuscon',
    fxStateId INT FOREIGN KEY REFERENCES State(StateId),
    AreaCode INT CHECK(AreaCode > 0)
}

Alter Constraint


ALTER TABLE tableName ADD UNIQUE(columnA);
ALTER TABLE tableName ADD CHECK(columnB > 0);

ALTER TABLE tableName ADD CONSTRAINT constraintName CHECK(columnB > 0);

ALTER TABLE tableName ALTER COLUMN columnName SET DEFAULT 'value';

Drop Constraint


ALTER TABLE tableName DROP CONSTRAINT constraintName;

If you did not explicitly name the constraint, you'll have to look up the default name assigned by the database.

Some constraint types are limited to one per column. These can be removed by type.


ALTER TABLE tableName ALTER COLUMN columnName DROP DEFAULT;

Cardinality

1 To 1

A 1 to 1 relationship is usually in a single table.

1 To Many

A 1 to Many relationship is a simple Foreign Key constraint.

Let TableB have a foreign key constraint to TableA. Now 1 record in TableA can be related to Many records in TableB.

Many To Many

A Many to Many relationship between two tables requires an intermediate table.

TableC has one column with a foreign key constraint to TableA, and one column with a foreign key constraint to TableB.

Ex: A SalesPeople table can have a Many to Many relationship with a Customers table. Each SalesPerson may work with multiple Customers, and over time each Customer may work with multiple SalesPeople.
Variables

This section discusses local variables. SQL Server does not allow global variables.

Variables can be used in queries anywhere you could use a literal value.

Naming Convention

Variable names must start with a "@" (at-sign).

DECLARE


DECLARE @variable dataType;
DECLARE @variable AS dataType; --the AS is optional
DECLARE @varA dataType, @varB dataType; --separate variables with a comma

SET


SET @variable = value;

SET @variable = (
    SELECT COUNT(*) FROM tableName
);

SELECT

Set variable values with a SELECT statement.

SELECT
    @varA = columnA,
    @varB = columnB
FROM tableName;
Data Types

CHAR(N) --fixed length string, exactly N characters
VARCHAR(N) --variable length string, up to N characters
BIT --boolean, TRUE or FALSE, 0 or 1
INT --integer
DECIMAL(P,S) --precision P = total digit count, scale S = digits after decimal point
DATE --year, month, and day
TIME --hour, minutes, and seconds
TIMESTAMP --date plus time
MONEY --variable length decimal with 2 digits after decimal point
.
.
.
todo

Cast


CAST(@count AS VARCHAR(MAX))

Strings

Print text to client console

PRINT @text;

Concat

PRINT @text + 'more text';

Trim

PRINT RTRIM(@text);
PRINT LTRIM(@text);
PRINT TRIM(@text);
Literals

String

CHAR, VARCHAR, or TEXT (may not recognize some characters)

'a'
'a string'
'12343'
'And. Punc,tuation!'

NCHAR, NVARCHAR, or NTEXT (Unicode strings)

N'a'
N'a string'
N'12343'
N'And. Punc,tuation!'

Integer


84756
+3
-1

Decimal


453.2535
+30.003
-0.454

DateTime


'May 2, 2015'
'2015/05/02' --YYYY/MM/DD
'2015/05/02 09:18:30' --YYYY/MM/DD hh:mm:ss

Escaping Characters

Escape % in a LIKE comparison:

WHERE column LIKE '%[%]%' --where string has a percent sign anywhere in it
or

WHERE column LIKE '%\%%' ESCAPE '\'
User-Defined Data Types

You can create a reusable named type which is a built-in type with constraints already applied.

Create

Format:

CREATE TYPE typeName
FROM dataType constraints;

Example:

CREATE TYPE socialSecurityNumber
FROM CHAR(11) NOT NULL;

Default

Bind a default value to a user-defined data type:

CREATE DEFAULT default_PhoneNumber
AS 'Unknown Number';

EXEC sp_bindefault('Default_PhoneNumber', dataTypeName);

Rule

Bind a rule to a user-defined data type:

CREATE RULE rule_CountryCode
AS
    @CountryCode NOT LIKE 'A%';
    
EXEC sp_bindrule(rule_CountryCode, "CountryCode");


CREATE RULE rule_PhoneNumber
AS
    (@phone='UnknownNumber')
    OR (
        LEN(@phone)=14
        AND SUBSTRING(@phone,1,1)= '+'
        AND SUBSTRING(@phone,4,1)= '-'
    );
    
EXEC sp_bindrule(rule_PhoneNumber, "PhoneNumber");

To remove a rule:

EXEC sp_unbindrule("dataTypeName");

Insert

Default Order

Insert values in the default order:

INSERT INTO tableName VALUES (3, 'Texas', '2015/09/30');

Custom Order

Insert values in a specific order:

INSERT INTO tableName (columnB, columnD, columnA) VALUES ('Texas', 343.55, 3);

Transfer Data

You can also transfer data between tables with SELECT INTO:

SELECT * INTO tableB FROM tableA WHERE columnA = value;

SELECT * INTO backupTable IN 'filename.mdb' FROM tableA;

INSERT INTO backupTable SELECT * FROM tableA;

INSERT INTO backupTable (columnA, columnB) SELECT columnA, columnB FROM tableA;


Update

Update records in a table:

UPDATE tableName SET columnA='Value', columnC=123 WHERE columnB >= 10;

CASE


UPDATE tableName SET columnA = (CASE WHEN columnA=0 THEN 1 ELSE 0 END);

Delete

Delete records from a table:

DELETE FROM tableName WHERE columnName = value;
Select

Select records from a table:

SELECT * FROM tableName;

SELECT columnA, columnB FROM tableName;

Aliasing


SELECT 
    columnA AS aliasA, 
    columnB 
FROM tableName;

SELECT 
    * 
FROM tableA AS tA
INNER JOIN tableB as tB ON tA.columnA = tB.columnC;

Joins

INNER JOIN
matches rows in both tables

LEFT JOIN (LEFT OUTER JOIN)
returns everything from the left (first) table, with matches from the right table where possible

RIGHT JOIN (RIGHT OUTER JOIN)
returns everything from the right (second) table, with matches from the left table where possible

FULL JOIN (FULL OUTER JOIN)
returns everything from both tables, with matches where possible

A self join means that you join a table to itself. This is not a special command word.

Be careful with joins, you can really slow down a query with an overly-large join.


SELECT 
    * 
FROM tableA 
INNER JOIN tableB ON tableA.columnA = tableB.columnB;

Union


SELECT * FROM tableA WHERE columnA = value
UNION
SELECT * FROM tableB WHERE columnB = value

The result returned will have the DISTINCT rows from the combined SELECT statements.

Use UNION ALL to return duplicate rows as well. This is faster.

The column list in each result set must be in the same order, with the same names and data types.

Top

Select just the first N rows of the result set.


SELECT TOP 5 
    * 
FROM tableName;

SELECT TOP 10 PERCENT 
    * 
FROM tableName;

--surround variables in parentheses
SELECT TOP (@BatchSize) 
    * 
FROM tableName;

Distinct

Returns just the distinct rows in the result set.


SELECT DISTINCT 
    * 
FROM tableName;

Where Operators

Where clauses can be combined using the set joins AND and OR.
Where clauses can be negated with keyword NOT.


SELECT 
    * 
FROM tableName 
WHERE (columnA < 5  AND (columnB = 'D' OR columnC <> 4));

Null Operators: IS NULL, IS NOT NULL
SQL uses 3-value truth statements, so if you want to operate on a NULL value, you must use these operators.

Normal Operators:
= equals
<> not equal
> greater than
< less than
>= greater than or equal
<= less than or equal

BETWEEN Where Operator

Defines a range of valid values.

Note that different database engines will include or exclude one or both of the boundary values. MS SQL Server includes both boundary conditions in the valid values.

I.e., the range is inclusive at both the minimum and the maximum.


SELECT * FROM Customer WHERE OrderCount BETWEEN 3 AND 5;

SELECT * FROM Customer WHERE MiddleInitial BETWEEN 'A' AND 'D';

SELECT * FROM Customer WHERE Birthday BETWEEN '1995/01/01' AND '2000/01/01';

LIKE Where Operator

A simple regular expression matcher.


SELECT * FROM Customer WHERE FirstName LIKE 'Sa%'; --any name starting with "Sa"

% matches any number of any characters
_ matches exactly one character
[abc] matches one character in the brackets
[a-g] matches one character in the range in the brackets
[^abc] matches one character that is NOT in the brackets
[!abc] is equivalent to [^abc]


--select any string containing a '%' percent sign
SELECT * FROM Comment WHERE Text LIKE '%[%]%';

--change the escape character
SELECT * FROM Comment WHERE Text LIKE '%\%%' ESCAPE '\';

IN Where Operator

Checks that the column value is in the provided list.


SELECT * FROM Customer WHERE State IN ('TX', 'IL', 'NM');

SELECT * FROM Customer WHERE State IN (
    SELECT State FROM States WHERE Population < 1,000,000
);

EXISTS Where Operator

Returns true if result set returns any rows at all. This can be very fast, since it only needs to verify that one row would be returned.


WHERE EXISTS (
    --subquery
);

Group By

Define how to group rows so that aggregate operations can be performed on them.

The GROUP BY clause goes after the WHERE clause.


SELECT City, COUNT(CustomerId) AS Cnt FROM Customer GROUP BY City;

SELECT City, Zipcode, COUNT(CustomerId) AS Cnt FROM Customer GROUP BY City, ZipCode;

Every column you are grouping by must be included in the SELECT clause.
All other columns in the SELECT clause must be aggregate functions.

Having

The WHERE clause cannot use aggregate functions. To put conditional statements on aggregate results, use the GROUP BY and the HAVING clauses.

The HAVING clause goes after the GROUP BY clause.

HAVING can use all the same structures and operators as WHERE.


SELECT 
    City, 
    COUNT(CustomerId) AS Cnt 
FROM Customer 
GROUP BY City 
HAVING Cnt > 5;
    
SELECT 
    City, 
    COUNT(CustomerId) AS Cnt 
FROM Customer 
GROUP BY City 
HAVING SUM(otherColumn) < 6;

Order By

ASC sorts results in ascending order (lowest to highest). This is the default.
DESC sorts results in descending order (highest to lowest).

The ORDER BY clause goes last.
    

SELECT * FROM Customer ORDER BY LastName ASC;

SELECT * FROM Customer ORDER BY LastLogin DESC, LastName ASC;

Specifying multiple ORDER BY columns means sort by the first column, then by the second column, and so on.

ROW_NUMBER

ROW_NUMBER will add a column to your result set that numbers the results 1 to N.


SELECT
    ROW_NUMBER() OVER(ORDER BY CustomerId) AS RowNum,
    CustomerId,
    Name
FROM Customer
ORDER BY Name
The result set will be ordered by Name, but the RowNum will be ordered by CustomerId.


SELECT
    ROW_NUMBER() OVER(
        PARTITION BY CustomerId, OrderYear 
        ORDER BY CustomerId, OrderYear
    ) as YearlyOrders,
    CustomerId,
    Name,
    OrderYear
FROM Customer
The result set will be group by unique combinations for CustomerId/OrderYear.
Each grouping will be numbered 1 to N with that group.
So you'll have duplicate YearlyOrders in the final result set.

Aggregate Functions

AVG(col) average the values

COUNT(col) count the number of rows

FIRST(col) return the first value

LAST(col) return the last value

MAX(col) return the highest value

MIN(col) return the lowest value

SUM(col) sum all the values

Other Functions

UPPER(string) convert all characters to uppercase

LOWER(string) convert all characters to lowercase

MID(string, index, length) substring function, with index starting at 1

LEN(string) string length

RIGHT(string, length) returns the substring, counting from the right-end of the string

LEFT(string, length) returns the substring, counting from the left-end of the string

ROUND(numericColumn, decimalDigitCount) round a number
be wary of the default rounding rules in the system, they may not be the rules you are used to
for instance, scientific rounding versus banker's rounding

ISNULL(checkValue, replacementValue) if the checkValue is NULL, the replacementValue will be used instead

COALESCE(A, B, C...) the first non-null value in the list will be returned

STUFF(text, startIndex, length, replacement) deletes (startIndex, length) from text, then inserts replacement at startIndex

FORMAT(column, 'C', 'en-us') format currency: turn 1198.56 into $1,198.56

XQuery

For columns of type XML Object.

Checking if a node exists:

WHERE xmlCol.exist('rootNode/modeA/nodeB') = 0

Comparing values:

WHERE xmlCol.value('(rootNode/nodeA/nodeB)[1]', 'varchar(30)') = 'myString'

Selecting a value:

SELECT xmlCol.value(('(rootNode/nodeA/nodeB/text())[1]', 'varchar(30)') ...

Using a node like a table:

SELECT colA.query('./nodeC').value('.', 'char(11)')
FROM xmlCol.nodes('rootNode/nodeA/nodeB') as ref(colA)

Insert a node:

--xml before
<results>
    <result>
        <value>Order not synced</value>
    </result>
</results>

--sql
UPDATE MyTable SET xmlCol.modify('
    insert <result><value>Manual update</value></result>
    into (/results)[1]
') WHERE Id = 5;

--xml after
<results>
    <result>
        <value>Order not synced</value>
    </result>
    <result>
        <value>Manual update</value>
    </result>
</results>

Delete a node:

UPDATE MyTable SET xmlCol.modify('
    delete (/results/result)[last()]
') WHERE Id = 5;

Index Hint

You can recommend a particular index be used for a SELECT statement:

SELECT
    columnA
FROM tableName (index=indexName);
Logic

CASE


UPDATE tableName SET columnA = (CASE WHEN columnA=0 THEN 1 ELSE 0 END);

TODO: Check algorithm for CASE statement execution. I remember handling an error where the case statement takes the LAST matching WHEN clause, but Microsoft's website says it will take the FIRST matching WHEN clause.

If no WHEN clauses match, the ELSE clause will be used.
Views

A view is a virtual table based on the results of a SQL statement.
The data is updated every time the view is queried.

When are views useful?
- They abstract away complex computations.
- You can give users access to a View without them access to the underlying tables.

Naming Convention

Views are named "vw_ViewName", or something similar.

Create


CREATE VIEW viewName AS
    SELECT 
        columnA, 
        columnB 
    FROM tableName 
    WHERE columnC <> value;

Update


CREATE OR REPLACE viewName AS
    SELECT 
        columnA, 
        columnB 
    FROM tableName 
    WHERE columnC <> value;

Delete


DROP VIEW viewName;
Temp Table

A temp table is a real table that is stored for a short period of time. The temp table only lasts as long as the current session.

Temp tables are intended for storing result sets that are accessed multiple times.

Temp tables are stored in system database "tempdb". Temp tables are automatically given a unique suffix so that multiple sessions will not overwrite each other's data.

Naming Convention

SQL Server knows a table is a temp table when it starts with "#" (hash or pound or sharp).

Create


SELECT
    columnNameA
INTO
    #tempTableName
FROM
    tableName
WHERE
    clause;
or

CREATE TABLE #tempTableName (
    columnName dataType
);

INSERT INTO #tempTableName
SELECT
    columnNameA
FROM
    tableName
WHERE
    clause;

Global

A global temp table is shared across sessions. This table will be dropped once the creating session ends, and all queries accessing the table end.

Use the "##" (double hash).


CREATE TABLE ##globalTempTableName (
    columnName dataType
);

Delete

You can manually drop a temp table:

DROP TABLE ##globalTempTableName;
Flow Control

BEGIN END

BEGIN...END delineates a code block. All statements within the block will be run as one batch.

Statement blocks can be nested. It is common, for instance, to have an IF...BEGIN...END block within a PROCEDURE...BEGIN...END block.

IF ELSE


IF 1 = 0 PRINT 'A'
ELSE PRINT 'B';


IF 1 = 0
BEGIN
    --multiple lines
END
ELSE
BEGIN
    --multiple lines
END;

WHILE


WHILE @count < 10
BEGIN
    --multiple lines
END

Use BREAK to exit a WHILE loop immediately.

Use CONTINUE to skip ahead to the top of the WHILE loop immediately.

TRY CATCH


BEGIN TRY
    --statements
END TRY
BEGIN CATCH
    --statements
END CATCH

Inside the CATCH block, you can access these values:
- ERROR_LINE() returns the line number where the exception occurred
- ERROR_MESSAGE() returns the exception message text
- ERROR_PROCEDURE() returns the name of the stored procedure or trigger where the exception occurred
- ERROR_NUMBER() returns the number (code) of the error
- ERROR_SEVERITY() returns the severity level of the error (0 to 25)
- ERROR_STATE() returns the state number (code) of the error, which is used with the error number to be more specific about what occurred

By comparison, @@ERROR returns the error number of an error that occurred in just the previous statement.

These information functions are aware of which CATCH block they are inside of, and will return data correctly even when used in or after nested CATCH blocks.

You can THROW an error from a CATCH block if you can't actually resolve it there.
The error will rise to the next CATCH block up in execution.

BEGIN CATCH
    --try to resolve issue
    THROW;
END CATCH

RAISEERROR

Raise an error/exception.

RAISEERROR('message', @severity, @state);

Severity:
- 0 to 10 for informational messages
- 11 to 18 for errors
- 19 to 25 for fatal errors

State:
- 0 to 255
If you raise the same user-defined error at multiple locations, use a different state for each to differentiate them.

THROW

Raise an error/exception another way.

THROW @number, 'message, @state;
These errors default to severity level 16.

Stored Procedures

Stored procedures can return zero, one, or multiple result sets.
Any result of a SELECT statement inside a stored proc will be returned to the client.

Stored procedures are compiled.
Stored procedures can be encrypted to hide sensitive information/calculations.

Recompile your stored procedures after adding new indexes that they should use. Or restart SQL Server.

Basic


CREATE PROCEDURE dbName.procedureName 
AS
BEGIN
    SELECT * FROM tableName;
END;
    
EXEC dbName.procedureName;

Naming Convention

The reference I'm looking at says "usp_ProcedureName", but I've never seen this convention in use. Just use some standard prefix.

Parameters


CREATE PROCEDURE dbName.procedureName 
    @paramA AS dataType 
AS
BEGIN
    SELECT 
        * 
    FROM tableName 
    WHERE columnA = @paramA;
END;
    
EXEC dbName.procedureName @paramA = value; --using named parameters
EXEC dbName.procedureName value; --or just listing the parameters in the same order as the stored proc


CREATE PROCEDURE dbName.procedureName 
    @paramA AS dataType, 
    @paramB As dataType 
AS
BEGIN
    SELECT 
        * 
    FROM tableName 
    WHERE columnA = @paramA OR columnB < @paramB;
END;
    
EXEC dbName.procedureName @paramA = value, @paramB = value;

Optional Parameters

You can provide a default value for parameters, in case the user does not pass one in.

CREATE PROCEDURE dbName.procedureName 
    @paramA AS dataType = NULL 
AS
BEGIN
    SELECT 
        * 
    FROM tableName 
    WHERE columnA = @paramA;
END;
    
EXEC dbName.procedureName;

Output Parameters

Output parameters are really Input/Output parameters.


CREATE PROCEDURE dbName.procedureName 
    @paramA AS dataType, 
    @paramB AS INT OUTPUT 
AS
BEGIN
    SELECT 
        @paramB = COUNT(*) 
    FROM tableName 
    WHERE columnA = @paramA;
END;

DECLARE @tableCount INT;
EXEC dbName.procedureName @paramA = value, @paramB = @tableCount OUTPUT;

Recursion

Stored procedures can recursively call themselves, but only to a depth of 32.

Dynamic Queries

If you must generate dynamic queries in a stored proc (and you should avoid this whenever possible) then here is a debug trick. Include an optional last parameter that will output the generated SQL for you to see.


CREATE PROCEDURE dbo.myProc 
    @paramA AS dataType, 
    @paramB AS dataType,
    @debug AS BIT = 0
AS
BEGIN
    --build up variable @dynamicSQL
    IF @debug = 1 PRINT @dynamicSQL
    --run @dynamicSQL
    EXEC sp_executesql @dynamicSQL
END;

Extended

An extended stored procedure is a compiled function in a DLL, often written in C, which you can call like a normal stored procedure.

User-Defined Function

You can use the result of a table-valued function as a normal table in a JOIN clause.

You can use a scalar-valued function in a SELECT clause, WHERE clause, or HAVING clause. Be careful of performance issues, since the UDF will be called on every row in the result set.

Naming Convention

"fn_FunctionName"

Table-Valued Function

A function that returns a table-type value.


CREATE FUNCTION fn_functionName (
    @paramA dataType
) RETURNS TABLE
AS
RETURN
    SELECT
        *
    FROM tableName
    WHERE columnA = @paramA

Scalar-Valued Function

A function that returns a scalar-value, such as an int or varchar.


CREATE FUNCTION fn_functionName (
    @paramA dataType
) RETURNS INT
AS
BEGIN
    DECLARE @count INT;
    SELECT
        @count = COUNT(*)
    FROM tableName
    WHERE columnA = @paramA;
    RETURN(@count);
END;

SP Vs UDF

Stored Procedures vs User-Defined Functions

A stored procedure is a collection of any statements. A user-defined function is much more limited.
Generally, stored procedures are used to complete a specific task, and user-defined functions are used to make generic computations.

SP can return 0 to N values.
UDF must return 1 value.

SP support input and output parameters.
UDF only supports input parameters.

SP can contain any DML statement (CRUD operations).
UDF only supports SELECT statements. UDF can't even use TRY/CATCH. UDF can't use transaction commands.

SP can call a UDF.
UDF cannot call an SP.

SP cannot be called within a SELECT statement, WHERE clause, or HAVING clause.
UDF can be called within a SELECT statement, WHERE clause, and HAVING clause. UDF that returns a table can be treated as a normal table within a query.

SP are stored in compiled form, so they will run quickly.
UDF are interpreted each time they run.
Trigger

Triggers are special kind of stored procedure that runs automatically when certain events occur.
Triggers run AFTER an event occurs.

DML triggers react to INSERT, UPDATE, and DELETE events.
DDL triggers react to CREATE, ALTER, and DROP events.
DCL triggers react to the LOGON event.

In SQL Server 6.5, you could only define 3 triggers per table, one for each DML event.
As of SQL Server 7, you can define as many as you want. But you cannot set what order they will fire in.
As of SQL Server 2000, you can at least set which will fire first and last.

Triggers cannot be invoked on-command.

Create

An INSERT trigger:

CREATE TRIGGER triggerName ON tableName
AFTER INSERT
AS
BEGIN
    --statements
END

An INSERT and UPDATE trigger:

CREATE TRIGGER triggerName ON tableName
AFTER INSERT, UPDATE
AS
BEGIN
    --statements
END

You can specify that a trigger not fire during replication processes:

CREATE TRIGGER triggerName ON tableName
AFTER INSERT
NOT FOR REPLICATION
AS
BEGIN
    --statements
END

Inside these stored procedures, you have access to special temp tables that hold the before/after values of affected records.
The "inserted" temp table holds (INSERT) rows to be inserted or (UPDATE) new rows inserted by update.
The "deleted" temp table holds (UPDATE) existing rows modified by update or (DELETE) rows deleted.

INSTEAD OF

An INSTEAD OF trigger will run instead of the INSERT/UPDATE/DELETE operation. The CRUD operation does not occur at all.

Example usage: you have a table that does not allow delete, it just has an IsDeleted column. So you put an INSTEAD OF DELETE trigger on it to enforce that rule.


CREATE TRIGGER triggerName ON tableName
INSTEAD OF DELETE
AS
BEGIN
    --statements
END

Disable

You can temporarily disable a trigger.

DISABLE TRIGGER triggerName ON tableName;

ENABLE TRIGGER triggerName ON tableName;

Drop

Drop a trigger:

DROP TRIGGER IF EXISTS triggerName;

List

List all triggers on a database.

SELECT  
    name,
    is_instead_of_trigger
FROM sys.triggers
WHERE type = 'TR';

Cursors

Cursors let you process a result set one row at a time.

It is recommended that you avoid cursors whenever possible, since they are slow. You can often use set-based operations instead.
A SELECT query will return its full result set with one trip to the server, while a CURSOR will continually ping the server for more sections of the result set.

If you must a cursor, consider storing the query results in a temp table first so that the cursor does not have to keep hitting the table.

Basic


DECLARE cursorName CURSOR 
FOR 
    SELECT 
        * 
    FROM tableName;
OPEN cursorName;
FETCH NEXT FROM cursorName INTO @A, @B;
WHILE(@@FETCH_STATUS = 0)
BEGIN
    --multiple lines
    FETCH NEXT FROM cursorName INTO @A, @B;
END;
CLOSE cursorName;
DEALLOCATE cursorName;

@@FETCHSTATUS returns the status of the last cursor fetch statement. 0 means successful.

Default Settings

SQL Server defaults to static cursors.

To verify the default settings, you can create a cursor and view its settings.

DECLARE c CURSOR FOR SELECT 1;
SELECT properties FROM sys.dm_exec_cursors(@@spid);

Static

A static cursor displays the result set as it was when the cursor was opened.

DECLARE cursorName CURSOR STATIC
FOR 
    --select statement

Dynamic

A dynamic cursor reflects ongoing changes in the base table.

DECLARE cursorName CURSOR DYNAMIC
FOR 
    --select statement

Forward Only


DECLARE cursorName CURSOR FORWARD_ONLY
FOR 
    --select statement

Fast Forward

FAST_FORWARD is almost the same as READ_ONLY, FORWARD_ONLY.
FAST_FORWARD is usually faster than just FORWARD_ONLY.


DECLARE cursorName CURSOR FAST_FORWARD
FOR 
    --select statement

TODO: there's a lot more to cursors than this
CTE

CTE stands for Common Table Expression. CTEs enable recursive queries, which can operate over hiearchical data.

A CTE is a common result set that you can reference within another CRUD statement. It's like a temporary view.

CTEs can help with:
- legibility, because you've broken a complex query into separate pieces
- instead of a view, if you only need it for one query
- recursion, search hierarchical data

You cannot use ORDER BY in a CTE definition, unless you also use TOP.

Non-Recursive


WITH cteName (columnA, columnB)
AS
(
    SELECT
        columnA,
        columnB
    FROM tableName
)
SELECT
    columnA,
    columnB
FROM cteName;

You can define multiple CTEs within one statement.

WITH cteNameA (columnA, columnB)
AS
(
    SELECT
        columnA,
        columnB
    FROM tableName
),
cteNameB (columnC, columnD)
AS
(
    SELECT
        columnC,
        columnD
    FROM tableName
)
SELECT
    columnA,
    columnB,
    columnC
FROM cteNameA
INNER JOIN cteNameB ON cteNameA.columnA = cteNameB.columnB;

Recursive

CTEs have a maximum recursion depth of 100.

Example: count from 1 to 50

WITH cteName (columnA, columnB)
AS
(
    --anchor member
    SELECT 1 AS n

    UNION ALL

    --recursive member
    SELECT n+1
    FROM cteName --invocation
    WHERE n < 50 --terminator
)
--select

Recursive CTEs cannot use GROUP BY, HAVING, LEFT/RIGHT/OUTER JOIN, SELECT DISTINCT, TOP, subqueries, or scalar aggregation.
Scripts

Find All References

Search the database definitions for a string.


USE myDatabase;
DECLARE @text VARCHAR(100);
SET @text = 'IsCommitted';
SELECT
    sys.objects.object_id,
    sys.schemas.name as [schema],
    sys.objects.name as [objectName],
    sys.objects.type_desc as [type],
    @text as [searchString]
FROM
    sys.sql_modules (NOLOCK)
    INNER JOIN sys.objects (NOLOCK) ON sys.sql_modules.object_id = sys.objects.object_id
    INNER JOIN sys.schemas (NOLOCK) ON sys.objects.schema_id = sys.schemas.schema_id
WHERE
    sys.sql_modules.definition COLLATE SQL_Latin1_General_CP1_CI_AS LIKE '%'+@text+'%'
ORDER BY
    sys.objects.type_desc, sys.schemas.name, sys.objects.name

Performance

Get a list of counters used to determine database performance and load. Requires the View Server State permission.


SELECT * 
FROM sys.dm_os_performance_counters 
WHERE instance_name = 'databaseName';
Trigger

Naming Convention

"TR_TableName_Event"
Ex: TR_Customers_AfterInsert

Possibly with an additional descriptor at the end, if you have many triggers on one event.
Hints

NOLOCK

Tells SQL: don't wait for the table to be unlocked, just read from it.


SELECT * FROM BusyTable (NOLOCK);
Collation

Collation can be set at the database level, can be overridden at the table level (?maybe?), and can be overridden at the column level.

Collation affects how text data is interpreted, how text is sorted, case-sensitivity rules, etc.

You cannot compare text data that is using different collations.

Database

To see what collation a database is using:
1) Open SSMS
2) Right-click on the database > Properties > Options tab > Collation is at the top

Or in T-SQL:

--for the server
SELECT CONVERT (varchar(256), SERVERPROPERTY('collation'));

--for the database
SELECT name, collation_name FROM sys.databases; 

--for the column
SELECT name, collation_name FROM sys.columns WHERE name = N'<column name>';

To see all supported collations:

SELECT name, description FROM sys.fn_helpcollations();


LocalDb

Microsoft SQL Server 2012 LocalDb
Microsoft SQL Server 2014 LocalDb
Microsoft SQL Server 2016 LocalDb
Microsoft SQL Server Express LocalDb

These are light-weight databases for developers. Instead of running as a Service, they are run on-demand (when a connection is attempted) and turn off when the last connection closes.

The actual data is saved in the user's Application Install folder.

Run the SqlLocalDb Utility to verify installation:

//in command prompt
SQLLocalDb info
If installed, it will output "MSSQLLocalDb" and (sometimes) the version number.

Connection

To make a connection from Visual Studio:
1) Open your Visual Studio ASP.Net project
2) View menu > Database Explorer (or Server Explorer, in other versions of Visual Studio)
3) Right-click on Data Connections > Add Connection
4) Select Microsoft SQL Server > Continue
5) Set Server Name to connect to SQL Server Local DB
- in Visual Studio 2012, use "(LocalDb)\v11.0"
- in Visual Studio 2015/2017, use "(LocalDb)\MSSQLLocalDB"
6) Select an existing database from the list, or enter a new name
7) Ok

Now in the Database Explorer, you can see the structure of that database.

You can right-click on a table in the Database Explorer > Show Table Data
- view data
- edit data manually
- insert data manually

You can look at the Properties for the connection to see the connectionstring:
Ex: "Data Source=(LocalDb)\MSSQLLocalDB;Initial Catalog=DemoDatabase;Integrated Security=True"

Delete Database

You can access LocalDb database through Visual Studio:
View menu > SQL Server Object Explorer
- open the LocalDb Databases folder
- you can delete a database from here