SQL Server Views

This post is first of yet to come long series on basic concepts of SQL Server. I’ll begin with basic 101 introduction to SQL Server views. I wouldn’t recommend reading any further if you are an experienced sql server’er. I value your time :-)

What are views?

  • virtual tables
  • named select statements

Where are views used?

  • to simplify underlying data model for users
  • to implement security mechanism
  • anywhere a table is expected

How to create views?

Create View t-sql command is used to create a view. Following is a sample from AdventureWorks database. Views are defined by the underlying query (select statement that creates it). Views contain rows and columns that were returned by the underlying query.

CREATE VIEW dbo.[vEmployee]
AS
SELECT
c.[FirstName]
, c.[LastName]
, [StateProvinceName] = sp.[Name]
, [CountryRegionName] = cr.[Name]
FROM
[HumanResources].[Employee] e
INNER JOIN [Person].[Contact] c
ON c.[ContactID] = e.[ContactID]
INNER JOIN [HumanResources].[EmployeeAddress] ea
ON e.[EmployeeID] = ea.[EmployeeID]
INNER JOIN [Person].[Address] a
ON ea.[AddressID] = a.[AddressID]
INNER JOIN [Person].[StateProvince] sp
ON sp.[StateProvinceID] = a.[StateProvinceID]
INNER JOIN [Person].[CountryRegion] cr
ON cr.[CountryRegionCode] = sp.[CountryRegionCode]

A select query can be used to retrieve data from the view.

SELECT * FROM dbo.vEmployee

If you are following along and executing the scripts so far you will notice that selecting from a view will return a result set (rows and columns).

How are views stored?

Unlike tables, contents of a view aren’t physically stored. Only the underlying query is given a name and saved on the server. Okay Mr.Genius, if the contents of a view are not saved how did selecting a view return data? I am glad you asked.

SQL Server internally replaces view with the underlying query and reads                                                                                                        select * from <view name> as select * from <underlying query>.

Uses of Views:

Most of the OLTP databases are highly normalized; which means data is spread across multiple tables and naturally joins will have to be used to retrieve this data making it difficult for report writers and other users. Views like dbo.vEmployee can be used to hide all the complexity from users. They form a simplified layer between database and users.

Views can also be used to implement security mechanism. Imagine a student table with ID, Name, SSN, and DOB. Not all the users should have access to SSN of students, a view can be created with only ID, Name, and DOB and users be granted access to this view limiting them from viewing SSN.

~Sam

SCRIPT: Find the row count of all tables and views

Today I’m sharing another script that I often find useful – Queries to find the count of rows from tables and views in a SQL Server database.

Count Rows In All Tables:

This query returns the count of rows from all tables in a database. The results can be seen instantaneously without causing any blocking or locking. It is safe to run this in production.

; WITH    A AS ( SELECT   TableName = O.Name ,
                       
SchemaName = SCHEMA_NAME(O.Schema_id) ,
                       
[Rows] = P.Rows
              
FROM     sys.objects O
                       
INNER JOIN sys.partitions P ON O.OBJECT_ID = P.OBJECT_ID
              
WHERE    TYPE = 'U'
            
)
   
SELECT  SchemaName ,
           
TableName ,
           
SUM([Rows])
   
FROM    A
   
GROUP BY SchemaName ,
           
TableName ;

 

Count Rows In All Views:

Unlike for tables, there is no way to query any DMV to get the row count. The only way is to perform COUNT() on all the views. The following query outputs a select for each view. Copy the results and execute in a query window.

SELECT fullname+' UNION ALL'
FROM sys.views v
INNER JOIN
sys.schemas s
ON s.schema_id = v.schema_id
CROSS APPLY (SELECT 'select '+''''+s.name+'.'+v.name+''''+','+ 'count(*) from '+s.name+'.'+v.name AS fullname) AS a

UNION ALL

SELECT 'SELECT '+'''DUMMY'''+', '+'0'

Caution: The query for views can be dangerously slow depending on the view’s definition. Please, please, please test it in your development before running it in production.

@SamuelVanga

SCRIPT: Basic Information About Indexes

I’m having a little fun with documenting basic information about indexes in my current project. I’m posting the scripts here mostly for me to come back when I need them in the future, but hopefully some of you might find them useful.

Find tables without any index:

This script gives you the list of tables that don’t have any index.

; WITH A
AS
(
SELECT SchemaName = OBJECT_SCHEMA_NAME(t.OBJECT_ID) ,
TableName = t.name
FROM   sys.tables t
WHERE  OBJECTPROPERTY(t.OBJECT_ID,'TableHasIndex') = 0
)
SELECT SchemaName ,
TableName ,
TwoPartName = SchemaName+'.'+TableName
FROM A

Find tables without a clustered index:

This script gives the list of tables without a clustered index.

; WITH B
AS
(
SELECT SchemaName = OBJECT_SCHEMA_NAME(OBJECT_ID) ,
TableName = OBJECT_NAME(OBJECT_ID)
FROM sys.indexes
WHERE index_id = 0
AND OBJECTPROPERTY(OBJECT_ID, 'IsUserTable') = 1
)
SELECT SchemaName ,
TableName ,
TwoPartName = SchemaName+'.'+TableName
FROM B
ORDER BY SchemaName

Find indexes on a table along with columns covered:

Use this script to get the index names and a comma separated list of columns included in the index.

; WITH C
AS
(
SELECT TableName = t.name
, IndexName = i.name
, ColumnName = c.name
FROM   sys.tables t
INNER JOIN
sys.indexes i
ON t.OBJECT_ID = i.OBJECT_ID
INNER JOIN
sys.index_columns ic
ON ic.OBJECT_ID = i.OBJECT_ID
AND ic.index_id = i.index_id
INNER JOIN
sys.columns c
ON c.OBJECT_ID = t.OBJECT_ID
AND c.column_id = ic.column_id
)
SELECT DISTINCT TableName, IndexName, ColumnNames
FROM cte A
CROSS APPLY
(
SELECT ColumnName + ', '
FROM cte B
WHERE A.TableName = B.TableName AND A.IndexName = B.IndexName
ORDER BY TableName, IndexName
FOR XML PATH('')
)
D (ColumnNames)

Notice that I generously used CTE’s here. I like to reuse column aliases instead of repeating length expressions everywhere. CTE allows you to do that.

@SamuelVanga

Be careful with two digit years

I recently had a scary conversation with a user, which can be broken down to:

User: Why would the data have customers with a birth date of MM/DD/2020?

Me: What are you talking about? Lemme take a look.

Me again, this time with a changed voice: Umm, Umm, i plead guilty with an explanation your honor!

User: Let’s hear your explanation son.

Me: We receive a raw file with customer data. Date of birth is a field in that file, and it is supposed to contain data in ‘DD-Month-YYYY’ format. But, apparently, one fine day the file contained two digits for year instead of four. My import process decided that the two digit year 20 is 2020 and not 1920.

User: Okay, I’ll let you go this time, but fix it. Won’t you?

<end of story>.

Turns out, SQL Server automatically uses current century for two digit years less than 50, and previous century for years greater than or equal to 50.

SELECT CAST('01-January-49' AS DATETIME) [<50]

Result:

<50
———————–
2049-01-01 00:00:00.000

(1 row(s) affected)

SELECT CAST('01-January-50' AS DATETIME) [>50]

Result:

<50
———————–
1950-01-01 00:00:00.000

(1 row(s) affected)

Here is the books online explanation for the two digit year cutoff. Benjamin Nevarez (t|b) pointed me to this link when i asked for help on twitter using the #sqlhelp hashtag. Thanks Benjamin.

Cheers!

@SamuelVanga.

You will also like If i learn something, sorry, but you have to read it!

LAST_VALUE() Analytic Function–SQL Server 2012

Earlier, i discussed FIRST_VALUE() function. FIRST_VALUE() returns the first value from an ordered set of records. Similarly LAST_VALUE() returns the last value for an ordered result set, but you are in for a bit of a surprise if you aren’t familiar with the enhancements to the SQL windows functions.

I’ll use the same example data i used earlier.

CREATE TABLE Customers
(CustId INT, OrderDate DATE, OrderAmount MONEY) ;

INSERT Customers VALUES
(1, ’2011-12-03′, 12500)
, (
1, ’2011-10-07′, 15000)
, (
1, ’2011-09-12′, 16000)
, (
2, ’2011-07-11′, 900)
, (
2, ’2011-09-13′, 18000)
, (
2, ’2011-11-03′, 7000)
, (
3, ’2011-01-29′, 19000)
, (
3, ’2011-04-19′, 1000)
, (
3, ’2011-07-09′, 12000) ;

We wrote a query using FIRST_VALUE() to calculate the date of first purchase for each customer.

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, FIRST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM
Customers c

Now, let’s say we want to know the recent purchase date for each customer. Similar to FIRST_VALUE(), there is a LAST_VALUE() function. LAST_VALUE() allows you to return the last value from an ordered result set.

I’ll follow my killer instincts and replace FIRST_VALUE() with LAST_VALUE() in the above query. It should work. Right?

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM Customers C

Result Set:

… but, oops, this is not the result i was expecting.

There are two points to note here. Window and Frame.

A window is a set of records that are defined by the OVER() clause. Because of the PARTITION BY Custid, you can safely think records for each Custid form a window in this example.

A Frame is a range of rows within a window; a subset of a window in other words, defined by the ROWS clause. A range has boundaries – a begin position and an end position.

Look back at the query using LAST_VALUE(). You’ll notice there wasn’t a ROWS clause. We haven’t defined a frame. In this case SQL Server uses the default frame which is, between the first row without any boundaries (first row of the window partition) and the current row. Last value from this default frame will always be the current row.

Correct usage of the function should instead be:

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate ROWS BETWEEN  CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM Customers C

Given all this, FIRST_VALUE() is providing expected results even with the default frame because the first row in the default frame is the first row of the window partition.

Here is an interesting article from Jeremiah Peschka (blog) for further reading: Leaving the Windows Open.

~Sam

Follow Sam on Twitter - @SamuelVanga


FIRST_VALUE() Analytic Function–SQL Server 2012

Given a customer table:

CREATE TABLE Customers
(CustId INT, OrderDate DATE, OrderAmount MONEY) ;

INSERT Customers VALUES
(1, ’2011-12-03′, 12500)
, (
1, ’2011-10-07′, 15000)
, (
1, ’2011-09-12′, 16000)
, (
2, ’2011-07-11′, 900)
, (
2, ’2011-09-13′, 18000)
, (
2, ’2011-11-03′, 7000)
, (
3, ’2011-01-29′, 19000)
, (
3, ’2011-04-19′, 1000)
, (
3, ’2011-07-09′, 12000) ;

Let’s say, we want to retrieve all records along with a column showing the customer’s date of first purchase.

We could start by computing the date of first purchase for each customer.

SELECT CustId, MIN(OrderDate) FirstPurchaseDate
FROM   Customers
GROUP BY
CustId

Result set has one row per each customer.

image

Joining this to the original table will return the expected results.

SELECT c.*, firstD.FirstPurchaseDate
FROM   Customers c
INNER JOIN
(
SELECT CustId, MIN(OrderDate) FirstPurchaseDate
FROM   Customers
GROUP BY
CustId
)firstD
ON firstD.CustId = c.CustId

Result set:

image

The query plan uses a nested join because of the join we were using the query above.

image

We can achieve the same results using the FIRST_VALUE() function.

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, FIRST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM
Customers c

[PARTITON BY CustId], creates a partiton for each customer. [ORDER BY OrderDate] sorts each partition by the OrderDate. Now, from this ordered result set, FIRST_VALUE() returns the first value for each partition.

The query is easier to read and a join is not needed. Result: the nested loop is eliminated in the query plan. This could improve performance.

image

I’ll conclude with a question for you to understand LAST_VALUE(). From the same fictitious customer data, let’s say we want to compute the recent purchase date (as opposed to first purchase date from above example) of every customer. What would be the query? I welcome your response in the comments below.

@SamuelVanga

CONCAT() string function–SQL Server 2012

As the name indicates, CONCAT() will concatenate two or more strings. This is one of the two string functions introduced in SQL Server 2012, FORMAT() being the other one.

New? Wait. Concatenating isn’t new. It’s always been there. Yeah. you are right. No, concatenating isn’t new.Yes, CONCAT() function is new.

“+” is used to concatenate strings, prior to 2012. As you can see in the query below, two strings are being concatenated.

SELECT 'HELLO'+SPACE(1)+'WORLD'

It gets tricky when you deal with non varchar data, and null values.

SELECT 1+SPACE(1)+'World' ;

SELECT ‘John’+NULL+‘Doe’ ;

These queries don’t return expected results. To fix that, select query should contain logic to replace null and convert int to string. On the left side of the below image is the corrected query.

image

CONCAT(), as can be seen in the right side, simplifies this process. It concatenates 2 or more strings, and automatically (implicitly) converts null to blank space and integer to string.

One more example to better explain the use of CONCAT().

SELECT CONCAT(1,1) ConcatInt
, CONCAT('John',NULL,'Doe') ConcatNull
, CONCAT(1,SPACE(1),'John') ConcatIntString

Lets wrap up…

Concat() is clearly a simpler way to concatenate strings. It improves code readability and is one of my favorite T-SQL enhancements.

~Sam.

T-SQL Tuesday #025 – Tips and Tricks

And the trick i share for this month’s T-SQL Tuesday is… Distraction free management studio.

If you’re an avid writer, you might be using Writeroom. It’s a full screen distraction free writing application, which gets your computer out-of-the-way so that you can focus on your work. Dark Room is a similar app for Windows.

Distraction free! Focus on your work! Isn’t that amazing?

So how about a similar app for management studio where you spend most of your time, writing queries? Wouldn’t it be nice to focus on queries alone and not worry about new email pop-ups or any other windows on your computer?

I did a quick search, and results, nada. But, better yet, this experience can be achieved by modifying a few options.

  • Add line numbers. From management studio, choose Tools, options. Expand Text Editor and expand Transact-SQL. In the general properties check Line numbers under Display and click ok.
  • Change background color. Again from options expand Environment and choose Fonts and Colors. Click ok.
  • View query window in full screen. Click view and choose full screen when writing queries. This will bring SSMS to the foreground and closes object explorer too. To exit from full screen again click view and deselect full screen.
  • Hide results window. Finally, press Ctrl+R to hide the results window. Press Ctrl+R again to display results window.

image

Give it a try, you’ll know what i am talking about!

@SamuelVanga

New string function in SQL Server 2012 – FORMAT()

Formatting numbers in an SSRS report is a common task. For example, you may want to format a number as currency or percentage.

You can select a format from the number page of the properties window.

image

You can let sql handle the formatting, so data in the result set is pre-formatted.

DECLARE @Sales MONEY = 32182000.85;

SELECT ‘$’
+ CONVERT(VARCHAR(32),@Sales,1);

Results:

image

Finally, you can use the newly introduced FORMAT() function in SQL Server 2012. Format() will, according to books online, return a value formatted with the specified format and optional culture. So, instead of converting and concatenating like we did in the previous example, FORMAT() can be used:

DECLARE @Sales MONEY = 32182000.85;

SELECT FORMAT(@Sales,‘c’,‘en-us’);

Results:

image

FORMAT() accepts the following parameters:

  • Value. Actual value that needs to be formatted.
  • Format. Value will be formatted to the specified format. Currency, percentage, and date are few examples.
  • Optional Culture. Specifies the language. More about cultures on BOL.PARSE()

Consider the following query. Value is formatted to three different languages based on the culture:

Formatting Currency:

DECLARE @Sales MONEY = 32182000.85;

SELECT FORMAT(@Sales,‘c’,‘it-IT’) [Italy]
, FORMAT(@Sales,‘c’,‘fr’) [France]
, FORMAT(@Sales,‘c’,‘ru-RU’) [Russian];

Results:

image

Formatting percentages:

DECLARE @Per DECIMAL(2,2) = 0.72;

SELECT FORMAT(@Per,‘p0′,‘en-us’)
,
FORMAT(@Per,‘p2′,‘en-us’);

Results:

image

Conclusion:

Similar formatting is ideally done in the presentation layer, reporting services for example. But I would want to let reporting services do minimal processing. FORMAT() simplifies string formatting. It provides functionality that most developers have always wanted.

Aaron Bertrand (blog), MVP, wrote a more detail article on this topic. Don’t forget to check that out.

~Sam.

Column alias in order by and where

If you ever wonder why column alias can be used in order by and not in where clause, it’s the logical query processing order. Pinal Dave blogged about the complete order here.

SELECT
Name n, GroupName g
FROM  HumanResources.Department D
WHERE g = 'Executive General and Administration'

In the above query, from clause is executed first followed by where clause, and select goes last. So where clause is trying to use a column g which doesn’t exist until select completes execution and thus resulting in a “invalid column g” error.

SELECT
Name n, GroupName g
FROM  HumanResources.Department D
ORDER BY g

But order by in this above query is executed at the end after select has created a column g.

And order by lived happily ever after >:]

~Sam.