Sam Vanga's

SQL Server, T-SQL, BI and more…

Creating Your First SSAS 2012 Tabular Project

leave a comment »

No matter how simple it is, new things come with some confusion; at least until you get used to it. Analysis Services 2012 Tabular is no different. This post will look at steps to create your first Tabular project.

Let’s start with firing up SQL Server Data Tools located under Microsoft SQL Server 2012 RC0. I’ve selected to create a new project.

SQL Server 2012 SSAS Tabular Project

In the above image, I’ve selected Analysis Services Tabular Project.

You might see a error message similar to the one below. This happens when the server instance specified in the Workspace Server property doesn’t have an analysis services running in tabular mode. For instance, in this example the name of my instance is localhost\SQL110, but I purposefully changed it to an incorrect name to show this error.

SSAS Tabular Workspace database server error

To fix this, I’ve changed Workspace Server to the correct one from the properties of model.bim

image

Importing Data

The Model option located on the top allows you to create model files. You’ll begin with importing data from a source.

image

The next screen lets you select a type of data source. I’ve selected Microsoft SQL Server. In the next screen, you choose a server name and database name as the source.

You’ll provide impersonation information in the next screen. This is important because, analysis services uses this account to connect to the data source. So, make sure that this account has required permissions on both the server instances: data source and analysis services.

Next screen asks how you want to import the data. You can either choose from a list of tables or view, or specify a query to retrieve the data. I’ve opted to select from a list.

From the list of tables and views in the below image, I’ve selected FactInternetSales, DimProduct, DimProductSubCategory, DimProductCategory, DimCustomer, and DimGeography.

image

Clicking finish will complete by the data import and the next screen displays a status of the tables and views that were imported.

image

Toggle to the diagram view to see the tables and relationships among the objects in the model. In the below image, you’ll notice the wizard identifies the relationships in the data source.

image

Toggle to the grid view to see the data that was imported from data source to the model. This interface looks similar or Excel workbooks with one tab for each table in the data source.

image

Create Measures

The first thing you should do after importing data is to create measures. I’ll select Sales Amount and choose Sum from the measure drop down in the following image.

image

I’ve repeated the same steps for Tax Amount, to create another measure.

Deploy To Target Server

Before deploying to the server, take a few seconds to verify the target server properties. Target Analysis Services must be running in tabular mode.

In the below image, I’ve selected to deploy the model.

image

Browse Your First Tabular Model

I selected Analyze in Excel option located on the top. An excel pivot table with a connection to the model you just deployed allows you to analyze data in Excel.

image

Conclusion

Well, Well, Well. I know this has been a lengthy post. So I’ll let you go with no further ado, but please use the comments below should you have any questions or comments.

~Sam

Written by Samuel Vanga

February 21, 2012 at 7:00 am

Be careful with two digit years

leave a comment »

I recently had a scary conversation with a user, which can be broken down to:

User: Why would the data have customers with a birth date of MM/DD/2020?

Me: What are you talking about? Lemme take a look.

Me again, this time with a changed voice: Umm, Umm, i plead guilty with an explanation your honor!

User: Let’s hear your explanation son.

Me: We receive a raw file with customer data. Date of birth is a field in that file, and it is supposed to contain data in ‘DD-Month-YYYY’ format. But, apparently, one fine day the file contained two digits for year instead of four. My import process decided that the two digit year 20 is 2020 and not 1920.

User: Okay, I’ll let you go this time, but fix it. Won’t you?

<end of story>.

Turns out, SQL Server automatically uses current century for two digit years less than 50, and previous century for years greater than or equal to 50.

SELECT CAST('01-January-49' AS DATETIME) [<50]

Result:

<50
———————–
2049-01-01 00:00:00.000

(1 row(s) affected)

SELECT CAST('01-January-50' AS DATETIME) [>50]

Result:

<50
———————–
1950-01-01 00:00:00.000

(1 row(s) affected)

Here is the books online explanation for the two digit year cutoff. Benjamin Nevarez (t|b) pointed me to this link when i asked for help on twitter using the #sqlhelp hashtag. Thanks Benjamin.

Cheers!

@SamuelVanga.

You will also like If i learn something, sorry, but you have to read it!

Written by Samuel Vanga

February 14, 2012 at 8:10 am

LAST_VALUE() Analytic Function–SQL Server 2012

leave a comment »

Earlier, i discussed FIRST_VALUE() function. FIRST_VALUE() returns the first value from an ordered set of records. Similarly LAST_VALUE() returns the last value for an ordered result set, but you are in for a bit of a surprise if you aren’t familiar with the enhancements to the SQL windows functions.

I’ll use the same example data i used earlier.

CREATE TABLE Customers
(CustId INT, OrderDate DATE, OrderAmount MONEY) ;

INSERT Customers VALUES
(1, ’2011-12-03′, 12500)
, (
1, ’2011-10-07′, 15000)
, (
1, ’2011-09-12′, 16000)
, (
2, ’2011-07-11′, 900)
, (
2, ’2011-09-13′, 18000)
, (
2, ’2011-11-03′, 7000)
, (
3, ’2011-01-29′, 19000)
, (
3, ’2011-04-19′, 1000)
, (
3, ’2011-07-09′, 12000) ;

We wrote a query using FIRST_VALUE() to calculate the date of first purchase for each customer.

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, FIRST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM
Customers c

Now, let’s say we want to know the recent purchase date for each customer. Similar to FIRST_VALUE(), there is a LAST_VALUE() function. LAST_VALUE() allows you to return the last value from an ordered result set.

I’ll follow my killer instincts and replace FIRST_VALUE() with LAST_VALUE() in the above query. It should work. Right?

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM Customers C

Result Set:

… but, oops, this is not the result i was expecting.

There are two points to note here. Window and Frame.

A window is a set of records that are defined by the OVER() clause. Because of the PARTITION BY Custid, you can safely think records for each Custid form a window in this example.

A Frame is a range of rows within a window; a subset of a window in other words, defined by the ROWS clause. A range has boundaries – a begin position and an end position.

Look back at the query using LAST_VALUE(). You’ll notice there wasn’t a ROWS clause. We haven’t defined a frame. In this case SQL Server uses the default frame which is, between the first row without any boundaries (first row of the window partition) and the current row. Last value from this default frame will always be the current row.

Correct usage of the function should instead be:

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, LAST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate ROWS BETWEEN  CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM Customers C

Given all this, FIRST_VALUE() is providing expected results even with the default frame because the first row in the default frame is the first row of the window partition.

Here is an interesting article from Jeremiah Peschka (blog) for further reading: Leaving the Windows Open.

~Sam

Follow Sam on Twitter - @SamuelVanga


Written by Samuel Vanga

February 7, 2012 at 7:00 am

Data Labels, Axis Interval, Rotate Label Text: SSRS Nugget

with 2 comments

Charts present summary data in a visually appealing format. Following is a chart for sales over month:


image

Before

I frequently use a few options along with charts to get most out of them.

Show data labels:

It’ll be easier for the users to quickly read, if the chart shows the actual numbers along with bars. Steps: From the chart data pane, click data value, and choose Show Data Labels.

image

Display all categories on horizontal axis:

Notice in the above chart, all the categories (Months in our example) are not displayed. By default, auto fit property determines the interval of horizontal axis labels. Changing the interval to 1 will force the axis to display all categories. Steps are depicted in the below screen prints.

image

image

Rotate Labels:

Sometimes you may notice the labels are rotated automatically to better fit them, other times labels overlap with each other. Labels can be forced to rotate by disabling auto-fit and changing the label rotation angle to 90 degrees.

image


image

After

Summary:

In this post, I’ve looked at simple properties that can extend chart’s functionality: Showing data labels, configuring axis interval, and rotating axis labels.

~Sam.

Written by Samuel Vanga

January 31, 2012 at 7:30 am

Read contents of a file to variable – SSIS

with 2 comments

Sometimes you will find the need to import a file to a variable using SSIS. In this post, I’ll use Script Task to read contents of a flat file into a variable.

I’ve created a file called ‘myfile.txt’ with some random text. I’ll configure script task to read this file into a SSIS variable.

  • First create a variable with string data type. I’ll call mine vText.
  • Drag a script task onto the control flow. As you would imagine double clicking the script task will open the editor.
  • Specify the variable you just created as a ReadWriteVariable.
  • Click edit script and use code similar to what you see below:

'initialize StreamReader class for text file
Dim streamReader As New StreamReader("D:\Documents\myfile.txt")
' Read the StreamReader To End and assign to local variable
Dim StreamText As String = streamReader.ReadToEnd()
' assign SSIS variable with value of StreamText local variable.
Me.Dts.Variables("vText").Value = StreamText

To test, I’ll add another script task. This task will show a message box with the contents of our variable. Following is the script i use for message box:

MsgBox(Dts.Variables("vText").Value.ToString)

As you can see I’m reading the file from a static location on my D drive. Most people, however, will use variables or expressions to read from any locations dynamically.

@SamuelVanga.

Written by Samuel Vanga

January 24, 2012 at 7:00 am

#meme15 – Use. Twitter. Why?

with 3 comments

Should an average Jane or Joe professional use twitter?meme15new

Yes.

Why?

Because, I’m using it.

Not enough reasons? <grumpy face> I knew you wouldn’t buy this.

Okay, allow me to present why you should:

Advertise.

You might be thinking, “I’m neither an author nor do i have any business to advertise”. True, but do you have a blog? Are you working on any interesting project? Are you a character? If the answer to any of these questions is yes, then yes, you’ve something to advertise.

Follow your interests.

Twitter makes it easy to follow your interests: people, events, or anything else, so you can stay up to date of what’s happening around your interests. I’ll use me as an example: I follow a (very) few people. No, I don’t follow celebs like @KimKardashian or @justinbieber – I don’t care what they do – oh, sorry, my interests don’t match theirs. Instead, I follow people from whom I learn new things, people whom I find interesting, people those inspire me. If any of those people write a new article, I want to know about it. With Twitter, i can know the news from all the people at a single place.

Interact.

Example. Mine. Again.

I follow Marco Russo (blog|@Marcorus). Marco tweeted that he was doing a session on Vertipaq vs OLAP.

image

I was excited about it and wanted to attend, but couldn’t. I tweeted this.

image

A tweet appeared magically from Marco saying, hey Sam – you are awesome, this session has been recorded for you. Ha, I’m dreaming here. But you can see his tweet below:

image

This shows how Twitter makes it easy for a novice like me to interact with a master like Marco.

Wrap it…

I’m sure there are other ways to achieve these – but don’t you want to pick the horse that wins the race?

There are more reasons, oh yes, more. Need’em? Sure. Fantastrateic bloggers are giving more reasons here, while participating in this month’s #meme15, created by Jason Strate (t).

When you are ready to start using Twitter, read this free simple twitter e-book by Brent Ozar (@BrentO). You will thank me later.

See ya on the Twitter world…

@SamuelVanga

Written by Samuel Vanga

January 16, 2012 at 7:30 am

Posted in General, meme15

Tagged with , ,

FIRST_VALUE() Analytic Function–SQL Server 2012

with 3 comments

Given a customer table:

CREATE TABLE Customers
(CustId INT, OrderDate DATE, OrderAmount MONEY) ;

INSERT Customers VALUES
(1, ’2011-12-03′, 12500)
, (
1, ’2011-10-07′, 15000)
, (
1, ’2011-09-12′, 16000)
, (
2, ’2011-07-11′, 900)
, (
2, ’2011-09-13′, 18000)
, (
2, ’2011-11-03′, 7000)
, (
3, ’2011-01-29′, 19000)
, (
3, ’2011-04-19′, 1000)
, (
3, ’2011-07-09′, 12000) ;

Let’s say, we want to retrieve all records along with a column showing the customer’s date of first purchase.

We could start by computing the date of first purchase for each customer.

SELECT CustId, MIN(OrderDate) FirstPurchaseDate
FROM   Customers
GROUP BY
CustId

Result set has one row per each customer.

image

Joining this to the original table will return the expected results.

SELECT c.*, firstD.FirstPurchaseDate
FROM   Customers c
INNER JOIN
(
SELECT CustId, MIN(OrderDate) FirstPurchaseDate
FROM   Customers
GROUP BY
CustId
)firstD
ON firstD.CustId = c.CustId

Result set:

image

The query plan uses a nested join because of the join we were using the query above.

image

We can achieve the same results using the FIRST_VALUE() function.

SELECT
c.CustId, c.OrderDate, c.OrderAmount
, FIRST_VALUE(OrderDate) OVER(PARTITION BY CustId ORDER BY OrderDate) CustomerSince
FROM
Customers c

[PARTITON BY CustId], creates a partiton for each customer. [ORDER BY OrderDate] sorts each partition by the OrderDate. Now, from this ordered result set, FIRST_VALUE() returns the first value for each partition.

The query is easier to read and a join is not needed. Result: the nested loop is eliminated in the query plan. This could improve performance.

image

I’ll conclude with a question for you to understand LAST_VALUE(). From the same fictitious customer data, let’s say we want to compute the recent purchase date (as opposed to first purchase date from above example) of every customer. What would be the query? I welcome your response in the comments below.

@SamuelVanga

Written by Samuel Vanga

January 10, 2012 at 9:30 am

Management Studio Split Tabs

with 12 comments

This is by far the simplest and silliest post ever, but i just figured how to create vertical tabs in management studio. When I learn something, you have to read it. I apologize, but you have to read it. As Ellen DeGeneres says, “if I see something… you have to see it” (YouTube Video).

When I want to compare scripts, I would open management studio in two separate windows and compare them. I knew there is an easier way, but didn’t know how to do that until recently.

Right click on a tab and choose – New Vertical Tab Group.

image

Split vertical tabs can be seen in the image below.

image

Note that you won’t see the new tab group options unless you have more than one tab open. Horizontal tabs can be created by clicking on New Horizontal Tab Group.

When done, you can reset the tabs to the original position by clicking on Move to Previous Tab Group.

image

I’m sure you know this – but like I said earlier, if I learn something…!

@me.

Written by Samuel Vanga

January 3, 2012 at 7:30 am

Posted in General

Tagged with

Why doesn’t my gauge show correct values?

leave a comment »

Here is an issue I encountered recently while using gauge in reporting services.
There are three gauges in the report one for each product; bikes, computers, and hardware. The value being showed is YTD sales and all gauges show a value of 100 for YTD Sales as seen below:

However the actual YTD sales are 100, 150 and 200 for bikes, computers, and hardware respectively. See the sample data below:

Okay, why is this happening? I am glad you asked!

Minimum and Maximum values for chart:

Starting and ending values (0,100) on the gauge scale are controlled by minimum and maximum values that can be configured on the general page of scale properties. Right click on scale and select scale properties.

Since maximum value is set to 100, scale doesn’t show beyond 100. And when the actual value (YTD sales in this case) exceeds the maximum value the pointer will be defaulted to point to the maximum value.

To resolve this problem, change the maximum value to the following expression.

=sum(Fields!YTDSales.Value)

~Sam.

Other Related Post: SSRS Error – An item with the same key

Written by Samuel Vanga

December 27, 2011 at 8:00 am

Posted in SSRS

Tagged with , ,

Full Load. Incremental Load. Whaaat?

leave a comment »

DinerSpecials4Back when i was in school, me and a bunch of friends used to hangout at a diner. The diner was popular for its specials, which were displayed on a board outside the entrance.

 The manager would roll in every day, wipe the board clean and list all the specials for that day. There will be a few new specials introduced daily, but most of them would be the same every day. Still the manager would erase everything completely (including those that weren’t changed) and rewrite. He was fine with having to spend a lot of time every day to do this, but he wanted to clearly display all the specials.

On the other hand, a manager at a different diner would erase only those specials that were discontinued and add the new ones. He didn’t mind that the specials were not clearly displayed, and only wanted to make the daily changes as quickly as possible.

Destructive Load

Also known as Full Load, is a process of completely destroying/deleting the existing data and reloading it from scratch. A lot of unchanged data is also deleted and reloaded in this process. But a destructive load ensures the highest data integrity easily.

“Delete destination data. Read data from source. Load into destination.”

Incremental Load

Incremental load is a process of loading data incrementally. Only new and changed data is loaded to the destination. Data that didn’t change will be left alone. Data integrity can be ensured in this process too, but ETL can get complicated.

“Read source data. Compare with destination. Filter for new and changed data. Write to destination.”. I stole this line from Anatomy of an Incremental Load by Andy Leonard (b|t).

Why Incremental?

A full load seems to be easiest the approach. Right?. Then why do we even care about incremental load?

  1. Speed. Opting to do a full load on larger datasets will take a great amount of time and other server resources. Ideally all the data loads are performed overnight with the expectation of completing them before users can see the data the next day. The overnight window may not be enough time for the full load to complete.
  2. Preserving history. When dealing with a OLTP source that is not designed to keep history, a full load will remove history from the destination as well, since full load will remove all the records first, remember! So a full load will not allow you to preserve history in the data warehouse.

Full Load vs. Incremental Load:

  Destructive Load Incremental Load
How it works Deletes all rows and reload from scratch. Only new or updated rows are processed
Time Requires more time. Requires less time.
Data Integrity Can easily be guaranteed Difficult. ETL must check for new/updated rows.
History Can be lost. Retained.


Conclusion

Full load is the easiest way to load daily. But consumes a lot of time and other server resources. Incremental load processes only the new or changed data, so when time is of essence with larger data sets, incremental load is the way to go.

Next up…

In my next post, I’ll continue with scripts to do a full and incremental loads.

@SamuelVanga.

Written by Samuel Vanga

December 20, 2011 at 7:30 am