PASS Summit 2012 Twitter Dashboard

As a business intelligence developer, my skills include taking the raw data, shaking it to remove all the junk, and presenting it to business users in the most sexiest way possible – smart people often use the words ETL, and BI to define this process, but I’m not smart – I neither have glasses nor a French beard. ^_^

Since I missed the Pass Summit 2012, I closely watched twitter feed for #sqlpass. Oh boy! That was amazing! There are believers (#PASSPrayers), first timers (#sqlfirsttimers), karaoke’rs (#sqlkaraoke), repeating offenders (Alumni), runners (#sqlrun) to name a few, making This a perfect family (#sqlfamily).

That was me watching #sqlpass tweets btw
OMG! So close.

Then, I put my skills to work. Through out the summit I collected tweets that were coming in with #sqlpass and other hashtags mentioned above, imported them to a database, and created the following dashboard.

See PASS Summit 2012 as it happened on twitter. Click the image below to interact (opens in a new window).

PASS Summit 2012 Twitter Dashboard - sqlpass

Charts on the top show the top tweeters and the busiest hours. Times are shown in Pacific Standard Time. Clearly, people tweeted more during the keynote sessions and there are cool people in the top tweeters list.

When you select a tweeter, all views in the dashboard are filtered for that particular user. For example, in the below image I select BrentO (Brent Ozar). You can see everything Brent tweeted and tweets over time.


Similarly, you can select a time to see everything for that hour.

Last functionality I show is searching tweets. Enter a term in the search tweets filter area in the bottom left. For example, I enter ‘Boring’ to get this:


Good news: there are only a few tweets that reflect a boring event. Bad news: there are tweets suggesting that the event (at some point) is boring. You can see when attendees felt boring. Don’t get me wrong – It’s not my intention to convey that Summit was boring. A good BI solution tells business users when and where there are problems.

Database Backup

Download the database backup from here. It’s a SQL Server 2012 RTM database. There are three tables:

dbo.Hashtag contains all the hashtags that were collected. HashtagID is an identity column.


dbo.Tweep is a table to store tweeters. UserID is the unique Twitter ID for a user, it’s not Identity in the database.


dbo.Tweet is used to store tweets. TweetID in this table is again an unique ID generated by Twitter. Note that TweetID alone isn’t unique in this table. Primary key is TweetID + HashtagID.


This query removes duplicates and gives you unique tweets. Change the filter conditions as you like.

USE HashtagArchive ;

WITH cte
SELECT tp.UserHandle
, tw.CreateDate
, tw.Tweet
, tw.Latitude
, tw.Longitude
, PST = DATEADD(HH,-8,CreateDate)
RowNum = Row_Number() OVER( Partition BY tw.TweetID ORDER BY (SELECT NULL))
FROM   dbo.Tweet tw
INNER JOIN dbo.Tweep tp
ON tw.UserID = tp.UserID
INNER JOIN dbo.Hashtag ht
ON ht.HashtagID = tw.HashtagID
WHERE  ht.Hashtag IN ('#Summit12', '#SqlPass','#Sqlrun','#sqlkaraoke','#sqlfamily','#ssisprecon','#PASSPrayers')
CreateDate > '2012-11-07 08:00:00.000'
SELECT UserHandle
, CreateDate
, [Date] = CAST(PST AS DATE)
[Time] = CONVERT(VARCHAR(8),PST,108)
, Latitude
, Longitude
FROM cte WHERE RowNum = 1
ORDER BY CreateDate ;


There is some noise with the #FirstTimers hashtag. I didn’t think young US citizens who voted for the first time would be on Twitter, strange! Make sure to avoid that and you’ve an archive of Twitter database for PASS Summit 2012.

Follow me and leave a comment

Too much to ask, I know! If you enjoyed this post, follow me on twitter (@SamuelVanga) – I’d love to get to know you. And leave a comment with what you discover from this dashboard.


Related Posts

SQL Saturday US Map – Posted 08/03/2012

Update: 11/12/2012 08:00 PM

Chris Webb (@Technitrain on Twitter), MVP and BI Expert posted  an example using NodeXL to analyze tweets. Check it out – It’s amazing!

Create Several Shell SSIS Packages In Minutes Using BIML

What do i mean by shell packages? You’re spot on – that’s an excellent question to begin with. For the purpose of this post, I define shell package as an empty package that is renamed appropriately and contains all the required connection managers.

It is a common pattern among SSIS developers to create separate packages for each unit of work. For example, you will create a package for each table that needs to be loaded.

The only problem: manually creating hundreds of packages, sometimes even more, is time consuming and boring. If you think about it, first steps of creating all these packages is always the same: Right click on SSIS Packages, select New SSIS Package, rename the package, and create Connection Managers.

Wouldn’t it be nice if there is a way to automate these boring steps? I hear you screaming yes.

Package templates help you to some extent. You create a package with the required connection managers and save it as a template. The subsequent packages will automatically contain all connection managers the template has, you don’t have to add them manually to all packages. You still have to add and rename packages though.

Let’s take a look at a much better and time saving method!

I use BIML to dynamically create multiple SSIS packages for each table that needs to be loaded. All packages will be renamed consistently and will contain connection managers.


Execute the following SQL to create source and destination databases.



Next, create some tables in the destination database.

USE DestDatabase;

CREATE TABLE dbo.DimProduct (
,ProductCode VARCHAR(10) UNIQUE
) ;

CREATE TABLE dbo.DimCustomer (
,CustomerCode VARCHAR(10) UNIQUE
) ;

CREATE TABLE dbo.DimEmployee (
,EmployeeCode VARCHAR(10) UNIQUE
) ;

Again, our objective is to create one package for each of the tables in the destination database.

Open BIDS, I’m using SSIS 2008R2 for this example. You’ll have to install BIDS Helper from here. Right click on the project and add New BIML File. Your solution explorer look like this:


Double click the BIML file and paste the following script. If you see issues while copy pasting, check this link.

<#@ template language=”C#” hostspecific=”true”#>
<#@ import namespace=”System.Data” #>

<Biml xmlns=””>
<Connection Name=”SrcDatabase” ConnectionString=”Data Source=(local);Initial Catalog=SrcDatabase;Provider=SQLNCLI10.1;Integrated Security=SSPI” RetainSameConnection =”true” DelayValidation =”false”/>
<Connection Name=”DestDatabase” ConnectionString=”Data Source=(local);Initial Catalog=DestDatabase;Provider=SQLNCLI10.1;Integrated Security=SSPI;” RetainSameConnection =”true” DelayValidation =”false”/>

<# string DestCnStr =”Data Source=(local);Initial Catalog=DestDatabase;Provider=SQLNCLI10.1;Integrated Security=SSPI”;
DataTable tables = ExternalDataAccess.GetDataTable(DestCnStr,”SELECT FROM sys.tables t”) ;
foreach (DataRow row in tables.Rows)
{ #>
<Package Name=”Load_<#=row[0]#>” ConstraintMode=”Linear” >
<Container Name=”Dummy — remove container” ConstraintMode=”Linear”>
<ExecuteSQL Name=”Dummy — remove task” ConnectionName=”SrcDatabase”>
<DirectInput>select 1</DirectInput>

<ExecuteSQL Name=”Dummy 1 — remove task” ConnectionName=”DestDatabase”>
<DirectInput>select 1</DirectInput>

<# } #>

After copying the code, right click on the BIML file and select Generate SSIS Packages.



Multiple packages with connections managers have been created in a matter of minutes.


You just have to delete the sequence container and proceed with package development. Why?  We created connection managers in the BIML script, but since we are not using them anywhere in the package, they won’t be included in the packages. So, I tricked it by creating a dummy task. This is still easier compared to the traditional process of creating multiple packages.


Don’t get me wrong. This example is probably the most basic of what BIML has to offer. I’m spending some time lately with BIML and I hope to do more posts as I learn new things.



I’m Not Attending #sqlpass Summit 2012 and I’m Sorry For Myself

I really wanted to attent PASS Summit this year, I’ve never attended before. I was preparing myself mentally since the last summit, but things didn’t work out. Life happened. I’ve some personal obligations that I need to take care. I can’t talk much about these obligations here, but I can say I’ll be a happy man if I complete them. So, I decided to trade Summit – not an easy decision though.

I know I’m going to miss out on a lot of things, Pre-Con Session: SSIS Design Patterns the most. Hopefully, I’ll be able to attend next year and get to meet all the awesome SQL people.

You, my friend, if you have a chance and haven’t registered yet, don’t wait. From what I’ve heard and learnt, it will be an amazing experience. Here is the registration link.


Generate Uniqueidentifier with SSIS

If you are trying to generate Uniqueidentifier/Newid() in SSIS data flow, you will soon realize that there is no out-of-the-box transformation you could use. Fear not. Script Component can be used to create Uniqueidentifier columns.

In the below package, I’ve an OLE DB source. Then, i dragged script component to the data flow. I selected transformation as the script component type and connected green arrow from source to script transformation.

Next, open the script transformation editor. In the Inputs and Outputs page, expand Output 0, select output columns and click add column in the bottom, give it a name, select unique identifier [DT_GUID] as the data type, and click ok. I named mine UniqueId.

SSIS NEWID() Script Add Column data type

In the script page, make sure Microsoft Visual Basic 2008 is selected as the scripting language. I just like working with VB. Click edit script, then copy paste the following script.

' Microsoft SQL Server Integration Services Script Component
' Write scripts using Microsoft Visual Basic 2008.
' ScriptMain is the entry point class of the script.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<microsoft.sqlserver.dts.pipeline.ssisscriptcomponententrypointattribute()> _
<clscompliant(False)> _
Public Class ScriptMain
Inherits UserComponent

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Row.UniqueId = System.Guid.NewGuid()
End Sub
End Class


When you add a destination, UniqueId will be available as a new column in the mappings page.

SSIS UniqueIdentifier Mappins page New Column


Book Review: SQL Server 2012 Integration Services Design Patterns

This book is written by Andy Leonard, Matt Masson, Tim Mitchell, Jessica Moss, and Michelle Ufford. These are all highly talented people we are talking about. Andy is constantly teaching SSIS patterns in his training classes and articles. Matt, for me, is the face of SSIS team at Microsoft. Same goes to other authors, they are common appearances in forums and community events helping people learn SSIS.

Within a few hours of reading SQL Server 2012 Integration Services Design Patterns, it stood out that none of the authors were trying to impress by showing what they all know in SSIS. Instead, they focused on describing solutions and patterns in a great detail (exactly why i paid for).

Each chapter is a collection of solutions and best practices to common data integration problems. For loading flat files go to chapter 7, for data warehouse patterns go to chapter 11. Each solution is written in detail with lots of pictures and step by step instructions. You could have the book open at work and follow through each step to solve a problem without running into any issues.

There is also pages and pages of code that you could immediately use. You’ll learn frameworks and metadata driven execution patterns. I didn’t find a companion cd or website from where you can simply copy and paste this code. That leaves you with having to type-in; a little depressing.

It is fun to read as well. I smiled when page 30 said VCR play button (referring to the debug button).Literature in the book is plain, clear, and casually written. It was like reading a blog post – simple and refreshing.

If you’re on a mission to learn everything about SSIS, I wouldn’t recommend this to you. This isn’t a know-all-SSIS book. Authors didn’t hide this fact either. Cover of the book says “Improve your efficiency as a data integration developer”. This focuses on patterns to data integration problems. If your job involves moving data using SSIS, this is a must have. SSIS 2012 Design Patterns teaches building faster, efficient, and reusable packages for your data integration needs.

I’m @SamuelVanga on Twitter.

SCRIPT: Find the row count of all tables and views

Today I’m sharing another script that I often find useful – Queries to find the count of rows from tables and views in a SQL Server database.

Count Rows In All Tables:

This query returns the count of rows from all tables in a database. The results can be seen instantaneously without causing any blocking or locking. It is safe to run this in production.

; WITH    A AS ( SELECT   TableName = O.Name ,
SchemaName = SCHEMA_NAME(O.Schema_id) ,
[Rows] = P.Rows
FROM     sys.objects O
SELECT  SchemaName ,
TableName ,
GROUP BY SchemaName ,
TableName ;


Count Rows In All Views:

Unlike for tables, there is no way to query any DMV to get the row count. The only way is to perform COUNT() on all the views. The following query outputs a select for each view. Copy the results and execute in a query window.

SELECT fullname+' UNION ALL'
FROM sys.views v
sys.schemas s
ON s.schema_id = v.schema_id
CROSS APPLY (SELECT 'select '+'''''.'''''+','+ 'count(*) from ''.' AS fullname) AS a


SELECT 'SELECT '+'''DUMMY'''+', '+'0'

Caution: The query for views can be dangerously slow depending on the view’s definition. Please, please, please test it in your development before running it in production.


SQL Saturday US Map

Apparently, SQL Saturday isn’t popular in the west. See for yourself. I wanted to see what cities frequently host SQL Saturday and ended up creating this dashboard. There are some interesting findings. Look at how the events are primarily confined to only (right) half of United States. There are several states that haven’t yet seen a SQL Saturday!


I created this using Tableau Public. Unfortunately, functionality of the dashboard on this blog is limited, so I encourage you to click on this link to see an interactive map.

Which state has conducted the most events so far? That’s right, it’s the Sunshine state. Good guys in Florida are outstanding by conducting 20 events already.

I hope this post is fun. Your comments are welcome. Cheers!

I’m @SamuelVanga on Twitter.