SQL Server and BI Tools: June 2011

Sunday, June 26, 2011

50 New Features of SQL Server 2008

• Transparent Data Encryption
Enable encryption of an entire database, data files, or log files, without the need for application changes. Benefits of this include: Search encrypted data using both range and fuzzy searches, search secure data from unauthorized users, and data encryption without any required changes in existing applications.

• Extensible Key Management
SQL Server 2005 provides a comprehensive solution for encryption and key management. SQL Server 2008 delivers an excellent solution to this growing need by supporting third-party key management and HSM products.

• Auditing
Create and manage auditing via DDL, while simplifying compliance by providing more comprehensive data auditing. This enables organizations to answer common questions, such as, "What data was retrieved?"

• Enhanced Database Mirroring
SQL Server 2008 builds on SQL Server 2005 by providing a more reliable platform that has enhanced database mirroring, including automatic page repair, improved performance, and enhanced supportability.

• Automatic Recovery of Data Pages
SQL Server 2008 enables the principal and mirror machines to transparently recover from 823/824 types of data page errors by requesting a fresh copy of the suspect page from the mirroring partner transparently to end users and applications.

• Log Stream Compression
Database mirroring requires data transmissions between the participants of the mirroring implementations. With SQL Server 2008, compression of the outgoing log stream between the participants delivers optimal performance and minimizes the network bandwidth used by database mirroring.

• Resource Governor
Provide a consistent and predictable response to end users with the introduction of Resource Governor, allowing organizations to define resource limits and priorities for different workloads, which enable concurrent workloads to provide consistent performance to their end users.

• Predictable Query Performance
Enable greater query performance stability and predictability by providing functionality to lock down query plans, enabling organizations to promote stable query plans across hardware server replacements, server upgrades, and production deployments.

• Data Compression
Enable data to be stored more effectively, and reduce the storage requirements for your data. Data compression also provides significant performance improvements for large I/O bound workloads, like data warehousing.

• Hot Add CPU
Dynamically scale a database on demand by allowing CPU resources to be added to SQL Server 2008 on supported hardware platforms without forcing any downtime on applications. Note that SQL Server already supports the ability to add memory resources online.

• Policy-Based Management
Policy-Based Management is a policy-based system for managing one or more instances of SQL Server 2008. Use this with SQL Server Management Studio to create policies that manage entities on the server, such as the instance of SQL Server, databases, and other SQL Server objects.

• Streamlined Installation
SQL Server 2008 introduces significant improvements to the service life cycle for SQL Server through the re-engineering of the installation, setup, and configuration architecture. These improvements separate the installation of the physical bits on the hardware from the configuration of the SQL Server software, enabling organizations and software partners to provide recommended installation configurations.

• Performance Data Collection
Performance tuning and troubleshooting are time-consuming tasks for the administrator. To provide actionable performance insights to administrators, SQL Server 2008 includes more extensive performance data collection, a new centralized data repository for storing performance data, and new tools for reporting and monitoring.

• Language Integrated Query (LINQ)
Enable developers to issue queries against data, using a managed programming language, such as C# or VB.NET, instead of SQL statements. Enable seamless, strongly typed, set-oriented queries written in .NET languages to run against ADO.NET (LINQ to SQL), ADO.NET DataSets (LINQ to DataSets), the ADO.NET Entity Framework (LINQ to Entities), and to the Entity Data Service Mapping provider. Use the new LINQ to SQL provider that enables developers to use LINQ directly on SQL Server 2008 tables and columns.

• ADO.NET Data Services
The Object Services layer of ADO.NET enables the materialization, change tracking, and persistence of data as CLR objects. Developers using the ADO.NET framework can program against a database, using CLR objects that are managed by ADO.NET. SQL Server 2008 introduces more efficient, optimized support that improves performance and simplifies development.

• DATE/TIME
SQL Server 2008 introduces new date and time data types:
o DATE—A date-only type
o TIME—A time-only type
o DATETIMEOFFSET—A time-zone-aware datetime type
o DATETIME2—A datetime type with larger fractional seconds and year range than the existing DATETIME type
The new data types enable applications to have separate data and time types while providing large data ranges or user defined precision for time values.

• HIERARCHY ID
Enable database applications to model tree structures in a more efficient way than currently possible. New system type HierarchyId can store values that represent nodes in a hierarchy tree. This new type will be implemented as a CLR UDT, and will expose several efficient and useful built-in methods for creating and operating on hierarchy nodes with a flexible programming model.

• FILESTREAM Data
Allow large binary data to be stored directly in an NTFS file system, while preserving an integral part of the database and maintaining transactional consistency. Enable the scale-out of large binary data traditionally managed by the database to be stored outside the database on more cost-effective storage without compromise.

• Integrated Full Text Search
Integrated Full Text Search makes the transition between Text Search and relational data seamless, while enabling users to use the Text Indexes to perform high-speed text searches on large text columns.

• Sparse Columns
NULL data consumes no physical space, providing a highly efficient way of managing empty data in a database. For example, Sparse Columns allows object models that typically have numerous null values to be stored in a SQL Server 2005 database without experiencing large space costs.

• Large User-Defined Types
SQL Server 2008 eliminates the 8-KB limit for User-Defined Types (UDTs), allowing users to dramatically expand the size of their UDTs.

• Spatial Data Types
Build spatial capabilities into your applications by using the support for spatial data.
o Implement Round Earth solutions with the geography data type. Use latitude and longitude coordinates to define areas on the Earth's surface.
o Implement Flat Earth solutions with the geometry data type. Store polygons, points, and lines that are associated with projected planar surfaces and naturally planar data, such as interior spaces.

• Backup Compression
Keeping disk-based backups online is expensive and time-consuming. With SQL Server 2008 backup compression, less storage is required to keep backups online, and backups run significantly faster since less disk I/O is required.

• Partitioned Table Parallelism
Partitions enable organizations to manage large growing tables more effectively by transparently breaking them into manageable blocks of data. SQL Server 2008 builds on the advances of partitioning in SQL Server 2005 by improving the performance on large partitioned tables.

• Star Join Query Optimizations
SQL Server 2008 provides improved query performance for common data warehouse scenarios. Star Join Query optimizations reduce query response time by recognizing data warehouse join patterns.

• Grouping Sets
Grouping Sets is an extension to the GROUP BY clause that lets users define multiple groupings in the same query. Grouping Sets produces a single result set that is equivalent to a UNION ALL of differently grouped rows, making aggregation querying and reporting easier and faster.

• Change Data Capture
With Change Data Capture, changes are captured and placed in change tables. It captures complete content of changes, maintains cross-table consistency, and even works across schema changes. This enables organizations to integrate the latest information into the data warehouse.

• MERGE SQL Statement
With the introduction of the MERGE SQL Statement, developers can more effectively handle common data warehousing scenarios, like checking whether a row exists, and then executing an insert or update.

• SQL Server Integration Services (SSIS) Pipeline Improvements
Data Integration packages can now scale more effectively, making use of available resources and managing the largest enterprise-scale workloads. The new design improves the scalability of runtime into multiple processors.

• SQL Server Integration Services (SSIS) Persistent Lookups
The need to perform lookups is one of the most common ETL operations. This is especially prevalent in data warehousing, where fact records need to use lookups to transform business keys to their corresponding surrogates. SSIS increases the performance of lookups to support the largest tables.

• Analysis Scale and Performance
SQL Server 2008 drives broader analysis with enhanced analytical capabilities and with more complex computations and aggregations. New cube design tools help users streamline the development of the analysis infrastructure enabling them to build solutions for optimized performance.

• Block Computations
Block Computations provides a significant improvement in processing performance enabling users to increase the depth of their hierarchies and complexity of the computations.

• Writeback
New MOLAP enabled writeback capabilities in SQL Server 2008 Analysis Services removes the need to query ROLAP partitions. This provides users with enhanced writeback scenarios from within analytical applications without sacrificing the traditional OLAP performance.

• Enterprise Reporting Engine
Reports can easily be delivered throughout the organization, both internally and externally, with simplified deployment and configuration. This enables users to easily create and share reports of any size and complexity.

• Internet Report Deployment
Customers and suppliers can effortlessly be reached by deploying reports over the Internet.

• Manage Reporting Infrastructure
Increase supportability and the ability to control server behaviour with memory management, infrastructure consolidation, and easier configuration through a centralized store and API for all configuration settings.

• Report Builder Enhancements
Easily build ad-hoc and author reports with any structure through Report Designer.

• Forms Authentication Support
Support for Forms authentication enables users to choose between Windows and Forms authentication.

• Report Server Application Embedding
Report Server application embedding enables the URLs in reports and subscriptions to point back to front-end applications.

• Microsoft Office Integration
SQL Server 2008 provides new Word rendering that enables users to consume reports directly from within Microsoft Office Word. In addition, the existing Excel renderer has been greatly enhanced to accommodate the support of features, like nested data regions, sub-reports, as well as merged cell improvements. This lets users maintain layout fidelity and improves the overall consumption of reports from Microsoft Office applications.

• Predictive Analysis
SQL Server Analysis Services continues to deliver advanced data mining technologies. Better Time Series support extends forecasting capabilities. Enhanced Mining Structures deliver more flexibility to perform focused analysis through filtering as well as to deliver complete information in reports beyond the scope of the mining model. New cross-validation enables confirmation of both accuracy and stability for results that you can trust. Furthermore, the new features delivered with SQL Server 2008 Data Mining Add-ins for Office 2007 empower every user in the organization with even more actionable insight at the desktop.

Tuesday, June 21, 2011

Working with SQL Table Partitions

When a database table grows in size to the hundreds of gigabytes or more, it can become more difficult to load new data, remove old data, and maintain indexes.
Even the data that we want to load or remove may be very few records, making INSERT and DELETE operations on the table becomes impractical.

In this Scenario, How can we face this issue and How can we achieve Performance ?

Partitioning a large table divides the table and its indexes into smaller partitions, so that maintenance operations can be applied on a partition-by-partition basis, rather than on the entire table.

In addition, the SQL Server optimizer can direct properly filtered queries to appropriate partitions rather than the entire table.

Table partitioning was introduced in SQL Server 2005 Enterprise Edition and enhanced in SQL Server 2008 Enterprise.

Every table in SQL Server 2005 and SQL Server 2008 may be looked upon as partitioned, in the sense that each table is given a default partition in the sys.partitions catalog view. However, a table is not truly partitioned unless it is created using a partition scheme and function, which ordinary database tables by default do not have.

In SQL Server, there are two ways to partition:
a. You can manually subdivide a large table's data into multiple physical tables.
b. you can use SQL Server's table partitioning feature to partition a single table.

1st way: Again there are two primary ways to partition data into multiple tables:

i. Horizontal Partitioning: where selected subsets of rows are placed in different tables
When a view is created over all the tables, and queries directed to the view, the result is a partitioned view.

Disadvantages: In a partitioned view, you have to manually apply the required constraints and operations, and maintenance can be complex and time-consuming.

ii. Vertical Partitioning: where the columns of a very wide table are spread across multiple tables containing distinct subsets of the columns with the same number of rows. The result is multiple tables containing the same number of rows but different columns, usually with the same primary key column in each table.

Often a view is defined across the multiple tables and queries directed against the view. SQL Server does not provide built-in support for vertical partitioning, but the new sparse columns feature of SQL Server 2008 can be a better solution for tables that require large numbers of columns

2nd Way: SQL Server's table partitioning differs from the above two approaches by partitioning a single table: here Multiple physical tables are no longer involved.

When a table is created as a partitioned table, SQL Server automatically places the table's rows in the correct partition, and SQL Server maintains the partitions behind the scenes.
You can then perform maintenance operations on individual partitions, and properly filtered queries will access only the correct partitions. But it is still one table as far as SQL Server is concerned.

A partitioned table is a unique kind of table in SQL Server. It depends on two pre-existing objects, the partition function and partition scheme, which are used only for partitioned tables and indexes

A partitioned table has a column identified as a partitioning column, and that column is referenced when the table is created. The partitioned table must be created on a partition scheme, which defines the filegroup storage locations for the partitioned table. The partition scheme in turn depends on a previously created partition function that defines the number of partitions the table will have and how the boundaries of the partitions are defined

Planning for Table Partitioning:

In order to successfully partition a large table, you must make a number of decisions. In particular, you need to:

Plan the partitioning:
Decide which table or tables can benefit from the increased manageability and availability of partitioning.
Decide the column or column combination upon which to base the partition.
Specify the partition boundaries in a partition function.
Plan on how to store the partitions in filegroups using a partition scheme.

Choosing a Table to Partition:

There is no firm rule or formula that would determine when a table is large enough to be partitioned, or whether even a very large table would benefit from partitioning.
Sometimes large tables may not require partitioning, some conditions where partitioning a table could benefit are:
Index maintenance on the table is costly or time-consuming and could benefit from re-indexing partitions of the table rather than the whole table at once.
Data must be aged out of the table periodically, and the delete process is currently too slow or blocks users trying to query the table.
New data is loaded periodically into the table, the load process is too slow or interferes with queries on the table, and the table data lends itself to a partition column based on ascending date or time.
In other words, make sure that the large table will actually benefit from partitioning; don’t partition it just because it’s big.

Choosing a Partition Column:

The partition column choice is critical, because after you have partitioned a table, choosing a different partition column will require re-creating the table, reloading all the data, and rebuilding all the indexes. Some things to note about the partition column:
The partition column must be a single column in the table (either a single column or a computed column).
If you have a combination of columns that form the best partition column, you can add a persisted computed column to the table that combines the values of the original columns and then partition on it.
In a clustered table, the partition column must be part of either the primary key or the clustered index. If the partition column is not part of the table's primary key, the partition column might allow NULL. Any data with a NULL in the partition column will reside in the leftmost partition.
You should also try to choose a partitioned column that will be used as a filter criterion in most of the queries run against the table. This enables partition elimination, where the query processor can eliminate inapplicable partitions from the query plan, and just access the partitions implied by the filter on the queries.

After you have identified the table and its partition column, you can then go through the steps of partitioning a table or index, which are:
Create or use an existing partition function that sets the correct range boundaries.
Create or use an existing partition scheme tied to the partition function.
Create the table using the partition scheme.

Benefits of Partitioned Tables and Indexes:

SQL Server’s partitioned tables and indexes offer a number of advantages when compared with partitioned views and other forms of partitioning data:
SQL Server automatically manages the placement of data in the proper partitions.
A partitioned table and its indexes appear as a normal database table with indexes, even though the table might have numerous partitions.
The table can be managed at the partition and filegroup level for ease of maintenance.
Partitioned tables support easier and faster data loading, aging, and archiving, as illustrated in the example above.
Application queries that are properly filtered on the partition column can perform better by making use of partition elimination and parallelism.
In cases where partitioned data will not be modified, you can mark some or most of a partitioned table's filegroups as read-only, making management of the filegroups easier.
In SQL Server 2008, you can compress individual partitions as well as control lock escalation at a partition level.

Partitioning large tables has some challenges:

There is a maximum of 1,000 partitions for a table.
You must manage filegroups and file placement if you place partitions on individual filegroups.
The metadata-only operations (SWITCH, MERGE, and SPLIT) can be blocked by other DML actions on the table at the time, until a schema-modification lock can be obtained.
Managing date or time-based data can be complex.
You cannot rebuild a partitioned index with the ONLINE option set to ON, because the entire table will be locked during the rebuild.
Automating changes to partitioned tables, as in a sliding window scenario, can be difficult, but Microsoft provides some tools to assist in automating the process.

Table Partitioning Best Practices:

Table partitioning is useful on very large data tables for primarily two reasons.
The major reason for partitioning is to gain better management and control of large tables by partitioning them. To gain better management of large tables, you can:
Rebuild and reorganize indexes by partition.
Use partition-aligned indexed views in switching operations.
Use a sliding window strategy for quickly bringing in new data and archiving old data.
Additionally, SQL Server's query optimizer can use partition elimination and parallelism to increase appropriately filtered query performance against partitioned tables. To make use of partition elimination:
Ensure that indexes are aligned with the partitioned table, and that indexed views are partition-aligned.
Ensure that queries against the partitioned tables have filters based on the partition column.
On data warehouse joins, keep the join column simple (such as an integer or date) and explicit, so as to take advantage of bitmap filtering for star joins.
In general, to take full advantage of table partitioning, you should:
Make sure that the configuration of max degree of parallelism is set sufficiently high to take advantage of parallel operations, or else add a MAXDOP query hint to fine-tune the degree of parallelism.
Maintain an empty partition on both ends of the partitioned table and ensure that only empty partitions are split and merged in a sliding window scenario.
Remember that RANGE RIGHT may be more convenient than RANGE LEFT in partition functions, especially when you are specifying date ranges.
Use data types without fractional components as partition columns, such as a date or an integer.
Always use a standard language-independent date format when specifying partition function boundary values.
Use an integer-based date and date dimension in data warehouses.
Use a single column of the table as the partitioned column whenever possible. If you must partition across more than one column, you can use a persisted computed column as the partitioning column. But then to achieve partition elimination, you must control queries to ensure they reference the partition column in their filters.
Use SWITCH with MERGE to drop partition data: Switch out the partition and remove the partition's boundary value using MERGE.
Use TRUNCATE TABLE to delete partition data by switching a partition out to a staging table and truncating the staging table.
Check for partition elimination in query plans.
Place read only data on read-only filegroups to reduce locking and simplify recovery for piecemeal restores.
Spread filegroups across all disks for maximum I/O performance.
Automate sliding window scenarios using available tools.

Example:

USE [master]
GO

CREATE DATABASE [SampleDB] ON PRIMARY
(
NAME = N'SampleDB',
FILENAME = N'C:\All Projects\TestDB\SampleDB.mdf' ,
SIZE = 3048KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 1024KB
),

FILEGROUP [Filegroup1]
(
NAME = N'File1',
FILENAME = N'C:\All Projects\TestDB\File1.ndf' ,
SIZE = 2048KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 1024KB
),

FILEGROUP [Filegroup2]
(
NAME = N'File2',
FILENAME = N'C:\All Projects\TestDB\File2.ndf' ,
SIZE = 2048KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 1024KB
),

FILEGROUP [Filegroup3]
(
NAME = N'File3',
FILENAME = N'C:\All Projects\TestDB\File3.ndf' ,
SIZE = 2048KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 1024KB
),

FILEGROUP [Filegroup4]
(
NAME = N'File4',
FILENAME = N'C:\All Projects\TestDB\File4.ndf' ,
SIZE = 2048KB ,
MAXSIZE = UNLIMITED,
FILEGROWTH = 1024KB
)
LOG ON
(
NAME = N'SampleDB_log',
FILENAME = N'C:\All Projects\TestDB\SampleDB_log.ldf' ,
SIZE = 1024KB ,
MAXSIZE = 2048GB ,
FILEGROWTH = 10%
)
GO

USE SampleDB
GO

CREATE PARTITION FUNCTION PF2_Left (int)
AS RANGE LEFT FOR VALUES (0, 10, 20);

CREATE PARTITION FUNCTION PF2_Right (int)
AS RANGE RIGHT FOR VALUES (0, 10, 20);

GO

CREATE PARTITION SCHEME PS2_Right
AS PARTITION PF2_Right
TO (Filegroup1, Filegroup2, Filegroup3, Filegroup4)

CREATE TABLE PartitionedTable
(
PartitionColumnID int not null,
Name varchar(100) not null
)
ON PS2_Right(PartitionColumnID);

INSERT PartitionedTable
select * from (values (31,'Amit'),(34,'Rahul'),(32,'Rajesh'),(33,'Ajay'),(28,'Ketan'))as tab(ID,NAME)

select * from PartitionedTable

More Examples:

REATE PARTITION FUNCTION myRangePF1 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO

CREATE PARTITION SCHEME myRangePS1
AS PARTITION myRangePF1
TO (test1fg, test2fg, test3fg, test4fg);

CREATE PARTITION FUNCTION myRangePF2 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO

CREATE PARTITION SCHEME myRangePS2
AS PARTITION myRangePF2
TO ( test1fg, test1fg, test1fg, test2fg );

CREATE PARTITION FUNCTION myRangePF3 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO

CREATE PARTITION SCHEME myRangePS3
AS PARTITION myRangePF3
ALL TO ( test1fg );

CREATE PARTITION FUNCTION myRangePF4 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO

CREATE PARTITION SCHEME myRangePS4
AS PARTITION myRangePF4
TO (test1fg, test2fg, test3fg, test4fg, test5fg)

--Creating a RANGE LEFT partition function on an int column

CREATE PARTITION FUNCTION myRangePF1 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);

Values
col1 <= 1 col1 > 1 AND col1 <= 100 col1 > 100 AND col1 <= 1000 col1 > 1000

--Creating a RANGE RIGHT partition function on an int column

CREATE PARTITION FUNCTION myRangePF2 (int)
AS RANGE RIGHT FOR VALUES (1, 100, 1000);

Values
col1 < 1 col1 >= 1 AND col1 < 100 col1 >= 100 AND col1 < 1000 col1 >= 1000

--Creating a RANGE RIGHT partition function on a datetime column

CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES ('20030201', '20030301', '20030401',
'20030501', '20030601', '20030701', '20030801',
'20030901', '20031001', '20031101', '20031201');

DROP PARTITION FUNCTION myRangePF;

Best Practices for Building a Large Relational Data Warehouse

Consider partitioning large fact tables
Build clustered index on the date key of the fact table
Choose partition grain carefully
Design dimension tables appropriately
Write effective queries for partition elimination
Efficiently delete old data
Efficiently load the initial data
Manage statistics manually
Consider efficient backup strategies

Sunday, June 19, 2011

Data Extract from XML file to SQL Server using SSIS

Data Extraction from XML file to SQL is simple just need to consider 2 things.
1. We must have XML file as a source file.
2. We must have XSD file as a source file.

XML file contains the tags and values where as XSD file contains the definition (rules) of the file so both the files are required for data population. If we gets only XML file as a source then we need to created XSD file for XML. Rest all data mapping formalities are same.

Wednesday, June 15, 2011

Remove HTML and XML Tags

select * , Left(Replace(Replace(Replace(Replace(Ltrim(tag),'<',''),'',''),'/',''),
Case when CHARINDEX('{0}',(Replace(Replace(Replace(Replace(ltrim(tag),'<',''),'',''),'/',''))) = 0
Then LEN(Replace(Replace(Replace(Replace(Ltrim(tag),'<',''),'',''),'/',''))
Else
CHARINDEX('{0}',(Replace(Replace(Replace(Replace(ltrim(tag),'<',''),'',''),'/','')))-1
END)
from TableName

Wednesday, June 8, 2011

Example on HierarchyID DataType

CREATE TABLE Employee
(EmployeeID int IDENTITY (1,1) PRIMARY KEY,
LastName nvarchar (50),
FirstName nvarchar (50),
Title nvarchar (50),
HireDate datetime,
PhoneExtention int,
EmailAddress varchar (255),
NodeRec hierarchyid)
GO

DECLARE @NodeRec hierarchyid
set @NodeRec = hierarchyid::GetRoot()
INSERT INTO Employee (LastName, FirstName, HireDate, Title, PhoneExtention, EmailAddress, NodeRec)
values
('Smith', 'Frank', '1/17/2005', 'CEO', 65428,'frank.smith@northwindtraders.com', @NodeRec.GetDescendant(null, null))
GO

select * from Employee

DECLARE @Manager hierarchyID, @Employee hierarchyID
SELECT @Manager = NodeRec FROM Employee WHERE EmployeeID = 1

SELECT @Employee = max(NodeRec)FROM Employee WHERE NodeRec.GetAncestor(1) = @Manager

INSERT INTO Employee(LastName, FirstName, HireDate, Title, PhoneExtention, EmailAddress, NodeRec)
VALUES
('Jefferson', 'John', '3/23/2006', 'VP Sales', 65647,'john.jefferson@northwindtraders.com', @Manager.GetDescendant(@Employee,null))

select * from Employee

SELECT @Employee = max(NodeRec)FROM Employee WHERE NodeRec.GetAncestor(1) = @Manager

INSERT INTO Employee
(LastName, FirstName, HireDate, Title, PhoneExtention, EmailAddress, NodeRec)
VALUES
('Karls', 'Kristin', '8/2/2006', 'VP IT', 63423,'kristin.karls@northwindtraders.com', @Manager.GetDescendant(@Employee,null))

select * from Employee
select EmployeeId, NodeRec, NodeRec.ToString()'Employee Path', LastName,FirstName from Employee

GO

CREATE PROCEDURE usp_AddEmployee
@SupervisorID int,
@FirstName nvarchar (50),
@LastName nvarchar (50),
@HireDate date,
@Title nvarchar (50),
@PhoneExtention int,
@EmailAddress varchar (255)
AS
BEGIN
DECLARE @Manager hierarchyID, @Employee hierarchyID
SELECT @Manager = NodeRec FROM Employee
WHERE EmployeeID = @SupervisorID

SELECT @Employee = max(NodeRec)FROM Employee
WHERE NodeRec.GetAncestor(1) = @Manager

INSERT INTO Employee
(LastName, FirstName, HireDate, Title, PhoneExtention, EmailAddress,NodeRec)
VALUES
(@LastName, @FirstName, @HireDate, @Title, @PhoneExtention,@EmailAddress, @Manager.GetDescendant(@Employee, null))
END
GO

exec usp_AddEmployee 3, 'Janice', 'Bing', '8/12/2004', 'Development Manager',53672, 'janice.bing@northwindtraders.com'
exec usp_AddEmployee 3, 'Jim', 'Frankenfurter', '6/30/2007', 'NOC Manager',76522, 'jim.frankerfurter@northwindtraders.com'
exec usp_AddEmployee 5, 'John', 'Willis', '7/3/2002', 'NOC Tech', 65242,'john.willis@northwindtraders.com'
exec usp_AddEmployee 5, 'Sarah', 'Jones', '8/4/2002', 'NOC Tech', 73625,'sarah.jones@northwindtraders.com'
exec usp_AddEmployee 5, 'Fred', 'Matt', '5/3/2004', 'NOC Tech', 26253,'fred.matt@northwindtraders.com'
exec usp_AddEmployee 2, 'Chris', 'Marshall', '2/5/2003', 'Sales Manager',82756, 'chris.marshall@northwindtraders.com'
exec usp_AddEmployee 8, 'Bob', 'Harris', '4/3/2003', 'Inside Sales', 63527,'bob.harris@northwindtraders.com'

GO

SELECT * FROM EMPLOYEE
SELECT EmployeeId, NodeRec, NodeRec.ToString()'Employee Path', LastName,FirstName from Employee
GO

Monday, June 6, 2011

CDC concept in SQL Server 2008

declare @rc int
exec @rc = sys.sp_cdc_enable_db
select @rc
-- new column added to sys.databases: is_cdc_enabled
select name, is_cdc_enabled from sys.databases

Go
create table dbo.customer1
(
id int identity not null
, name varchar(50) not null
, state varchar(2) not null
, constraint pk_customer primary key clustered (id)
)

Go

exec sys.sp_cdc_enable_table
@source_schema = 'dbo',
@source_name = 'customer1' ,
@role_name = 'CDCRole',
@supports_net_changes = 1

select name, type, type_desc, is_tracked_by_cdc from sys.tables

Friday, June 3, 2011

Limitations of SQL AZURE

1. SQL Azure Database supports tabular data stream (TDS) protocol client version 7.3 or later. Earlier versions of TDS protocol are not supported.
2. Connecting to SQL Azure Database by using OLE DB is not supported.
3. Only TCP/IP connections are allowed.
4. The SQL Server 2008 SQL Server browser is not supported because SQL Azure Database does not have dynamic ports, only port 1433.
5. When using the SQL Server 2008 Native Client ODBC driver, the Test Connection button may result in an error that master.dbo.syscharsets is not supported. Ignore this error, save the DSN, and use it. In addition, when you choose to connect to a different database other than master while configuring the DSN, the TESTS COMPLETED SUCCESSFULLY message may not show up even when there is no error.
6. With the SQL Server 2008 R2 Native Client ODBC driver, you will not receive the error stating that the master.dbo.syscharsets is not supported even when connected to a different database.
7. It does not support The RESTORE statement.
8. It doesn’t support Attaching a database to the SQL Azure server.
9. SQL Azure Database does not support SQL Server Agent or jobs. You can, however, run SQL Server Agent on your on-premise SQL Server and connect to SQL Azure Database.
10. SQL Azure Database does not support distributed transactions, which are transactions that affect several resources.
11. Both the READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION database options are set to ON in SQL Azure Database. Because SET in the ALTER DATABASE Transact-SQL statement is not supported, these database options cannot be changed.
12. When using an on-premise SQL Server, you can set collations at server, database, column, and expression levels. SQL Azure Database does not allow setting the collation at the server or database level. To use the non-default collation with SQL Azure Database, set the collation at the column level or the expression level.
13. SQL Azure Database does not support tables without clustered indexes. A table must have a clustered index. If a table is created without a clustered constraint, a clustered index must be created before an insert operation is allowed on the table.
14. By default, SQL Azure Database supports up to 150 databases in each SQL Azure server, including the master database. You can create up to 149 databases in each SQL Azure server. An extension of this limit may be available for your SQL Azure server.
15. Certain user names are not allowed for security reasons. You cannot use the following names:
a. admin
b. administrator
c. guest
d. root
e. sa
16. Additionally, login and user names cannot contain the \ character (Windows Authentication is not supported).
17. SQL Azure Database provides a large-scale multi-tenant database service on shared resources. In order to provide a good experience to all SQL Azure Database customers, your connection to the service may be closed due to the following conditions:
a. Excessive resource usage
b. Connections that have been idle for 30 minutes or longer
c. Failover because of server failures
18. When your connection to the service is closed, you will receive one of the following errors: 40197, 40501, 40544, 40549, 40550, 40551, 40552, 40553, and 40613. For more information on these errors, see Connection-Loss Errors (http://msdn.microsoft.com/en-us/library/4cff491e-9359-4454-bd7c-fb72c4c452ca#bkmk_throt_errors)
19. The following Transact-SQL features are not supported by SQL Azure Database:
a. Common Language Runtime (CLR)
b. Database file placement
c. Database mirroring
d. Distributed queries
e. Distributed transactions
f. Filegroup management
g. Global temporary tables
h. SQL Server configuration options
i. SQL Server Service Broker
j. System tables
k. Trace Flags
20. The SQL Azure Database service is only available through TCP port 1433. To access a SQL Azure database from your computer, ensure that your firewall allows outgoing TCP communication on TCP port 1433. Before you can connect to the SQL Azure server for the first time, you must use the Windows Azure Platform Management Portal (http://go.microsoft.com/fwlink/?LinkID=161793)
to configure the SQL Azure firewall. You will need to create a firewall setting that enables connection attempts from your computer or Windows Azure.
21. SQL Server Management Studio does not support SQL Azure Database in versions prior to SQL Server 2008 R2.
22. For the server-level principal login, the following restrictions apply:
a. The database user in the master database corresponding to the server-level principal login cannot be altered or dropped.
b. Although the server-level principal login is not a member of the two server roles dbmanager and loginmanager, it has all permissions granted with these two roles.
23. To access the master database, every login must be mapped to a user account in the master database.
24. You must be connected to the master database when executing the CREATE/ALTER/DROP LOGIN and CREATE/ALTER/DROP DATABASE statements.
25. When executing the CREATE/ALTER/DROP DATABASE and CREATE/ALTER/DROP LOGIN statements, each of these statements must be the only statement in a Transact-SQL batch. Otherwise, an error occurs. For example, the following Transact-SQL checks whether the database exists. If it exists, a DROP DATABASE statement is called to remove the database. Because the DROP DATABASE statement is not the only statement in the batch, executing this Transact-SQL will result in an error.
26. IF EXISTS (SELECT [name]
FROM [sys].[databases]
WHERE [name] = N'database_name')
DROP DATABASE [database_name];
Go
27. When executing the CREATE USER statement with the FOR/FROM LOGIN option, it must be the only statement in a Transact-SQL batch.
28. When executing the ALTER USER statement with the WITH LOGIN option, it must be the only statement in a Transact-SQL batch.
29. Only the server-level principal login and the members of the dbmanager role have permission to execute the CREATE DATABASE and DROP DATABASE statements.
30. Only the server-level principal login and the members of the loginmanager role have permission to execute the CREATE LOGIN, ALTER LOGIN, and DROP LOGIN statements.
31. When the owner of a database role tries to add/remove another database user to/from that database role, the following error may occur: User or role 'Name' does not exist in this database. This error occurs because the user is not visible to the owner. To resolve this issue, grant the owner with VIEW DEFINITION permission on the user.
32. The following features that were new to SQL Server 2008 R2 are not supported by SQL Azure Database:
a. SQL Server Utility
b. SQL Server PowerShell Provider. PowerShell scripts can be run on an on-premise computer, however, and connect to SQL Azure Database using supported objects (such as System Management Objects or Data-tier Applications Framework).
c. Master Data Services
The following features that were new to SQL Server 2008 are not supported by SQL Azure Database:
a. Change Data Capture
b. Data Auditing
c. Data Compression
d. Extended Events
e. Extension of spatial types and methods through Common Language Runtime (CLR)
f. External Key Management / Extensible Key Management
g. FILESTREAM Data
h. Integrated Full-Text Search
i. Large User-Defined Aggregates (UDAs)
j. Large User-Defined Types (UDTs)
k. Performance Data Collection (Data Collector)
l. Policy-Based Management
m. Resource Governor
n. Sparse Columns
o. SQL Server Replication
p. Transparent Data Encryption
The following features that were new to SQL Server 2005 are not supported by SQL Azure Database:
a. Common Language Runtime (CLR) and CLR User-Defined Types
b. Database Mirroring
c. Service Broker
d. Table Partitioning
e. Typed XML and XML indexing is not supported. The XML data type is supported by SQL Azure Database.
The following features from earlier versions of SQL Server are not supported by SQL Azure Database:
a. Backup and Restore
b. Replication
c. Extended Stored Procedures
d. SQL Server Agent/Jobs

Thursday, June 2, 2011

SQL Server 2008 - Hierarchy ID Data Type

create table [dbo].Organisation
(
DepartmentID int primary key nonclustered,
DepartmentName varchar(100) not null,
DepartmentHierarchyNode hierarchyid not null,
DepartmentHierarchyLevel as DepartmentHierarchyNode.GetLevel() persisted
)

insert Organisation(DepartmentID, DepartmentHierarchyNode, DepartmentName)
values (1, hierarchyid::GetRoot(), 'Contoso')

select * from Organisation

create procedure AddDepartment
@DepartmentID int,
@DepartmentName varchar(100),
@ParentID int
as
begin
declare @ParentNode hierarchyid,
@MaxChildNode hierarchyid
begin transaction
select @ParentNode = DepartmentHierarchyNode
from Organisation
where DepartmentID = @ParentID

select @MaxChildNode = max(DepartmentHierarchyNode)
from Organisation
where @ParentNode = DepartmentHierarchyNode.GetAncestor(1)

insert Organisation (DepartmentID, DepartmentHierarchyNode, DepartmentName)
values (@DepartmentID, @ParentNode.GetDescendant(@MaxChildNode, null), @DepartmentName)
commit
end

exec AddDepartment 2, 'Operations', 1
exec AddDepartment 3, 'Development', 1
exec AddDepartment 4, 'Parking', 1
exec AddDepartment 5, 'Home Operations', 2
exec AddDepartment 6, 'Field Operations', 2
exec AddDepartment 7, 'North Territory', 6
exec AddDepartment 8, 'South Territory', 6
exec AddDepartment 9, 'Database', 3
exec AddDepartment 10, 'Services', 3
exec AddDepartment 11, 'Applications', 3
exec AddDepartment 12, 'Windows', 11
exec AddDepartment 13, 'Internet', 11
exec AddDepartment 14, 'Self', 4
exec AddDepartment 15, 'Valet', 4

select * from Organisation

create procedure ShowDepartmentChart
@DepartmentName varchar(50)
as
begin
declare @TopNode hierarchyid,
@TopLevel int

select @TopNode = DepartmentHierarchyNode,
@TopLevel = DepartmentHierarchyLevel
from Organisation
where DepartmentName = @DepartmentName

select DepartmentHierarchyNode.ToString() NodeText, space((DepartmentHierarchyLevel - @TopLevel) * 5) + DepartmentName Department
from Organisation
where DepartmentHierarchyNode.IsDescendantOf(@TopNode) = 1
order by DepartmentHierarchyNode
end
go

exec ShowDepartmentChart 'Contoso'

Populating a Table with Existing Hierarchical Data

To create the EmployeeDemo table

CREATE TABLE EmployeeDemo (EmployeeID int, LoginID varchar(200), ManagerID int);
INSERT INTO EmployeeDemo
VALUES (1, 'zarifin', Null),
(2, 'tplate', 1),
(3, 'hjensen', 1),
(4, 'schai', 2),
(5, 'elang', 2),
(6, 'gsmits', 2),
(7, 'sdavis', 3),
(8, 'norint', 3),
(9, 'jwang', 4),
(10, 'malexander', 4);

To examine the structure and data of the EmployeeDemo table
select * from EmployeeDemo

SELECT
Mgr.EmployeeID AS MgrID, Mgr.LoginID AS Manager,
Emp.EmployeeID AS E_ID, Emp.LoginID
FROM EmployeeDemo AS Emp
LEFT JOIN EmployeeDemo AS Mgr
ON Emp.ManagerID = Mgr.EmployeeID
ORDER BY MgrID, E_ID;

To create a new table named NewOrg

CREATE TABLE NewOrg
(
OrgNode hierarchyid,
EmployeeID int,
LoginID nvarchar(50),
ManagerID int
CONSTRAINT PK_NewOrg_OrgNode
PRIMARY KEY CLUSTERED (OrgNode)
);
GO

To create a temporary table named #Children
CREATE TABLE #Children
(
EmployeeID int,
ManagerID int,
Num int
);
GO

CREATE CLUSTERED INDEX tmpind ON #Children(ManagerID, EmployeeID);
GO

To populate the NewOrg table

INSERT #Children (EmployeeID, ManagerID, Num)
SELECT EmployeeID, ManagerID,
ROW_NUMBER() OVER (PARTITION BY ManagerID ORDER BY ManagerID)
FROM EmployeeDemo
GO

SELECT * FROM #Children ORDER BY ManagerID, Num

Populate the NewOrg table. Use the GetRoot and ToString methods to concatenate the Num values into the hierarchyid format, and then update the OrgNode column with the resultant hierarchical values:

WITH paths(path, EmployeeID)
AS (
-- This section provides the value for the root of the hierarchy
SELECT hierarchyid::GetRoot() AS OrgNode, EmployeeID
FROM #Children AS C
WHERE ManagerID IS NULL

UNION ALL
-- This section provides values for all nodes except the root
SELECT
CAST(p.path.ToString() + CAST(C.Num AS varchar(30)) + '/' AS hierarchyid),
C.EmployeeID
FROM #Children AS C
JOIN paths AS p
ON C.ManagerID = P.EmployeeID
)
INSERT NewOrg (OrgNode, O.EmployeeID, O.LoginID, O.ManagerID)
SELECT P.path, O.EmployeeID, O.LoginID, O.ManagerID
FROM EmployeeDemo AS O
JOIN Paths AS P
ON O.EmployeeID = P.EmployeeID
GO

select * from EmployeeDemo
SELECT OrgNode.ToString() AS LogicalNode, *
FROM NewOrg
ORDER BY LogicalNode;

To index the NewOrg table for efficient searches

ALTER TABLE NewOrg
ADD H_level AS OrgNode.GetLevel() ;
CREATE UNIQUE INDEX EmpBFInd
ON NewOrg(H_level, OrgNode) ;

CREATE UNIQUE INDEX EmpIDs_unq ON NewOrg(EmployeeID) ;

SELECT OrgNode.ToString() AS LogicalNode,
OrgNode, H_Level, EmployeeID, LoginID
FROM NewOrg
ORDER BY OrgNode;

SELECT OrgNode.ToString() AS LogicalNode,
OrgNode, H_Level, EmployeeID, LoginID
FROM NewOrg
ORDER BY H_Level, OrgNode;

SELECT OrgNode.ToString() AS LogicalNode,
OrgNode, H_Level, EmployeeID, LoginID
FROM NewOrg
ORDER BY EmployeeID;
GO