BEGIN TRANSACTION (Transact-SQL)

BEGIN TRANSACTION represents a point at which the data referenced by a connection is logically and physically consistent. If errors are encountered, all data modifications made after the BEGIN TRANSACTION can be rolled back to return the data to this known state of consistency. Each transaction lasts until either it completes without errors and COMMIT TRANSACTION is issued to make the modifications a permanent part of the database, or errors are encountered and all modifications are erased with a ROLLBACK TRANSACTION statement.

Syntax:

BEGIN { TRAN | TRANSACTION }
    [ { transaction_name | @tran_name_variable }
      [ WITH MARK [ 'description' ] ]
    ]
[ ; ]

Example:

You can catch your error in T-SQL as per my experience

-- Exec spa_begintrnExample 'a'

Create Proc spa_begintrnExample
@flag char(1)

AS
if @flag='a'
begin

BEGIN TRANSACTION

Update table set clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Update table2 set clm12='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Delete from table3 where clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Delete from table4 set clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

COMMIT TRANSACTION

GOTO  EndSave

QuitWithRollback:
  IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave: 

end

Remarks:

The local transaction started by the BEGIN TRANSACTION statement is escalated to a distributed transaction if the following actions are performed before the statement is committed or rolled back:

  • An INSERT, DELETE, or UPDATE statement that references a remote table on a linked server is executed. The INSERT, UPDATE, or DELETE statement fails if the OLE DB provider used to access the linked server does not support the ITransactionJoin interface.
  • A call is made to a remote stored procedure when the REMOTE_PROC_TRANSACTIONS option is set to ON.

SQL RAND Function

SQL RAND Function

The SQL RAND() function is used to generate some random numbers at run time. Here is the syntax:

Syntax

RAND ( [ seed ] )

Repetitive calls of RAND() with the same seed value return the same results.

For one connection, if RAND() is called with a specified seed value, all subsequent calls of RAND() produce results based on the seeded RAND() call. For example, the following query will always return the same sequence of numbers.

SELECT RAND(100), RAND(), RAND()

Examples:

The following example produces four different random numbers that are generated by the RAND function.

DECLARE @counter smallint
SET @counter = 1
WHILE @counter < 5
BEGIN
SELECT RAND() Random_Number
SET @counter = @counter + 1
END
GO

The following example returns random float numbers based on three different seed values.

CREATE TABLE Random (Seed1 float, Seed5 float, Seed10 float)
INSERT INTO Random Values (RAND(1), RAND(5), RAND(10))
SELECT * FROM Random

The RAND function is a pseudorandom number generator that operates in a manner similar to the C run-time library rand function. If no seed is provided, the system generates its own variable seed numbers. If you call RAND with a seed value, you must use variable seed values to generate random numbers. If you call RAND multiple times with the same seed value, it returns the same generated value. The following script returns the same value for the calls to RAND because they all use the same seed value:

SELECT RAND(159784)
SELECT RAND(159784)
SELECT RAND(159784)

A common way to generate random numbers from RAND is to include something relatively variable as the seed value, such as adding several parts of a GETDATE:

SELECT RAND( (DATEPART(mm, GETDATE()) * 100000 )
+ (DATEPART(ss, GETDATE()) * 1000 )
+ DATEPART(ms, GETDATE()) )

When you use an algorithm based on GETDATE to generate seed values, RAND can still generate duplicate values if the calls to RAND are made within the interval of the smallest datepart used in the algorithm. This is especially likely when the calls to RAND are included in a single batch. Multiple calls to RAND in a single batch can be executed within the same millisecond. This is the smallest increment of DATEPART. In this case, incorporate a value based on something other than time to generate the seed values.

Tips And Tricks For Advanced MS SQL Server Developers

Tips And Tricks For Advanced MS SQL Server Developers:

  1. Use “TRUNCATE TABLE” statement instead of “DELETE” clause if you want to delete all rows from a table. It is much faster then “DELETE” statement without any conditions. “TRUNCATE TABLE” frees all the space occupied by that table’s data and indexes, without logging the individual row deletes.
  2. Always use owner prefix in T-SQL  queries:

    SELECT mycolumn FROM dbo.mytable

    In this case query optimizer does not have to decide whether to retrieve from dbo.mytable or other owner’s table and avoids recompilation.  Recompilation results in no performance advantages of stored procedures usage.

  3. Don’t use “sp_“ as your prefix for stored procedures – it is a reserved prefix in MS SQL server! MS SQL server searches for a stored procedure with “sp_” prefix in the system procedures first, and only after that looks for them in client procedures.
  4. If you are unable to install MSDE at home because of unknown error – check that you did not stop “Server” system service on you PC…
  5. There are thousands of examples, when developers use “SELECT COUNT(*)” statement. But there is another, much faster way to accomplish the task:
    SELECT rows FROM sysindexes WHERE id = OBJECT_ID('Table_Name') AND indid < 2
  6. Include “SET NOCOUNT ON” statement in your stored procedures to greatly reduce network traffic.
  7. Use the “BETWEEN” clause instead of “IN” for greater performance:
    SELECT productId FROM customer
    WHERE productId BETWEEN 1 AND 9

    Instead of:

    SELECT productId
    FROM customer
    WHERE productId IN (1, 2, 3, 4,5,6,7,8,9)
  8. Use Table variables – new feature of MS SQL 2000 instead of temp tables. Table variables are created in memory, not written to the tempdb database, and therefore they are much faster. However, be careful to use them only with not very huge amount of data that you want to allocate in temp tables, otherwise you can easily get the server down.

Basic Definitions

Definition of a Database

A database is a collection of related information, accessed and managed by its DBMS. After experimenting with hierarchical and networked DBMSs during the 1970’s, the IT industry became dominated by relational DBMSs (Or Object-Relational Database Management System) such as Informix database, Oracle, Sybase, and, later on, Microsoft SQL Server and the like.

In a strictly technical sense, for any database to be defined as a “Truly Relational Model Database Management System,” it should, ideally, adhere to the twelve rules defined by Edgar F. Codd, pioneer in the field of relational databases. To date, while many come close, it is admitted that nothing on the market adheres 100% to those rules, any more than they are 100% ANSI-SQL compliant.

While IBM and Oracle technically were the earliest on the RDBMS scene, many others have followed, and while it is unlikely that miniSQL still exist in their original form, Monty’s MySQL is still extant and thriving, along with the Ingres-descended PostgreSQL. Microsoft Access – the 1995+ versions, not the prior versions – were, despite various limitations, technically the closest thing to being ‘Truly Relational’ DBMS’s for the desktop PC, with Visual FoxPro, and many other desktop products marketed at that time far less compliant with Codd’s Rules.

A relational DBMS manages information about types of real-world things (entities) in the form of tables that represent the entities. A table is like a spreadsheet; each row represents a particular entity (instance—), and each column represents a type of information about the entity (domain). Sometimes entities are made up of smaller related entities, such as orders and order lines; and so one of the challenges of a multi-user DBMS is provide data about related entities from the standpoint of an instant of logical consistency.

Properly managed relational databases minimize the need for application programs to contain information about the physical storage of the data they access. To maximize the isolation of programs from data structures, relational DBMSs restrict data access to the messaging protocol SQL, a nonprocedural language that limits the programmer to specifying desired results. This message-based interface was a building block for the decentralization of computer hardware, because a program and data structure with such a minimal point of contact become feasible to reside on separate computers.

Recoverability

Recoverability means that, if a data entry error, program bug or hardware failure (Vista) occurs, the DBA can bring the database backward in time to its state at an instant of logical consistency before the damage was done. Recoverability activities include making database backups and storing them in ways that minimize the risk that they will be damaged or lost, such as placing multiple copies on removable media and storing them outside the affected area of an anticipated disaster. Recoverability is the DBA’s most important concern.

The backup of the database consists of data with timestamps combined with database logs to change the data to be consistent to a particular moment in time. It is possible to make a backup of the database containing only data without timestamps or logs, but the DBA must take the database offline to do such a backup.

The recovery tests of the database consist of restoring the data, then applying logs against that data to bring the database backup to consistency at a particular point in time up to the last transaction in the logs. Alternatively, an offline database backup can be restored simply by placing the data in-place on another copy of the database.

If a DBA (or any administrator) attempts to implement a recoverability plan without the recovery tests, there is no guarantee that the backups are at all valid. In practice, in all but the most mature RDBMS packages, backups rarely are valid without extensive testing to be sure that no bugs or human error have corrupted the backups.

Security
Security means that users’ ability to access and change data conforms to the policies of the business and the delegation decisions of its managers. Like other metadata, a relational DBMS manages security information in the form of tables. These tables are the “keys to the kingdom” and so it is important to protect them from intruders.

Performance

Performance means that the database does not cause unreasonable online response times, and it does not cause unattended programs to run for an unworkable period of time. In complex client/server and three-tier systems, the database is just one of many elements that determine the performance that online users and unattended programs experience. Performance is a major motivation for the DBA to become a generalist and coordinate with specialists in other parts of the system outside of traditional bureaucratic reporting lines.

Techniques for database performance tuning have changed as DBA’s have become more sophisticated in their understanding of what causes performance problems and their ability to diagnose the problem.

In the 1990s, DBAs often focused on the database as a whole, and looked at database-wide statistics for clues that might help them find out why the system was slow. Also, the actions DBAs took in their attempts to solve performance problems were often at the global, database level, such as changing the amount of computer memory available to the database, or changing the amount of memory available to any database program that needed to sort data.

DBA’s now understand that performance problems initially must be diagnosed, and this is best done by examining individual SQL statements, table process, and system architecture, not the database as a whole. Various tools, some included with the database and some available from third parties, provide a behind the scenes look at how the database is handling the SQL statements, shedding light on what’s taking so long.Having identified the problem, the individual SQL statement can be clarify

Development/Testing Support

Development and testing support is typically what the database administrator regards as his or her least important duty, while results-oriented managers consider it the DBA’s most important duty. Support activities include collecting sample production data for testing new and changed programs and loading it into test databases; consulting with programmers about performance tuning; and making table design changes to provide new kinds of storage for new program functions.

Indexing Service

What is Indexing Service?

Indexing Service is a base service for Microsoft® Windows® 2000 or later that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching.

Indexing Service can extract both text and property information from files on the local host and on remote, networked hosts. The files can be simply members of a selected file system or part of a virtual Web hosted by, for example, Internet Information Services (IIS).

Indexing Service extracts the content by filtering—using filter components that understand a file’s format. The format could include multi-language features such as international languages and locales. A filter component implements the IFilter interface, which supplies methods to read a file to extract text and properties. Windows 2000 and Microsoft Windows XP supply filters for Microsoft Office files, Hypertext Markup Language (HTML) files, Multipurpose Internet Mail Extension (MIME) messages, and plain-text files.

Indexing Service then merges the extracted information into catalogs of indexes for efficient searches. Indexing is the overall process of filtering, creating index entries, and merging them into catalogs.

The final step in the indexing process is creation of a catalog that contains a master index (and any temporary word lists and shadow indexes) storing words and their locations within a set of indexed documents. Subsequently, searching, or querying, the catalogs for particular word combinations uses the master index as well as word lists and shadow indexes to execute queries quickly and efficiently.

Windows 2000 and Windows XP include basic facilities for querying the Indexing Service catalog and for managing the state and properties of Indexing Service itself. These facilities include:

  • When Indexing Service is running, Start/Search/For Files or Folders uses the Indexing Service catalog.
  • The Indexing Service snap-in for the Microsoft Management Console (MMC) provides the means to start, stop, and pause Indexing Service, and to administer many of its properties, such as those defining its catalogs.
  • The Platform Software Development Kit (SDK) provides additional versatile and flexible facilities for programmatically interacting with Indexing Service. These facilities include:
  • Admin and Query Helper objects and ActiveX® Data Object (ADO) methods for use with Microsoft Visual Basic®, Microsoft Visual Basic Scripting Edition (VBScript), Microsoft Visual J++® and Microsoft JScript® development software.
  • ISAPI Extensions for use in .idq, .ida, and .htx files.
  • OLE DB Helper functions for use with Microsoft Visual C++® development system.
  • OLE DB Provider for Indexing Service interfaces for use with Visual C++.
  • IFilter interface for use with Visual C++

Source: MSDN

SQL Performance Tuning using Indexes

SQL Performance Tuning using Indexes

This article looks at general guidelines to creating effective indexes using short keys, distinct keys, covering indexes and clustered indexes.

Effective indexes are one of the best ways to improve performance in a database application. Without an index, the SQL Server engine is like a reader trying to find a word in a book by examining each page. By using the index in the back of a book, a reader can complete the task in a much shorter time. In database terms, a table scan happens when there is no index available to help a query. In a table scan SQL Server examines every row in the table to satisfy the query results. Table scans are sometimes unavoidable, but on large tables, scans have a terrific impact on performance.

One of the most important jobs for the database is finding the best index to use when generating an execution plan. Most major databases ship with tools to show you execution plans for a query and help in optimizing and tuning indexes. This article outlines several good rules of thumb to apply when creating and modifying indexes for your database. First, let’s cover the scenarios where indexes help performance, and when indexes can hurt performance.

Useful Index Queries

Just like the reader searching for a word in a book, an index helps when you are looking for a specific record or set of records with a WHERE clause. This includes queries looking for a range of values, queries designed to match a specific value, and queries performing a join on two tables. For example, both of the queries against the Northwind database below will benefit from an index on the UnitPrice column.

DELETE FROM Products WHERE UnitPrice = 1

SELECT * FROM PRODUCTS
WHERE UnitPrice BETWEEN 14 AND 16

Since index entries are stored in sorted order, indexes also help when processing ORDER BY clauses. Without an index the database has to load the records and sort them during execution. An index on UnitPrice will allow the database to process the following query by simply scanning the index and fetching rows as they are referenced. To order the records in descending order, the database can simply scan the index in reverse.

SELECT * FROM Products ORDER BY UnitPrice ASC

Grouping records with a GROUP BY clause will often require sorting, so a UnitPrice index will also help the following query to count the number of products at each price.

SELECT Count(*), UnitPrice FROM Products
GROUP BY UnitPrice

By retrieving the records in sorted order through the UnitPrice index, the database sees matching prices appear in consecutive index entries, and can easily keep a count of products at each price. Indexes are also useful for maintaining unique values in a column, since the database can easily search the index to see if an incoming value already exists. Primary keys are always indexed for this reason.

Index Drawbacks

Indexes are a performance drag when the time comes to modify records. Any time a query modifies the data in a table the indexes on the data must change also. Achieving the right number of indexes will require testing and monitoring of your database to see where the best balance lies. Static systems, where databases are used heavily for reporting, can afford more indexes to support the read only queries. A database with a heavy number of transactions to modify data will need fewer indexes to allow for higher throughput. Indexes also use disk space. The exact size will depends on the number of records in the table as well as the number and size of the columns in the index. Generally this is not a major concern as disk space is easy to trade for better performance.

Building The Best Index

There are a number of guidelines to building the most effective indexes for your application. From the columns you select to the data values inside them, consider the following points when selecting the indexes for your tables.

Short Keys

Having short index is beneficial for two reasons. First, database work is inherently disk intensive. Larger index keys will cause the database to perform more disk reads, which limits throughput. Secondly, since index entries are often involved in comparisons, smaller entries are easier to compare. A single integer column makes the absolute best index key because an integer is small and easy for the database to compare. Character strings, on the other hand, require a character by character comparison and attention to collation settings.

Distinct Keys

The most effective indexes are the indexes with a small percentage of duplicated values. As an analogy, think of a phone book for a town where almost everyone has the last name of Smith. A phone book in this town is not very useful if sorted in order of last name, because you can only discount a small number of records when you are looking for a Smith.

An index with a high percentage of unique values is a selective index. Obviously, a unique index is highly selective since there are no duplicate entries. Many databases will track statistics about each index so they know how selective each index is. The database uses these statistics when generating an execution plan for a query.


Covering Queries

Indexes generally contain only the data values for the columns they index and a pointer back to the row with the rest of the data. This is similar to the index in a book: the index contains only the key word and then a page reference you can turn to for the rest of the information. Generally the database will have to follow pointers from an index back to a row to gather all the information required for a query. However, if the index contains all of he columns needed for a query, the database can save a disk read by not returning to the table for more information.

Take the index on UnitPrice we discussed earlier. The database could use just the index entries to satisfy the following query.

SELECT Count(*), UnitPrice FROM Products
GROUP BY UnitPrice

We call these types of queries covered queries, because all of the columns requested in the output are covered by a single index. For your most crucial queries, you might consider creating a covering index to give the query the best performance possible. Such an index would probably be a composite index (using more than one column), which appears to go against our first guideline of keeping index entries as short as possible. Obviously this is another tradeoff you can only evaluate with performance testing and monitoring.

Clustered Indexes

Many databases have one special index per table where all of the data from a row exists in the index. SQL Server calls this index a clustered index. Instead of an index at the back of a book, a clustered index is closer in similarity to a phone book because each index entry contains all the information you need, there are no references to follow to pick up additional data values.

As a general rule of thumb, every non-trivial table should have a clustered index. If you only create one index for a table, make the index a clustered index. In SQL Server, creating a primary key will automatically create a clustered index (if none exists) using the primary key column as the index key. Clustered indexes are the most effective indexes (when used, they always cover a query), and in many databases systems will help the database efficiently manage the space required to store the table.

When choosing the column or columns for a clustered index, be careful to choose a column with static data. If you modify a record and change the value of a column in a clustered index, the database might need to move the index entry (to keep the entries in sorted order). Remember, index entries for a clustered index contain all of the column values, so moving an entry is comparable to executing a DELETE statement followed by an INSERT, which can obviously cause performance problems if done often. For this reason, clustered indexes are often found on primary or foreign key columns. Key values will rarely, if ever, change.

Conclusion

Determining the correct indexes to use in a database requires careful analysis, benchmarking, and testing. The rules of thumb presented in this article are general guidelines. After applying these principals you need to retest your specific application in your specific environment of hardware, memory, and concurrent activity. See my previous article: SQL Server Indexes, for a more thorough introduction.

CREATE PARTITION FUNCTION

CREATE PARTITION FUNCTION (Transact-SQL)

Creates a function in the current database that maps the rows of a table or index into partitions based on the values of a specified column. Using CREATE PARTITION FUNCTION is the first step in creating a partitioned table or index.

CREATE PARTITION FUNCTION partition_function_name ( input_parameter_type )
AS RANGE [ LEFT | RIGHT ]
FOR VALUES ( [ boundary_value [ ,...n ] ] )
[ ; ]

partition_function_name
Is the name of the partition function. Partition function names must be unique within the database and comply with the rules for identifiers.
input_parameter_type
Is the data type of the column used for partitioning. All data types are valid for use as partitioning columns, except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types.The actual column, known as a partitioning column, is specified in the CREATE TABLE or CREATE INDEX statement.
boundary_value
Specifies the boundary values for each partition of a partitioned table or index that uses partition_function_name. If boundary_value is empty, the partition function maps the whole table or index using partition_function_name into a single partition. Only one partitioning column, specified in a CREATE TABLE or CREATE INDEX statement, can be used.boundary_value is a constant expression that can reference variables. This includes user-defined type variables, or functions and user-defined functions. It cannot reference Transact-SQL expressions. boundary_value must either match or be implicitly convertible to the data type supplied in input_parameter_type, and cannot be truncated during implicit conversion in a way that the size and scale of the value does not match that of its corresponding input_parameter_type.

If boundary_value consists of datetime or smalldatetime literals, these literals are evaluated assuming that us_english is the session language. This behavior is deprecated. To make sure the partition function definition behaves as expected for all session languages, we recommend that you use constants that are interpreted the same way for all language settings, such as the yyyymmdd format; or explicitly convert literals to a specific style. For more information, see Writing International Transact-SQL Statements. To determine the language session of your server, run SELECT @@LANGUAGE.

…n
Specifies the number of values supplied by boundary_value, not to exceed 999. The number of partitions created is equal to n + 1. The values do not have to be listed in order. If the values are not in order, the Database Engine sorts them, creates the function, and returns a warning that the values are not provided in order. The Database Engine returns an error if n includes any duplicate values.
LEFT | RIGHT
Specifies to which side of each boundary value interval, left or right, the boundary_value [ ,…n ] belongs, when interval values are sorted by the Database Engine in ascending order from left to right. If not specified, LEFT is the default. For more information, see Examples.
The scope of a partition function is limited to the database that it is created in. Within the database, partition functions reside in a separate namespace from the other functions.Any rows whose partitioning column has null values are placed in the left-most partition, unless NULL is specified as a boundary value and RIGHT is indicated. In this case, the left-most partition is an empty partition, and NULL values are placed in the following partition.
Any one of the following permissions can be used to execute CREATE PARTITION FUNCTION:

  • ALTER ANY DATASPACE permission. This permission defaults to members of the sysadmin fixed server role and the db_owner and db_ddladmin fixed database roles.
  • CONTROL or ALTER permission on the database in which the partition function is being created.
  • CONTROL SERVER or ALTER ANY DATABASE permission on the server of the database in which the partition function is being created.

A. Creating a RANGE LEFT partition function on an int column

The following partition function will partition a table or index into four partitions.

CREATE PARTITION FUNCTION myRangePF1 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 <= 1 col1 > 1 AND col1 <= 100 col1 > 100 AND col1 <= 1000 col1 > 1000

B. Creating a RANGE RIGHT partition function on an int column

The following partition function uses the same values for boundary_value [ ,…n ] as the previous example, except it specifies RANGE RIGHT.

CREATE PARTITION FUNCTION myRangePF2 (int)
AS RANGE RIGHT FOR VALUES (1, 100, 1000);

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 < 1 col1 >= 1 AND col1 < 100 col1 >= 100 AND col1 < 1000 col1 >= 1000

C. Creating a RANGE RIGHT partition function on a datetime column

The following partition function partitions a table or index into 12 partitions, one for each month of a year’s worth of values in a datetime column.

CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES ('20030201', '20030301', '20030401',
               '20030501', '20030601', '20030701', '20030801',
               '20030901', '20031001', '20031101', '20031201');

The following table shows how a table or index that uses this partition function on partitioning column datecol would be partitioned.

Partition 1 2 11 12
Values datecol < February 1, 2003 datecol >= February 1, 2003 AND datecol < March 1, 2003 datecol >= November 1, 2003 AND col1 < December 1, 2003 col1 >= December 1, 2003

D. Creating a partition function on a char column

The following partition function partitions a table or index into four partitions.

CREATE PARTITION FUNCTION myRangePF3 (char(20))
AS RANGE RIGHT FOR VALUES ('EX', 'RXE', 'XR');

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 < EX col1 >= EX AND col1 < RXE col1 >= RXE AND col1 < XR col1 >= XR

Partitioning the Data in a Table

Is it ever good database design practice (for speed sake, etc.) to essentially make copies of tables to hold a certain group of data?

For example, I have come across a database table that stores information for a housing subdivision; ie. lot number, lot size, lot price, etc. And the database to which this table belongs stores this data for many subdivisions. However, instead of having one table that stores the subdivision information for ALL subdivisions (and having some ID that represents the specific subdivision), this database has one table for each subdivision. For example, ‘Clair Ridge Estates Subdivision Info’ and another table ‘Possum Bend Subdivision Info’, etc, with each table having the exact same fields. And, if they needed another subdivision, they would make yet another copy and give it a unique name.”

Yes, there are times this is a good idea. I like this question because it reminded me of one of my favorite features of SQL Server – partitioned views.

What you’re referring to is an optimization method called horizontal partitioning. That is, a table is split up into multiple smaller tables containing the same number of columns, but fewer rows. Compare this to vertical partitioning, in which the table is split into multiple smaller tables with the same number of rows, but fewer columns.

And yes, this design decision is often made to improve performance. Horizontally partitioning a table gives us some advantages:

  • Each partition table will have fewer rows; if you have to (heaven forbid) table-scan the data, it will take less time.
  • Indexes on each partition table will be smaller (=faster seeks) than a corresponding index on the unpartitioned table.
  • If you need to, you can put each partition table on a different filegroup and partition the data among multiple disks/RAID volumes/drive controllers.
  • If you’re trying to whomp Oracle’s TPC-C benchmark, then you may want to consider partitioning the data among multiple federated servers in SQL Server 2000. (Although for storing information about housing subdivisions, this may be a bit over the top.)
  • If you create a partitioned view on the partitioned tables, you can treat the view like it is the whole table, and the QP (query processor) will only touch the tables it needs to fulfill the query. You get the benefits of horizontal partitioning without the query headache.

So, as you can see, horizontal partitioning is all about splitting up the workload – spreading out data access among tables, indexes, disks, and servers.

Why would you want to do this? Well, maybe you have a big table – hundreds of millions of rows, for instance. Or maybe not so many rows, but large rows. Or maybe you have a table in a data warehouse that contains frequently and infrequently accessed rows. All of these situations are candidates for partitioning.

Now, there are two big downsides to all of this:

  • Unless you use a partitioned view to access the data, you’ll have to build logic into your application to access the correct table, and that has a high suck factor. Please, please, use the partitioned view instead.
  • You actually have to partition the data. And maintain it. And create the partitioned view. And balance the amount of data in each partitioned table, if needed. In other words, the dreaded “administrative overhead”.

Okay, since I’m touting the partitioned view, let me quickly explain how to create one. I’ll use the information from this question as an example.First, the tables:


CREATE TABLE Subdiv_ClaireRidgeEstates (SubdivID int, LotID int /*, etc.*/)
CREATE TABLE Subdiv_TibetianYakFarms (SubdivID int, LotID int /*, etc.*/)

You may notice that I included the Subdivision ID in each table. This is important; for the partitioned view to work most effectively, the QP must be able to know that each partition table will only contain a certain type of data. To do this, you need to build CHECK constraints on each table on the ID that you’re partitioning on. Since you’re partitioning the data by subdivision, you will build CHECK constraints on SubdivID:


ALTER TABLE Subdiv_ClaireRidgeEstates ADD CONSTRAINT CK_CRE_SubdivID CHECK (SubdivID = 42)
ALTER TABLE Subdiv_TibetianYakFarms ADD CONSTRAINT CK_TYF_SubdivID CHECK (SubdivID = 9538)

You could just as easily partition by using a surrogate key field and assigning a range of key values to each partition table. Or by partitioning on a date and using a range of dates for each partition value. Regardless, you still need those CHECK constraints in place on each table.

After actually creating the partition tables, distributing the data, and building the CHECK constraints, building the view is pretty easy. You just SELECT * from each partition table and use UNION ALL to combine the results of the query:


CREATE VIEW Subdivision
AS
SELECT * FROM Subdiv_ClaireRidgeEstates
UNION ALL
SELECT * FROM Subdiv_TibetianYakFarms

Now, if you’ve been following along with the example, try inserting some sample rows into each table:


INSERT Subdiv_ClaireRidgeEstates VALUES (42,9999)
INSERT Subdiv_TibetianYakFarms VALUES (9538,1234)

Now, turn on the “Show Execution Plan” option in query analyzer, and run the following queries:

SELECT * FROM Subdivision WHERE SubdivID = 42
SELECT * FROM Subdivision WHERE SubdivID = 9538
SELECT * FROM Subdivision


You’ll notice that for the first two queries, SQL Server only pulls information from the required partition table. Only in the last query, where we don’t filter by SubdivID, does the QP pull data from each partition table.

In SQL Server 7.0, you unfortunately cannot update data in a partitioned view. However, this IS possible in SQL Server 2000. Check out SQL Server Books Online (especially if you’re going to use distributed partitioned views) for the do’s and don’ts of partitioning data.

Backup/Restore Optimization Tips

Backup/Restore Optimization Tips:

  1. Try to perform backup to the local hard disk first, and copy backup file(s) to the tape later.

    When you perform backup, some SQL Server commands cannot be made, for example: during backup you cannot run ALTER DATABASE statement with either the ADD FILE or REMOVE FILE options, you cannot shrink database, you cannot run CREATE INDEX statement and so on. So, to decrease the backup operation’s time, you can perform backup to the local hard disk first, and then copy backup file(s) to the tape, because tape device usually much more slow than hard disks. The smaller backup
    operation’s time is, the less impact there will be on the server when the backup occurs.

  2. Perform backup on multiple backup devices.
    Using multiple backup devices forces SQL Server to create a separate backup thread for each backup device, so the backups will be written to all backup devices in parallel.
  3. Perform backup on a physical disk array, so the more disks in array the more quickly the backup will be made.This can improve performance because a separate thread will be created for each backup device on each disk in order to write the backup’s data in parallel.
  4. Perform backups during periods of low database access.
    Because backup is very resource effective, try to schedule it during CPU idle time and slow production periods.
  5. Use full backup to minimize the time to restore databases.
    The full backups take the longest to perform in comparison with differential and incremental backups, but are the fastest to restore.
  6. Use incremental backup to minimize the time to backup databases.
    The incremental backups take the fastest to perform in comparison with full and differential backups, but are the longest to restore.
  7. Use differential backup instead of incremental backup when the users update the same data many times.
    Because a differential backup captures only those data pages that have changed after the last database backup, you can eliminate much of the time the server spends rolling transactions forward when recovering transaction logs from the incremental backups. Using differential backup, in this case, can improve the recovery process in several times.
  8. Try to separate your database to different files and filegroups to backing up only appropriate file/filegroup.
    This can results in smaller backup operation’s time. The smaller backup operation’s time is, the less impact there will be on the server when the backup occurs.
  9. Use Windows NT Performance Monitor or Windows 2000 System Monitor to check a backup impact on the total system performance.
    You can verify the following counters: SQL Server Backup Device: Device Throughput Bytes/sec to determine the throughput of specific backup devices, rather than the entire database backup or restore operation; SQL Server Databases: Backup/Restore Throughput/sec to monitor the throughput of the entire database backup or restore operation; PhysicalDisk: % Disk Time to monitors the percentage of time that the disk is busy with read/write activity; Physical Disk Object: Avg. Disk Queue Length to determine how many system requests on average are waiting for disk access.
  10. To decrease the backup operation’s time consider backing up more often.
    The more often you will make backup, the smaller they will be, and the less impact there will be on the server when the backup occurs. So, to avoid locking users for a long time during everyday work, you can perform backup more often. Note. The more often you will make backup, the less data you will lost if the database becomes corrupt.
  11. Place a tape drive on another SCSI bus as disks or a CD-ROM drive.
    The tape drives perform better if they have a dedicated SCSI bus for each tape drive used. Using separate SCSI bus for a tape drive can results in maximum backup performance and prevents conflicts with other drive array access. Microsoft recommends using dedicated SCSI bus for the tape drives whose native transfer rate exceeds 50 percent of the SCSI bus speed.
  12. Use SQL Server 2000 snapshot backups for the very large databases.
    The SQL Server 2000 snapshot backup and restore technologies work in conjunction with third party hardware and software vendors. The main advantages of snapshot backups and restores are that they can be done in a very short time, typically measured in seconds, not hours, and reduce the backup/restore impact on the overall server performance. The snapshot backups accomplished by splitting a mirrored set of disks or creating a copy of a disk block when it is written and required the special hardware and software.

Analyze and Fix Index Fragmentation in SQL Server 2008

Analyze and Fix Index Fragmentation in SQL Server 2008

It is very common that over time SQL Server tables and indexes tend to become fragmented. The fragmentation generally happens when data within the underlying tables on which an index exists is modified. The data modification basically can be an insert, update or a delete operation. The indexes over time become ineffective because they get fragmented. In this article you will see an example of how an index gets fragmented and the steps which database administrator needs to take to fix index fragmentations.

Example to Analyze and Fix Index Fragmentation in SQL Server 2008

Follow the below mentioned steps to see how an index fragmentation occurs on a table which has indexes defined on it. And finally you will see the steps which you need to take to fix index fragmentation issues.

Create AnalyzeFragmentation Database

First let us create a new database named AnalyzeFragmentation for this example. Database can be created by executing the below mentioned TSQL Query.

Use master
GO
IF EXISTS (SELECT name FROM sys.databases WHERE name = N'AnalyzeFragmentation')
DROP DATABASE [AnalyzeFragmentation]
GO
CREATE DATABASE AnalyzeFragmentation
GO

Create FindAndFixFragmentation Table in AnalyzeFragmentation Database

The next step will be to create a new table named FindAndFixFragmentation within the AnalyzeFragmentation database.


USE AnalyzeFragmentation
GO
IF OBJECT_ID (N'dbo.FindAndFixFragmentation', N'U') IS NOT NULL
DROP TABLE dbo.FindAndFixFragmentation;
GO

— Create FindAndFixFragmentation Table–
CREATE TABLE [dbo].[FindAndFixFragmentation]
(

[AddressID] [int] NOT NULL,
[AddressLine1] [nvarchar](60) NOT NULL,
[City] [nvarchar](30) NOT NULL,
[PostalCode] [nvarchar](15) NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
[RowGUID] [UNIQUEIDENTIFIER] NOT NULL

)
ON [PRIMARY]

GO

Populate the FindAndFixFragmentation Table using the below TSQL code

The next step will be to populate the FindAndFixFragmentation table which you have created earlier by executing the below mentioned TSQL code. For this example we will be using the data which is available in Person.Address table available in AdventureWorks database.


USE AnalyzeFragmentation
GO

— Populate FindAndFixFragmentation table with data from AdventureWorks.Person.Address —

INSERT INTO FindAndFixFragmentation
SELECT
AddressID,
AddressLine1,
City,
PostalCode,
ModifiedDate,
RowGUID
FROM AdventureWorks.Person.Address
GO

Create a Clustered Index on FindAndFixFragmentation Table using the below TSQL code

The next step will be to create a clustered index named CL_FindAndFixFragmentation_Index on FindAndFixFragmentation table using the below mentioned TSQL code.

-- Drop the index if it is already existing--
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[FindAndFixFragmentation]') AND name = N'CL_FindAndFixFragmentation_Index')

DROP INDEX [CL_FindAndFixFragmentation_Index] ON [dbo].[FindAndFixFragmentation]
GO
— Create Clustered Index on FindAndFixFragmentation(RowGUID) —
CREATE CLUSTERED INDEX [CL_FindAndFixFragmentation_Index] ON [dbo].[FindAndFixFragmentation]
(
[RowGUID] ASC
)
WITH (FILLFACTOR = 90) ON [PRIMARY]
GO

You can see that we are creating a clustered index on FindAndFixFragmentation table with a Fill Factor 90. The fill factor option is basically provided for fine tuning index data storage and to improve performance. Whenever an index is created or it is rebuilt, the fill factor value basically determines the percentage of space on each leaf level page that needs to be filled with data. Based on the fill factor value a percentage of free space is allocated on every single page. By default the fill factor value is 0 or 100 which means there will be no free space allocated on each leaf level page. The value for fill factor is defined in percentages and this can be any value in between 1 to 100. In this example the fill factor value provide is 90 which mean on every single page there will be a 10 percentage of free space left to accommodate future growth.

Query to Find Existing Fragmentation on FindAndFixFragmentation Table

Next step will be to execute the below mentioned TSQL query to know the existing fragmentation on FindAndFixFragmentation table. The important values which need to be noted by the database administrators are AvgPageFragmentation and PageCounts. The value for AvgPageFragmentation is 0.341296928327645, which means there is a very little fragmentation existing on the table at this point of time. However the value for PageCounts is 293, which mean the data is stored in that many data pages on SQL Server. This query will be executing many a times in this article.


-- Find index fragmentation --
SELECT
DB_NAME(DATABASE_ID) AS [DatabaseName],
OBJECT_NAME(OBJECT_ID) AS TableName,
SI.NAME AS IndexName,
INDEX_TYPE_DESC AS IndexType,
AVG_FRAGMENTATION_IN_PERCENT AS AvgPageFragmentation,
PAGE_COUNT AS PageCounts
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, N'LIMITED') DPS
INNER JOIN sysindexes SI
ON DPS.OBJECT_ID = SI.ID AND DPS.INDEX_ID = SI.INDID
GO

Perform Update Operation on FindAndFixFragmentation Table

Next step will be to perform updates on FindAndFixFragmentation table by executing the below mentioned TSQL code. This query will modify all the data for RowGUID column on which we have created clustered index with fill factor as 90.

— Update all the rows within to FindAndFixFragmentation table create index fragmentation —

USE AnalyzeFragmentation
GO
UPDATE FindAndFixFragmentation
SET RowGUID =NEWID()
GO

Execute the query to find existing fragmentation on FindAndFixFragmentation table as shown in the below snippet.

Now you can see that the value for AvgPageFragmentation has changed from 0.341296928327645 to 99.0049751243781, which means index is completely fragmentation. At the same time the value for PageCounts has changed from 293 to 603, which mean more number of data pages are required to store the content. Now the question which comes to your mind is how this can be fixed.

There are two methods to fix index fragmentation issues in SQL Server 2005 and higher versions. The two methods are Reorganize or Rebuild Index. The Reorganize Index is an online operation, however Rebuild Index is not an online operation until you have specified the option ONLINE=ON while performing the Rebuild. Next step will be to perform first REORGANIZE Index option and then finally perform we will perform the REBUILD and see which options is the best.

Perform Reorgainize Index Operation on Clustered Index of FindAndFixFragmentation Table

First let us perform REORGANIZE Index operation on the clustered index, and then execute the query as shown in the snippet to find the fragmentation on FindAndFixFragmentation table.


-- Reorganize [CL_FindAndFixFragmentation_Index] index on FindAndFixFragmentation --

ALTER INDEX [CL_FindAndFixFragmentation_Index] ON FindAndFixFragmentation

REORGANIZE;

GO

Once we have performed the REORGANIZE Index operation you can see that the value for AvgPageFragmentation has changed from 99.0049751243781 to 5.70469798657718, which means index fragmentation is much better that how it was earlier. And at the same time the value for PageCounts has also come down from 603 to 298, this is considerable improvement.

Perform Rebuild Index Operation on Clustered Index of FindAndFixFragmentation Table

Now let us perform REBUILD Index operation on the clustered index, when you are using the Rebuild index operation it basically drops and recreates the index. The important thing what we need to see is does this results in reducing the index fragmentation further down from 5.70469798657718. Once you have performed the Rebuild operation execute the query as shown in the snippet to check the fragmentation on FindAndFixFragmentation table.


-- Rebuild [CL_FindAndFixFragmentation_Index] index on FindAndFixFragmentation --

ALTER INDEX [CL_FindAndFixFragmentation_Index] ON FindAndFixFragmentation
REBUILD WITH (FILLFACTOR = 90, ONLINE=ON)
GO

You can see that the value for AvgPageFragmentation is back to 0.341296928327645, which means the fragmentation is same as it was when we began this exercise. And at the same time the value for PageCounts is back to 293. This proves that using REBUILD Index operation is better than REORGANIZE Index operation.

Reorganize Index

Reorganize Index uses minimal system resources and it is performed online. The biggest advantage is it does not require locks for long time therefore it does not block updates or other user queries. If the index fragmentation ranges in between 5% to 30% then it is better to perform Reorganize Index.

Rebuild Index

Rebuild Index basically drops and recreates the index; this is by far the best approach. If the index fragmentation is greater than 30% then the best strategy will be to use Rebuild Index instead of Reorganize Index.

Conclusion

Database Administrators should always make sure that fragmentation of indexes is handled on time. If the indexes are fragmented then the query response will not only be very slow; the data storage will also require more disk space. In this article you have seen an example where the clustered index gets fragmented over time and the steps which you need to perform to resolve index fragmentation issues.

How to Cluster SQL Server 2005

How to Cluster SQL Server 2005

Believe it or not, the procedure to install a SQL Server 2005 instance onto a cluster is one of the easiest parts of getting your SQL Server 2005 cluster up and running. The SQL Server 2005 setup program is used for the install and does the hard work for you. All you have to do is make a few (but critically important) decisions, and then sit back and watch the installation take place. In fact, the setup program even goes to the trouble to verify that your nodes are all properly configured, and if not, will suggest how to fix most problems before the installation begins.

When the installation process does begin, the setup program recognizes all the nodes, and once you give it the go ahead to install on each one, it does, all automatically. SQL Server 2005 binaries are installed on the local drive of each node, and the system databases are stored on the shared array you designate.

In the next section are the step-by-steps instructions for installing a SQL Server 2005 instance in a cluster. The assumption for this example is that you will be installing this instance in a 2-node active/passive cluster. Even if you will be installing a 2-node active/active or a multi-node cluster, the steps in this section are virtually the same. The only real difference is that you will have to run SQL Server 2005 setup for every instance you want to install on the cluster, and you will have to specify a different logical drive on the shared array.

Clustering SQL Server

To begin installing your SQL Server 2005 cluster, you will need the installation CD or DVD. You can either install it directly from the media, or copy the install files from the media to the current active node of the cluster, and run the setup program from there.

To begin the installation, run Setup.exe. After an introductory screen, you will get the first install dialog box as shown in the figure below.

The Installing Prerequisites dialog box lists the prerequisites that need to be installed before installation of SQL Server 2005 can begin. The number of components may vary from the above figure, depending on what you have already installed on your nodes. What is interesting to note here is that these prerequisite components will only be installed immediately on the active node. They will be installed on the passive node later during the installation process. This is done automatically and you don’t have to worry about it.

Click Install to install these components. When completed, you will get a dialog box telling you that they were installed successfully, and then you can the click Next to proceed. On occasion, I have seen these components fail to install correctly. If this happens, you will have to troubleshoot the installation. Generally speaking, try rebooting both nodes of the cluster and try installing them again. This often fixes whatever caused the first setup try to fail.

Once the prerequisite components have been successfully installed, the SQL Server Installation Wizard launches

SQL Tuning or SQL Optimization

SQL Tuning or SQL Optimization

Sql Statements are used to retrieve data from the database. We can get same results by writing different sql queries. But use of the best query is important when performance is considered. So you need to sql query tuning based on the requirement. Here is the list of queries which we use reqularly and how these sql queries can be optimized for better performance.

SQL Tuning/SQL Optimization Techniques:

1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than ‘*’.

For Example: Write the query as

SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:

SELECT * FROM student_details;

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes.
For Example: Write the query as

SELECT subject, count(subject)
FROM student_details
WHERE subject != 'Science'
AND subject != 'Maths'
GROUP BY subject;

Instead of:

SELECT subject, count(subject)
FROM student_details
GROUP BY subject
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.
For Example: Write the query as

SELECT name
FROM employee
WHERE (salary, age ) = (SELECT MAX (salary), MAX (age)
FROM employee_details)
AND dept = 'Electronics';

Instead of:

SELECT name
FROM employee
WHERE salary = (SELECT MAX(salary) FROM employee_details)
AND age = (SELECT MAX(age) FROM employee_details)
AND emp_dept = 'Electronics';

4) Use operator EXISTS, IN and table joins appropriately in your query.
a) Usually IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-query.
c) EXISTS is efficient when most of the filter criteria is in the main query.

For Example: Write the query as

Select * from product p
where EXISTS (select * from order_items o
where o.product_id = p.product_id)

Instead of:

Select * from product p
where product_id IN
(select product_id from order_items

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
For Example: Write the query as

SELECT d.dept_id, d.dept
FROM dept d
WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);

Instead of:

SELECT DISTINCT d.dept_id, d.dept
FROM dept d,employee e
WHERE e.dept = e.dept;

6) Try to use UNION ALL in place of UNION.
For Example: Write the query as

SELECT id, first_name
FROM student_details_class10
UNION ALL
SELECT id, first_name
FROM sports_team;

Instead of:

SELECT id, first_name, subject
FROM student_details_class10
UNION
SELECT id, first_name
FROM sports_team;

7) Be careful while using conditions in WHERE clause.
For Example: Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age != 10;

Write the query as

SELECT id, first_name, age
FROM student_details
WHERE first_name LIKE 'Chan%';

Instead of:

SELECT id, first_name, age
FROM student_details
WHERE SUBSTR(first_name,1,3) = 'Cha';

Write the query as

SELECT id, first_name, age
FROM student_details
WHERE first_name LIKE NVL ( :name, '%');

Instead of:

SELECT id, first_name, age
FROM student_details
WHERE first_name = NVL ( :name, first_name);

Write the query as

SELECT product_id, product_name
FROM product
WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)

Instead of:

SELECT product_id, product_name
FROM product
WHERE unit_price >= MAX(unit_price)
and unit_price <= MIN(unit_price)

Write the query as

SELECT id, name, salary
FROM employee
WHERE dept = 'Electronics'
AND location = 'Bangalore';

Instead of:

SELECT id, name, salary
FROM employee
WHERE dept || location= 'ElectronicsBangalore';

Use non-column expression on one side of the query because it will be processed earlier.

Write the query as

SELECT id, name, salary
FROM employee
WHERE salary < 25000;

Instead of:

SELECT id, name, salary
FROM employee
WHERE salary + 10000 < 35000;

Write the query as

SELECT id, first_name, age
FROM student_details
WHERE age > 10;

Instead of:

SELECT id, first_name, age
FROM student_details
WHERE age NOT = 10;

8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause.
For Example: Write the query as

SELECT id FROM employee
WHERE name LIKE 'Ramesh%'
and location = 'Bangalore';

Instead of:

SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee
WHERE name LIKE 'Ramesh%';

9) To store large binary objects, first place them in the file system and add the file path in the database.

10) To write queries which provide efficient performance follow the general SQL standard rules.

a) Use single case for all SQL verbs
b) Begin all SQL verbs on a new line
c) Separate all words with a single space
d) Right or left aligning verbs within the initial SQL verb

SQL Subquery

SQL Subquery

Subquery or Inner query or Nested query is a query in a query. A subquery is usually added in the WHERE Clause of the sql statement. Most of the time, a subquery is used when you know how to search for a value using a SELECT statement, but do not know the exact value.

Subqueries are an alternate way of returning data from multiple tables.


Subqueries can be used with the following sql statements along with the comparision operators like =, <, >, >=, <= etc.

  • SELECT
  • INSERT
  • UPDATE
  • DELETE

For Example:

1) Usually, a subquery should return only one record, but sometimes it can also return multiple records when used with operators like IN, NOT IN in the where clause. The query would be like,

SELECT first_name, last_name, subject
FROM student_details
WHERE games NOT IN (‘Cricket’, ‘Football’);

The output would be similar to:

first_name     last_name     games
————-     ————-     ———-
Shekar     Gowda     Badminton
Priya     Chandra     Chess

2) Lets consider the student_details table which we have used earlier. If you know the name of the students who are studying science subject, you can get their id’s by using this query below,

SELECT id, first_name
FROM student_details
WHERE first_name IN ('Rahul', 'Stephen');

but, if you do not know their names, then to get their id’s you need to write the query in this manner,

SELECT id, first_name
FROM student_details
WHERE first_name IN (SELECT first_name
FROM student_details
WHERE subject= 'Science');

Output:


id     first_name
--------     -------------
100     Rahul
102     Stephen

In the above sql statement, first the inner query is processed first and then the outer query is processed.

3) Subquery can be used with INSERT statement to add rows of data from one or more tables to another table. Lets try to group all the students who study Maths in a table ‘maths_group’.


INSERT INTO maths_group(id, name)
SELECT id, first_name || ' ' || last_name
FROM student_details WHERE subject= 'Maths'

4) A subquery can be used in the SELECT statement as follows. Lets use the product and order_items table defined in the sql_joins section.

select p.product_name, p.supplier_name, (select order_id from order_items where product_id = 101) as order_id from product p where p.product_id = 101


product_name     supplier_name     order_id
------------------     ------------------     ----------
Television     Onida     5103
Correlated Subquery

A query is called correlated subquery when both the inner query and the outer query are interdependent. For every row processed by the inner query, the outer query is processed as well. The inner query depends on the outer query before it can be processed.


SELECT p.product_name FROM product p
WHERE p.product_id = (SELECT o.product_id FROM order_items o
WHERE o.product_id = p.product_id);

NOTE:
1) You can nest as many queries you want but it is recommended not to nest more than 16 subqueries in oracle.
2) If a subquery is not dependent on the outer query it is called a non-correlated subquery.

Using Stored Procedures

Most of us, the database programmers, have used Stored Procedures. May be not all of us knows about why we use them. This article is for those who have used/never used stored procedures, and are yet to understand why everyone suggests using them in your Database.

Stored Procedures – What are they?

Stored procedure is a set of pre-defined Transact-SQL statements, used to perform a specific task. There can be multiple statements in a stored procedure, and all the multiple statements are clubbed in to one database object.

How to create a stored procedure?

Creating a stored procedure is as easy as running the “Create Procedure” statement followed by the SQL script. You can run your Create Procedure statement from the SQL Query Analyzer, or can use the New Procedure menu item in the Enterprise Manager.

The simplest skeleton of a stored procedure.

CREATE PROC procedure_name
[ { @parameter data_type }
]
AS sql_statement

Check the basic building blocks of a stored procedure.

A stored procedure includes:

1.      A CREATE PROC (CREATE PROCEDURE) statement;
2.      The procedure name;
3.      The parameter list
4.      And the SQL statements.

Even though there are numerous other options available while we define a stored procedure, I kept it simple, just to give you a basic idea about creating stored procedures.

Advantages!

Almost every database Guru that you will meet, will suggest using stored procedures. For you, it will seem as if most of them blindly believes in stored procedures. But there are reasons for this. This is what I am trying to explore in this article.

1. Performance

All the SQL statements, that you send to your database server passes through a series of actions, called execution. These are the steps that your SQL statement passes through before the data is returned to the client.

User sends request to execute the Stored Procedure. SQL Server checks for syntax errors. Identifies and checks the aliases in the FROM clause. Creates a query plan. Compiles the query and. Executes the query plan and return the requested data.

See, lots of things are happening inside that we didn’t knew about. Now, the crucial question. Does a stored procedure bypass all these?

In a way, yes. The previous versions of SQL Server stored the compiled execution plan in system tables, making them partially pre-compiled. This improved performance, because the Server did not have to compile the stored procedure each and every time it is called.

In later versions of SQL Server, there were a large number of changes in statement processing. Now, the stored procedure is stored in a procedure cache when it is called, making subsequent calls faster.

2. Security

Stored procedures provide significant benefits when it comes to security. By using a stored procedure, you can grant permissions to certain users to access data, reducing the immense coding that you need to do in your client applications. This is one of the best ways to control access to your data.

3. Modifications/Maintenance

If you use stored procedures for database access, any change in the database can be reflected on to the client application without much effort. This is because you know exactly where the data is accessed from, and you also know exactly where you need to alter. This means no scuba diving in to thousands of lines of source code to identify areas where you need to alter and no headache of re-deploying the client application.

4. Minimal processing at the client.

When creating a client/server application, normally it was the client who took care of the integrity of data that went in to the database. Managing Primary Keys, Foreign keys, cascaded deletion everything was done by the client, and the database server just had to store data given by the client.

Well friends, things have changed. Stored procedures help you write batch of SQL statements, which helps you manage the transactions, constraints etc. A little data aware code has to be written in to the client application, making it a thin-client application. These applications will be concerned more about displaying data in the way the user needs them and they know little about the database.

Take another scenario. You have a database with millions of rows and hundreds of tables. You need to do some calculations before updating each and every record. If you are fetching the complete data to the client, and is asking the client machine to process the data completely, then think about the overhead it creates. But when the client can execute a store procedure, where you have done the calculations prior to updating the records, you have a client, that doesn’t need to know about the calculations. This also reduces the amount of computing happening in the client, and the server takes care of tedious calculations.

5. Network traffic

Client applications always have to request/send data from the database server. These data are sent as packets, and travel through the network to the server.

To explain how stored procedures can help reduce network traffic, let us see another scenario, where a request for data is send from the client. The request is sent as an SQL statement, and here it is.

SELECT dbo.Tbl_Tablename.fieldID,
dbo.Tbl_Tablename.fieldName,
dbo.Tbl_Tablename.Title,
dbo.TBl_otherTableName.fieldID,
dbo.Tbl_Tablename.Published,
dbo.Tbl_Tablename.Updated,
dbo.Tbl_Tablename.SomeText,
dbo.Tbl_Tablename.TransactionDate,
dbo.Tbl_Tablename.Approved,
dbo.Tbl_Tablename.ApprovedBy,
dbo.Tbl_Tablename.ApprovalID
FROM
dbo.Tbl_Tablename
LEFT OUTER JOIN
dbo.TBl_otherTableName on dbo.Tbl_Tablename.fieldID=dbo.TBl_otherTableName.ID
Where
DateDiff ( wk, dbo.Tbl_Tablename.TransactionDate, getdate()) <= 1
and dbo.Tbl_Tablename.Approved = 0

518 Characters travel through the network, and when there are 20 client applications using this stored procedure 20 times a day, the number of characters passing through the network for just this request will be 2,07,200!

You see the difference now. If it was a stored procedure, lets call it SP_fetchSomething, there are only 6800 characters in the network for the request. A saving of 2,004,00!

As you have seen the five major points that I use to explain why I used a stored procedure, I hope you will also elect to intelligently use this awesome technology in your next database design.

SQL Server Date Formats CAST and CONVERT

Date Formats CAST and CONVERT (Transact-SQL)

Converts an expression of one data type to another.
Topic link icon Transact-SQL Syntax Conventions

Syntax for CAST:

CAST ( expression AS data_type [ (length ) ])

Syntax for CONVERT:

CONVERT ( data_type [ ( length ) ] , expression [ , style ] )

SQL Server Date Formats

Default
SELECT CONVERT(VARCHAR(20), GETDATE(), 100)
Jan 1 2005 1:29PM 1
USA
SELECT CONVERT(VARCHAR(8), GETDATE(), 1) AS [MM/DD/YY]
11/23/98
USA
SELECT CONVERT(VARCHAR(10), GETDATE(), 101) AS [MM/DD/YYYY]
11/23/1998
ANSI
SELECT CONVERT(VARCHAR(8), GETDATE(), 2) AS [YY.MM.DD]
72.01.01
ANSI
SELECT CONVERT(VARCHAR(10), GETDATE(), 102) AS [YYYY.MM.DD]
1972.01.01
British/French
SELECT CONVERT(VARCHAR(8), GETDATE(), 3) AS [DD/MM/YY]
19/02/72
British/French
SELECT CONVERT(VARCHAR(10), GETDATE(), 103) AS [DD/MM/YYYY]
19/02/1972
German
SELECT CONVERT(VARCHAR(8), GETDATE(), 4) AS [DD.MM.YY]
25.12.05
German
SELECT CONVERT(VARCHAR(10), GETDATE(), 104) AS [DD.MM.YYYY]
25.12.2005
Italian
SELECT CONVERT(VARCHAR(8), GETDATE(), 5) AS [DD-MM-YY]
24-01-98
Italian
SELECT CONVERT(VARCHAR(10), GETDATE(), 105) AS [DD-MM-YYYY]
24-01-1998
SELECT CONVERT(VARCHAR(9), GETDATE(), 6) AS [DD MON YY]
04 Jul 06 1
SELECT CONVERT(VARCHAR(11), GETDATE(), 106) AS [DD MON YYYY]
04 Jul 2006 1
SELECT CONVERT(VARCHAR(10), GETDATE(), 7) AS [Mon DD, YY]
Jan 24, 98 1
SELECT CONVERT(VARCHAR(12), GETDATE(), 107) AS [Mon DD, YYYY]
Jan 24, 1998 1
SELECT CONVERT(VARCHAR(8), GETDATE(), 108)
03:24:53
Default +

milliseconds

SELECT CONVERT(VARCHAR(26), GETDATE(), 109)
Apr 28 2006 12:32:29:253PM 1
USA
SELECT CONVERT(VARCHAR(8), GETDATE(), 10) AS [MM-DD-YY]
01-01-06
USA
SELECT CONVERT(VARCHAR(10), GETDATE(), 110) AS [MM-DD-YYYY]
01-01-2006
SELECT CONVERT(VARCHAR(8), GETDATE(), 11) AS [YY/MM/DD]
98/11/23
SELECT CONVERT(VARCHAR(10), GETDATE(), 111) AS [YYYY/MM/DD]
1998/11/23
ISO
SELECT CONVERT(VARCHAR(6), GETDATE(), 12) AS [YYMMDD]
980124
ISO
SELECT CONVERT(VARCHAR(8), GETDATE(), 112) AS [YYYYMMDD]
19980124
Europe default + milliseconds
SELECT CONVERT(VARCHAR(24), GETDATE(), 113)
28 Apr 2006 00:34:55:190 1
SELECT CONVERT(VARCHAR(12), GETDATE(), 114) AS [HH:MI:SS:MMM(24H)]
11:34:23:013
ODBC Canonical
SELECT CONVERT(VARCHAR(19), GETDATE(), 120)
1972-01-01 13:42:24
ODBC Canonical

(with milliseconds)

SELECT CONVERT(VARCHAR(23), GETDATE(), 121)
1972-02-19 06:35:24.489
ISO8601
SELECT CONVERT(VARCHAR(23), GETDATE(), 126)
1998-11-23T11:25:43:250
Kuwaiti
SELECT CONVERT(VARCHAR(26), GETDATE(), 130)
28 Apr 2006 12:39:32:429AM 1
Kuwaiti
SELECT CONVERT(VARCHAR(25), GETDATE(), 131)
28/04/2006 12:39:32:429AM

SQL Statement OR Syntax

SQL Quick Reference:

SQL Statement Syntax
AND / OR SELECT column_name(s)
FROM table_name
WHERE condition
AND|OR condition
ALTER TABLE ALTER TABLE table_name
ADD column_name datatypeor

ALTER TABLE table_name
DROP COLUMN column_name

AS (alias) SELECT column_name AS column_alias
FROM table_nameor

SELECT column_name
FROM table_name  AS table_alias

BETWEEN SELECT column_name(s)
FROM table_name
WHERE column_name
BETWEEN value1 AND value2
CREATE DATABASE CREATE DATABASE database_name
CREATE TABLE CREATE TABLE table_name
(
column_name1 data_type,
column_name2 data_type,
column_name2 data_type,

)
CREATE INDEX CREATE INDEX index_name
ON table_name (column_name)or

CREATE UNIQUE INDEX index_name
ON table_name (column_name)

CREATE VIEW CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition
DELETE DELETE FROM table_name
WHERE some_column=some_valueor

DELETE FROM table_name
(Note: Deletes the entire table!!)

DELETE * FROM table_name
(Note: Deletes the entire table!!)

DROP DATABASE DROP DATABASE database_name
DROP INDEX DROP INDEX table_name.index_name (SQL Server)
DROP INDEX index_name ON table_name (MS Access)
DROP INDEX index_name (DB2/Oracle)
ALTER TABLE table_name
DROP INDEX index_name (MySQL)
DROP TABLE DROP TABLE table_name
GROUP BY SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value
IN SELECT column_name(s)
FROM table_name
WHERE column_name
IN (value1,value2,..)
INSERT INTO INSERT INTO table_name
VALUES (value1, value2, value3,….)or

INSERT INTO table_name
(column1, column2, column3,…)
VALUES (value1, value2, value3,….)

INNER JOIN SELECT column_name(s)
FROM table_name1
INNER JOIN table_name2
ON table_name1.column_name=table_name2.column_name
LEFT JOIN SELECT column_name(s)
FROM table_name1
LEFT JOIN table_name2
ON table_name1.column_name=table_name2.column_name
RIGHT JOIN SELECT column_name(s)
FROM table_name1
RIGHT JOIN table_name2
ON table_name1.column_name=table_name2.column_name
FULL JOIN SELECT column_name(s)
FROM table_name1
FULL JOIN table_name2
ON table_name1.column_name=table_name2.column_name
LIKE SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern
ORDER BY SELECT column_name(s)
FROM table_name
ORDER BY column_name [ASC|DESC]
SELECT SELECT column_name(s)
FROM table_name
SELECT * SELECT *
FROM table_name
SELECT DISTINCT SELECT DISTINCT column_name(s)
FROM table_name
SELECT INTO SELECT *
INTO new_table_name [IN externaldatabase]
FROM old_table_nameor

SELECT column_name(s)
INTO new_table_name [IN externaldatabase]
FROM old_table_name

SELECT TOP SELECT TOP number|percent column_name(s)
FROM table_name
TRUNCATE TABLE TRUNCATE TABLE table_name
UNION SELECT column_name(s) FROM table_name1
UNION
SELECT column_name(s) FROM table_name2
UNION ALL SELECT column_name(s) FROM table_name1
UNION ALL
SELECT column_name(s) FROM table_name2
UPDATE UPDATE table_name
SET column1=value, column2=value,…
WHERE some_column=some_value
WHERE SELECT column_name(s)
FROM table_name
WHERE column_name operator value

SQL Server 2005 Failover Clustering White Paper

SQL Server 2005 Failover Clustering White Paper

Brief Description:
Comprehensive document about implementing failover clustering for SQL Server 2005 and Analysis Services.

Overview:
This white paper is intended for a technical audience and not technical decision makers. It complements the existing documentation around planning, implementing, and administering of a failover cluster that can be found in Microsoft SQL Server 2005 Books Online. To ease the upgrade process for existing users of failover clustering, this white paper also points out differences in the failover clustering implementation of SQL Server 2005 compared to SQL Server 2000.

System Requirements:

  • Supported Operating Systems: Windows Server 2003 Service Pack 1; Windows XP Service Pack 2
  • Microsoft Word

Instructions:
Click the Download button on this page. You can then choose to open the file immediately or save it to your hard disk.

click here to download from microsoft site

Roles of DBA

A database administrator (DBA) is a person who is responsible for the environmental aspects of a database. In general, these include basic:

  • Recoverability – Creating and testing Backups
  • Integrity – Verifying or helping to verify data integrity
  • Security – Defining and/or implementing access controls to the data
  • Availability – Ensuring maximum uptime
  • Performance – Ensuring maximum performance
  • Development and testing support – Helping programmers and engineers to efficiently utilize the database.

The role of a database administrator has changed according to the technology of database management systems (DBMSs) as well as the needs of the owners of the databases. For example, although logical and physical database design are traditionally the duties of a database analyst or database designer, a DBA may be tasked to perform those duties.


The Database Administrator is responsible for designing, developing and implementing programs, as required, to support the technical capabilities.

  • Develop new or maintain existing databases based on specifications
  • Develop, implement and maintain unit tests of database programs (i.e, SQL, etc)
  • Share knowledge by effectively documenting work
  • Respond quickly and effectively to production & development issues and taking responsibility for seeing those issues through resolution.
  • Resolve database performance issues, database capacity issues, replication, and other distributed data issues.
  • Design & implement data models and database designs into physical databases.
  • Install and maintain database software.
  • Manage backup and recovery of databases.
  • Manage security of database structures and corporate data held within databases.
  • Develop database procedures, triggers and SQL scripts for development teams.
  • Maintain database changes from Development, QA to Production.
  • Assist in the definition and implementation of database standards.
  • Monitor databases for errors and perform problem determination when necessary.
  • Design and implement highly available production systems.

Qualifications:

  • Bachelor degree in technical discipline; or equivalent professional experience
  • Experience writing complex SQL, triggers, and procedures
  • Ability to work with minimal direction, yet also able to work in team environment.
  • Relational database analysis and modeling experience.
  • Experience configuring database network connectivity.
  • Understanding of database backup and recovery techniques.
  • Experience in fast paced production or operational system arena
  • Clear and effective written and verbal communication skills
  • Hands on and ownership personality..

Easy step to install SQL Server 2000

Installing SQL Server 2000

loadTOCNode(2, ‘summary’);

To Install SQL Server 2000 Basic Local Installation

loadTOCNode(3, ‘summary’);

  1. Insert the Microsoft SQL Server 2000 compact disc in your CD-ROM drive (if the compact disc does not run automatically, double-click Autorun.exe in the root directory of the compact disc), select SQL Server 2000 Components, and then select Install Database Server. Setup prepares the SQL Server Installation Wizard. At the Welcome page, click Next.
  2. In the Computer Name dialog box, Local Computer is the default option, and the local computer name appears in the text box. Click Next.
  3. In the Installation Selection dialog box, click Create a new instance of SQL Server, or install Client Tools, and then click Next. Follow directions on the User Information, Software License Agreement and related pages. In the Installation Definition dialog box, click Server and Client Tools, and then click Next.
  4. In the Instance Name dialog box, if the Default check box is available, you can install either the default or a named instance. If the Default check box is not available, a default instance has already been installed, and you can install only a named instance.
    • To install the default instance, click to select the Default check box, and then click Next.
    • To install a named instance, click to clear the Default check box, type a new named instance in the Instance Name box, and then click Next.
  5. In the Setup Type dialog box, click Typical or Minimum, and then click Next.
  6. In the Service Accounts dialog box, accept the default settings, type your domain password, and then click Next. In the Authentication Mode dialog box, accept the default setting, and then click Next. When you finish specifying options, click Next in the Start Copying Files dialog box.
  7. In the Choose Licensing Mode dialog box, make selections according to your license agreement, and then click Continue to begin the installation. In the Setup Complete dialog box, click Yes, I want to restart my computer now, and then click Finish.

To Install Client Tools Only for SQL Server 2000

loadTOCNode(3, ‘summary’);

  1. Insert the Microsoft SQL Server 2000 compact disc in your CD-ROM drive (if the compact disc does not run automatically, double-click Autorun.exe in the root directory of the compact disc), select SQL Server 2000 Components, select Install Database Server, and then click Next at the Welcome page of the SQL Server Installation Wizard.
  2. In Computer Name dialog box, Local Computer is the default option, and the local computer name appears in the edit box. Click Next.
  3. In the Installation Selection dialog box, click Create a new instance of SQL Server, or install Client Tools, and then click Next.
  4. Follow the directions on the User Information, Software License Agreement, and related pages.
  5. In the Installation Definition dialog box, click Client tools only, and then click Next.
  6. In the Select Components dialog box, accept the defaults or select the components you want, and then click Next. You can select an item in the Components list, such as Management Tools, and then select items from the related Sub-Components list, such as Enterprise Manager. Click to select items that you want to install, and click to clear the check box for the items you do not want to install. For information about each component, select the item, and view the Description box.
  7. In the Start Copying Files dialog box, click Next to complete the installation of the client tools.

To Install Connectivity Only for SQL Server 2000

loadTOCNode(3, ‘summary’);

  1. Insert the Microsoft SQL Server 2000 compact disc into your CD-ROM drive (if the compact disc does not run automatically, double-click Autorun.exe in the root directory of the compact disc), and then select SQL Server 2000 Components.
  2. Select Install Database Server. Setup prepares the SQL Server Installation Wizard. At the Welcome page, click Next.
  3. In the Computer Name dialog box, Local Computer is the default option, and the local computer name appears in the text box. Click Next.
  4. In the Installation Selection dialog box, click Create a new instance of SQL Server, or install Client Tools, and then click Next.
  5. Follow the directions on the User Information, Software License Agreement and related pages.
  6. In the Installation Definition dialog box, click Connectivity Only, and then click Next.
  7. In the Start Copying Files dialog box, click Next to complete the installation.Microsoft SQL Server Books Online:”Basic Installation Options”For additional information, click the article number below to view the article in the Microsoft Knowledge Base:
    257716 (http://support.microsoft.com/kb/257716/EN-US/ ) Frequently Asked Questions – SQL Server 2000 – Setup

Introduction to SQL

SQL is a standard language for accessing and manipulating databases.

What is SQL?

  • SQL stands for Structured Query Language
  • SQL lets you access and manipulate databases
  • SQL is an ANSI (American National Standards Institute) standard

What Can SQL do?

  • SQL can execute queries against a database
  • SQL can retrieve data from a database
  • SQL can insert records in a database
  • SQL can update records in a database
  • SQL can delete records from a database
  • SQL can create new databases
  • SQL can create new tables in a database
  • SQL can create stored procedures in a database
  • SQL can create views in a database
  • SQL can set permissions on tables, procedures, and views

SQL is a Standard – BUT….
Although SQL is an ANSI (American National Standards Institute) standard, there are many different versions of the SQL language.

However, to be compliant with the ANSI standard, they all support at least the major commands (such as SELECT, UPDATE, DELETE, INSERT, WHERE) in a similar manner.

Note: Most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard!

Using SQL in Your Web Site

To build a web site that shows some data from a database, you will need the following:

  • An RDBMS database program (i.e. MS Access, SQL Server, MySQL)
  • A server-side scripting language, like PHP or ASP
  • SQL
  • HTML / CSS

RDBMS

RDBMS stands for Relational Database Management System.

RDBMS is the basis for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.

The data in RDBMS is stored in database objects called tables.
A table is a collections of related data entries and it consists of columns and rows.