Find Table Locked Session

Some time your table may locked by some session and you want to know that session and why i can not access my table?

This mean your begain transaction is completed and insert/update/delete is runing but Comit transacion is yet be run

OR you can create the situation by running the below query:


BEGIN TRANSACTION
DELETE * FROM Table1

Below script will allow you to find which session is Locking your table:

SELECT
OBJECT_NAME(P.object_id) AS TableName,
Resource_type,
request_session_id
FROM
sys.dm_tran_locks L
join sys.partitions P
ON L.resource_associated_entity_id = p.hobt_id
WHERE   OBJECT_NAME(P.object_id) = 'Table1'

To release the Loking you can kill that session using below script:

KILL 54

 

If you want to know in detail why, when and which session is locking then below script is useful:


SELECT
blocking_session_id AS BlockingSessionID,
session_id AS VictimSessionID,
(SELECT [text] FROM sys.sysprocesses
CROSS APPLY sys.dm_exec_sql_text([sql_handle])
WHERE spid = blocking_session_id) AS BlockingQuery,
[text] AS VictimQuery,
wait_time/1000 AS WaitDurationSecond,
wait_type AS WaitType,
percent_complete AS BlockingQueryCompletePercent
FROM sys.dm_exec_requests
CROSS APPLY sys.dm_exec_sql_text([sql_handle])
WHERE blocking_session_id > 0

How to pass parameters dynamically

How to pass parameters dynamically?
MS-SQL has many features to call the procedure from other application and from SQL itself, See the example.
Some time you need optional parameter in Procedure then assign null as default so that it will not ask you compulsory.

CREATE PROCEDURE dbo.TESTPROCEDURE
@param1 VARCHAR(32) = NULL,
@param2 INT = NULL
AS
BEGIN
SET NOCOUNT ON

SELECT Param1 = COALESCE
(
@param1,
'@param1 was Blank'
)

SELECT Param2 = COALESCE
(
RTRIM(@param2),
'@param2 was Blank'
)
END
GO

EXEC dbo.TESTPROCEDURE @param1='HELLO', @param2=1
EXEC dbo.TESTPROCEDURE @param1='HELLO'
EXEC dbo.TESTPROCEDURE @param2=1
EXEC dbo.TESTPROCEDURE 'HELLO',1
EXEC dbo.TESTPROCEDURE 'HELLO

There are so many way to get the result from procedure, Choose any of them which you like.

Format money/decimal with commas

SQL format money/decimal with commas:

Sometimes you want to have your money fields properly formatted with commas like this: 13,243,543.57
You can use the CONVERT function and give a value between 0 and 2 to the style and the format will be displayed based on that

Below is an example:

DECLARE @v MONEY
SELECT @v = 1322323.6666

SELECT CONVERT(VARCHAR,@v,0)  --1322323.67
Rounded but no formatting

SELECT CONVERT(VARCHAR,@v,1)    --1,322,323.67
Formatted with commas

SELECT CONVERT(VARCHAR,@v,2)    --1322323.6666
No formatting

If you have a decimal field it doesn’t work with the convert function
The work around is to convert it to money:

DECLARE @v2 DECIMAL (36,10)
SELECT @v2 = 13243543.56565656

SELECT CONVERT(VARCHAR,CONVERT(MONEY,@v2),1) --13,243,543.57
Formatted with commas


Difference with (nolock) vs (nolock) as a SQL Table Hint

Difference with (nolock) vs (nolock) as a SQL Table Hint:

select sum(amount) from reward_expired (nolock)

difference was between using the table hint (nolock) and the table hint with (nolock).
Look at MSDN under SQL 2005,

“In SQL Server 2005, with some exceptions, table hints are supported in the FROM clause only when the hints are specified with the WITH keyword. Table hints also must be specified with parentheses.

The following table hints are allowed with and without the WITH keyword: NOLOCK, READUNCOMMITTED, UPDLOCK, REPEATABLEREAD, SERIALIZABLE, READCOMMITTED, FASTFIRSTROW, TABLOCK, TABLOCKX, PAGLOCK, ROWLOCK, NOWAIT, READPAST, XLOCK, and NOEXPAND. When these table hints are specified without the WITH keyword, the hints should be specified alone.”

Micorosft plan to remove the ability to use just (nolock), and to future proof all SQL, users should write all table hints using the with keyword.

Omitting the WITH keyword is a deprecated feature: This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.

HOW TO USE ROLLUP and CUBE

HOW TO USE CUBE:

The CUBE operator generates a result set that is a multidimensional cube. A multidimensional cube is an expansion of fact data, or data that records individual events. The expansion is based on columns that the user wants to analyze. These columns are called dimensions. The cube is a result set containing a cross tabulation of all the possible combinations of the dimensions.

The CUBE operator is specified in the GROUP BY clause of a SELECT statement. The select list contains the dimension columns and aggregate function expressions. The GROUP BY specifies the dimension columns and the keywords WITH CUBE. The result set contains all possible combinations of the values in the dimension columns, along with the aggregate values from the underlying rows that match that combination of dimension values.

For example, a simple table Inventory contains:

Item                 Color                Quantity
-------------------- -------------------- --------------------------
Table                Blue                 124
Table                Red                  223
Chair                Blue                 101
Chair                Red                  210

This query returns a result set that contains the Quantity subtotal for all possible combinations of Item and Color:

SELECT Item, Color, SUM(Quantity) AS QtySum
FROM Inventory
GROUP BY Item, Color WITH CUBE

Here is the result set:

Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                Blue                 101.00
Chair                Red                  210.00
Chair                (null)               311.00
Table                Blue                 124.00
Table                Red                  223.00
Table                (null)               347.00
(null)               (null)               658.00
(null)               Blue                 225.00
(null)               Red                  433.00

The following rows from the result set are of special interest:

Chair (null) 311.00

This row reports a subtotal for all rows having the value Chair in the Item dimension. The value NULL is returned for the Color dimension to show that aggregate reported by the row includes rows with any value of the Color dimension.

Table (null) 347.00

This row is similar, but reports the subtotal for all rows having Table in the Item dimension.

(null) (null) 658.00

This row reports the grand total for the cube. Both the Item and Color dimensions have the value NULL showing that all values of both dimensions are summarized in the row.

(null) Blue 225.00
(null) Red 433.00

These two rows report the subtotals for the Color dimension. Both have NULL in the Item dimension to show that the aggregate data came from rows having any value for the Item dimension.

Using GROUPING to Distinguish Null Values
The null values generated by the CUBE operation present a problem: How can a NULL generated by the CUBE operation be distinguished from a NULL returned in the actual data? This is achieved using the GROUPING function. The GROUPING function returns 0, if the column value came from the fact data, and 1 if the column value is a NULL generated by the CUBE operation. In a CUBE operation, a generated NULL represents all values. The SELECT statement can be written to use the GROUPING function to substitute the string ALL in place of any generated NULL. Because a NULL from the fact data indicates the data value is unknown, the SELECT can also be coded to return the string UNKNOWN in place of any NULL from the fact data. For example:

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'
ELSE ISNULL(Item, 'UNKNOWN')
END AS Item,
CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'
ELSE ISNULL(Color, 'UNKNOWN')
END AS Color,
SUM(Quantity) AS QtySum
FROM Inventory
GROUP BY Item, Color WITH CUBE

Multidimensional Cubes
The CUBE operator can be used to generate n-dimensional cubes, or cubes with any number of dimensions. A single dimension cube can be used to generate a total, for example:

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'
ELSE ISNULL(Item, 'UNKNOWN')
END AS Item,
SUM(Quantity) AS QtySum
FROM Inventory
GROUP BY Item WITH CUBE
GO

This SELECT statement returns a result set showing both the subtotals for each value of Item and the grand total for all values of Item:

Item                 QtySum
-------------------- --------------------------
Chair                311.00
Table                347.00
ALL                  658.00

SELECT statements that contain a CUBE with many dimensions can generate large result sets, because these statements generate rows for all combinations of the values in all the dimensions. These large result sets may contain too much data to be easily read and understood. One solution to this problem is to put the SELECT statement into a view:

CREATE VIEW InvCube AS
SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'
ELSE ISNULL(Item, 'UNKNOWN')
END AS Item,
CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'
ELSE ISNULL(Color, 'UNKNOWN')
END AS Color,
SUM(Quantity) AS QtySum
FROM Inventory
GROUP BY Item, Color WITH CUBE

The view can then be used to query only the dimension values of interest:

SELECT *
FROM InvCube
WHERE Item = 'Chair'
AND Color = 'ALL'

Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                ALL                  311.00

HOW TO USE  ROLLUP:

The ROLLUP operator is useful in generating reports that contain subtotals and totals. The ROLLUP operator generates a result set that is similar to the result sets generated by the CUBE operator. For more information, see Summarizing Data Using CUBE.

The differences between CUBE and ROLLUP are:

CUBE generates a result set showing aggregates for all combinations of values in the selected columns.

ROLLUP generates a result set showing aggregates for a hierarchy of values in the selected columns.
For example, a simple table Inventory contains:

Item                 Color                Quantity
-------------------- -------------------- --------------------------
Table                Blue                 124
Table                Red                  223
Chair                Blue                 101
Chair                Red                  210

This query generates a subtotal report:

SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL'
ELSE ISNULL(Item, 'UNKNOWN')
END AS Item,
CASE WHEN (GROUPING(Color) = 1) THEN 'ALL'
ELSE ISNULL(Color, 'UNKNOWN')
END AS Color,
SUM(Quantity) AS QtySum
FROM Inventory
GROUP BY Item, Color WITH ROLLUP

Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                Blue                 101.00
Chair                Red                  210.00
Chair                ALL                  311.00
Table                Blue                 124.00
Table                Red                  223.00
Table                ALL                  347.00
ALL                  ALL                  658.00
(7 row(s) affected)

If the ROLLUP keyword in the query is changed to CUBE, the CUBE result set is the same, except these two additional rows are returned at the end:

ALL Blue 225.00
ALL Red 433.00

The CUBE operation generated rows for possible combinations of values from both Item and Color. For example, not only does CUBE report all possible combinations of Color values combined with the Item value Chair (Red, Blue, and Red + Blue), it also reports all possible combinations of Item values combined with the Color value Red (Chair, Table, and Chair + Table).

For each value in the columns on the right in the GROUP BY clause, the ROLLUP operation does not report all possible combinations of values from the column (or columns) on the left. For example, ROLLUP does not report all the possible combinations of Item values for each Color value.

The result set of a ROLLUP operation has functionality similar to that returned by a COMPUTE BY; however, ROLLUP has these advantages:

ROLLUP returns a single result set; COMPUTE BY returns multiple result sets that increase the complexity of application code.

ROLLUP can be used in a server cursor; COMPUTE BY cannot.

The query optimizer can sometimes generate more efficient execution plans for ROLLUP than it can for COMPUTE BY.

SQL running serial generating tips

SQL running serial generating tips:

Traditionally developers and Database administrators used temporary tables and co-related sub-queries to generate calculated row numbers in a query. Now SQL Server 2005 provides a function, which replaces all of the additional resources we used to generate row numbers.

Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.

Example:

select ROW_NUMBER() OVER(ORDER BY CLM1),* from Table1

Remark:

The ORDER BY clause determines the sequence in which the rows are assigned their unique
ROW_NUMBER within a specified partition.

Enable remote connection SQL Server

Error Description ::

An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server)

Enable remote connections for SQL Server 2005 Express or SQL Server 2005 Developer Edition

You must enable remote connections for each instance of SQL Server 2005 that you want to connect to from a remote computer. To do this, follow these steps:

  1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Surface Area Configuration.
  2. On the SQL Server 2005 Surface Area Configuration page, click Surface Area Configuration for Services and Connections.
  3. On the Surface Area Configuration for Services and Connections page, expand Database Engine, click Remote Connections, click Local and remote connections, click the appropriate protocol to enable for your environment, and then click Apply.Note Click OK when you receive the following message:
    Changes to Connection Settings will not take effect until you restart the Database Engine service.
  4. On the Surface Area Configuration for Services and Connections page, expand Database Engine, click Service, click Stop, wait until the MSSQLSERVER service stops, and then click Start to restart the MSSQLSERVER service.

Enable the SQL Server Browser service:

  1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Surface Area Configuration.
  2. On the SQL Server 2005 Surface Area Configuration page, click Surface Area Configuration for Services and Connections.
  3. On the Surface Area Configuration for Services and Connections page, click SQL Server Browser, click Automatic for Startup type, and then click Apply.Note When you click the Automatic option, the SQL Server Browser service starts automatically every time that you start Microsoft Windows.
  4. Click Start, and then click OK.

Create exceptions in Windows Firewall:

  1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Configuration Manager.
  2. In SQL Server Configuration Manager, click the SQL Server Browser service in the right pane, right-click the instance name in the main window, and then click Properties.
  3. On the SQL Server Browser Properties page, click the Advanced tab, locate the instance ID in the property list, and then click OK.

To open Windows Firewall, click Start, click Run, type firewall.cpl, and then click OK

Create an exception for SQL Server 2005 in Windows Firewall

To create an exception for SQL Server 2005 in Windows Firewall, follow these steps:

  1. In Windows Firewall, click the Exceptions tab, and then click Add Program.
  2. In the Add a Program window, click Browse.
  3. Click the C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLBinnsqlservr.exe executable program, click Open, and then click OK.Note The path may be different depending on where SQL Server 2005 is installed. MSSQL.1 is a placeholder for the instance ID that you obtained in step 3 of the previous procedure.
  4. Repeat steps 1 through 3 for each instance of SQL Server 2005 that needs an exception.

Create an exception for the SQL Server Browser service in Windows Firewall

To create an exception for the SQL Server Browser service in Windows Firewall, follow these steps:

  1. In Windows Firewall, click the Exceptions tab, and then click Add Program.
  2. In the Add a Program window, click Browse.
  3. Click the C:Program FilesMicrosoft SQL Server90Sharedsqlbrowser.exe executable program, click Open, and then click OK.

Case sensitive comparisons in SQL

Case sensitive comparisons in SQL Server

Some time we need to compare some data in binary see the examples:

SELECT *  FROM TABLE1 WITH (NOLOCK) WHERE convert(varbinary(255), PASS) = convert(varbinary(255), 'RAGHU')

Normal comparison:

SELECT *  FROM TABLE1 WITH (NOLOCK) WHERE convert(varbinary(255), PASS) = convert(varbinary(255), 'Raghu') AND PASS= 'TEST'

Uuse the BINARY_CHECKSUM function:

SELECT *  FROM TABLE1 WITH (NOLOCK) WHERE BINARY_CHECKSUM(answer) = BINARY_CHECKSUM('TEST')

Use SQL Server case insensitive collation:

SELECT *  FROM TABLE1 WITH (NOLOCK) WHERE PASSWORD COLLATE SQL_Latin1_General_CP1_CS_AS = 'RAGHU' COLLATE SQL_Latin1_General_CP1_CS_AS

Sub Query Update

Sub Query Update:
if you want to run update query in join or by comparing other table then here is the example.

UPDATE Table1
SET Price = Price * 2
WHERE ProductID IN
    (SELECT ID
     FROM Table2
     WHERE ID= 51);
GO

Here is an equivalent UPDATE statement using a join:

USE AdventureWorks;
GO
UPDATE Table1
SET Price = Price * 2
FROM Table1 AS t1
INNER JOIN Table2 AS t2
    ON t1.ProductID = t2.ProductID AND t2.ID= 51;
GO

BEGIN TRANSACTION (Transact-SQL)

BEGIN TRANSACTION represents a point at which the data referenced by a connection is logically and physically consistent. If errors are encountered, all data modifications made after the BEGIN TRANSACTION can be rolled back to return the data to this known state of consistency. Each transaction lasts until either it completes without errors and COMMIT TRANSACTION is issued to make the modifications a permanent part of the database, or errors are encountered and all modifications are erased with a ROLLBACK TRANSACTION statement.

Syntax:

BEGIN { TRAN | TRANSACTION }
    [ { transaction_name | @tran_name_variable }
      [ WITH MARK [ 'description' ] ]
    ]
[ ; ]

Example:

You can catch your error in T-SQL as per my experience

-- Exec spa_begintrnExample 'a'

Create Proc spa_begintrnExample
@flag char(1)

AS
if @flag='a'
begin

BEGIN TRANSACTION

Update table set clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Update table2 set clm12='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Delete from table3 where clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

Delete from table4 set clm1='test'
IF (@@ERROR <> 0) GOTO QuitWithRollback 

COMMIT TRANSACTION

GOTO  EndSave

QuitWithRollback:
  IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave: 

end

Remarks:

The local transaction started by the BEGIN TRANSACTION statement is escalated to a distributed transaction if the following actions are performed before the statement is committed or rolled back:

  • An INSERT, DELETE, or UPDATE statement that references a remote table on a linked server is executed. The INSERT, UPDATE, or DELETE statement fails if the OLE DB provider used to access the linked server does not support the ITransactionJoin interface.
  • A call is made to a remote stored procedure when the REMOTE_PROC_TRANSACTIONS option is set to ON.

SQL RAND Function

SQL RAND Function

The SQL RAND() function is used to generate some random numbers at run time. Here is the syntax:

Syntax

RAND ( [ seed ] )

Repetitive calls of RAND() with the same seed value return the same results.

For one connection, if RAND() is called with a specified seed value, all subsequent calls of RAND() produce results based on the seeded RAND() call. For example, the following query will always return the same sequence of numbers.

SELECT RAND(100), RAND(), RAND()

Examples:

The following example produces four different random numbers that are generated by the RAND function.

DECLARE @counter smallint
SET @counter = 1
WHILE @counter < 5
BEGIN
SELECT RAND() Random_Number
SET @counter = @counter + 1
END
GO

The following example returns random float numbers based on three different seed values.

CREATE TABLE Random (Seed1 float, Seed5 float, Seed10 float)
INSERT INTO Random Values (RAND(1), RAND(5), RAND(10))
SELECT * FROM Random

The RAND function is a pseudorandom number generator that operates in a manner similar to the C run-time library rand function. If no seed is provided, the system generates its own variable seed numbers. If you call RAND with a seed value, you must use variable seed values to generate random numbers. If you call RAND multiple times with the same seed value, it returns the same generated value. The following script returns the same value for the calls to RAND because they all use the same seed value:

SELECT RAND(159784)
SELECT RAND(159784)
SELECT RAND(159784)

A common way to generate random numbers from RAND is to include something relatively variable as the seed value, such as adding several parts of a GETDATE:

SELECT RAND( (DATEPART(mm, GETDATE()) * 100000 )
+ (DATEPART(ss, GETDATE()) * 1000 )
+ DATEPART(ms, GETDATE()) )

When you use an algorithm based on GETDATE to generate seed values, RAND can still generate duplicate values if the calls to RAND are made within the interval of the smallest datepart used in the algorithm. This is especially likely when the calls to RAND are included in a single batch. Multiple calls to RAND in a single batch can be executed within the same millisecond. This is the smallest increment of DATEPART. In this case, incorporate a value based on something other than time to generate the seed values.

How to use Bulk Insert

Introduction Bulk Insert:

The Bulk Insert task provides the quickest way to copy large amounts of data into a SQL Server table or view. For example, suppose your company stores its million-row product list on a mainframe system, but the company’s e-commerce system uses SQL Server 2005 to populate Web pages. You must update the SQL Server product table nightly with the master product list from the mainframe. To update the table, you save the product list in a tab-delimited format and use the Bulk Insert task to copy the data directly into the SQL Server table.

To ensure high-speed data copying, transformations cannot be performed on the data while it is moving from the source file to the table or view.

You can configure the Bulk Insert task in the following ways:

  • Specify the OLE DB connection manager to connect to the destination SQL Server database and the table or view into which data is inserted.
  • Specify the File or Flat File connection manager to access the source file and provide information about the source data file, such as the code page and file type.
  • Define the format used by the Bulk Insert task, either by using a format file or by defining the column and row delimiters of the source data. If using a format file, specify the File connection manager to access the format file.
  • Specify actions to perform on the destination table or view when inserting the data. The options include whether to check constraints, enable identity inserts, keep nulls, fire triggers, or lock the table.
  • Provide information about the batch of data to insert, such as the batch size, the first and last row from the file to insert, the number of insert errors that can occur before the task stops inserting rows, and the names of the columns that will be sorted.

If the Bulk Insert task uses a Flat File connection manager to access the source file, the task does not use the format specified in the Flat File connection manager. Instead, the Bulk Insert task uses either the format specified in a format file, or the values of the RowDelimiter and ColumnDelimiter properties of the task.

BULK INSERT (Transact-SQL)

BULK INSERT
   [ database_name . [ schema_name ] . | schema_name . ] [ table_name | view_name ]
      FROM 'data_file'
     [ WITH
    (
   [ [ , ] BATCHSIZE = batch_size ]
   [ [ , ] CHECK_CONSTRAINTS ]
   [ [ , ] CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } ]
   [ [ , ] DATAFILETYPE  =
      { 'char' | 'native'| 'widechar' | 'widenative' } ]
   [ [ , ] FIELDTERMINATOR = 'field_terminator' ]
   [ [ , ] FIRSTROW  =first_row ]
   [ [ , ] FIRE_TRIGGERS ]
   [ [ , ] FORMATFILE = 'format_file_path' ]
   [ [ , ] KEEPIDENTITY ]
   [ [ , ] KEEPNULLS ]
   [ [ , ] KILOBYTES_PER_BATCH =kilobytes_per_batch ]
   [ [ , ] LASTROW = last_row ]
   [ [ , ] MAXERRORS = max_errors ]
   [ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
   [ [ , ] ROWS_PER_BATCH = rows_per_batch ]
   [ [ , ] ROWTERMINATOR = 'row_terminator' ]
   [ [ , ] TABLOCK ]
   [ [ , ] ERRORFILE = 'file_name' ]
    )]

Examples

A. Using pipes to import data from a file:

This example imports order detail information into the AdventureWorks.Sales.SalesOrderDetail table from the specified data file by using a pipe (|) as the field terminator and |n as the row terminator.

BULK INSERT AdventureWorks.Sales.SalesOrderDetail     
FROM 'f:orderslineitem.txt'     
WITH         
(           
FIELDTERMINATOR =' |',           
ROWTERMINATOR =' |n'        
)

B. Using the FIRE_TRIGGERS argument

This example specifies the FIRE_TRIGGERS argument.

BULK INSERT AdventureWorks.Sales.SalesOrderDetail     
FROM 'f:orderslineitem.txt'     
WITH       
(          
FIELDTERMINATOR =' |',          
ROWTERMINATOR = ':n',          
FIRE_TRIGGERS        )

C. Using line feed as a row terminator

This example loads a file that uses the line feed as a row terminator such as a UNIX output:

DECLARE @bulk_cmd varchar(1000)  
SET @bulk_cmd = 'BULK INSERT AdventureWorks.Sales.SalesOrderDetail  
FROM ''<drive>:<path><filename>''   
WITH (ROWTERMINATOR = '''+CHAR(10)+''')'  
EXEC(@bulk_cmd)

What is BCP Utility in SQL server?

Introduction BCP Utility:

The bcp utility bulk copies data between an instance of Microsoft SQL Server 2005 and a data file in a user-specified format. The bcp utility can be used to import large numbers of new rows into SQL Server tables or to export data out of tables into data files. Except when used with the queryout option, the utility requires no knowledge of Transact-SQL. To import data into a table, you must either use a format file created for that table or understand the structure of the table and the types of data that are valid for its columns.

Syntax:

bcp {[[database_name.][owner].]{table_name | view_name} | "query"}
    {in | out | queryout | format} data_file
    [-mmax_errors] [-fformat_file] [-x] [-eerr_file]
    [-Ffirst_row] [-Llast_row] [-bbatch_size]
    [-n] [-c] [-w] [-N] [-V (60 | 65 | 70 | 80)] [-6]
    [-q] [-C { ACP | OEM | RAW | code_page } ] [-tfield_term]
    [-rrow_term] [-iinput_file] [-ooutput_file] [-apacket_size]
    [-Sserver_name[instance_name]] [-Ulogin_id] [-Ppassword]
    [-T] [-v] [-R] [-k] [-E] [-h"hint [,...n]"]

Examples:

Copying Table Rows into a Data File (with a Trusted Connection)

The following example illustrates the out option on the AdventureWorks.Sales.Currency table. This example creates a data file named Currency.dat and copies the table data into it using character format. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At a command prompt, enter the following command:

bcp AdventureWorks.Sales.Currency out Currency.dat -T -c


Copying Table Rows into a Data File (with Mixed-Mode Authentication)

The following example illustrates the out option on the AdventureWorks.Sales.Currency table. This example creates a data file named Currency.dat and copies the table data into it using character format.

he example assumes that you are using mixed-mode authentication, you must use the -U switch to specify your login ID. Also, unless you are connecting to the default instance of SQL Server on the local computer, use the -S switch to specify the system name and, optionally, an instance name.

bcp AdventureWorks.Sales.Currency out 
Currency.dat -c -U<login_id> -S<server_nameinstance_name>

The system will prompt you for your password.

Copying Data from a File to a Table

The following example illustrates the in option by using the file created in the preceding example (Currency.dat). First, however, this example creates an empty copy of the AdventureWorks Sales.Currency table, Sales.Currency2, into which the data is copied. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

To create the empty table, in Query Editor, enter the following command:

USE AdventureWorks;
GO
SELECT * INTO AdventureWorks.Sales.Currency2
FROM AdventureWorks.Sales.Currency WHERE 1=2

To bulk copy the character data into the new table–that is, to import the data–enter the following command at a command prompt:

bcp AdventureWorks.Sales.Currency2 
in Currency.dat -T -c

To verify that the command succeeded, display the contents of the table in Query Editor, and enter

USE AdventureWorks;
GO
SELECT * FROM Sales.Currency2


Copying a Specific Column into a Data File

To copy a specific column, you can use the queryout option. The following example copies only the Name column of the Sales.Currency table into a data file. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At the Windows command prompt, enter:

bcp "SELECT Name FROM AdventureWorks.Sales.Currency" 
queryout Currency.Name.dat -T -c


Copying Data From a Query to a Data File

To copy the result set from a Transact-SQL statement to a data file, use the queryout option. The following example copies the names from the AdventureWorks.Person.Contact table, ordered by last name then first name, into the Contacts.txt data file. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At the Windows command prompt, enter:

bcp "SELECT FirstName, LastName FROM 
AdventureWorks.Person.Contact 
ORDER BY LastName, Firstname" 
queryout Contacts.txt -c -T


Creating a Non-XML Format File

a non-XML format file, Currency.fmt, for the Sales.Currency table in the AdventureWorks database. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At the Windows command prompt, enter:

bcp AdventureWorks.Sales.Currency 
format nul -T -c  -f Currency.fmt

Creating an XML Format File

The following example creates an XML format file named Currency.xml for the Sales.Currency table in the AdventureWorks database. The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At the Windows command prompt, enter:

bcp AdventureWorks.Sales.Currency 
format nul -T -c -x -f Currency.xml

Using a Format File to Bulk Import with bcp

To use a previously created format file when importing data into an instance of SQL Server, use the -f switch with the in option. For example, the following command bulk copies the contents of a data file, Currency.dat, into a copy of the Sales.Currency table (Sales.Currency2) by using the previously created format file (Currency.xml). The example assumes that you are using Windows Authentication and have a trusted connection to the server instance on which you are running the bcp command.

At the Windows command prompt, enter:

bcp AdventureWorks.Sales.Currency2 
in Currency.dat -T -f Currency.xml

Tips And Tricks For Advanced MS SQL Server Developers

Tips And Tricks For Advanced MS SQL Server Developers:

  1. Use “TRUNCATE TABLE” statement instead of “DELETE” clause if you want to delete all rows from a table. It is much faster then “DELETE” statement without any conditions. “TRUNCATE TABLE” frees all the space occupied by that table’s data and indexes, without logging the individual row deletes.
  2. Always use owner prefix in T-SQL  queries:

    SELECT mycolumn FROM dbo.mytable

    In this case query optimizer does not have to decide whether to retrieve from dbo.mytable or other owner’s table and avoids recompilation.  Recompilation results in no performance advantages of stored procedures usage.

  3. Don’t use “sp_“ as your prefix for stored procedures – it is a reserved prefix in MS SQL server! MS SQL server searches for a stored procedure with “sp_” prefix in the system procedures first, and only after that looks for them in client procedures.
  4. If you are unable to install MSDE at home because of unknown error – check that you did not stop “Server” system service on you PC…
  5. There are thousands of examples, when developers use “SELECT COUNT(*)” statement. But there is another, much faster way to accomplish the task:
    SELECT rows FROM sysindexes WHERE id = OBJECT_ID('Table_Name') AND indid < 2
  6. Include “SET NOCOUNT ON” statement in your stored procedures to greatly reduce network traffic.
  7. Use the “BETWEEN” clause instead of “IN” for greater performance:
    SELECT productId FROM customer
    WHERE productId BETWEEN 1 AND 9

    Instead of:

    SELECT productId
    FROM customer
    WHERE productId IN (1, 2, 3, 4,5,6,7,8,9)
  8. Use Table variables – new feature of MS SQL 2000 instead of temp tables. Table variables are created in memory, not written to the tempdb database, and therefore they are much faster. However, be careful to use them only with not very huge amount of data that you want to allocate in temp tables, otherwise you can easily get the server down.

Database Settings Optimization Tips

Database Settings Optimization Tips

  • You can turn off the ‘auto create statistics’ database option.When this database option is set to true, statistics are automatically created on columns used in a predicate. By default, this database option is set to true. Because auto creation statistics results in some performance degradation, you can turn off this database option and create statistics manually during off-peak times by using the CREATE STATISTICS statement. By the way, in most cases, it will not provide some performance benefits.
  • You can turn off the ‘auto update statistics’ database option.When this database option is set to true, existing statistics are automatically updated when the statistics become out-of-date. By default, this database option is set to true. Because auto update statistics results in some performance degradation, you can turn off this database option and update statistics manually during off-peak times by using the UPDATE STATISTICS statement. By the way, in most cases, it will not provide some performance benefits.
  • Turn off the ‘autoclose’ database option.When this option is turned on, the database’s resources are freed after the last user exits. When the new user will connect to database, the database should be reopened, which takes some time. So, do hot set this database option to true on your production server. By default, this database option is set to true when using SQL Server Desktop Edition, and set to false for all other editions.
  • Turn off the ‘autoshrink’ database option.When this database option is set to true, the database files will be periodically shrink. Autoshrinking results in some performance degradation, therefore you should shrink the database manually or create a scheduled task to shrink the database periodically during off-peak times, rather than set Autoshrink feature to on. By default, this database option is set to true when using SQL Server Desktop Edition, and set to false for all other editions.
  • You can turn on the ‘read-only’ database option to prevent users to modify the database’s data.By default, this database option is set to false. If you have data that should not be modified, you can place it into another database and set for this database the ‘read-only’ option to true. It can increase the speed of your queries. If you need to allow permissions management (for example, prevent some users to select data from some tables), you should create another filegroup and make only this filegroup read-only, because when the ‘read-only’ database option is set to true, the database’s system tables will be also read-only and this will prevent the permissions management.
  • You can turn on the ‘select into/bulkcopy’ database option to allow SELECT INTO statements and nonlogged bulk copies.The nonlogged bulk copy is much faster than logged one, but to use it you must provide all the following conditions:1. The database option ‘select into/bulkcopy’ is set to true.

    2. The target table is not being replicated.
    3. The TABLOCK hint is specified.
    4. The target table has no indexes, or if the table has indexes, it is empty when the bulk copy starts.
    By default, this database option is set to false.

  • You can turn off the ‘trunc. log on chkpt.’ database option to prevent the transaction log from truncating on checkpoint.This option can be set if the transaction log grows very quickly to prevent the transaction log from filling rapidly and running out of disk space. If you set the ‘trunc. log on chkpt.’ database option to true, the transaction log cannot be backed up, so you cannot restore your data to the point of failure (only to the time when the last full backup was made). So, the general recommendation about this option is allow it to be turned off, and make the transaction log backup periodically to truncate the log. By default, this database option is set to true when using SQL Server Desktop Edition, and set to false for all other editions.

Note. You can set the above database options by using the sp_dboption system stored procedure or Enterprise Manager. If you want to set the above database options for the newly created database, you should set these options for the model database.

SQL Server Settings Optimization Tips

SQL Server Settings Optimization Tips:

  1. You can increase the ‘min memory per query’ option to improve the performance of queries that use hashing or sorting operations, if your SQL Server has a lot of memory available and there are many queries running concurrently on the server.
    The SQL Server will automatically allocate, at a minimum, the amount of memory set in this configuration setting. The default ‘min memory per query’ option is equal to 1024 Kb.
  2. You can increase the ‘max async IO’ option if your SQL Server works on a high performance server with high-speed intelligent disk subsystem (such as hardware-based RAID with more than 10 disks).
    This option specifies the maximum number of outstanding asynchronous disk I/O requests that the entire server can issue against a file. By the way, the ‘max async IO’ SQL Server option is no longer supported in SQL Server 2000.
  3. You can change the ‘network packet size’ option to the appropriate value.
    This option can improve performance on networks whose base topology supports larger packets than TCP/IP’s default of 4096 bytes. For example, if client sends or receives large amounts of data, a larger packet size can improve performance, because it results in fewer network reads and writes. The default value for the ‘network packet size’ option is 4096 bytes. Microsoft does not recommend changing this option, because for most applications, the default packet size of 4096 bytes is best.
  4. You can change the ‘fill factor’ option to the appropriate value.

    The ‘fill factor’ option specifies how full SQL Server will make each index page. When there is no free space to insert new row on the index page, SQL Server will create new index page and transfer some rows from the previous page to the new one. This operation is called page splits. You can reduce the number of page splits by setting the appropriate fill factor option to reserve free space on each index page. The fill factor is a value from 1 through 100 that specifies the percentage of the index page to be left empty. The default value for fill factor is 0. It is treated similarly to a fill factor value of 100, the difference in that SQL Server leaves some space within the upper level of the index tree for FILLFACTOR = 0. The fill factor percentage is used only at the time the index is created. If the table contains read-only data (or data that very rarely changed), you can set the ‘fill factor’ option to 100. When the table’s data modified very often, you can decrease the ‘fill factor’ option to 70 percent, for example.

  5. You can increase the ‘recovery interval’ value.
    The ‘recovery interval’ option specifies the maximum number of minutes per database that SQL Server needs to complete its recovery procedures. The default value of this option is 0. It means that SQL Server will automatically configure this option. SQL Server issues a checkpoint using the ‘recovery interval’ option. Microsoft does not recommend changing this option in general case, but sometimes you can improve performance by changing this option. You can monitor disk-write activity on the data files, and if you see periodic spikes that send disk utilization to 100 percent, you can increase the recovery interval. In this case, Microsoft suggests setting the ‘recovery interval’ option to 5 and continuing monitoring.
  6. You can set the ‘priority boost’ SQL Server options to 1.
    You can set this option to 1, if you want from SQL Server to work with a higher priority than other processes on the same computer. The default value is 0. Setting ‘priority boost’ to 1 can degrade the performance of other applications running on the same computer with SQL Server. So, you should set the ‘priority boost’ SQL Server options to 1 only if you have dedicated server to SQL Server. In other case, do not change this option.
  7. Set the ‘max worker threads’ options to the maximum number of the user connections to your SQL Server box.
    The default setting for the ‘max worker threads’ option is 255. If the number of user connections will be less than the ‘max worker threads’ value, a separate operating system thread will be created for each client connection, but if the number of user connections will exceed this value the thread pooling will be used. For example, if the maximum number of the user connections to your SQL Server box is equal to 50, you can set the ‘max worker threads’ options to 50, this frees up resources for SQL Server to use elsewhere. If the maximum number of the user connections to your SQL Server box is equal to 500, you can set the ‘max worker threads’ options to 500, this can improve SQL Server performance because thread pooling will not be used.
  8. You can specify the ‘min server memory’ and ‘max server memory’ options.
    These options can be used to specify the fixed amount of memory to allocate to SQL Server. In this case, you should set the ‘min server memory’ and ‘max server memory’ to the same value (equal to the maximum amount of physical memory that SQL Server will use), and set the ‘set working set size’ SQL Server option to 1. This can improve performance because SQL Server will not dynamically allocate memory. You can also change these options when SQL Server works on the same computer with other applications. In this case, the ‘min server memory’ options is used to allow SQL Server works when other applications pretend to use all available memory, and the ‘max server memory’ options is used to allow other applications work when SQL Server tried to use all available resources.
  9. You can specify the ‘set working set size’ SQL Server option to reserve the amount of physical memory space for SQL Server.
    Unlike SQL Server 6.5, SQL Server 7.0/2000 can automatically allocate memory (can take more memory if SQL Server need it, and can give memory back to operation system). This is one of the main advantages in comparison with previous versions, but dynamic memory allocation takes some time. If you know the maximum amount of physical memory that SQL Server will use, you can specify this amount by setting ‘min server memory’ and ‘max server memory’ to the same value (equal to the maximum amount of physical memory that SQL Server will use) and set the ‘set working set size’

Basic Definitions

Definition of a Database

A database is a collection of related information, accessed and managed by its DBMS. After experimenting with hierarchical and networked DBMSs during the 1970’s, the IT industry became dominated by relational DBMSs (Or Object-Relational Database Management System) such as Informix database, Oracle, Sybase, and, later on, Microsoft SQL Server and the like.

In a strictly technical sense, for any database to be defined as a “Truly Relational Model Database Management System,” it should, ideally, adhere to the twelve rules defined by Edgar F. Codd, pioneer in the field of relational databases. To date, while many come close, it is admitted that nothing on the market adheres 100% to those rules, any more than they are 100% ANSI-SQL compliant.

While IBM and Oracle technically were the earliest on the RDBMS scene, many others have followed, and while it is unlikely that miniSQL still exist in their original form, Monty’s MySQL is still extant and thriving, along with the Ingres-descended PostgreSQL. Microsoft Access – the 1995+ versions, not the prior versions – were, despite various limitations, technically the closest thing to being ‘Truly Relational’ DBMS’s for the desktop PC, with Visual FoxPro, and many other desktop products marketed at that time far less compliant with Codd’s Rules.

A relational DBMS manages information about types of real-world things (entities) in the form of tables that represent the entities. A table is like a spreadsheet; each row represents a particular entity (instance—), and each column represents a type of information about the entity (domain). Sometimes entities are made up of smaller related entities, such as orders and order lines; and so one of the challenges of a multi-user DBMS is provide data about related entities from the standpoint of an instant of logical consistency.

Properly managed relational databases minimize the need for application programs to contain information about the physical storage of the data they access. To maximize the isolation of programs from data structures, relational DBMSs restrict data access to the messaging protocol SQL, a nonprocedural language that limits the programmer to specifying desired results. This message-based interface was a building block for the decentralization of computer hardware, because a program and data structure with such a minimal point of contact become feasible to reside on separate computers.

Recoverability

Recoverability means that, if a data entry error, program bug or hardware failure (Vista) occurs, the DBA can bring the database backward in time to its state at an instant of logical consistency before the damage was done. Recoverability activities include making database backups and storing them in ways that minimize the risk that they will be damaged or lost, such as placing multiple copies on removable media and storing them outside the affected area of an anticipated disaster. Recoverability is the DBA’s most important concern.

The backup of the database consists of data with timestamps combined with database logs to change the data to be consistent to a particular moment in time. It is possible to make a backup of the database containing only data without timestamps or logs, but the DBA must take the database offline to do such a backup.

The recovery tests of the database consist of restoring the data, then applying logs against that data to bring the database backup to consistency at a particular point in time up to the last transaction in the logs. Alternatively, an offline database backup can be restored simply by placing the data in-place on another copy of the database.

If a DBA (or any administrator) attempts to implement a recoverability plan without the recovery tests, there is no guarantee that the backups are at all valid. In practice, in all but the most mature RDBMS packages, backups rarely are valid without extensive testing to be sure that no bugs or human error have corrupted the backups.

Security
Security means that users’ ability to access and change data conforms to the policies of the business and the delegation decisions of its managers. Like other metadata, a relational DBMS manages security information in the form of tables. These tables are the “keys to the kingdom” and so it is important to protect them from intruders.

Performance

Performance means that the database does not cause unreasonable online response times, and it does not cause unattended programs to run for an unworkable period of time. In complex client/server and three-tier systems, the database is just one of many elements that determine the performance that online users and unattended programs experience. Performance is a major motivation for the DBA to become a generalist and coordinate with specialists in other parts of the system outside of traditional bureaucratic reporting lines.

Techniques for database performance tuning have changed as DBA’s have become more sophisticated in their understanding of what causes performance problems and their ability to diagnose the problem.

In the 1990s, DBAs often focused on the database as a whole, and looked at database-wide statistics for clues that might help them find out why the system was slow. Also, the actions DBAs took in their attempts to solve performance problems were often at the global, database level, such as changing the amount of computer memory available to the database, or changing the amount of memory available to any database program that needed to sort data.

DBA’s now understand that performance problems initially must be diagnosed, and this is best done by examining individual SQL statements, table process, and system architecture, not the database as a whole. Various tools, some included with the database and some available from third parties, provide a behind the scenes look at how the database is handling the SQL statements, shedding light on what’s taking so long.Having identified the problem, the individual SQL statement can be clarify

Development/Testing Support

Development and testing support is typically what the database administrator regards as his or her least important duty, while results-oriented managers consider it the DBA’s most important duty. Support activities include collecting sample production data for testing new and changed programs and loading it into test databases; consulting with programmers about performance tuning; and making table design changes to provide new kinds of storage for new program functions.

Indexing Service

What is Indexing Service?

Indexing Service is a base service for Microsoft® Windows® 2000 or later that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching.

Indexing Service can extract both text and property information from files on the local host and on remote, networked hosts. The files can be simply members of a selected file system or part of a virtual Web hosted by, for example, Internet Information Services (IIS).

Indexing Service extracts the content by filtering—using filter components that understand a file’s format. The format could include multi-language features such as international languages and locales. A filter component implements the IFilter interface, which supplies methods to read a file to extract text and properties. Windows 2000 and Microsoft Windows XP supply filters for Microsoft Office files, Hypertext Markup Language (HTML) files, Multipurpose Internet Mail Extension (MIME) messages, and plain-text files.

Indexing Service then merges the extracted information into catalogs of indexes for efficient searches. Indexing is the overall process of filtering, creating index entries, and merging them into catalogs.

The final step in the indexing process is creation of a catalog that contains a master index (and any temporary word lists and shadow indexes) storing words and their locations within a set of indexed documents. Subsequently, searching, or querying, the catalogs for particular word combinations uses the master index as well as word lists and shadow indexes to execute queries quickly and efficiently.

Windows 2000 and Windows XP include basic facilities for querying the Indexing Service catalog and for managing the state and properties of Indexing Service itself. These facilities include:

  • When Indexing Service is running, Start/Search/For Files or Folders uses the Indexing Service catalog.
  • The Indexing Service snap-in for the Microsoft Management Console (MMC) provides the means to start, stop, and pause Indexing Service, and to administer many of its properties, such as those defining its catalogs.
  • The Platform Software Development Kit (SDK) provides additional versatile and flexible facilities for programmatically interacting with Indexing Service. These facilities include:
  • Admin and Query Helper objects and ActiveX® Data Object (ADO) methods for use with Microsoft Visual Basic®, Microsoft Visual Basic Scripting Edition (VBScript), Microsoft Visual J++® and Microsoft JScript® development software.
  • ISAPI Extensions for use in .idq, .ida, and .htx files.
  • OLE DB Helper functions for use with Microsoft Visual C++® development system.
  • OLE DB Provider for Indexing Service interfaces for use with Visual C++.
  • IFilter interface for use with Visual C++

Source: MSDN

SQL Performance Tuning using Indexes

SQL Performance Tuning using Indexes

This article looks at general guidelines to creating effective indexes using short keys, distinct keys, covering indexes and clustered indexes.

Effective indexes are one of the best ways to improve performance in a database application. Without an index, the SQL Server engine is like a reader trying to find a word in a book by examining each page. By using the index in the back of a book, a reader can complete the task in a much shorter time. In database terms, a table scan happens when there is no index available to help a query. In a table scan SQL Server examines every row in the table to satisfy the query results. Table scans are sometimes unavoidable, but on large tables, scans have a terrific impact on performance.

One of the most important jobs for the database is finding the best index to use when generating an execution plan. Most major databases ship with tools to show you execution plans for a query and help in optimizing and tuning indexes. This article outlines several good rules of thumb to apply when creating and modifying indexes for your database. First, let’s cover the scenarios where indexes help performance, and when indexes can hurt performance.

Useful Index Queries

Just like the reader searching for a word in a book, an index helps when you are looking for a specific record or set of records with a WHERE clause. This includes queries looking for a range of values, queries designed to match a specific value, and queries performing a join on two tables. For example, both of the queries against the Northwind database below will benefit from an index on the UnitPrice column.

DELETE FROM Products WHERE UnitPrice = 1

SELECT * FROM PRODUCTS
WHERE UnitPrice BETWEEN 14 AND 16

Since index entries are stored in sorted order, indexes also help when processing ORDER BY clauses. Without an index the database has to load the records and sort them during execution. An index on UnitPrice will allow the database to process the following query by simply scanning the index and fetching rows as they are referenced. To order the records in descending order, the database can simply scan the index in reverse.

SELECT * FROM Products ORDER BY UnitPrice ASC

Grouping records with a GROUP BY clause will often require sorting, so a UnitPrice index will also help the following query to count the number of products at each price.

SELECT Count(*), UnitPrice FROM Products
GROUP BY UnitPrice

By retrieving the records in sorted order through the UnitPrice index, the database sees matching prices appear in consecutive index entries, and can easily keep a count of products at each price. Indexes are also useful for maintaining unique values in a column, since the database can easily search the index to see if an incoming value already exists. Primary keys are always indexed for this reason.

Index Drawbacks

Indexes are a performance drag when the time comes to modify records. Any time a query modifies the data in a table the indexes on the data must change also. Achieving the right number of indexes will require testing and monitoring of your database to see where the best balance lies. Static systems, where databases are used heavily for reporting, can afford more indexes to support the read only queries. A database with a heavy number of transactions to modify data will need fewer indexes to allow for higher throughput. Indexes also use disk space. The exact size will depends on the number of records in the table as well as the number and size of the columns in the index. Generally this is not a major concern as disk space is easy to trade for better performance.

Building The Best Index

There are a number of guidelines to building the most effective indexes for your application. From the columns you select to the data values inside them, consider the following points when selecting the indexes for your tables.

Short Keys

Having short index is beneficial for two reasons. First, database work is inherently disk intensive. Larger index keys will cause the database to perform more disk reads, which limits throughput. Secondly, since index entries are often involved in comparisons, smaller entries are easier to compare. A single integer column makes the absolute best index key because an integer is small and easy for the database to compare. Character strings, on the other hand, require a character by character comparison and attention to collation settings.

Distinct Keys

The most effective indexes are the indexes with a small percentage of duplicated values. As an analogy, think of a phone book for a town where almost everyone has the last name of Smith. A phone book in this town is not very useful if sorted in order of last name, because you can only discount a small number of records when you are looking for a Smith.

An index with a high percentage of unique values is a selective index. Obviously, a unique index is highly selective since there are no duplicate entries. Many databases will track statistics about each index so they know how selective each index is. The database uses these statistics when generating an execution plan for a query.


Covering Queries

Indexes generally contain only the data values for the columns they index and a pointer back to the row with the rest of the data. This is similar to the index in a book: the index contains only the key word and then a page reference you can turn to for the rest of the information. Generally the database will have to follow pointers from an index back to a row to gather all the information required for a query. However, if the index contains all of he columns needed for a query, the database can save a disk read by not returning to the table for more information.

Take the index on UnitPrice we discussed earlier. The database could use just the index entries to satisfy the following query.

SELECT Count(*), UnitPrice FROM Products
GROUP BY UnitPrice

We call these types of queries covered queries, because all of the columns requested in the output are covered by a single index. For your most crucial queries, you might consider creating a covering index to give the query the best performance possible. Such an index would probably be a composite index (using more than one column), which appears to go against our first guideline of keeping index entries as short as possible. Obviously this is another tradeoff you can only evaluate with performance testing and monitoring.

Clustered Indexes

Many databases have one special index per table where all of the data from a row exists in the index. SQL Server calls this index a clustered index. Instead of an index at the back of a book, a clustered index is closer in similarity to a phone book because each index entry contains all the information you need, there are no references to follow to pick up additional data values.

As a general rule of thumb, every non-trivial table should have a clustered index. If you only create one index for a table, make the index a clustered index. In SQL Server, creating a primary key will automatically create a clustered index (if none exists) using the primary key column as the index key. Clustered indexes are the most effective indexes (when used, they always cover a query), and in many databases systems will help the database efficiently manage the space required to store the table.

When choosing the column or columns for a clustered index, be careful to choose a column with static data. If you modify a record and change the value of a column in a clustered index, the database might need to move the index entry (to keep the entries in sorted order). Remember, index entries for a clustered index contain all of the column values, so moving an entry is comparable to executing a DELETE statement followed by an INSERT, which can obviously cause performance problems if done often. For this reason, clustered indexes are often found on primary or foreign key columns. Key values will rarely, if ever, change.

Conclusion

Determining the correct indexes to use in a database requires careful analysis, benchmarking, and testing. The rules of thumb presented in this article are general guidelines. After applying these principals you need to retest your specific application in your specific environment of hardware, memory, and concurrent activity. See my previous article: SQL Server Indexes, for a more thorough introduction.

CREATE PARTITION FUNCTION

CREATE PARTITION FUNCTION (Transact-SQL)

Creates a function in the current database that maps the rows of a table or index into partitions based on the values of a specified column. Using CREATE PARTITION FUNCTION is the first step in creating a partitioned table or index.

CREATE PARTITION FUNCTION partition_function_name ( input_parameter_type )
AS RANGE [ LEFT | RIGHT ]
FOR VALUES ( [ boundary_value [ ,...n ] ] )
[ ; ]

partition_function_name
Is the name of the partition function. Partition function names must be unique within the database and comply with the rules for identifiers.
input_parameter_type
Is the data type of the column used for partitioning. All data types are valid for use as partitioning columns, except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types.The actual column, known as a partitioning column, is specified in the CREATE TABLE or CREATE INDEX statement.
boundary_value
Specifies the boundary values for each partition of a partitioned table or index that uses partition_function_name. If boundary_value is empty, the partition function maps the whole table or index using partition_function_name into a single partition. Only one partitioning column, specified in a CREATE TABLE or CREATE INDEX statement, can be used.boundary_value is a constant expression that can reference variables. This includes user-defined type variables, or functions and user-defined functions. It cannot reference Transact-SQL expressions. boundary_value must either match or be implicitly convertible to the data type supplied in input_parameter_type, and cannot be truncated during implicit conversion in a way that the size and scale of the value does not match that of its corresponding input_parameter_type.

If boundary_value consists of datetime or smalldatetime literals, these literals are evaluated assuming that us_english is the session language. This behavior is deprecated. To make sure the partition function definition behaves as expected for all session languages, we recommend that you use constants that are interpreted the same way for all language settings, such as the yyyymmdd format; or explicitly convert literals to a specific style. For more information, see Writing International Transact-SQL Statements. To determine the language session of your server, run SELECT @@LANGUAGE.

…n
Specifies the number of values supplied by boundary_value, not to exceed 999. The number of partitions created is equal to n + 1. The values do not have to be listed in order. If the values are not in order, the Database Engine sorts them, creates the function, and returns a warning that the values are not provided in order. The Database Engine returns an error if n includes any duplicate values.
LEFT | RIGHT
Specifies to which side of each boundary value interval, left or right, the boundary_value [ ,…n ] belongs, when interval values are sorted by the Database Engine in ascending order from left to right. If not specified, LEFT is the default. For more information, see Examples.
The scope of a partition function is limited to the database that it is created in. Within the database, partition functions reside in a separate namespace from the other functions.Any rows whose partitioning column has null values are placed in the left-most partition, unless NULL is specified as a boundary value and RIGHT is indicated. In this case, the left-most partition is an empty partition, and NULL values are placed in the following partition.
Any one of the following permissions can be used to execute CREATE PARTITION FUNCTION:

  • ALTER ANY DATASPACE permission. This permission defaults to members of the sysadmin fixed server role and the db_owner and db_ddladmin fixed database roles.
  • CONTROL or ALTER permission on the database in which the partition function is being created.
  • CONTROL SERVER or ALTER ANY DATABASE permission on the server of the database in which the partition function is being created.

A. Creating a RANGE LEFT partition function on an int column

The following partition function will partition a table or index into four partitions.

CREATE PARTITION FUNCTION myRangePF1 (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 <= 1 col1 > 1 AND col1 <= 100 col1 > 100 AND col1 <= 1000 col1 > 1000

B. Creating a RANGE RIGHT partition function on an int column

The following partition function uses the same values for boundary_value [ ,…n ] as the previous example, except it specifies RANGE RIGHT.

CREATE PARTITION FUNCTION myRangePF2 (int)
AS RANGE RIGHT FOR VALUES (1, 100, 1000);

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 < 1 col1 >= 1 AND col1 < 100 col1 >= 100 AND col1 < 1000 col1 >= 1000

C. Creating a RANGE RIGHT partition function on a datetime column

The following partition function partitions a table or index into 12 partitions, one for each month of a year’s worth of values in a datetime column.

CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES ('20030201', '20030301', '20030401',
               '20030501', '20030601', '20030701', '20030801',
               '20030901', '20031001', '20031101', '20031201');

The following table shows how a table or index that uses this partition function on partitioning column datecol would be partitioned.

Partition 1 2 11 12
Values datecol < February 1, 2003 datecol >= February 1, 2003 AND datecol < March 1, 2003 datecol >= November 1, 2003 AND col1 < December 1, 2003 col1 >= December 1, 2003

D. Creating a partition function on a char column

The following partition function partitions a table or index into four partitions.

CREATE PARTITION FUNCTION myRangePF3 (char(20))
AS RANGE RIGHT FOR VALUES ('EX', 'RXE', 'XR');

The following table shows how a table that uses this partition function on partitioning column col1 would be partitioned.

Partition 1 2 3 4
Values col1 < EX col1 >= EX AND col1 < RXE col1 >= RXE AND col1 < XR col1 >= XR