Tech Splurge: SQL

Showing posts with label SQL. Show all posts

Friday, 20 February 2009

70-431 Hints and Tips

I passed 70-431 (TS: Microsoft SQL Server 2005 - Implementation and Maintenance) a while ago and it’s shamefully taken months to get round to blogging the tips. As ever, the general tips are read a book, do some sample exams, fill the gaps from MSDN, do the exam and good luck. More specific tips are

The Book
As I’ve said before the Microsoft Press books are of an extremely variable quality. In this case the book is pretty good and covers the right stuff. You’ll still need to plug the gaps from MSDN.

The Simulation
This is the big tip. Unlike every other MS exam I’ve done this one has a simulation of some typical database tasks at the end. The simulation mocks up the Database Management console and asks you to click around to complete the different tasks. Make sure you can do the simple stuff such as

Perform a full backup.
Perform a differential backup.
Perform a transaction log backup
Restore a database
Create a login and user for a database
Access and run a profile

This list isn’t exhaustive but is the type of things that could be on the sim.

The Links
Below are the links that I used to supplement the Microsoft books. Some of these are pretty dry reading and again this isn’t exhaustive.

Installing and Configuring SQL Server 2005
Returning the login name associated with a security identification number (SID)
Ownership Chains

Maintaining Databases
Backup T-SQL
DBCC CHECKDB
Creating full DB backup
RESTORE T-SQL
Dedicated Administrator Connection

Creating and Implementing Database Objects
Creating Certificates
Rollup T-SQL
Xml Indexes
Auto Mode for Xml Queries
User Defined Types T-SQL
Indexed Views

Monitoring and Troubleshooting SQL Server Performance
Sp_addmessage

Supporting Data Consumers
SQL Server Browser Service
Server Broker Contract
Service Broker Endpoints
Service Broker Endpoints with Kerebos
Endpoint authentication

And of course best of luck if you’re taking the exam.

Saturday, 24 January 2009

Estimating Lines of Code

I’ve always been a bit puzzled on why people would use lines of code to estimate projects. It seems that estimating (more realistically guessing) lines of code (LOC) to estimate effort and schedule is way harder that splitting the project into jobs and undertaking an estimate for each job. However reading Steve McConnell’s Rapid Development book has changed my mind.

In it he has charts linking lines of code to effort over a range of criteria. So if you know the LOC for your projects you can plug them into the tables and see how you projects measure up. Moreover you could get lines of code for a project to build a historic set of data tailored to your team to take some of the guess work out of that blackest of all arts – software estimation.

Which got me thinking of how you would quickly get non-commented, non-whitespace lines of code for a Web Application written it ASP.Net

C#
Count the semicolons C#.

CSS
I would count the number of {

HTML
LOC are more difficult here. Possibility to count open angled brackets (<) and maybe divide by 5 – making the assumption that there is 5 tags per line JavaScript and VB.Net
Semicolons doesn’t work so well here as they are optional in the code. You’d need to be a bit cleverer and count the number of carriage returns without comments and with a length of more then one when been trimmed

OR

Quick and dirty: arbitrarily estimate the number of characters per line then whip through all the files and could the characters and divide by the per line estimate.

SQL stored procedures and functions
Like JavaScript you’d be looking at using carriage returns or using the quick and dirty method

Other elements.
For example SQL reports, Data extract components etc…
Can’t do anything with these so get the back of the envelope out and estimate what fraction of the project was spend on these activities. Multiple the LOC by this ratio.

In the best of all worlds we would all use function points or some kind of automated formula based approach but I moved out of the best of all worlds sometime ago and sadly my software development occurs in the dirty old real world. In this world one frequently doesn’t have time to spend half a day calculating historical effort but you’ll always have the opportunity to spend a sneaky 15 minutes seeing how many LOC were wrote and how long it took. It’s always good to marvel at how much work you’ve done and it may even make software estimation a slightly less black art in the future.

Saturday, 10 January 2009

T-SQL Random Data Generator

I always think that creating a component to generate random data for a table in SQL Server should be easy but it never is. I’ve written a few windows components in the past to do this but I always leave them in random folders and lose them in my sock draw. So I’ve spend a bit of time writing a stored procedure that will do the job – so I can keep it associated with the database du jour and not lose it. The proc isn’t perfect but then neither am I.

What it does

Will output an insert T-SQL statement for any named table in the database.
Only tested on SQL Server 2005. The INFORMATION_SCHEMA use should make it forwardly compatible but doesn’t do backwards compatibly any favours.
Detects foreign keys and inserts and makes a random but valid link to the child table (this was the hard part)
Detect different data types and do the necessary SQL formatting
Accounts for max character length in string type fields
Accounts for identity columns
Detects nullable columns and leaves them null

What it doesn’t do

Doesn’t account for unique indexes
There are omitted datatypes (cursor, timestamp, uniqueidentifier, binary, varbinary, image)

Horrible hacks
Where the foreign key links where linked to child table via a unique index rather than a primary key it assumes a foreign key of 1. The INFORMATION_SCHEMA views don’t include index information. So this would necessitate delving into the sys.index table. Not impossible but the proc was complicated enough as it was. I didn’t need it so I didn’t do it.

Assumptions

Foreign keys are of a numeric type i.e. int, float et. al.
Foreign key constraints exist on the database. It isn’t magic – it can’t work out the relationships if they don’t exist.



Create Procedure usp_RandomInsertGenerator
(
 @TableName nvarchar(128)
)
As

Begin

 Set NoCount On

 Declare @ColumnName nvarchar(128)
 Declare @FK_TableName nvarchar(128)
 Declare @FK_ColumnName nvarchar(128)
 Declare @ConstraintName nvarchar(128)
 Declare @DataType nvarchar(128)
 Declare @CharacterMaximumLength int
 Declare @Sql nvarchar(max)
 Declare @MaxValue int
 Declare @InsertValue nvarchar(200)
 Declare @SqlOutputFields nvarchar(max)
 Declare @SqlOutputValues nvarchar(max)
 Declare @FirstLoop bit
 Declare @IsIdentity bit
 Declare @StringInsert bit
 Declare @DummyText varchar(48)

 Set @FirstLoop = 1
 Set @SqlOutputFields = 'INSERT INTO ' + @TableName + ' ('
 Set @SqlOutputValues = ' VALUES ('
 Set @DummyText = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

 -- cursor loops through every column for named table
 Declare procCursor CURSOR FORWARD_ONLY
 For
 Select
  COLUMN_NAME,
  DATA_TYPE,
  CHARACTER_MAXIMUM_LENGTH
 From
  INFORMATION_SCHEMA.COLUMNS col
 Where
  TABLE_NAME = @TableName
  And IS_NULLABLE = 'NO'
 Order By
  ORDINAL_POSITION

 Open procCursor
 Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength

 while @@fetch_status <> -1
 Begin
  
  -- datatypes i haven't bothered implementing
  if @DataType = 'cursor' OR @DataType = 'timestamp'
   Or @DataType = 'uniqueidentifier' Or @DataType = 'binary'
   Or @DataType = 'varbinary' Or @DataType = 'image'
  Begin
   Raiserror('Unsupported Data Type', 1, 16)
  End

  --reset variables
  Set @FK_TableName = ''
  Set @FK_ColumnName = ''
  Set @StringInsert = 0
  Set @ConstraintName = ''
  
  -- Don't add in an insert value if the loop is an identity
  Select @IsIdentity = COLUMNPROPERTY(OBJECT_ID(@TableName),@ColumnName,'IsIdentity')
  if @IsIdentity = 1  
  Begin
   Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
   continue
  End
  
  -- getting the value to be inserted for this data type
  if @DataType = 'varchar' Or @DataType = 'nvarchar'
   Or @DataType = 'char' Or @DataType = 'nchar'
   Or @DataType = 'text' Or @DataType = 'ntext'
  Begin
   Set @StringInsert = 1
   Set @InsertValue = SubString(@DummyText, 1, @CharacterMaximumLength)
  End
  Else if @DataType = 'datetime' Or @DataType = 'smalldatetime'
  Begin
   Set @StringInsert = 1
   Set @InsertValue = Cast(GetDate() as varchar(20))
  End
  Else -- it is some key of numeric type
  Begin
   
   -- getting the child table indexes
   Set @Sql = '
   Select
    @FK_TableName = pkconst.TABLE_NAME,
    @FK_ColumnName = pkconst.COLUMN_NAME,
    @ConstraintName = coluse.CONSTRAINT_NAME
   From
   INFORMATION_SCHEMA.KEY_COLUMN_USAGE coluse
   Join INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS fkconst
     On fkconst.CONSTRAINT_NAME = coluse.CONSTRAINT_NAME
   Left Join INFORMATION_SCHEMA.KEY_COLUMN_USAGE pkconst
     On pkconst.CONSTRAINT_NAME = fkconst.UNIQUE_CONSTRAINT_NAME
   Where
    coluse.TABLE_NAME = @TableName
    And coluse.COLUMN_NAME = @ColumnName'


   Execute sp_executesql @Sql, N'@TableName nvarchar(128),
    @ColumnName nvarchar(128),
    @FK_TableName nvarchar(128) OUTPUT,
    @FK_ColumnName nvarchar(128) OUTPUT,
    @ConstraintName nvarchar(128) OUTPUT',
    @TableName=@TableName,
    @ColumnName=@ColumnName,
    @FK_TableName=@FK_TableName OUTPUT,
    @FK_ColumnName=@FK_ColumnName OUTPUT,
    @ConstraintName=@ConstraintName OUTPUT
   
   if Len(@FK_TableName) > 0 And Len(@FK_ColumnName) > 0
   Begin  

    /* have found foreign key and the lookup table
    so pick a random primary key from the lookup table */
    Set @Sql = 'Select top 1 @InsertValue = Cast(' + @FK_ColumnName +
       ' as varchar(200)) From '
       + @FK_TableName + ' Order By newid()'
    Execute sp_executesql @Sql, N'@InsertValue nvarchar(128) OUTPUT', @InsertValue=@InsertValue OUTPUT

   End
   Else if(Len(@ConstraintName) > 0)
   Begin
     
    /* OK we've found the foreign key constraint but have no idea what the
    lookup table is. This is because we are joining on a unique index to the
    lookup table not a primary key. Make a MASSIVE assumption in this instance
    - that the lookup table has a link value of 1 */
    Set @InsertValue = '1' 

   End
   Else
   Begin
    -- no foreign key so the max that can be inserted is based on the datatype
    -- don't bother with any thing greater than thirty thousand - big enough
    if @DataType = 'bit'
     Set @MaxValue = 1
    else if @DataType = 'tinyint'
     Set @MaxValue = 255
    else if @DataType = 'smallint'
     Set @MaxValue = 32767
    else
     Set @MaxValue = 32767
    
    -- randomly generate a number to insert up to maximum
    Set @InsertValue = Cast(ROUND(((@MaxValue - 1) * RAND() + 1), 0) as varchar(200)) 
    
   End
  End -- end of numeric processing
  
  -- building up output string
  Declare @Delimiter char(1)
  if @FirstLoop = 1
  Begin
   Set @FirstLoop = 0
   Set @Delimiter = ''
  End
  Else
  Begin
   Set @Delimiter = ','
  End

  Set @SqlOutputFields = @SqlOutputFields + @Delimiter + @ColumnName
  
  if @StringInsert = 1
  Begin
   Set @SqlOutputValues = @SqlOutputValues + @Delimiter + '''' + @InsertValue + ''''
  End
  Else
  Begin
   Set @SqlOutputValues = @SqlOutputValues + @Delimiter + @InsertValue  
  End

  Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
 End -- finished this column = go to next

 close procCursor
 deallocate procCursor

 -- outputting the sql string
 Set @SqlOutputFields = @SqlOutputFields + ')'
 Set @SqlOutputValues = @SqlOutputValues + ')'

 select @SqlOutputFields + ' ' + @SqlOutputValues

End 
Go

It’s very easy to create a wrapper sproc to get multiple statements.
One row of random data isn’t that much use – 10,000 might be.


Create Procedure usp_RandomInsertGeneratorWrapper
(
 @TableName nvarchar(128),
 @NumberOfInserts int
)
As
Begin
 
 Set NoCount On

 Declare @StatementCount int
 Set @StatementCount = 0

 While @StatementCount < @NumberOfInserts
 Begin
 
  exec usp_RandomInsertGenerator @TableName
  Set @StatementCount = @StatementCount + 1
 End

End

Room for improvement
Here’s some extensions that my be useful for individual projects

Implement missing datatypes. Things like uniqueidenifier would be very easy indeed to implement. It’s just that I didn’t need it.
Create custom inserts for know field names to give more realistic data for your application
Randomise the character data properly.

So that’s it. Live it, love it and fiddle about with it a bit.

Saturday, 6 December 2008

T-SQL Identity

This is probably as old as the hills but I still see this everywhere. What’s wrong with this line of T-SQL

Declare @PrimaryKeyVariable int
Set @PrimaryKeyVariable =  @@IDENTITY

Nothing until it gets deployed into a high volume usage situation and the number of users ramps up after a few months. Then it will blow up and someone will be in at 3a.m. desperately trying to restore the referential integrity of the database.

The problem is that it will set the variable to the most recently set identity flag in the database no matter who has done it. That might be you or it might be another connection – thus linking up/deleting/updating the wrong records. If you’re lucky the transaction will fail. If not then you’ll be digging around in those two hourly transactional backups to get it back. Oops.

And the answer is really straight forward. Just use this instead

Declare @PrimaryKeyVariable int
Set @ PrimaryKeyVariable = Scope_Identity()

The identity is now only taken from current connection. It’s scoped – just like mother always used to scope her identities. Lovely.

Saturday, 15 November 2008

ExecuteNonQuery Gotcha

Here’s something that made me roll my eyes this week. I’ve been trying to get the number of rows executed returned from SQL Server to an ASP.Net data layer. It promised to be straight forward but it had a cheeky little gotcha nestled in the middle of it. To take the simplest example for demo purposes (all error handling type stuff omitted)

The stored procedure


CREATE PROCEDURE usp_TestProc
      @PersonId int
AS
BEGIN
      Set Nocount On

      Update Person
      Set
     FirstName = 'Willy',
            Surname = 'Wonka'
      Where
            Id = @PersonId
END

The calling business layer


conn.Open();
cmdNonQuery = conn.CreateCommand();
cmdNonQuery.CommandType = CommandType.StoredProcedure;
cmdNonQuery.CommandText = “usp_TestProc”;
cmdNonQuery.Parameters.AddRange(sqlParameters);
numberOfRecords = cmdNonQuery.ExecuteNonQuery();

Logic (and the .Net API documentation) dictates that the ExecuteNonQuery method returns the number of rows. But it steadfastly returns -1 no matter what is occurring in the database.

So what is happening?? It turns out that the Nocount setting stops this information coming back. Removing the statement – or setting NoCount Off causes the ExecuteNonQuery method to behave as expected. It returns -1 if no rows affected otherwise the number of rows actually affected.

This would probably be too arcane to worry about except the SQL Server standard template for creating stored procedures includes the Set NoCount On statement. In other words – select ‘create new procedure from the SQL Server menu’ and you get this template


CREATE PROCEDURE <Procedure_Name, sysname, ProcedureName>
      -- Add the parameters for the stored procedure here
      <@Param1, sysname, @p1> <Datatype_For_Param1, , int> = <Default_Value_For_Param1, , 0>,
      <@Param2, sysname, @p2> <Datatype_For_Param2, , int> = <Default_Value_For_Param2, , 0>
AS
BEGIN
      -- SET NOCOUNT ON added to prevent extra result sets from
      -- interfering with SELECT statements.
      SET NOCOUNT ON;

    -- Insert statements for procedure here
      SELECT <@Param1, sysname, @p1>, <@Param2, sysname, @p2>
END
GO

Very lovely – except it contains the very code that disables the proc return values. Ouch.

Friday, 20 February 2009

70-431 Hints and Tips

Saturday, 24 January 2009

Estimating Lines of Code

Saturday, 10 January 2009

T-SQL Random Data Generator

Saturday, 6 December 2008

T-SQL Identity

Saturday, 15 November 2008

ExecuteNonQuery Gotcha

Tech Splurge

The Author

Labels

Blog Archive

Great Blogs Of Our Time