Showing posts with label SQL. Show all posts
Showing posts with label SQL. Show all posts

Friday, 20 February 2009

70-431 Hints and Tips

I passed 70-431 (TS: Microsoft SQL Server 2005 - Implementation and Maintenance) a while ago and it’s shamefully taken months to get round to blogging the tips. As ever, the general tips are read a book, do some sample exams, fill the gaps from MSDN, do the exam and good luck. More specific tips are

The Book
As I’ve said before the Microsoft Press books are of an extremely variable quality. In this case the book is pretty good and covers the right stuff. You’ll still need to plug the gaps from MSDN.

The Simulation
This is the big tip. Unlike every other MS exam I’ve done this one has a simulation of some typical database tasks at the end. The simulation mocks up the Database Management console and asks you to click around to complete the different tasks. Make sure you can do the simple stuff such as
  1. Perform a full backup.
  2. Perform a differential backup.
  3. Perform a transaction log backup
  4. Restore a database
  5. Create a login and user for a database
  6. Access and run a profile

This list isn’t exhaustive but is the type of things that could be on the sim.

The Links
Below are the links that I used to supplement the Microsoft books. Some of these are pretty dry reading and again this isn’t exhaustive.

Installing and Configuring SQL Server 2005
Returning the login name associated with a security identification number (SID)
Ownership Chains

Maintaining Databases
Backup T-SQL
DBCC CHECKDB
Creating full DB backup
RESTORE T-SQL
Dedicated Administrator Connection

Creating and Implementing Database Objects
Creating Certificates
Rollup T-SQL
Xml Indexes
Auto Mode for Xml Queries
User Defined Types T-SQL
Indexed Views

Monitoring and Troubleshooting SQL Server Performance
Sp_addmessage

Supporting Data Consumers
SQL Server Browser Service
Server Broker Contract
Service Broker Endpoints
Service Broker Endpoints with Kerebos
Endpoint authentication

And of course best of luck if you’re taking the exam.

Saturday, 24 January 2009

Estimating Lines of Code

I’ve always been a bit puzzled on why people would use lines of code to estimate projects. It seems that estimating (more realistically guessing) lines of code (LOC) to estimate effort and schedule is way harder that splitting the project into jobs and undertaking an estimate for each job. However reading Steve McConnell’s Rapid Development book has changed my mind.

In it he has charts linking lines of code to effort over a range of criteria. So if you know the LOC for your projects you can plug them into the tables and see how you projects measure up. Moreover you could get lines of code for a project to build a historic set of data tailored to your team to take some of the guess work out of that blackest of all arts – software estimation.

Which got me thinking of how you would quickly get non-commented, non-whitespace lines of code for a Web Application written it ASP.Net



C#
Count the semicolons C#.

CSS
I would count the number of {

HTML
LOC are more difficult here. Possibility to count open angled brackets (<) and maybe divide by 5 – making the assumption that there is 5 tags per line JavaScript and VB.Net
Semicolons doesn’t work so well here as they are optional in the code. You’d need to be a bit cleverer and count the number of carriage returns without comments and with a length of more then one when been trimmed

OR

Quick and dirty: arbitrarily estimate the number of characters per line then whip through all the files and could the characters and divide by the per line estimate.

SQL stored procedures and functions
Like JavaScript you’d be looking at using carriage returns or using the quick and dirty method

Other elements.
For example SQL reports, Data extract components etc…
Can’t do anything with these so get the back of the envelope out and estimate what fraction of the project was spend on these activities. Multiple the LOC by this ratio.



In the best of all worlds we would all use function points or some kind of automated formula based approach but I moved out of the best of all worlds sometime ago and sadly my software development occurs in the dirty old real world. In this world one frequently doesn’t have time to spend half a day calculating historical effort but you’ll always have the opportunity to spend a sneaky 15 minutes seeing how many LOC were wrote and how long it took. It’s always good to marvel at how much work you’ve done and it may even make software estimation a slightly less black art in the future.

Saturday, 10 January 2009

T-SQL Random Data Generator

I always think that creating a component to generate random data for a table in SQL Server should be easy but it never is. I’ve written a few windows components in the past to do this but I always leave them in random folders and lose them in my sock draw. So I’ve spend a bit of time writing a stored procedure that will do the job – so I can keep it associated with the database du jour and not lose it. The proc isn’t perfect but then neither am I.

What it does
  1. Will output an insert T-SQL statement for any named table in the database.
  2. Only tested on SQL Server 2005. The INFORMATION_SCHEMA use should make it forwardly compatible but doesn’t do backwards compatibly any favours.
  3. Detects foreign keys and inserts and makes a random but valid link to the child table (this was the hard part)
  4. Detect different data types and do the necessary SQL formatting
  5. Accounts for max character length in string type fields
  6. Accounts for identity columns
  7. Detects nullable columns and leaves them null

What it doesn’t do
  1. Doesn’t account for unique indexes
  2. There are omitted datatypes (cursor, timestamp, uniqueidentifier, binary, varbinary, image)

Horrible hacks
Where the foreign key links where linked to child table via a unique index rather than a primary key it assumes a foreign key of 1. The INFORMATION_SCHEMA views don’t include index information. So this would necessitate delving into the sys.index table. Not impossible but the proc was complicated enough as it was. I didn’t need it so I didn’t do it.

Assumptions
  1. Foreign keys are of a numeric type i.e. int, float et. al.
  2. Foreign key constraints exist on the database. It isn’t magic – it can’t work out the relationships if they don’t exist.


Create Procedure usp_RandomInsertGenerator
(
@TableName nvarchar(128)
)
As

Begin

Set NoCount On

Declare @ColumnName nvarchar(128)
Declare @FK_TableName nvarchar(128)
Declare @FK_ColumnName nvarchar(128)
Declare @ConstraintName nvarchar(128)
Declare @DataType nvarchar(128)
Declare @CharacterMaximumLength int
Declare @Sql nvarchar(max)
Declare @MaxValue int
Declare @InsertValue nvarchar(200)
Declare @SqlOutputFields nvarchar(max)
Declare @SqlOutputValues nvarchar(max)
Declare @FirstLoop bit
Declare @IsIdentity bit
Declare @StringInsert bit
Declare @DummyText varchar(48)

Set @FirstLoop = 1
Set @SqlOutputFields = 'INSERT INTO ' + @TableName + ' ('
Set @SqlOutputValues = ' VALUES ('
Set @DummyText = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

-- cursor loops through every column for named table
Declare procCursor CURSOR FORWARD_ONLY
For
Select
COLUMN_NAME,
DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH
From
INFORMATION_SCHEMA.COLUMNS col
Where
TABLE_NAME = @TableName
And IS_NULLABLE = 'NO'
Order By
ORDINAL_POSITION

Open procCursor
Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength

while @@fetch_status <> -1
Begin

-- datatypes i haven't bothered implementing
if @DataType = 'cursor' OR @DataType = 'timestamp'
Or @DataType = 'uniqueidentifier' Or @DataType = 'binary'
Or @DataType = 'varbinary' Or @DataType = 'image'
Begin
Raiserror('Unsupported Data Type', 1, 16)
End

--reset variables
Set @FK_TableName = ''
Set @FK_ColumnName = ''
Set @StringInsert = 0
Set @ConstraintName = ''

-- Don't add in an insert value if the loop is an identity
Select @IsIdentity = COLUMNPROPERTY(OBJECT_ID(@TableName),@ColumnName,'IsIdentity')
if @IsIdentity = 1
Begin
Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
continue
End

-- getting the value to be inserted for this data type
if @DataType = 'varchar' Or @DataType = 'nvarchar'
Or @DataType = 'char' Or @DataType = 'nchar'
Or @DataType = 'text' Or @DataType = 'ntext'
Begin
Set @StringInsert = 1
Set @InsertValue = SubString(@DummyText, 1, @CharacterMaximumLength)
End
Else if @DataType = 'datetime' Or @DataType = 'smalldatetime'
Begin
Set @StringInsert = 1
Set @InsertValue = Cast(GetDate() as varchar(20))
End
Else -- it is some key of numeric type
Begin

-- getting the child table indexes
Set @Sql = '
Select
@FK_TableName = pkconst.TABLE_NAME,
@FK_ColumnName = pkconst.COLUMN_NAME,
@ConstraintName = coluse.CONSTRAINT_NAME
From
INFORMATION_SCHEMA.KEY_COLUMN_USAGE coluse
Join INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS fkconst
On fkconst.CONSTRAINT_NAME = coluse.CONSTRAINT_NAME
Left Join INFORMATION_SCHEMA.KEY_COLUMN_USAGE pkconst
On pkconst.CONSTRAINT_NAME = fkconst.UNIQUE_CONSTRAINT_NAME
Where
coluse.TABLE_NAME = @TableName
And coluse.COLUMN_NAME = @ColumnName'


Execute sp_executesql @Sql, N'@TableName nvarchar(128),
@ColumnName nvarchar(128),
@FK_TableName nvarchar(128) OUTPUT,
@FK_ColumnName nvarchar(128) OUTPUT,
@ConstraintName nvarchar(128) OUTPUT',
@TableName=@TableName,
@ColumnName=@ColumnName,
@FK_TableName=@FK_TableName OUTPUT,
@FK_ColumnName=@FK_ColumnName OUTPUT,
@ConstraintName=@ConstraintName OUTPUT

if Len(@FK_TableName) > 0 And Len(@FK_ColumnName) > 0
Begin

/* have found foreign key and the lookup table
so pick a random primary key from the lookup table */
Set @Sql = 'Select top 1 @InsertValue = Cast(' + @FK_ColumnName +
' as varchar(200)) From '
+ @FK_TableName + ' Order By newid()'
Execute sp_executesql @Sql, N'@InsertValue nvarchar(128) OUTPUT', @InsertValue=@InsertValue OUTPUT

End
Else if(Len(@ConstraintName) > 0)
Begin

/* OK we've found the foreign key constraint but have no idea what the
lookup table is. This is because we are joining on a unique index to the
lookup table not a primary key. Make a MASSIVE assumption in this instance
- that the lookup table has a link value of 1 */
Set @InsertValue = '1'

End
Else
Begin
-- no foreign key so the max that can be inserted is based on the datatype
-- don't bother with any thing greater than thirty thousand - big enough
if @DataType = 'bit'
Set @MaxValue = 1
else if @DataType = 'tinyint'
Set @MaxValue = 255
else if @DataType = 'smallint'
Set @MaxValue = 32767
else
Set @MaxValue = 32767

-- randomly generate a number to insert up to maximum
Set @InsertValue = Cast(ROUND(((@MaxValue - 1) * RAND() + 1), 0) as varchar(200))

End
End -- end of numeric processing

-- building up output string
Declare @Delimiter char(1)
if @FirstLoop = 1
Begin
Set @FirstLoop = 0
Set @Delimiter = ''
End
Else
Begin
Set @Delimiter = ','
End

Set @SqlOutputFields = @SqlOutputFields + @Delimiter + @ColumnName

if @StringInsert = 1
Begin
Set @SqlOutputValues = @SqlOutputValues + @Delimiter + '''' + @InsertValue + ''''
End
Else
Begin
Set @SqlOutputValues = @SqlOutputValues + @Delimiter + @InsertValue
End

Fetch Next From procCursor Into @ColumnName, @DataType, @CharacterMaximumLength
End -- finished this column = go to next

close procCursor
deallocate procCursor

-- outputting the sql string
Set @SqlOutputFields = @SqlOutputFields + ')'
Set @SqlOutputValues = @SqlOutputValues + ')'

select @SqlOutputFields + ' ' + @SqlOutputValues

End
Go


It’s very easy to create a wrapper sproc to get multiple statements.
One row of random data isn’t that much use – 10,000 might be.



Create Procedure usp_RandomInsertGeneratorWrapper
(
@TableName nvarchar(128),
@NumberOfInserts int
)
As
Begin

Set NoCount On

Declare @StatementCount int
Set @StatementCount = 0

While @StatementCount < @NumberOfInserts
Begin

exec usp_RandomInsertGenerator @TableName
Set @StatementCount = @StatementCount + 1
End

End

Room for improvement
Here’s some extensions that my be useful for individual projects
  1. Implement missing datatypes. Things like uniqueidenifier would be very easy indeed to implement. It’s just that I didn’t need it.
  2. Create custom inserts for know field names to give more realistic data for your application
  3. Randomise the character data properly.
So that’s it. Live it, love it and fiddle about with it a bit.

Saturday, 6 December 2008

T-SQL Identity

This is probably as old as the hills but I still see this everywhere. What’s wrong with this line of T-SQL
Declare @PrimaryKeyVariable int
Set @PrimaryKeyVariable = @@IDENTITY
Nothing until it gets deployed into a high volume usage situation and the number of users ramps up after a few months. Then it will blow up and someone will be in at 3a.m. desperately trying to restore the referential integrity of the database.

The problem is that it will set the variable to the most recently set identity flag in the database no matter who has done it. That might be you or it might be another connection – thus linking up/deleting/updating the wrong records. If you’re lucky the transaction will fail. If not then you’ll be digging around in those two hourly transactional backups to get it back. Oops.

And the answer is really straight forward. Just use this instead
Declare @PrimaryKeyVariable int
Set @ PrimaryKeyVariable = Scope_Identity()
The identity is now only taken from current connection. It’s scoped – just like mother always used to scope her identities. Lovely.

Saturday, 15 November 2008

ExecuteNonQuery Gotcha

Here’s something that made me roll my eyes this week. I’ve been trying to get the number of rows executed returned from SQL Server to an ASP.Net data layer. It promised to be straight forward but it had a cheeky little gotcha nestled in the middle of it. To take the simplest example for demo purposes (all error handling type stuff omitted)

The stored procedure

CREATE PROCEDURE usp_TestProc
@PersonId int
AS
BEGIN
Set Nocount On

Update Person
Set
FirstName = 'Willy',
Surname = 'Wonka'
Where
Id = @PersonId
END

The calling business layer

conn.Open();
cmdNonQuery = conn.CreateCommand();
cmdNonQuery.CommandType = CommandType.StoredProcedure;
cmdNonQuery.CommandText = “usp_TestProc”;
cmdNonQuery.Parameters.AddRange(sqlParameters);
numberOfRecords = cmdNonQuery.ExecuteNonQuery();
 

Logic (and the .Net API documentation) dictates that the ExecuteNonQuery method returns the number of rows. But it steadfastly returns -1 no matter what is occurring in the database.

So what is happening?? It turns out that the Nocount setting stops this information coming back. Removing the statement – or setting NoCount Off causes the ExecuteNonQuery method to behave as expected. It returns -1 if no rows affected otherwise the number of rows actually affected.

This would probably be too arcane to worry about except the SQL Server standard template for creating stored procedures includes the Set NoCount On statement. In other words – select ‘create new procedure from the SQL Server menu’ and you get this template

CREATE PROCEDURE <Procedure_Name, sysname, ProcedureName>
-- Add the parameters for the stored procedure here
<@Param1, sysname, @p1> <Datatype_For_Param1, , int> = <Default_Value_For_Param1, , 0>,
<@Param2, sysname, @p2> <Datatype_For_Param2, , int> = <Default_Value_For_Param2, , 0>
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
SELECT <@Param1, sysname, @p1>, <@Param2, sysname, @p2>
END
GO

Very lovely – except it contains the very code that disables the proc return values. Ouch.