navigation
 Thursday, February 23, 2006

My colleague John recently blogged about the unexpected results of using the T-SQL IN operator on a set that contains NULL. Here is another solution to the problem.There are three main ways to test for existence in another table, listed here in order of increasing performance

  1. LEFT JOIN and look for NULLs in the right table
  2. Use the IN operator
  3. Use the EXISTS operator

In general, the EXISTS operator will generate the most efficient query plan, and it also isn't subject to the very NULL pitfall that plagues the IN operator. The query that John posted:

SELECT COUNT(*)
FROM TableTarget
WHERE PrimaryKeyField NOT IN (
  SELECT ForeignKeyField
  FROM TableSource
  WHERE ForeignKeyField IS NOT NULL
)

Could be writtten like this:

SELECT COUNT(*)
FROM TableTarget t
WHERE NOT EXISTS (
  SELECT *
  FROM TableSource
  WHERE ForeignKeyField = t.PrimaryKeyField
)

posted on February 23, 2006  #    by Adam Anderson  Comments [0] Trackback
 Wednesday, February 22, 2006

Danc of Lost Garden has written many interesting and thought-provoking articles, mostly about game design. Today's article, however, is wider in scope, addressing the evolution of product design, and is well worth the read.

posted on February 22, 2006  #    by Adam Anderson  Comments [0] Trackback
 Saturday, February 18, 2006

Krugle is a search engine for source code that is supposed to make life easier for us developers. You can sign up for the Beta on their website and try it out sometime next month.  http://www.krugle.com/

posted on February 18, 2006  #    by Mike Dugan  Comments [0] Trackback
 Monday, February 06, 2006

Is Courier New just not doing it for you anymore? This page summarizes and rates different fixed-width fonts based on their suitability for use in a programming environment. The reviewer evaluates several factors including how easily the characters 'l', '1', and 'i' are distinguised, as well as the characters '0', 'o', and 'O'. After browsing several of the handy font previews, simply follow the associated link to download your new favorite programming font. The top-rated font, Bitstream Sans Vera Mono, looks great without ClearType and awful with it, so if you have font smoothing enabled, I recommend you look further down the list for something that plays better with ClearType

posted on February 6, 2006  #    by Adam Anderson  Comments [0] Trackback
 Friday, January 27, 2006

Being able to compare the results of two different SQL queries and verify that they match 100% is a useful ability for any database application developer. I've encountered two main reasons for wanting to compare query results in the course of development:

  1. After importing data from an older system, verifying that equivalent reports in the old system and the new one agree
  2. When refactoring a complex query, to verify that the changes made haven't altered the output

In both cases, I find it convenient to run data comparisons of queries run in SQL Server Query Analyzer by using MS Excel. The technique is simple, but the payoff is huge: 100% confidence that every row and every column matches. Here are the steps I use:

  1. Create a new 3-page workbook and rename the tabs "Old", "New", and "Compare" 
  2. Hand-enter column names in the top row of each worksheet. I like to make them bold and underlined so Excel can tell that they're headers. 
     
  3. Run the "old" query in Query Analyzer. Once the results come back, click in the result pane and select all using Ctrl+A or Edit | Select All
  4. In the "Old" worksheet, select the leftmost cell in the row under the header and paste the results.
  5. Repeat the same steps for the "New" query and worksheet.
  6. If the queries don't explicitly set their own sort order, I recommend sorting both the "Old" and "New" data in a way that should guarantee that they are ordered the same on the two worksheets.
  7. In the "Compare" worksheet, enter the following formula in the leftmost cell under the header row: =Old!A2=New!A2. This formula will return TRUE if the cells match, and FALSE if they don't. 
  8. To make mismatches easier to spot, I like to apply some conditional formatting as well.

    1. Select the cell and then select Format | Conditional Formatting...
    2. Set up the condition so that it reads "Cell Value Is" "equal to" "FALSE"
    3. Click the Format... button. On the Patterns tab, select a nice bright color that will stand out, like red. Click OK.
    4. Click OK on the Conditional Formatting dialog
  9. Select the span from this initial cell to the rightmost column in the result set and press Ctrl+R or Edit | Fill | Right.
  10. Check the "Old" and "New" worksheets and note the number of the last row containing query results.
  11. In the "Compare" worksheet, select all the cells that contain data in the same row and column on the other two sheets and press Ctrl+D or Edit | Fill | Down.
  12. Differences in data will result in a value of FALSE (optionally highlighted if you used conditional formatting). The values can be scanned for visually, or you can use the Find dialog to search for them. When using the Find dialog, make sure to click the Options button and change "Look in" from "Formulas" to "Values" 
  13. Once differences are found, the old and new data can be examined to determine the cause for the difference.
posted on January 27, 2006  #    by Adam Anderson  Comments [0] Trackback

The MSDN Library says this about the DataGrid.Items property:

"Only items bound to the data source are contained in the Items collection. The header, footer, and separator are not included in the collection."

So how do we get to these other items? Most people handle the ItemCommandEvent for the grid, but there is a way to access them directly.

If these items are not in the DataGrid.Items collection, then where are they? To find out, turn on tracing for your ASP.NET webpage. You will see that the grid is rendered something like this:

DataGrid
   DataGridTable
      DataGridItem
         TableCell
         TableCell
      DataGridItem
         TableCell
         LiteralControl
         Label
      DataGridItem
      ...

What we are seeing is that the first object in the DataGrid's Control hierarchy is a DataGridTable. That DataGridTable contains ALL of the DataGridItems, including the header and footer. To get these DataGridItems we just need to grab the first or last control out of the DataGridTable's control collection.

So to get a DataGrid's footer, this code will do this trick:

//First get the DataGridTable (the first control in DataGrid's control collection.)
//Then grab the last control in the DataTable's Collection
DataGridItem footer = 
    DataGrid1.Controls[0].Controls[DataGrid1.Controls[0].Controls.Count -1] as DataGridItem;
posted on January 27, 2006  #    by Mike Dugan  Comments [0] Trackback
 Friday, January 20, 2006

I love working with MS SQL Server. Oh sure, Oracle is supposedly faster and all that jazz, but when it comes to rapid development, MSSQL's ease of use just can't be beat. However, there are still some things to watch out for. Today I want to warn you all off of using UNION.

Conceptually, there's nothing wrong with UNION. It takes two result sets with congruent columns and returns them as a single result set. There are two potential performance issues, however, one of which can be quite serious.

The first problem is that the UNION operator automatically performs a DISTINCT operation on the combined result set. If it's unnecessary, or worse, if it is necessary, but you've also used SELECT DISTINCT in one or both of the UNIONed queries, then UNION is causing a loss of performance for nothing. The workaround to this is simple; use UNION ALL instead.

The second problem can be far worse. I've run into situations many times where the UNION (or UNION ALL) operator caused inexplicably poor performance. In one recent scenario, I used UNION ALL to merge the results of two queries, each of which ran in about 7 seconds. When merged using UNION ALL, query execution time leapt up to 1:20. Neither the estimated execution plan nor the actual one reflected this; the total subtree cost was exactly the same as the sum of the two queries' costs when run separately. The fix is a little more complex, but extremely effective.

My solution to the problem is to use a table variable. In general, I recommend the use of table variables over temp tables unless you absolutely need a feature that temp tables support but table variables do not.

Example

The example below is not intended to demonstrate good database design; it merely serves to illustrate the solution. Given the original query:

SELECT FirstName, LastName, Salary
FROM Employees
UNION ALL
SELECT FirstName, LastName, Salary
FROM Managers

First declare a table variable with columns that match that of the result set:

DECLARE @Result TABLE (
  FirstName varchar(50),
  LastName varchar(50),
  Salary money,
  -- Example of how to declare a PK within a table variable
  PRIMARY KEY ( LastName, FirstName )
)

Next, insert each query into the result table:

INSERT @Result
SELECT FirstName, LastName, Salary
FROM Employee

INSERT @Result
SELECT FirstName, LastName, Salary
FROM Manager

Finally, select the results:

SELECT *
FROM @Result

This workaround will not only avoid potential performance problems with UNION, but will actually allow you to improve join performance through the definition of a good primary key, if the UNION result is joined to other data.

posted on January 20, 2006  #    by Adam Anderson  Comments [1] Trackback
 Monday, January 16, 2006
DPack is a collection of free tools that extend the VS IDE. I originally got it because of the Delphi keyboard shortcut scheme, but I've come to appreciate for many of its other features, especially Surround With, which surrounds selected code with many different kinds of code constructs, and the Code Browser, which incrementally searches code for members matching a filter expression. Get it at http://www.usysware.com/dpack/
posted on January 16, 2006  #    by Adam Anderson  Comments [0] Trackback
 Friday, January 06, 2006

Add vertical guidelines to Visual Studio by adding a simple registry key to this location:

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\7.1\Text Editor]

Now, add a string value named "Guides".

Set its value to something like RGB(128,0,0) 80, 100

The setting above will draw a Red guideline in Visual Studio at column 80 and 100. You can  have up to 13 guidelines. Adjust the values as you see fit.

Giving credit where credit is due, I believe Sara Ford first blogged about this about a year ago.

posted on January 6, 2006  #    by Mike Dugan  Comments [0] Trackback