Tuesday, November 29

Test Driven Development with SQL Server Databases

Test driven development is easy when you're programming in a language such as C# or Java. Tools such as NUnit are available to help you write tests that can be run regularly and that ensure that changes you make to your application do not break other things.

However, when much of your business logic is in the form of SQL Server 2000 stored procedures, unit testing can be more problematical, and testing tends to be done on a more ad-hoc basis. Subtle changes in table or view structures can break things very easily.

To simplify testing, and allow Test Driven Development, where tests are written first, I propose creating a test stored procedure for each real stored procedure as follows:

1) Begin a transaction
2) Set up any initial data
3) Run the stored procedure
4) Verify the data in the database
5) Display a success or failure message
6) Rollback the transaction
7) Return a value for success or failure

It'll be easy to write a GUI front end to run all the test procedures and display red or green bars according to which tests pass according to their return value. As tests are all individual transactions that are rolled back whether they succeed or fail, the database will end up in the same state as it was when it started.

Monday, November 28

Word automation - replace a placeholder with a file

There appears to be surprisingly little in the way of tutorials on the web on the subject of controlling Word from a Windows application.

I was trying to get Word to load a document and replace the string "<>" with another document. The first document was a standard template, and the second document was generated by my application. I followed a 4 step process for this:

1) Generate a word macro
2) Look up any constants
3) Copy this to C# code, and add a reference to your word application.
4) Create reference objects for any function parameters.
5) Encapsulate this in a C# function.

1) Generating a word macro

The best way I found to get started with this was to record a macro and view the source of the macro. This gives you a clue as to the names of the required functions. These didn't prove easy to find out any other way.


With Selection.Find
.Text = "searchtext"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.InsertFile
FileName:="myfile.doc", _
Range:="", ConfirmConversions:=False,
Link:=False, Attachment:=False


2) Looking up constants

It was then necessary to convert this into C# code. The first thing was to find out what the value of wdFindAsk was - entering this into a search engine gave a page that listed its value as zero.

I'd already set up a Word Automation class and opened the document using the following code:


wordApp = new Word.Application();
object o = filename;
oDoc = wordApp.Documents.Open(ref o,
ref missing,ref missing,ref missing,
ref missing,ref missing,ref missing,
ref missing,ref missing,ref missing,
ref missing,ref missing,ref missing,
ref missing,ref missing);
wordApp.Visible = true;


3) Converting the macro to C#

To change the macro into C#, I noted the macro first set up the Selection.Find object, and then called the Execute function.

Setting up the selection.Find object meant adding wordApp before the word selection. This became


wordApp.Selection.Find.Text = placeholder;
wordApp.Selection.Find.Replacement.Text = "";


To call the Execute function, I typed "wordApp.Selection.Find.Execute" into Visual Studio, and noted all the parameters it required. The only one that seemed to matter was the first one. Word functions seem to take all their arguments in the form ref object, so missing objects for the other arguments needed to be set up:

4) Creating reference objects


missing = System.Reflection.Missing.Value;


I then had to insert the file. Again, I took the macro as a start point, and typed wordApp.Selection.InsertFile into Visual Studio to see what arguments were needed. I set up objects for an empty string and the Boolean false, and called the function in the form


wordApp.Selection.InsertFile(filename,
ref emptystring, ref falseobject,
ref falseobject, ref falseobject);


5) Encapsulating this in a C# function

The code for the complete replace placeholder function was as follows:


// Replace the placeholder on a file
// that we have launched with a new file

public void ReplacePlaceholder(String filename,
String placeholder)
{
object o = filename;
object falseobject = false;
object zeroobject = 0;
object normal = "Normal";
object emptystring = "";

// Insert the new document into the original

object osearchtext = placeholder;

wordApp.Selection.Find.ClearFormatting();
wordApp.Selection.Find.Text = placeholder;
wordApp.Selection.Find.Replacement.Text = "";

bool found = wordApp.Selection.Find.Execute(
ref osearchtext,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing);

// Does the search text exist?

if(found)
{
wordApp.Selection.InsertFile(filename,
ref emptystring,
ref falseobject, ref falseobject, ref falseobject);
}
}

Thursday, November 24

HttpRuntime.Cache in Windows Application

I read a great tip on Scott Hanselman's Blog.

You can use the Cache class from within a windows application just as easily as a web application. This will be a lot more powerful than caching objects into a global Hashtable, as expiration policies can be set to ensure you don't use too many resources, and dependencies can be set using the CacheDependency class, forcing cached objects to be reloaded if something happens in the environment.

If you get an error Unable to load DLL (aspnet_isapi.dll), you can load the Cassini web server from here.

Here is some code to set up a cache with a dependency file:


private void button1_Click(object sender,
System.EventArgs e)
{
// Set cache

CacheDependency dependency =
new CacheDependency("c:\\temp\\b.txt");
String cached = "cached";
Cache cc = HttpRuntime.Cache;
cc.Remove("a");
cc.Insert("a", cached, dependency);
}

private void button2_Click(object sender,
System.EventArgs e)
{
// Display cache

Cache cc = HttpRuntime.Cache;
String c = (String)(cc["a"]);
label1.Text = c;
}

private void button3_Click(object sender,
System.EventArgs e)
{
// Change dependency file

StreamWriter s = System.IO.File.CreateText
("c:\\temp\\b.txt");
s.WriteLine("a");
s.Flush();
s.Close();

// Cache should be reset to nothing next
time you click button 2
}

Well-being and social capital in the workplace

This interesting report (found via here and here) describes studies that examine the factors relating to job satisfaction.

It seems like by far the most important factor determining satisfaction is trust in management. The more openness there is, the more management will be trusted. Methodologies like Scrum make the status of projects very visible, and improve the satisfaction of those working on them (and hence the chance of success of the projects).

The worrying thing for some is that a feeling of doing better than one's peers is also positively correlated to satisfaction, and many believe that having exclusive access to information gives them this power. It's a difficult job to convince them that the successful projects they will achieve through adopting open methodologies will ultimately make their lives better than a feeling of power through exclusive access to information.

Monday, November 21

ObsoleteAttribute

When changing code it's difficult to be sure there aren't any calls to functions that you are now replacing. Search and replace isn't always perfect, especially when new functions have the same name but different arguments.

You might not want to remove the old functions immediately. They might have worked well, and you might need to think about what you will need to test to minimise risks.

The [Obsolete] attribute is very useful so you can flag these as compiler warnings.

Saturday, November 19

Price as Signal

Joel Spolsky describes how pricing sends a signal as to whether a product is good. People don't buy cheaply priced products because they perceive them to be "cheap" and therefore not very good.

For products where physical manufacturing costs are low, such as music, you should charge each customer as much as they are prepared to pay. If you sell for a low price to someone who would not have bought your product at all at a high price, you still make a profit.

The book industry manages to do this without making books appear cheap by having two versions, a hardback (expensive) version and a paperback (cheap) version released later. Keen readers of an author will buy the hardback as soon as it comes out, more casual readers will wait for the paperback. Music could do the same thing. It's a bit more difficult with online downloads, but maybe people who buy early at a premium price could have access to websites with exclusive pictures and interviews with the band and later purchasers (at a cheap price) would only get the music itself.

Thursday, November 17

SQL Server 2005, Visual Studio 2005 and Biztalk Server 2006 Launch Event - London

On Monday, I attended the SQL Server 2005, Visual Studio 2005 and Biztalk Server 2006 Launch event in London.

1000 people crammed into the Novotel in Hammersmith to hear Bruce Lynn introduce the event and explain the theme of "Organisational Productivity". Previous products have concentrated on individual productivity, but the idea behind this release is to encourage the organisation as a whole to become more efficient by reducing cultural barriers to communication and allowing people to work more efficiently in teams.

This was followed by four more technical presentations. The first of these, by Maris Berzins, described Biztalk 2006. This is currently still in beta, but will be released in Q1 2006. The main enhancements seem to be a management console based on MMC, application level management and better integration with Sharepoint.

A demonstration showed how you can see workflows with the number of people at each stage of the workflow, and be notified if needed – e.g. if orders cannot be fulfilled because you are waiting for stock, you can take a business decision to order more stock. The business rules can be easily coded using .net programming languages.

Many business processes involve complex workflows, and can quite easily get stuck at a stage, for example, if the person responsible is on holiday. This could keep track of these and allow action to be taken to reroute as necessary. It is also easy to report on the situation, so if a particular stage of a process proves a regular bottleneck, it could be redesigned.

The second presentation, by Keith Burns described various ways in which SQL Server 2005 is more fault-tolerant than its predecessors. SQL server 2005 can work in part when parts of the system fail. You should put filegroups on different physical disks to get the benefits from this. Parts of tables can be partitioned between different filegroups.

Databases can now be mirrored to a different server for faster recovery when the database fails, without the need for hardware support.

There is better support for peer to peer replication, so that data can be updated locally, and propagated around your network without the need for a central distributor.

The presentation also described how concurrency was improved. This focussed on the fact that index rebuilds do not lock the tables, and there is support for versioning as well as locking, which improves performance by reducing the numbers of transactions in which deadlocks will occur.

When we upgrade to 2005, it seems like it's certainly worth considering filegroups a lot more than I have done in the past, and distributing files in a way that maximises both fault-tolerance and performance.

The third presentation, by Mark Quirk, described Visual Studio Team System.

(the most expensive version aimed at large development teams) and the tools within it to allow teams to work together. There are different versions of this for a number of roles (Solution Architect, Software Developer, Tester, and Infrastructure Architect). Views are available describing the status of the project for lower forms of life such as project managers and business sponsors. Typically these are created using Sharepoint. Microsoft seem to think that everyone has a job that fits nicely into one of those roles.

Each person in the process can see a view of the project from their perspective, and there are also business intelligence functions, e.g. to allow the numbers of bugs to be graphed.

There are two development methodologies built into the system, agile and CMMI, and rules can be set up to e.g. make source code conform to certain specifications before it is checked into the source control system (SourceSafe has been replaced). There is also basic unit testing functionality in the system.

The testing tools were then demonstrated and looked impressive. You could record a test script using a special version of IE. You can then change your script so that forms might read from a database (e.g. if you want to try and log on to a system multiple times with different accounts). You can set up tests to run in different browsers and simulate different network bandwidths. This could be used for load testing a system with 80% of users on IE, 20% on Netscape, 60% on broadband and 40% on dial-up connections. Graphs can be generated of overall response times, and this could be scripted to run regularly so that any performance trends could be understood.

The next part of the presentation looked at Smart Clients. Traditionally, there has been a trade-off between thick client applications, which give a rich user experience but are difficult to deploy and web-based applications which are easy to deploy, but have a less rich experience. Microsoft’s preference is to develop rich applications but greatly simplify the deployment process, so that when an application is running it automatically detects updates and updates itself. They seemed to prefer this approach to AJAX type technologies where applications run in a web browser but use JavaScript and web services to give a richer and more responsive user experience.

After that, ASP.NET master pages were discussed. Master pages are another way of creating pages with a standard layout. A content page inherits from a master page, which contains the headers and footers. The master page contains a tag, and the content page is inserted here. This looks marginally easier to use than traditional include files.

Finally, there have been some improvements in security, and it is easier to hide parts of a web site, such as options on a navigation menu that should only be visible to certain security roles.

I think the main thing I'll look at will be the testing tool. It's difficult at the moment to simulate many concurrent logins with different browsers and connections over varying network bandwidths. Testing could be scripted to run overnight, and response times and any problems could be emailed automatically to developers before they became too serious.

I'm less convinced about Smart Clients, however. I think AJAX type technologies are more able to deliver responsive systems that are easy to deploy. They also have the advantage of being more platform independent.

The final presentation, by Rob Gray and Mark Anderson, examined the business intelligence functions that are integrated with SQL Server 2005 Enterprise edition. The aim in SQL Server 2005 is to move this from being a tool that only a few high level strategic decision makers use towards being a more general tool.

There are 3 parts to this, Integration Services, Analysis Services and Reporting Services.

Integration Services replace the current DTS packages, and allow data to be cleansed as it is imported into a SQL Server database. “Fuzzy Lookups” were demonstrated; where records that are an approximate match with records in a database could be found (e.g. names that are spelled slightly differently).

Analysis Services aim to create a “Unified Dimensional Model” of a database that can be queried by OLAP type queries, and Reporting Services allow this to be queried and drilled-down on in various user friendly ways.

An impressive demonstration showed "fuzzy lookups" and how these could be used in the common situation where two database tables have tgo be matched, where the data is slightly different.

Overall, I felt there were many "quick wins" where introducing these technologies could provide a rapid benefit. I was less convinced by the team system functionality, as I believe teams should organise themselves to reflect the strengths and characters of their members, and not have roles and ways of working imposed on them. It looks like it's going to be an involved process for a smaller team to set this up for their own ways of working.

However, I'm looking forward to receiving my free copy of Visual Studio Professional and SQL Server standard in the post anytime soon, and I get the chance to see if it's really as easy to use as it is made out to be in the demonstrations.

Hundred Dollar Laptop

I was delighted to read of a project to distribute a 100 dollar laptop computer to children in developing countries.

Giving these children access to information and the chance to learn more effectively will I think help them out of poverty. More educated people would be likely to elect better governments and would know how to make the best use of the resources available to them.

For $100, a laptop would have an open-source operating system, presumably Linux. Large numbers of these would mean that software publishers have to write software for these laptops. Would the advent of this software mean Linux becomes more common on laptops and desktops in the West?

Sunday, November 13

Software testing and risk reduction

Joel Spolsky describes five different worlds of software development, each with their own problems, and describes how we should consider which world we are operating in when deciding how to go about developing something.

Often, people move from one world to another, from e.g. shrinkwrap development to internal applications, but still think the approach they initially learned is the correct one. This is unlikely to be the case. An application used by a small number of people dealing with medical data will need to be developed in a different way to a program used by many thousands of people to design their gardens.

The former will need to ensure that data does not become corrupt, and is kept securely. The latter will need to run on a wide variety of operating systems and be easy and intuitive to use. The two different types of applications call for different testing methodologies.

When testing, you should consider what the risks are in your application, and then:
  • How many people might be affected if it goes wrong

  • What is the chance this will go wrong

  • What is the cost per person if it does go wrong


You can then multiply the 3 figures together, and work out a strategy to reduce the significant risks. It may be that this is to use test driven development when developing the software. It may be that it is to invest in usability labs, to ensure people find your application easy to use. It may be that it's best to test on a wide range of different hardware configurations.

If none of the risks are significant, as with throwaway applications, you need to do very little. If all of them are, you should do a lot more testing.

Thursday, November 10

Cannot resolve collation conflict for equal to operation

I had the error message "Cannot resolve collation conflict for equal to operation" today when exporting tables from one SQL server database into another. It seems this was because the collation settings of the two databases were different.

I found the script below here, which goes through the information_schema.tables object and calls ALTER TABLE to reset all the collation settings on all columns in all tables to the defaults. (I've updated this slightly to reset nvarchar and char columns).


alter procedure p1 as

DECLARE COL_CURSOR CURSOR READ_ONLY FOR
select table_schema, table_name, column_name,
column_default, is_Nullable, Data_type,
character_maximum_length, collation_name
from information_schema.columns

INNER JOIN
(SELECT TABLE_NAME TN FROM
information_schema.tables where
TABLE_TYPE='BASE TABLE' ) IT ON
(TABLE_NAME=TN)
where Data_type = 'varchar' or Data_type =
'nvarchar' or data_type='char'
DECLARE @table_schema varchar(10), @table_name
varchar(100), @column_name varchar(100),
@column_default varchar(100), @is_Nullable
varchar(5), @Data_type varchar(100),
@character_maximum_length varchar(10),
@collation_name varchar(200)

DECLARE @Execstr VARCHAR(2000)
OPEN COL_CURSOR
FETCH NEXT FROM COL_CURSOR INTO @table_schema,
@table_name, @column_name,
@column_default, @is_Nullable, @Data_type,
@character_maximum_length, @collation_name
WHILE (@@fetch_status > -1)
BEGIN
IF (@@fetch_status > -2)
BEGIN
SET @Execstr = 'ALTER TABLE ' + @table_schema
+ '.' + @table_name
+ ' ALTER COLUMN [' + @column_name + '] ' +
@Data_type + ' ('+ @character_maximum_length + ') '
+ CASE WHEN @is_Nullable='no' THEN ' NOT NULL'
ELSE ' NULL ' END
exec (@Execstr)
PRINT ('Executing -->'+ @Execstr )
PRINT ('Orig COLLATION WAS -->'+
@collation_name )

END
FETCH NEXT FROM COL_CURSOR INTO @table_schema,
@table_name, @column_name,
@column_default, @is_Nullable, @Data_type,
@character_maximum_length, @collation_name
END
CLOSE COL_CURSOR
DEALLOCATE COL_CURSOR
GO

Wednesday, November 9

Avoiding images being cached

I saw a question on the Developer Fusion web site today.

This reminded me of a problem we had, where we had image files that were generated from a database, containing graphs of web site usage.

We found the images were frequently cached, and old graphs were shown to our users. To resolve this, we added a QueryString containing a random number after the image file name in the HTML, using simple C# code as follows:

<% System.Random r = new System.Random(); %>
<img src=myimage.gif?<%Response.Write(r.NextDouble()); %>>

Profilers and The Perils of Micro-Optimization

Ian Griffiths writes about the perils of optimising code where it is not required. He correctly points out that most parts of a typical application are not perceived as too slow, and should be optimised for reliability and maintainability, not speed.

Those that are too slow are typically too slow because of infrastructure issues, and code is responsible in only a small number of cases.

Ian criticises profilers as they encourage developers to chase performance gains where they are not a serious problem. However, I'd recommend running code through a profiler occasionally as a learning experience. Early in my programming career, I had some code with performance problems. Running it through a profiler, I found one loop was slow. This was because I had set it up in the form:

for (int i=0;i < strlen(s);i++){s[i]='A';}.

I didn't realise that strlen was evaluated every time the loop was iterated, and the time taken to run this is proportional to the length of the string. A profiler found this was the cause of a bottleneck, and I learned from the experience to move the call to strlen() outside of the loop.

For a production web site, I'd recommend working out what the acceptable time is to load each page. This should be based on how real users make use of each page, not what you think might be technically possible.

You should then record the time taken by a percentage of requests, chosen at random. If these are greater than the acceptable level, then an alert should be automatically sent to the developer. There may be many causes of slowness, e.g. unusually high loading, databases not being properly indexed etc., and its not possible to anticipate them all, but this will ensure you spend your time tackling issues that are causing users genuine problems and minimise the numbers of users affected by them.

Monday, November 7

Internet Explorer Developer Toolbar

Oliver Sturm writes about the Internet Explorer Developer Toolbar.

This looks like it will save a lot of time debugging sites and finding things that you might not notice when developing sites, but that might cause problems when they go live, such as large image file sizes.

It's not perfect yet, but hopefully the bugs will be fixed, and it will evolve over time into a standard part of IE.

Sunday, November 6

The Best Software Writing I

I’ve just finished reading The Best Software Writing I: Selected and Introduced by Joel Spolsky.

This is a collection of 29 essays about software development by a range of authors some of who have published works of their own. They are all available on the web if you look for them, but having them in book form makes it a lot easier to read on the train.

Of particular note were Danah Boyd’s essay entitled “Autistic Social Software”, in which the author describes 3 ways to make technology work in the context of people – to demand it fits into peoples lives, to see what catches on and evolve, and to understand people’s needs first. Most web sites, especially in the dot com boom a few years ago used the first approach. More modern web sites are using the second and third approaches, to better reflect the way people do things, and the technologies are evolving in the same direction.

Related to this Eric Sink’s essay entitled “Closing the Gap”. The author describes how you should develop in the open, with public discussions and readily available preview downloads. This combines Boyd’s second and third approaches. The weakness of the understanding people’s needs first is that they can’t often express them unless they see what is possible. If technology is developed in the open, people can better express their needs and the technology can be adapted.

Recommended.

Wednesday, November 2

Temporary Outlook Buttons

I have written an C# add-in to create a button on Outlook. There was a problem with this in that if you closed down outlook, then uninstall the add-in, then restarted outlook, the button was still there, but did not do anything. I had tried without success to delete the button in the OnDisconnection event handler, but found a much simpler fix:

1) Detect if the button is present in the OnConnection event handler.
2) Remove it if it is present.
3) Create a temporary button using code like this:

bool temporary = true;
this.toolbarButton = (CommandBarButton)commandBars["Standard"].Controls.Add(1, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, temporary);

The temporary button will be deleted when outlook closes. There isn't any need to do anything in any of the other event handlers in the IDTExtensibility2 interface.