# Monday, March 9, 2009

Episode 14

Jim Holmes takes a break from organizing and coordinating CodeMash to talk about the conference.

2 minutes, 38 seconds

Monday, March 9, 2009 4:11:04 AM (GMT Standard Time, UTC+00:00)
# Saturday, March 7, 2009

Episode 13

Matt Pizzimenti is involved in a2geeks.org - a social network for techies and entrepeneurs.  He discusses it in this interview.

Saturday, March 7, 2009 12:45:47 PM (GMT Standard Time, UTC+00:00)
# Thursday, March 5, 2009

Episode 12

Mark Hindsbo is the General Manager of the Microsoft Evangelism group in the United States. In this interview, Mark talks about what Microsoft can do to help the developer community succeed.

2 minutes, 24 seconds

Thursday, March 5, 2009 11:52:25 AM (GMT Standard Time, UTC+00:00)
# Wednesday, March 4, 2009

I woke up this morning and was surprised and deleted to see the following e-mail in my inbox

Congratulations! We are pleased to present you with the 2009 NVP Award! The NVP Award is our way to say thank you for promoting bad behavior and general revelry in the office. We appreciate your extraordinary efforts at “Video interviewing everyone on the planet” during the past year..

To celebrate this tremendous achievement, please join your other NVPers at the NVP Summit at SRT Solutions Thursday afternoon where we will talk about the future of <__>, drink lots of <__>, play a little <__>, and not do any <__>. As we NVPers like to say – “Anything that happens at NVP Summit, stays at NVP Summit”. NDA restrictions apply. You will also receive a special gift for being a NVP and some useless swag.

--Mike (NVP Regional Director)

I cannot tell you how proud I am to receive this award. I certainly would have dreamed of it all my life had I heard of it before yesterday.

I would be thrilled to share all I learn with you - my readers - were it not for the attached 423-page non-disclosure agreement I was forced to sign. But rest assured that this knowledge and its accompanying free drinks will ultimately help the community and be good for all software developers.

Thank you to the committee, to those who nominated me and seconded my nomination and especially to <__>, without whom I would not have won this award.

Wednesday, March 4, 2009 12:58:00 PM (GMT Standard Time, UTC+00:00)
# Tuesday, March 3, 2009

Episode 11

After writing a distributed application, software architect Phil Japikse needed a way to deploy updates to users across the state.  In this conversation, Phil describes the deployment strategy he implemented using tools provided by the .Net framework. 

Tuesday, March 3, 2009 11:55:01 AM (GMT Standard Time, UTC+00:00)
# Friday, February 27, 2009

Episode 10

Tim Adams of Microsoft was filming a series of "Man On The Street" video interviews at CodeMash when I caught up with him. I interviewed him while he interviewed me and we each filmed the conversation.

Here is the conversation from my side:

Here is the same conversation from Tim's point of view:

Friday, February 27, 2009 5:05:59 AM (GMT Standard Time, UTC+00:00)
# Thursday, February 26, 2009

In my last article, I defined the basic concepts of database, table, column and row. Using these constructs, you can organize data into a rectangular format. This paradigm often works really well, because

  • You can group related information into a single container (a table)
  • Each row represents a single entity (such as a customer, employee, or invoice) and
  • Each column represents an attribute of the entity (such as FirstName, LastName, or TotalSales).

Using this model, we can create a table containing information about a customer's purchases. Each row in this item might represent a single purchase.

When a customer purchases an item, we probably would want to store some information about that purchase. These bits of information about each purchase are attributes of the purchase and are therefore candidates for columns. Below are examples of the information we might want to save about a customer's purchase.

  • Date of Purchase
  • Customer First Name
  • Customer Last Name
  • Customer Street Address
  • Customer City
  • Customer Zip Code
  • Item Purchased
  • Quantity Purchased
  • Price per Item

We can create a table CustomerPurchase with a column for each of the above attributes and begin populating with data each time a customer purchases something. The data would look something like this:

PurchaseDate Customer
ItemPurchased Quantity PricePerItem
2/26/2009 John Smith 123 Elm Bigg City 48222 Lamp 1 40
2/26/2009 Bill Jones 456 Maple Smallville 48333 Chair 2 100
2/26/2009 Mary Brown 789 Oak Middleton 48444 Table 1 50

This model seems to capture the information we want. Do you see any problems with it?

What happens if a customer orders more than one item? If John Smith purchases a Chair in addition to his Lamp, we can just add another row to the table, like so.

PurchaseDate Customer
ItemPurchased Quantity PricePerItem
2/26/2009 John Smith 123 Elm Bigg City 48222 Lamp 1 40
2/26/2009 Bill Jones 456 Maple Smallville 48333 Chair 2 100
2/26/2009 Mary Brown 789 Oak Middleton 48444 Table 1 50
2/26/2009 John Smith 123 Elm Bigg City 48222 Chair 1 100
2/27/2009 John Smith 123 Elm Bigg City 48222 Table 1 50

But notice that now we are storing John Smith's name and address multiple times.  Assuming John Smith will never change his name, this is a waste of space.  Granted, this isn't very much wasted space when we have only a few orders, but imagine a system with thousands of customers and millions of orders.  Do you really want all that redundant information cluttering up your database?

Also, imagine that we want to correct an error in the spelling of John's name.  With the current model, we must correct that error three times due to the redundant storage.

To address these issues, we can normalize the data.  Data normalization refers to structuring our data in order to remove redundancy. 

In our example, we accomplish this by creating a table of customers with the following structure

  • FirstName
  • LastName
  • StreetAddress
  • City
  • ZipCode

and and moving the customer data to this table - one row per customer.

FirstName LastName StreetAddress City ZipCode
John Smith 123 Elm Bigg City 48222
Bill Jones 456 Maple Smallville 48333
Mary Brown 789 Oak Middleton 48444

Then we add an extra column to the ustomerPurchase tab table.  This new column is special in that the value in it will uniquely identify each row - in other words, no two rows will have the same value.  This unique column goes by many names but we will call it a Primary Key here.  In this case, the Primary Key column will be named "CustomerID" and will hold an integer.

CustomerID FirstName LastName StreetAddress City ZipCode
1 John Smith 123 Elm Bigg City 48222
2 Bill Jones 456 Maple Smallville 48333
3 Mary Brown 789 Oak Middleton 48444

Now we can go back to the ustomerPurchase tab table, and replace the columns that describe customer with a column to hold the CustomerID.  This replacement column is known as a "Foreign Key".  It references a Primary Key in another table and is used to point to a single unique record in that other table.

PurchaseDate CustomerID ItemPurchased Quantity PricePerItem
2/26/2009 1 Lamp 1 40
2/26/2009 2 Chair 2 100
2/26/2009 3 Table 1 50
2/26/2009 1 Chair 1 100
2/27/2009 1 Table 1 50

This is all we need because, given the CustomerID, we can look in the Customer table, find the record for that customer and get all information about that customer.

This concept of using a key value to point to a row in another table is known as a relationship.  We say that the Customer table is related to the CustomerPurchase tab table. 

This type of relationship is known as a one-to-many relationship, every customer may have many orders.  In this type of relationship the table with one row is known as the parent and the table with (potentially) many rows is known as the child table.  

This relationship is typically represented by a drawing similar to the one below.

Organizing data in this way can make storage of that data far more efficient and flexible.

Thursday, February 26, 2009 1:17:33 PM (GMT Standard Time, UTC+00:00)
# Wednesday, February 25, 2009

Episode 9

Mike Wood is the Lead Director for the Cincinnati .Net User Group.  He and I spoke about how to build a strong community and what makes the Microsoft Heartland Community so special.

Wednesday, February 25, 2009 11:53:42 AM (GMT Standard Time, UTC+00:00)
# Tuesday, February 24, 2009

In this article, we will define a database, a table and the main parts of a table - rows and columns.

A database is an organized (or structured) collection of information.

Most companies spend a lot of time and effort collecting information and storing it somewhere, but not all that information is organized, which makes it difficult to retrieve anything relevant later on.  A database adds structure to the information making it easier to maintain and query it.

Database engines like SQL Server provide a structure to organize data in a way that makes sense to a user.  Specifically, SQL Server uses a relational model* to organize its data.

In a relational database, data is partitioned into tables.  Tables are a way of data storing data in rows and columns, kind of like in an Excel worksheet. 

Figure 1 - Data in an Excel workbook

I've always found this rectangular view of data very intuitive.

Just as in a workbook, each table row represents a discrete record.  All information in that row serves to describe the row.  

Similarly, a table column is a placeholder that describes a single attribute for each row.  The advantage SQL Server has over Excel is that you can easily place rules onto a column, restricting the type of data that can be stored there. 

If a SQL Server column is designed to hold a date, a property of that column can be set to throw an error if a person or program tries to store a string.   We can set up such restrictions for many data types, so that a column can be restricted to allow only integers, only TRUE/FALSE values, or only binary objects.  We can even restrict the maximum length of a string or require users to always enter a value into a column - all simply by setting properties on the column.**

For example, a table named "Customers" might be used to store information about your company's customers.  Each row in this table would represent a single customer.  Each column would hold an attribute of that customer, so you could create columns such as FirstName, LastName and StreetAddress that would hold the appropriate values for each customer. 

Figure 2 - Data in a SQL Server table

Looking at the first row, gives us information about the customer.  It should be obvious that this customer has a first name of "David", a last name of "Giard" and an address of "123 Main St".

*SQL Server does provide some non-relational ways of storing data but those are beyond the scope of this article.
** It is possible to configure Microsoft Excel to restrict data input, but this task is relatively advanced and far more easily accomplished in SQL Server.

Tuesday, February 24, 2009 6:14:48 PM (GMT Standard Time, UTC+00:00)
# Monday, February 23, 2009

Episode 8

Brian Prince joined Microsoft last year as an Architect Evangelist.  In this interview, he talks about the experience and how it is different from any company he has worked for in the past.

Monday, February 23, 2009 11:30:08 AM (GMT Standard Time, UTC+00:00)