Anil on Software: July 2010

Tuesday, July 27, 2010

World Will Run Out of Internet Addresses in Less Than a Year, Experts Predict

Experts predict the world will run out of internet addresses in less than a year, the Sydney Morning Herald reported Monday.

The internet protocol used by the majority of web users, IPv4, provides for about four billion IP addresses -- the unique 32-digit number used to identify each computer
, website or internet-connected device.

There are currently only 232 million IP addresses left -- enough for about 340 days -- thanks to the explosion in smartphones and other web-enabled devices.

"When the IPv4 protocol was developed 30 years ago, it seemed to be a reasonable attempt at providing enough addresses," carrier relations manager at Australian internet service provider (ISP) Internode John Lindsay told the Herald.

"Bearing in mind that at that point personal computers didn't really exist, the idea that mobile phones might want an IP address hadn't occurred to anybody because mobile phones hadn't been invented [and] the idea that air-conditioners and refrigerators might want them was utterly ludicrous."

The solution to the problem is IPv6, which uses a 128-digit address. It would give everyone in the world more than four billion addresses each, but most of the internet industry has so far been reluctant to introduce it.

It would require each device that connects to the internet to be reconfigured or upgraded, with some users even being forced to buy new hardware, the Sydney Morning Herald reported.

In the meantime ISPs may force multiple customers to share IP addresses, which may lead to common applications
, such as Gmail and iTunes, ceasing to work.

There are also fears a black market of IP addresses may spring up.

Wednesday, July 14, 2010

How to Increase Query Speed by 3 Orders of Magnitude with no Indexes

Many years ago, shortly after I had first switched from embedded to relational database development, I saw my boss optimize performance so elegantly, so beautifully that it changed the way I approach the whole of development. He indeed figured out how to cut the running time of a query from about three hours to about 15 seconds without spending any time at all with the mechanics of query optimization.

First, some background. We were developing an integrated solution for managing a vaccine laboratory (using Oracle, but that is not relevant here). We were well along in the development process, basically starting the user acceptance phase. There was a very important screen that was taking about 3 hours to populate. Since this was a transactional screen, with performance like that the system was unusable.

Managers were starting to freak out. Important Meetings were being scheduled. The DBA was hauled onto the carpet, his explanations of "it's only a development instance" ignored. Diagnostics were being run, the database was in extents (in Oracle World, A Very Bad Thing (tm)), database monitors were flashing red. The project sponsor was called in.

My boss, a true genius, kept his head while others were losing theirs. He calmly went to the business people, the lab techs who were going to actually use the system. He asked a simple question: does the data populating this screen need to be current as of this instant, or would data as of the previous midnight be OK?

"As of midnight would be perfectly fine", the business people said. My boss then created a snapshot (an indexed view in MSSQL, now called a materialized view in Oracle) based on the query, to be run at midnight every day. The screen was now populated in 15 seconds, as opposed to 3 hours. The crisis was averted, the problem fixed, and everybody was happy.

Captain Kirk employed a similar strategy for the Kobayashi Maru game. For those non-Trekkies out there, Captain Kirk won a rigged game by changing the rules of the game. Unfortunately, we are programmed by school to think of that as "cheating", but I think it a shame that we don't do it more often in our lives of creating tools for people to use. All too often, we struggle with tasks that are not useful or are unimportant for the people who actually use our work because it is written somewhere, or because we learned the "proper" way in school.

In addition, all too often, we only talk directly with our users after experiencing a lot of drama. Then, once we sit together, we find out what they really need and want and then we slap our heads and say, that's easy!

That's a game no one enjoys playing. Here are some tips to avoid getting caught in that trap.

* Spend some time with your end users, if possible. Get to know them as people, not just users. Find out what they actually do, and what they actually want to use your tool to do. Be patient with their lack of technical skill; remember that they probably know things you don't, and that the more knowledge they have about what you do, the easier your job will be.
* Look at production data. If you get the chance to create prototypes, use production data in your prototypes.
* If you're in a position of some authority, eliminate the distinction between "analysts" and "programmers". Set it up so that everyone does both.
* Finally, always remember what Liberace said: without the business, there's no show.

Great Hacker != Great Hire

I thoroughly enjoyed reading Paul Graham's recent essay Great Hackers. His sermon is well-written, and I assume it played very well when he preached it to the choir at OSCON.
Graham describes the notion of a "great hacker", which he seems to roughly define as a programmer who is several times more productive than average. (Please note that some people use the word "hacker" to describe programmers who engage in illegal activity. That connotation is not applicable here or in Graham's essay.) He then asks the following questions:

How do you recognize [great hackers]? How do you get them to come and work for you?"

Note carefully: Graham proceeds from the assumption that we do in fact want to hire these great hackers, but he never explains why.
I concede that this assumption is intuitive. After all, doesn't every company want the most productive employees they can hire?
But this assumption deserves to be examined and challenged.

For the love of the code

Graham begins his description of great hackers by explaining the intrinsic motivation and passion they have for writing code:

Their defining quality is probably that they really love to program. Ordinary programmers write code to pay the bills. Great hackers think of it as something they do for fun, and which they're delighted to find people will pay them for.

On this point, we agree. The best developers simply love to create software. They get paid, and their compensation is important, but it isn't really the primary reason why they write code. They wrote code before they were getting paid for it. They would continue to write code even after winning the lottery. When I hire developers, I am looking for this quality.
However, the remainder of Graham's essay does a pretty good job of explaining why many small ISVs might not want to hire a "great hacker". In a nutshell, great hackers are often very fussy people.

Fussy about tools and platforms

Graham explains the well-known fact that great hackers are extremely picky about the tools, platforms and technologies they use:

Good hackers find it unbearable to use bad tools. They'll simply refuse to work on projects with the wrong infrastructure.

Bad tools? Wrong infrastructure? Graham sounds like he is right on the money. Nobody could object to the idea that great hackers care deeply about these kinds of choices, right?
Unfortunately, Graham goes on to explain what great hackers mean by "bad tools" and "wrong infrastructure". Painting with a very wide brush, he observes that great hackers don't use technologies like Windows and Java. They prefer languages like Python and Perl. They prefer to use open source technologies whenever possible.
I'm not saying I am a great hacker, but I do sympathize with this fussiness. I have similar religious preferences about technologies. I really do like Python. My personal server runs Debian. The first things I install when I repave my Windows machine are emacs and cygwin.
However, I work at an ISV. I love building software, but SourceGear is not my hobby -- it is my profession. We sell products to users. We have learned to value the needs of the users over our own preferences.
Graham seems to suggest that if we choose to build our products on the technologies that great hackers prefer, then it is more likely that we will be able to hire them. This opinion may be true, but it ignores the fact that technology choices have marketing implications. I've written about this topic several times now:

Law #21: "These may not seem like marketing decisions, but they are. Technology choices have big marketing implications. When you choose a platform, you define the maximum size of your market."
Geek Gauntlets: "We need to talk about what customers want, but our own preferences get in the way. We bring our technology prejudices and biases to the discussion, often without ever being aware of the problems they can cause."
Be Careful Where you Build: "As developers in a small ISV, our productivity is important, but it must be secondary to the comfort and preferences of our users."

The higher productivity of a great hacker is a big advantage, but probably not big enough to overcome our attempts to sell something that users don't want.
If Graham is right, a great hacker is someone who believes that his own preferences are more important than doing what is best for the users. Small ISVs don't need people like that.

Fussy about doing interesting projects

Graham goes on to explain how important it is for great hackers to be doing interesting projects:

Along with good tools, hackers want interesting projects. It's pretty easy to say what kinds of problems are not interesting: those where instead of solving a few big, clear, problems, you have to solve a lot of nasty little ones. One of the worst kinds of projects is writing an interface to a piece of software that's full of bugs.

Here again, I am sympathetic. I like interesting stuff too. My To-Do list tells me that I am supposed to be working on some enhancements to our online store website. Frankly, that programming task doesn't interest me very much. I'll confess that I'm procrastinating on that particular task.
However, I work at an ISV. I love building software, but SourceGear is not my hobby -- it is my profession. We sell products to users. The reality is that lots of highly profitable software development tasks are just not very interesting.
Graham strikes particularly close to home when he says, "One of the worst kinds of projects is writing an interface to a piece of software that's full of bugs." Our SourceOffSite product provides an Internet-based interface to a piece of software that's full of bugs (SourceSafe). That same product supports integration with IDEs by means of a piece of software that's full of bugs (the MSSCCI API). If we had been great hackers and refused to do this work because it is not interesting, we would have missed out on millions of dollars of revenue.
If Graham is right, a great hacker is someone who is not willing to do any of the un-fun things that need to be done. Small ISVs don't need people like that.

Fussy about interacting with users

Graham characterizes great hackers as people don't want to be involved with users:

Bigger companies solve the problem by partitioning the company. They get smart people to work for them by establishing a separate R&D department where employees don't have to work directly on customers' nasty little problems.
You may not have to go to this extreme. Bottom-up programming suggests another way to partition the company: have the smart people work as toolmakers. ... This way you might be able to get smart people to write 99% of your code, but still keep them almost as insulated from users as they would be in a traditional research department.

This kind of attitude is a big problem in a small ISV. I concede that it is frustrating to be interrupted when I'm working on a coding problem. I concede that users sometimes ask dumb questions. I concede that writing a great piece of code is more fun than figuring out how somebody screwed up their Web.config file.
However, I work at an ISV. I love building software, but SourceGear is not my hobby -- it is my profession. We sell products to users. Nothing here is more important than our users. Nothing.
Last year I wrote an article in which I claim that small ISVs should only hire "developers", which I define as "programmers who also contribute in non-coding ways". The thesis statement of this article said:

"For the purpose of this article, a "programmer" is someone who does nothing but code new features and [if you're lucky] fix bugs. They don't write specs. They don't write automated test cases. They don't help keep the automated build system up to date. They don't help customers work out tough problems. They don't help write documentation. They don't help with testing. They don't even read code. All they do is write new code. In a small ISV, you don't want any of these people in your company."

If Graham is right, a great hacker is someone who is not willing to help the people who use the software he creates. Small ISVs don't need people like that.

Bottom Line

Like I said, I enjoyed Graham's essay very much. He describes great hackers by enumerating all of their worst qualities, and yet, the essay still makes us want to admire these super-productive people. That's good writing.
But the essay causes concern. I worry that lots of small ISVs will read his article and believe that they need to hire great hackers. When great hackers are as fussy as Graham says they are, they're not worth the trouble. We want the super-productivity, and we want the innate love of software development, but we don't want all the extra baggage. Instead:

Hire people who care about users.
Hire people who understand the difference between a job and a hobby.
Hire people who want to contribute in lots of different ways to the success of the product.

It's okay to be in awe of these great hackers. But as a practical matter, small ISVs would be much better off hiring professionals.

10 Simple Ways To Speed Up Your Windows XP

One of the factors that slow the performance of the computer is disk fragmentation. When files are fragmented, the computer must search the hard disk when the file is opened to piece it back together. To speed up the response time, you should monthly run Disk Defragmenter, a Windows utility that defrags and consolidates fragmented files for quicker computer response.

Follow Start > All Programs > Accessories > System Tools > Disk Defragmenter
Click the drives you want to defrag and click Analyze
Click Defragment

2. Detect and Repair Disk Errors

Over time, your hard disk develops bad sectors. Bad sectors slow down hard disk performance and sometimes make data writing difficult or even impossible. To detect and repair disk errors, Windows has a built-in tool called the Error Checking utility. Itâ€™ll search the hard disk for bad sectors and system errors and repair them for faster performance.

Follow Start > My Computer
In My Computer right-click the hard disk you want to scan and click Properties
Click the Tools tab
Click Check Now
Select the Scan for and attempt recovery of bad sectors check box
Click Start

3. Disable Indexing Services

Indexing Services is a little application that uses a lot of CPU. By indexing and updating lists of all the files on the computer, it helps you to do a search for something faster as it scans the index list. But if you know where your files are, you can disable this system service. It wonâ€™t do any harm to you machine, whether you search often or not very often.

Go to Start
Click Settings
Click Control Panel
Double-click Add/Remove Programs
Click the Add/Remove Window Components
Uncheck the Indexing services
Click Next

4. Optimize Display Settings

Windows XP is a looker. But it costs you system resources that are used to display all the visual items and effects. Windows looks fine if you disable most of the settings and leave the following:

Show shadows under menus
Show shadows under mouse pointer
Show translucent selection rectangle
Use drop shadows for icons labels on the desktop
Use visual styles on windows and buttons

5. Speedup Folder Browsing

You may have noticed that everytime you open My Computer to browse folders that there is a little delay. This is because Windows XP automatically searches for network files and printers everytime you open Windows Explorer. To fix this and to increase browsing speed, you can disable the â€œAutomatically search for network folders and printersâ€� option.

6. Disable Performance Counters

Windows XP has a performance monitor utility which monitors several areas of your PCâ€™s performance. These utilities take up system resources so disabling is a good idea.

Download and install the Extensible Performance Counter List
Then select each counter in turn in the â€˜Extensible performance countersâ€™ window and clear the â€˜performance counters enabledâ€™ checkbox at the bottom button below

7. Optimize Your Pagefile

You can optimize your pagefile. Setting a fixed size to your pagefile saves the operating system from the need to resize the pagefile.

Right click on My Computer and select Properties
Select the Advanced tab
Under Performance choose the Settings button
Select the Advanced tab again and under Virtual Memory select Change
Highlight the drive containing your page file and make the initial Size of the file the same as the Maximum Size of the file.

Windows XP sizes the page file to about 1.5X the amount of actual physical memory by default. While this is good for systems with smaller amounts of memory (under 512MB) it is unlikely that a typical XP desktop system will ever need 1.5 X 512MB or more of virtual memory. If you have less than 512MB of memory, leave the page file at its default size. If you have 512MB or more, change the ratio to 1:1 page file size to physical memory size.

8. Remove Fonts for Speed

Fonts, especially TrueType fonts, use quite a bit of system resources. For optimal performance, trim your fonts down to just those that you need to use on a daily basis and fonts that applications may require.

Open Control Panel
Open Fonts folder
Move fonts you donâ€™t need to a temporary directory (e.g. C:\FONTBKUP?) just in case you need or want to bring a few of them back. The more fonts you uninstall, the more system resources you will gain.

9. Use a Flash Memory to Boost Performance

To improve performance, you need to install additional RAM memory. Itâ€™ll let you boot your OS much quicker and run many applications and access data quicker. There is no easiest and more technically elegant way to do it than use eBoostr.
eBoostr is a little program that lets you improve a performance of any computer, powered by Windows XP in much the same way as Vistaâ€™s ReadyBoost. With eBoostr, if you have a flash drive, such as a USB flash thumb drive or an SD card, you can use it to make your computer run better. Simply plug in a flash drive through a USB socket and Windows XP will use eBoostr to utilize the flash memory to improve performance.
The product shows the best results for frequently used applications and data, which becomes a great feature for people who are using office programs, graphics applications or developer tools. Itâ€™ll surely attract a special attention of laptop owners as laptop upgrade is usually more complicated and laptop hard drives are by definition slower than those of desktops.

10. Perform a Boot Defragment

There's a simple way to speed up XP startup: make your system do a boot defragment, which will put all the boot files next to one another on your hard disk. When boot files are in close proximity to one another, your system will start faster.
On most systems, boot defragment should be enabled by default, but it might not be on yours, or it might have been changed inadvertently. To make sure that boot defragment is enabled:

Run the Registry Editor
Go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Dfrg\BootOptimizeFunction
Set the Enable string value to Y if it is not already set to Y.
Exit the Registry
Reboot

Hope you find these 10 tips useful. Have a nice day!

Difference between 32 bit and 64 bit OS.

During my research about 64 and 32 bit operating system I found many people confused about what computer they should purchase or what bit system they should choose…I think this article should be a great help for people to choose the correct machine for their use.

I found many people asking about that what is the difference between 32 and 64 bit operating system?... I hope after reading this article people would feel free to choose the best machine for them.

A 32 bit processor is faster than a 64 bit processor, 64 bit processors are very commonly used that you can find it easily in any home pc but the main difference is the hardware you are having on your machine. For 32 bits there isn’t any need of any wide main bus to carry 32 bits at a time but for 64 bits its must that you should have a wider bus to carry 64bits.

The main difference between a 32 bit and 64 is that 32 bit system has 4gb(gigabytes) of space for addressing means that the 32 bit system has a limit of 4GB RAM to process data where as the 64 bit operating system has 2^64 bits of space to address and supports 16 hexabytes of RAM to process data.

A person who does not care about the category will find 64 bit more beneficial than 32 bit because he can use 64 bit OS with 32 bit Os and softwares. If we compare 64 bit with 32 bit OS, 64 bit is faster and performs more upgraded silicon processes and have more no of transistors which proves it to be more advantageous than 32 bit.

Now most of the software companies are developing their softwares in accordance to the 64 bit environment, it’s really hard for the consumers to run a 64 bit application in 32 bit environment, in this case they have to upgrade their hardware’s such as RAM which is also a big issue because most of the computer users have less than 1Gb in their systems.

So instead of changing your hardware day by day I would recommend d32 bit OS which is more user friendly and enables it users to run applications of 64 bit in 32 bit environment.

Monday, July 12, 2010

HTTP Status Codes

HTTP status codes
When a request is made to your server for a page on your site (for instance, when a user accesses your page in a browser or when Googlebot crawls the page), your server returns an HTTP status code in response to the request.

This status code provides information about the status of the request. This status code gives Googlebot information about your site and the requested page.

Some common status codes are:

* 200 - the server successfully returned the page
* 404 - the requested page doesn't exist
* 503 - the server is temporarily unavailable

A complete list of HTTP status codes is below. You can also visit the W3C page on HTTP status codes for more information.

1xx (Provisional response)
Status codes that indicate a provisional response and require the requestor to take action to continue.
Code Description
100 (Continue) The requestor should continue with the request. The server returns this code to indicate that it has received the first part of a request and is waiting for the rest.
101 (Switching protocols) The requestor has asked the server to switch protocols and the server is acknowledging that it will do so.

2xx (Successful)

Status codes that indicate that the server successfully processed the request.
Code Description
200 (Successful) The server successfully processed the request. Generally, this means that the server provided the requested page. If you see this status for your robots.txt file, it means that Googlebot retrieved it successfully.
201 (Created) The request was successful and the server created a new resource.
202 (Accepted) The server has accepted the request, but hasn't yet processed it.
203 (Non-authoritative information) The server successfully processed the request, but is returning information that may be from another source.
204 (No content) The server successfully processed the request, but isn't returning any content.
205 (Reset content) The server successfully proccessed the request, but isn't returning any content. Unlike a 204 response, this response requires that the requestor reset the document view (for instance, clear a form for new input).
206 (Partial content) The server successfully processed a partial GET request.

3xx (Redirected)
Further action is needed to fulfill the request. Often, these status codes are used for redirection. Google recommends that you use fewer than five redirects for each request. You can use Webmaster Tools to see if Googlebot is having trouble crawling your redirected pages. The Crawl errors page under Diagnostics lists URLs that Googlebot was unable to crawl due to redirect errors.
Code Description
300 (Multiple choices) The server has several actions available based on the request. The server may choose an action based on the requestor (user agent) or the server may present a list so the requestor can choose an action.
301 (Moved permanently) The requested page has been permanently moved to a new location. When the server returns this response (as a response to a GET or HEAD request), it automatically forwards the requestor to the new location. You should use this code to let Googlebot know that a page or site has permanently moved to a new location.
302 (Moved temporarily) The server is currently responding to the request with a page from a different location, but the requestor should continue to use the original location for future requests. This code is similar to a 301 in that for a GET or HEAD request, it automatically forwards the requestor to a different location, but you shouldn't use it to tell the Googlebot that a page or site has moved because Googlebot will continue to crawl and index the original location.
303 (See other location) The server returns this code when the requestor should make a separate GET request to a different location to retrieve the response. For all requests other than a HEAD request, the server automatically forwards to the other location.
304 (Not modified)

The requested page hasn't been modified since the last request. When the server returns this response, it doesn't return the contents of the page.

You should configure your server to return this response (called the If-Modified-Since HTTP header) when a page hasn't changed since the last time the requestor asked for it. This saves you bandwidth and overhead because your server can tell Googlebot that a page hasn't changed since the last time it was crawled
.
305 (Use proxy) The requestor can only access the requested page using a proxy. When the server returns this response, it also indicates the proxy that the requestor should use.
307 (Temporary redirect) The server is currently responding to the request with a page from a different location, but the requestor should continue to use the original location for future requests. This code is similar to a 301 in that for a GET or HEAD request, it automatically forwards the requestor to a different location, but you shouldn't use it to tell the Googlebot that a page or site has moved because Googlebot will continue to crawl and index the original location.

4xx (Request error)
These status codes indicate that there was likely an error in the request which prevented the server from being able to process it.
Code Description
400 (Bad request) The server didn't understand the syntax of the request.
401 (Not authorized) The request requires authentication. The server might return this response for a page behind a login.
403 (Forbidden) The server is refusing the request. If you see that Googlebot received this status code when trying to crawl valid pages of your site (you can see this on the Web crawl page under Diagnostics in Google Webmaster Tools), it's possible that your server or host is blocking Googlebot's access.
404 (Not found)

The server can't find the requested page. For instance, the server often returns this code if the request is for a page that doesn't exist on the server.

If you don't have a robots.txt file on your site and see this status on the robots.txt page of the Diagnostic tab in Google Webmaster Tools, this is the correct status. However, if you do have a robots.txt file and you see this status, then your robots.txt file may be named incorrectly or in the wrong location. (It should be at the top-level of the domain and named robots.txt.)

If you see this status for URLs that Googlebot tried to crawl (on the HTTP errors page of the Diagnostic tab), then Googlebot likely followed an invalid link from another page (either an old link or a mistyped one).
405 (Method not allowed) The method specified in the request is not allowed.
406 (Not acceptable) The requested page can't respond with the content characteristics requested.
407 (Proxy authentication required) This status code is similar 401 (Not authorized); but specifies that the requestor has to authenticate using a proxy. When the server returns this response, it also indicates the proxy that the requestor should use.
408 (Request timeout) The server timed out waiting for the request.
409 (Conflict) The server encountered a conflict fulfilling the request. The server must include information about the conflict in the response. The server might return this code in response to a PUT request that conflicts with an earlier request, along with a list of differences between the requests.
410 (Gone) The server returns this response when the requested resource has been permanently removed. It is similar to a 404 (Not found) code, but is sometimes used in the place of a 404 for resources that used to exist but no longer do. If the resource has permanently moved, you should use a 301 to specify the resource's new location.
411 (Length required) The server won't accept the request without a valid Content-Length header field.
412 (Precondition failed) The server doesn't meet one of the preconditions that the requestor put on the request.
413 (Request entity too large) The server can't process the request because it is too large for the server to handle.
414 (Requested URI is too long) The requested URI (typically, a URL) is too long for the server to process.
415 (Unsupported media type) The request is in a format not support by the requested page.
416 (Requested range not satisfiable) The server returns this status code if the request is for a range not available for the page.
417 (Expectation failed) The server can't meet the requirements of the Expect request-header field.

5xx (Server error)
These status codes indicate that the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request.
Code Description
500 (Internal server error) The server encountered an error and can't fulfill the request.
501 (Not implemented) The server doesn't have the functionality to fulfill the request. For instance, the server might return this code when it doesn't recognize the request method.
502 (Bad gateway) The server was acting as a gateway or proxy and received an invalid response from the upstream server.
503 (Service unavailable) The server is currently unavailable (because it is overloaded or down for maintenance). Generally, this is a temporary state.
504 (Gateway timeout) The server was acting as a gateway or proxy and didn't receive a timely request from the upstream server.
505 (HTTP version not supported) The server doesn't support the HTTP protocol version used in the request.

Friday, July 9, 2010

Google building Facebook rival

Web world is abuzz with rumours that Google's Facebook rival is in works. The rumour mill began churning after Digg founder Kevin Rose posted a tweet last weekend (saturday to be precise), saying, "Ok, umm, huge rumor: Google to launch facebook competitor very soon 'Google Me,' very credible source."

Wondering how Google Me will really work? Experts believe that Google already has almost 30 different social properties that it has acquired or built including YouTube, Picassa, Google Profiles, Google Docs, Google Friends Connect and Google Latitude. So, the company largely has almost every component that it requires to built a Facebook killer. All it needs to do is a bit of organisation and create a common platform for all these networks.

However, so far Google is still to get a hit in the social networking space. Orkut, one of the pioneers in social network, today has little following except in Brazil and India. More recently, the company made another foray into social networking arena with Google Buzz, which aims to upgrade Gmail into a social networking hub from just an e-mail service. However, the service has been relegating to more of a Web 2.0 sharing tool than a social networking hub.

Several analysts also wonder if Google would really risk waging a full-scale war against Facebook, which has seen its fortunes soar in the past year.

Writes eWeek's Clint Boulton, "Google challenging Facebook in social is like Facebook challenging Google in search," he writes. "People are comfortable socializing on Facebook, which is where their friends (and their friends of friends are) and they are comfortable searching on Google, which is where all of the data about businesses, places and other facts live. Unless and until there are technological improvements on both sides, paired with practical user behavior shifts from consumers, never the twain shall meet."

While analysts may continue to debate the issue, the rumour has been further fanned by Adam D’Angelo, former Facebook CTO and now founder of Q&A service Quora (on Quora only).

Here's what Angelo wrote on Quora.

* This is not a rumor. This is a real project. There are a large number of people working on it. I am completely confident about this.

* They realized that Buzz wasn't enough and that they need to build out a full, first-class social network. They are modeling it off of Facebook.

* Unlike previous attempts (before Buzz at least), this is a high-priority project within Google.

* They had assumed that Facebook's growth would slow as it grew, and that Facebook wouldn't be able to have too much leverage over them, but then it just didn't stop, and now they are really scared.

Thursday, July 8, 2010

Painless Bug Tracking

TRS-80 Level-I BASIC could only store two string variables, A$ and B$. Similarly, I was born with only two bug-storing-slots in my brain. At any given time, I can only remember two bugs. If you ask me to remember three, one of them will fall on the floor and get swept under the bed with the dust bunnies, who will eat it.

Keeping a database of bugs is one of the hallmarks of a good software team. I never cease to be amazed at how few teams are actually doing this. One of the biggest incorrect facts that programmers consistently seem to believe is that they can remember all their bugs or keep them on post-it notes.

If I can bend your ear a moment, I'd like to explain a pretty painless way to do bug tracking, in the spirit of my previous articles on painless schedules and painless specs.

First of all, you need a real database. On teams of 2 people writing a little bit of code over the course of a long weekend, it's probably OK to use a text file as the database. Anything larger, and you're going to need a real bug tracking database. There are zillions of bug tracking databases you can buy. (Blatant self-promotion: the one we wrote at Fog Creek Software, called FogBUGZ, is web based, pretty easy to use, and quite powerful, if I may say so myself.)

Mikey the Programmer is hacking away on the new FTP client feature of his groovy Macintosh software. At some point, because he's feeling frisky, he writes his own string-copy function. That'll teach them pesky reusability police! Bwa ha ha!

Bad things happen when you don't reuse code, Mikey. And today, the bad thing that happened was that Mikey forgot to null-terminate the copied string. But he never noticed the problem because most of the time he happened to be copying strings into pre-zeroed memory.

Later that week, Jill the Very, Very Good Tester is banging away at the code, rolling her forehead back and forth on the keyboard or some equally cruel test. (Incidentally, most good testers are named Jill, or some variation like Gillian.) Suddenly something very strange happens: the ftp daemon she was testing against crashed. Yes, I know it's a Linux machine and Linux machines never crash (no snorting sounds from the slashdot crowd, please) but this dang thing crashed. And she wasn't even touching the server, she was just FTPing files to it using Mikey's Mac code.

Now, Jill is a very, very good tester, so she's kept a careful log of what she was doing (the precise pitch and yaw of her head as she rolled it on the keyboard is in her lab book, for example). She reboots everything, starts with a clean machine, repeats the steps, and -- Lo and Behold -- it happens again! The Linux ftp daemon crashed again! That's twice in one day, now! Take that, Linus.

Jill squints at the list of repro steps. There are about 20 steps. Some of them don't seem related. After a bit of experimentation, Jill is able to whittle the problem down to four steps that always cause the same behavior. Now she's ready to file a bug.

Jill enters the new bug in the bug tracking database. Now, just the act of entering a bug requires some discipline: there are good bug reports and bad bug reports.
Three Parts To Every Good Bug Report

And the Lord spake, saying, "First shalt thou take out the Holy Pin. Then, shalt thou count to three, no more, no less. Three shall be the number thou shalt count, and the number of the counting shalt be three. Four shalt thou not count, nor either count thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then lobbest thou thy Holy Hand Grenade of Antioch towards thou foe, who being naughty in my sight, shall snuff it."

-- Monty Python and the Holy Grail

It's pretty easy to remember the rule for a good bug report. Every good bug report needs exactly three things.

1. Steps to reproduce,
2. What you expected to see, and
3. What you saw instead.

Seems easy, right? Maybe not. As a programmer, people regularly assign me bugs where they left out one piece or another.

If you don't tell me how to repro the bug, I probably will have no idea what you are talking about. "The program crashed and left a smelly turd-like object on the desk." That's nice, honey. I can't do anything about it unless you tell me what you were doing. Now, I admit that there are two cases where it's hard to get exact steps to repro. Sometimes you just don't remember, or you're just transcribing a bug from "the field." (By the way, why do they call it "the field"? Is it, like, a field of rye or something? Anyway...) The other time it's OK not to have repro steps is when the bug happens sometimes but not all the time, but you should still provide repro steps, with a little annotation that says that it doesn't happen too often. In these cases, it's going to be really hard to find the bug, but we can try.

If you don't specify what you expected to see, I may not understand why this is a bug. The splash screen has blood on it. So what? I cut my fingers when I was coding it. What did you expect? Ah, you say that the spec required no blood! Now I understand why you consider this a bug.

Part three. What you saw instead. If you don't tell me this, I don't know what the bug is. That one is kind of obvious.
Back To Our Story

Anyhoo. Jill enters the bug. In a good bug tracking system it gets automatically assigned to the lead developer for that project. And therein lies the second concept: every bug needs to be assigned to exactly one person at all times, until it is closed. A bug is like a hot potato: when it's assigned to you, you are responsible to resolve it, somehow, or assign it to someone else.

Willie, the lead developer, looks at the bug, decides it's probably something to do with the ftp server, and resolves it as "won't fix." After all, they didn't write the ftp server.

When a bug is resolved, it gets assigned back to the person who opened it. This is a crucial point. It does not go away just because a programmer thinks it should. The golden rule is that only the person who opened the bug can close the bug. The programmer can resolve the bug, meaning, "hey, I think this is done," but to actually close the bug and get it off the books, the original person who opened it needs to confirm that it was actually fixed or agree that it shouldn't be fixed for some reason.

Jill gets an email telling her that the bug is back in her court. She looks at it and reads Willie the Lead Developer's comments. Something doesn't sound right. People have been using this ftp server for years and it doesn't crash. It only crashes when you use Mikey's code. So Jill reactivates the bug explaining her position, and the bug goes back to Willie. This time Willie assigns the bug to Mikey to fix.

Mikey studies the bug, thinks long and hard, and completely misdiagnoses the bug. He fixes some altogether different bug, and then resolves the one Jill opened.

The bug is back to Jill, this time marked "RESOLVED-FIXED". Jill tries her repro steps with the latest build, and, lo and behold, the Linux server crashes. She reactivates the bug again and assigns it straight back to Mikey.

Mikey is perplexed, but he finally tracks down the source of the bug. (Know what it is yet? I'll leave it as an exercise to the reader. I've given you enough clues!) He fixes it, tests it, and -- Eureka! The repro case no longer crashes the ftp server. Once again, he resolves it as FIXED. Jill also tries the repro steps, discovers that the bug is good 'n' fixed, and closes it out.
Top Ten Tips for Bug Tracking

1. A good tester will always try to reduce the repro steps to the minimal steps to reproduce; this is extremely helpful for the programmer who has to find the bug.
2. Remember that the only person who can close a bug is the person who opened it in the first place. Anyone can resolve it, but only the person who saw the bug can really be sure that what they saw is fixed.
3. There are many ways to resolve a bug. FogBUGZ allows you to resolve a bug as fixed, won't fix, postponed, not repro, duplicate, or by design.
4. Not Repro means that nobody could ever reproduce the bug. Programmers often use this when the bug report is missing the repro steps.
5. You'll want to keep careful track of versions. Every build of the software that you give to testers should have a build ID number so that the poor tester doesn't have to retest the bug on a version of the software where it wasn't even supposed to be fixed.
6. If you're a programmer, and you're having trouble getting testers to use the bug database, just don't accept bug reports by any other method. If your testers are used to sending you email with bug reports, just bounce the emails back to them with a brief message: "please put this in the bug database. I can't keep track of emails."
7. If you're a tester, and you're having trouble getting programmers to use the bug database, just don't tell them about bugs - put them in the database and let the database email them.
8. If you're a programmer, and only some of your colleagues use the bug database, just start assigning them bugs in the database. Eventually they'll get the hint.
9. If you're a manager, and nobody seems to be using the bug database that you installed at great expense, start assigning new features to people using bugs. A bug database is also a great "unimplemented feature" database, too.
10. Avoid the temptation to add new fields to the bug database. Every month or so, somebody will come up with a great idea for a new field to put in the database. You get all kinds of clever ideas, for example, keeping track of the file where the bug was found; keeping track of what % of the time the bug is reproducible; keeping track of how many times the bug occurred; keeping track of which exact versions of which DLLs were installed on the machine where the bug happened. It's very important not to give in to these ideas. If you do, your new bug entry screen will end up with a thousand fields that you need to supply, and nobody will want to input bug reports any more. For the bug database to work, everybody needs to use it, and if entering bugs "formally" is too much work, people will go around the bug database.

If you are developing code, even on a team of one, without an organized database listing all known bugs in the code, you are simply going to ship low quality code. On good software teams, not only is the bug database used universally, but people get into the habit of using the bug database to make their own "to-do" lists, they set their default page in their web browser to the list of bugs assigned to them, and they start wishing that they could assign bugs to the office manager to stock more Mountain Dew.

Tuesday, July 6, 2010

The Guerrilla Guide to Interviewing

A motley gang of anarchists, free-love advocates, and banana-rights agitators have hijacked The Love Boat out of Puerto Vallarta and are threatening to sink it in 7 days with all 616 passengers and 327 crew members unless their demands are met. The demand? A million dollars in small unmarked bills, and a GPL implementation of WATFIV, that is, the esteemed Waterloo Fortran IV compiler. (It’s surprising how few things the free-love people can find to agree on with the banana-rights people.)

As chief programmer of the Festival Cruise programming staff, you’ve got to decide if you can deliver a Fortran compiler from scratch in seven days. You’ve got a staff of two programmers to help you.

Can you do it?

“Well, I suppose, it depends,” you say. One of the benefits of writing this article is that I get to put words into your mouth and you can’t do a darn thing about it.

On what?

“Um, will my team be able to use UML-generating tools?”

Does that really matter? Three programmers, seven days, Waterloo Fortran IV. Are UML tools going to make or break it?

“I guess not.”

OK, so, what does it depend on?

“Will we have 19 inch monitors? And will we have access to all the Jolt we can drink?”

Again, does this matter? Is caffeine going to determine whether you can do it?

“I guess not. Oh, wait. You said I have a staff of two programmers?”

Right.

“Who are they?”

Does that matter?

“Sure! If the team doesn’t get along, we’ll never be able to work together. And I know a few superstar programmers who could crank out a Fortran compiler by themselves in one week, and lots of programmers who couldn’t write the code to print the startup banner if they had six months.”

Now we’re on to something!

Everybody gives lip service to the idea that people are the most important part of a software project, but nobody is quite sure what you can do about it. The very first thing you have to do right if you want to have good programmers is to hire the right programmers, and that means you have to be able to figure out who the right programmers are, and this is usually done in the interview process. So this chapter is all about interviewing.

(The in-person interview is just one part of the hiring process, which starts with sorting resumes and a phone screen. This article only covers in-person interviews.)

You should always try to have at least six people interview each candidate that gets hired, including at least five who would be peers of that candidate (that is, other programmers, not managers). You know the kind of company that just has some salty old manager interview each candidate, and that decision is the only one that matters? These companies don’t have very good people working there. It’s too easy to fake out one interview, especially when a non-programmer interviews a programmer.

If even two of the six interviewers thinks that a person is not worth hiring, don’t hire them. That means you can technically end the “day” of interviews after the first two if the candidate is not going to be hired, which is not a bad idea, but to avoid cruelty you may not want to tell the candidate in advance how many people will be interviewing them. I have heard of companies that allow any interviewer to reject a candidate. This strikes me as a little bit too aggressive; I would probably allow any senior person to reject a candidate but would not reject someone just because one junior person didn’t like them.

Don’t try to interview a bunch of people at the same time. It’s just not fair. Each interview should consist of one interviewer and one interviewee, in a room with a door that closes and a whiteboard. I can tell you from extensive experience that if you spend less than one hour on an interview you’re not going to be able to make a decision.

You’re going to see three types of people in your interviews. At one end of the scale, there are the unwashed masses, lacking even the most basic skills for this job. They are easy to ferret out and eliminate, often just by asking two or three quick questions. At the other extreme you’ve got your brilliant superstars who write lisp compilers for fun, in a weekend, in Assembler for the Nintendo DS. And in the middle, you have a large number of “maybes” who seem like they might just be able to contribute something. The trick is telling the difference between the superstars and the maybes, because the secret is that you don’t want to hire any of the maybes. Ever.

At the end of the interview, you must be prepared to make a sharp decision about the candidate. There are only two possible outcomes to this decision: Hire or No Hire. There is no other possible answer. Never say, “Hire, but not for my team.” This is rude and implies that the candidate is not smart enough to work with you, but maybe he’s smart enough for those losers over in that other team. If you find yourself tempted to say “Hire, but not in my team,” simply translate that mechanically to “No Hire” and you’ll be OK. Even if you have a candidate that would be brilliant at doing your particular task, but wouldn’t be very good in another team, that’s a No Hire. In software, things change so often and so rapidly that you need people that can succeed at just about any programming task that you throw at them. If for some reason you find an idiot savant that is really, really, really good at SQL but completely incapable of ever learning any other topic, No Hire. You’ll solve some short term pain in exchange for a lot of long term pain.

Never say “Maybe, I can’t tell.” If you can’t tell, that means No Hire. It’s really easier than you’d think. Can’t tell? Just say no! If you are on the fence, that means No Hire. Never say, “Well, Hire, I guess, but I’m a little bit concerned about…” That’s a No Hire as well. Mechanically translate all the waffling to “no” and you’ll be all right.

Why am I so hardnosed about this? It’s because it is much, much better to reject a good candidate than to accept a bad candidate. A bad candidate will cost a lot of money and effort and waste other people’s time fixing all their bugs. Firing someone you hired by mistake can take months and be nightmarishly difficult, especially if they decide to be litigious about it. In some situations it may be completely impossible to fire anyone. Bad employees demoralize the good employees. And they might be bad programmers but really nice people or maybe they really need this job, so you can’t bear to fire them, or you can’t fire them without pissing everybody off, or whatever. It’s just a bad scene.

On the other hand, if you reject a good candidate, I mean, I guess in some existential sense an injustice has been done, but, hey, if they’re so smart, don’t worry, they’ll get lots of good job offers. Don’t be afraid that you’re going to reject too many people and you won’t be able to find anyone to hire. During the interview, it’s not your problem. Of course, it’s important to seek out good candidates. But once you’re actually interviewing someone, pretend that you’ve got 900 more people lined up outside the door. Don’t lower your standards no matter how hard it seems to find those great candidates.

OK, I didn’t tell you the most important part—how do you know whether to hire someone?

In principle, it’s simple. You’re looking for people who are

1. Smart, and
2. Get things done.

That’s it. That’s all you’re looking for. Memorize that. Recite it to yourself before you go to bed every night. You don’t have enough time to figure out much more in a short interview, so don’t waste time trying to figure out whether the candidate might be pleasant to be stuck in an airport with, or whether they really know ATL and COM programming or if they’re just faking it.

People who are Smart but don’t Get Things Done often have PhDs and work in big companies where nobody listens to them because they are completely impractical. They would rather mull over something academic about a problem rather than ship on time. These kind of people can be identified because they love to point out the theoretical similarity between two widely divergent concepts. For example, they will say, “Spreadsheets are really just a special case of programming language,” and then go off for a week and write a thrilling, brilliant whitepaper about the theoretical computational linguistic attributes of a spreadsheet as a programming language. Smart, but not useful. The other way to identify these people is that they have a tendency to show up at your office, coffee mug in hand, and try to start a long conversation about the relative merits of Java introspection vs. COM type libraries, on the day you are trying to ship a beta.

People who Get Things Done but are not Smart will do stupid things, seemingly without thinking about them, and somebody else will have to come clean up their mess later. This makes them net liabilities to the company because not only do they fail to contribute, but they soak up good people’s time. They are the kind of people who decide to refactor your core algorithms to use the Visitor Pattern, which they just read about the night before, and completely misunderstood, and instead of simple loops adding up items in an array you’ve got an AdderVistior class (yes, it’s spelled wrong) and a VisitationArrangingOfficer singleton and none of your code works any more.

How do you detect smart in an interview? The first good sign is that you don’t have to explain things over and over again. The conversation just flows. Often, the candidate says something that shows real insight, or brains, or mental acuity. So an important part of the interview is creating a situation where someone can show you how smart they are. The worst kind of interviewer is the blowhard. That’s the kind who blabs the whole time and barely leaves the candidate time to say, “yes, that’s so true, I couldn’t agree with you more.” Blowhards hire everyone; they think that the candidate must be smart because “he thinks so much like me!”

The second worst kind of interviewer is the Quiz Show Interviewer. This is the kind of person who thinks that smart means “knows a lot of facts.” They just ask a bunch of trivia questions about programming and give points for correct answers. Just for fun, here is the worst interview question on Earth: “What’s the difference between varchar and varchar2 in Oracle 8i?” This is a terrible question. There is no possible, imaginable correlation between people that know that particular piece of trivia and people that you want to hire. Who cares what the difference is? You can find out online in about fifteen seconds! Remember, smart does not mean “knows the answer to trivia questions.” Anyway, software teams want to hire people with aptitude, not a particular skill set. Any skill set that people can bring to the job will be technologically obsolete in a couple of years, anyway, so it’s better to hire people that are going to be able to learn any new technology rather than people who happen to know how to make JDBC talk to a MySQL database right this minute.

But in general, the way to learn the most about a person is to let them do the talking. Give them open-ended questions and problems.

So, what do you ask?

My personal list of interview questions originates from my first job at Microsoft. There are actually hundreds of famous Microsoft interview questions. Everybody has a set of questions that they really like. You, too, will develop a particular set of questions and a personal interviewing style which helps you make the Hire/No Hire decision. Here are some techniques that I have used that have been successful.

Before the interview, I read over the candidates resume and jot down an interview plan on a scrap of paper. That’s just a list of questions that I want to ask. Here’s a typical plan for interviewing a programmer:

1. Introduction
2. Question about recent project candidate worked on
3. Easy Programming Question
4. Pointer/Recursion Question
5. Are you satisfied?
6. Do you have any questions?

I am very, very careful to avoid anything that might give me some preconceived notions about the candidate. If you think that someone is smart before they even walk into the room, just because they have a Ph.D. from MIT, then nothing they can say in one hour is going to overcome that initial prejudice. If you think they are a bozo because they went to community college, nothing they can say will overcome that initial impression. An interview is like a very, very delicate scale—it’s very hard to judge someone based on a one hour interview and it may seem like a very close call. But if you know a little bit about the candidate beforehand, it’s like a big weight on one side of the scale, and the interview is useless. Once, right before an interview, a recruiter came into my office. “You’re going to love this guy,” she said. Boy did this make me mad. What I should have said was, “Well, if you’re so sure I’m going to love him, why don’t you just hire him instead of wasting my time going through this interview.” But I was young and naïve, so I interviewed him. When he said not-so-smart things, I thought to myself, “gee, must be the exception that proves the rule.” I looked at everything he said through rose-colored glasses. I wound up saying Hire even though he was a crappy candidate. You know what? Everybody else who interviewed him said No Hire. So: don’t listen to recruiters; don’t ask around about the person before you interview them; and never, ever talk to the other interviewers about the candidate until you’ve both made your decisions independently. That’s the scientific method.

The introduction phase of the interview is intended to put the candidate at ease. I ask them if they had a nice flight. I spend about 30 seconds telling the person who I am and how the interview will work. I always reassure candidates that we are interested in how they go about solving problems, not the actual answer.

Part two is a question about some recent project that the candidate worked on. For interviewing college kids, ask them about their senior thesis, if they had one, or about a course they took that involved a long project that they really enjoyed. For example, sometimes I will ask, “what class did you take last semester that you liked the most? It doesn’t have to be computer-related.” When interviewing experienced candidates, you can talk about their most recent assignment from their previous job.

Again, ask open-ended questions and sit back and listen, with only the occasional “tell me more about that” if they seem to stall.

What should you look for during the open ended questions?

One: Look for passion. Smart people are passionate about the projects they work on. They get very excited talking about the subject. They talk quickly, and get animated. Being passionately negative can be just as good a sign. “My last boss wanted to do everything on VAX computers because it was all he understood. What a dope!” There are far too many people around that can work on something and not really care one way or the other. It’s hard to get people like this motivated about anything.

Bad candidates just don’t care and will not get enthusiastic at all during the interview. A really good sign that a candidate is passionate about something is that when they are talking about it, they will forget for a moment that they are in an interview. Sometimes a candidate comes in who is very nervous about being in an interview situation—this is normal, of course, and I always ignore it. But then when you get them talking about Computational Monochromatic Art they will get extremely excited and lose all signs of nervousness. Good. I like passionate people who really care. (To see an example of Computational Monochromatic Art try unplugging your monitor.) You can challenge them on something (try it—wait for them to say something that’s probably true and say “that couldn’t be true”) and they will defend themselves, even if they were sweating five minutes ago, because they care so much they forget that you are going to be making Major Decisions About Their Life soon.

Two: Good candidates are careful to explain things well, at whatever level. I have rejected candidates because when they talked about their previous project, they couldn’t explain it in terms that a normal person could understand. Often CS majors will just assume that everyone knows what Bates Theorem is or what O(log n) means. If they start doing this, stop them for a minute and say, “could you do me a favor, just for the sake of the exercise, could you please explain this in terms my grandmother could understand.” At this point many people will still continue to use jargon and will completely fail to make themselves understood. Gong! You don’t want to hire them, basically, because they are not smart enough to comprehend what it takes to make other people understand their ideas.

Three: If the project was a team project, look for signs that they took a leadership role. A candidate might say, “We were working on X, but the boss said Y and the client said Z.” I’ll ask, “So what did you do?” A good answer to this might be “I got together with the other members of the team and wrote a proposal…” A bad answer might be, “Well, there was nothing I could do. It was an impossible situation.” Remember, Smart and Gets Things Done. The only way you’re going to be able to tell if somebody Gets Things Done is to see if historically they have tended to get things done in the past. In fact, you can even ask them directly to give you an example from their recent past when they took a leadership role and got something done—overcoming some institutional inertia, for example.

Most of the time in the interview, though, should be spent letting the candidate prove that they can write code.

Reassure candidates that you understand that it’s hard to write code without an editor, and you will forgive them if the whiteboard gets really messy. Also you understand that it’s hard to write bug-free code without a compiler, and you will take that into account.

For the first interview of the day, I’ve started including a really, really easy programming problem. I had to start doing this during the dotcom boom when a lot of people who thought HTML was “programming” started showing up for interviews, and I needed a way to avoid wasting too much time with them. It’s the kind of problem that any programmer working today should be able to solve in about one minute. Some examples:

1. Write a function that determines if a string starts with an upper-case letter A-Z
2. Write a function that determines the area of a circle given the radius
3. Add up all the values in an array

These softball questions seem too easy, so when I first started asking them, I had to admit that I really expected everyone to sail right through them. What I discovered was that everybody solved the problem, but there was a lot of variation in how long it took them to solve.

That reminded me of why I couldn’t trade bonds for a living.

Jared is a bond trader. He is always telling me about interesting deals that he did. There’s this thing called an option, and there are puts, and calls, and the market steepens, so you put on steepeners, and it’s all very confusing, but the weird thing is that I know what all the words mean, I know exactly what a put is (the right, but not the responsibility, to sell something at a certain price) and in only three minutes I can figure out what should happen if you own a put and the market goes up, but I need the full three minutes to figure it out, and when he’s telling me a more complicated story, where the puts are just the first bit, there are lots of other bits to the story, I lose track very quickly, because I’m lost in thought (“let’s see, market goes up, that mean interest rates go down, and now, a put is the right to sell something…”) until he gets out the graph paper and starts walking me through it, and my eyes glazeth over and it’s very sad. Even though I understand all the little bits, I can’t understand them fast enough to get the big picture.

And the same thing happens in programming. If the basic concepts aren’t so easy that you don’t even have to think about them, you’re not going to get the big concepts.

Serge Lang, a math professor at Yale, used to give his Calculus students a fairly simple algebra problem on the first day of classes, one which almost everyone could solve, but some of them solved it as quickly as they could write while others took a while, and Professor Lang claimed that all of the students who solved the problem as quickly as they could write would get an A in the Calculus course, and all the others wouldn’t. The speed with which they solved a simple algebra problem was as good a predictor of the final grade in Calculus as a whole semester of homework, tests, midterms, and a final.

You see, if you can’t whiz through the easy stuff at 100 m.p.h., you’re never gonna get the advanced stuff.

But like I said, the good programmers stand up, write the answer on the board, sometimes adding a clever fillip (Ooh! Unicode compliant! Nice!), and it takes thirty seconds, and now I have to decide if they’re really good, so I bring out the big guns: recursion and pointers.

15 years of experience interviewing programmers has convinced me that the best programmers all have an easy aptitude for dealing with multiple levels of abstraction simultaneously. In programming, that means specifically that they have no problem with recursion (which involves holding in your head multiple levels of the call stack at the same time) or complex pointer-based algorithms (where the address of an object is sort of like an abstract representation of the object itself).

I’ve come to realize that understanding pointers in C is not a skill, it’s an aptitude. In first year computer science classes, there are always about 200 kids at the beginning of the semester, all of whom wrote complex adventure games in BASIC for their PCs when they were 4 years old. They are having a good ol’ time learning C or Pascal in college, until one day they professor introduces pointers, and suddenly, they don’t get it. They just don’t understand anything any more. 90% of the class goes off and becomes Political Science majors, then they tell their friends that there weren’t enough good looking members of the appropriate sex in their CompSci classes, that’s why they switched. For some reason most people seem to be born without the part of the brain that understands pointers. Pointers require a complex form of doubly-indirected thinking that some people just can’t do, and it’s pretty crucial to good programming. A lot of the “script jocks” who started programming by copying JavaScript snippets into their web pages and went on to learn Perl never learned about pointers, and they can never quite produce code of the quality you need.

That’s the source of all these famous interview questions you hear about, like “reversing a linked list” or “detect loops in a tree structure.”

Sadly, despite the fact that I think that all good programmers should be able to handle recursion and pointers, and that this is an excellent way to tell if someone is a good programmer, the truth is that these days, programming languages have almost completely made that specific art unnecessary. Whereas ten years ago it was rare for a computer science student to get through college without learning recursion and functional programming in one class and C or Pascal with data structures in another class, today it’s possible in many otherwise reputable schools to coast by on Java alone.

A lot of programmers that you might interview these days are apt to consider recursion, pointers, and even data structures to be a silly implementation detail which has been abstracted away by today’s many happy programming languages. “When was the last time you had to write a sorting algorithm?” they snicker.

Still, I don’t really care. I want my ER doctor to understand anatomy, even if all she has to do is put the computerized defibrillator nodes on my chest and push the big red button, and I want programmers to know programming down to the CPU level, even if Ruby on Rails does read your mind and build a complete Web 2.0 social collaborative networking site for you with three clicks of the mouse.

Even though the format of the interview is, superficially, just a candidate writing some code on the whiteboard, my real goal here is to have a conversation about it. “Why did you do it that way?” “What are the performance characteristics of your algorithm?” “What did you forget?” “Where’s your bug?”

That means I don’t really mind giving programming problems that are too hard, as long as the candidate has some chance of starting out, and then I’m happy to dole out little hints along the way, little toeholds, so to speak. I might ask someone, say, to project a triangle onto a plane, a typical graphics problem, and I don’t mind helping them with the trig (SOH-CAH-TOA, baby!), and when I ask them how to speed it up, I might drop little hints about look-up tables. Notice that the kinds of hints I’m happy to provide are really just answers to trivia questions—the kinds of things that you find on Google.

Inevitably, you will see a bug in their function. So we come to question five from my interview plan: “Are you satisfied with that code?” You may want to ask, “OK, so where’s the bug?” The quintessential Open Ended Question From Hell. All programmers make mistakes, there’s nothing wrong with that, they just have to be able to find them. With string functions in C, most college kids forget to null-terminate the new string. With almost any function, they are likely to have off-by-one errors. They will forget semicolons sometimes. Their function won’t work correctly on 0 length strings, or it will GPF if malloc fails… Very, very rarely, you will find a candidate that doesn’t have any bugs the first time. In this case, this question is even more fun. When you say, “There’s a bug in that code,” they will review their code carefully, and then you get to see if they can be diplomatic yet firm in asserting that the code is perfect.

As the last step in an interview, ask the candidate if they have any questions. Remember, even though you’re interviewing them, the good candidates have lots of choices about where to work and they’re using this day to figure out if they want to work for you.

Some interviewees try to judge if the candidate asks “intelligent” questions. Personally, I don’t care what questions they ask; by this point I’ve already made my decision. The trouble is, candidates have to see about five or six people in one day, and it’s hard for them to ask five or six people different, brilliant questions, so if they don’t have any questions, fine.

I always leave about five minutes at the end of the interview to sell the candidate on the company and the job. This is actually important even if you are not going to hire them. If you’ve been lucky enough to find a really good candidate, you want to do everything you can at this point to make sure that they want to come work for you. Even if they are a bad candidate, you want them to like your company and go away with a positive impression.

Ah, I just remembered that I should give you some more examples of really bad questions.

First of all, avoid the illegal questions. Anything related to race, religion, gender, national origin, age, military service eligibility, veteran status, sexual orientation, or physical handicap is illegal here in the United States. If their resume says they were in the Marines, you can’t ask them, even to make pleasant conversation, if they were in Iraq. It’s against the law to discriminate based on veteran status. If their resume says that they attended the Technion in Haifa, don’t ask them, even conversationally, if they are Israeli, even if you’re just making conversation because your wife is Israeli, or you love Felafel. It’s against the law to discriminate based on national origin.

Next, avoid any questions which might make it seem like you care about, or are discriminating based on, things which you don’t actually care about or discriminate based on. The best example of this I can think of is asking someone if they have kids or if they are married. This might give the impression that you think that people with kids aren’t going to devote enough time to their work or that they are going to run off and take maternity leave. Basically, stick to questions which are completely relevant to the job for which they are interviewing.

Finally, avoid brain teaser questions like the one where you have to arrange 6 equal length sticks to make exactly 4 identical perfect triangles. Or anything involving pirates, marbles, and secret codes. Most of these are “Aha!” questions—the kind of question where either you know the answer or you don’t. With these questions knowing the answer just means you heard that brain teaser before. So as an interviewer, you don’t get any information about “smart/get things done” by figuring out if they happen to make a particular mental leap.

In the past, I’ve used “impossible questions,” also known as “back of the envelope questions.” Classic examples of this are “How many piano tuners are there in Seattle?” The candidate won’t know the answer, but smart candidates won’t give up and they’ll be happy to try and estimate a reasonable number for you. Let’s see, there are probably… what, a million people in Seattle? And maybe 1% of them have pianos? And a piano needs to be tuned every couple of years? And it takes 35 minutes to tune one? All wrong, of course, but at least they’re attacking the problem. The only reason to ask a question like this is that it lets you have a conversation with the candidate. “OK, 35 minutes, but what about travel time between pianos?”

“Good point. If the piano tuner could take reservations well in advance they could probably set up their schedule to minimize travel time. You know, do all the pianos in Redmond on Monday rather than going back and forth across 520 three times a day.”

A good back-of-the-envelope question allows you to have a conversation with the candidate that helps you form an opinion about whether they are smart. A bad “Aha!” pirate question usually results in the candidate just sort of staring at you for a while and then saying they’re stuck.

If, at the end of the interview, you’ve convinced yourself that this person is smart and gets things done, and four or five other interviewers agree, you probably won’t go wrong in hiring them. But if you have any doubts, you’re better off waiting for someone better.

The optimal time to make a decision about the candidate is about three minutes after the end of the interview. Far too many companies allow interviewers to wait days or weeks before turning in their feedback. Unfortunately, the more time that passes, the less you’ll remember.

I ask interviewers to write immediate feedback after the interview, either a “hire” or “no hire”, followed by a one or two paragraph justification. It’s due 15 minutes after the interview ends.

If you’re having trouble deciding, there’s a very simple solution. NO HIRE. Just don’t hire people that you aren’t sure about. This is a little nerve wracking the first few times—what if we never find someone good? That’s OK. If your resume and phone-screening process is working, you’ll probably have about 20% hires in the live interview. And when you find the smart, gets-things-done candidate, you’ll know it. If you’re not thrilled with someone, move on.

The Joel Test: 12 Steps to Better Code

Have you ever heard of SEMA?
It's a fairly esoteric system for measuring how good a software team is. No, wait! Don't follow that link! It will take you about six years just to understand that stuff. So I've come up with my own, highly irresponsible, sloppy test to rate the quality of a software team. The great part about it is that it takes about 3 minutes. With all the time you save, you can go to medical school.

The Joel Test

1. Do you use source control?
2. Can you make a build in one step?
3. Do you make daily builds?
4. Do you have a bug database?
5. Do you fix bugs before writing new code?
6. Do you have an up-to-date schedule?
7. Do you have a spec?
8. Do programmers have quiet working conditions?
9. Do you use the best tools money can buy?
10. Do you have testers?
11. Do new candidates write code during their interview?
12. Do you do hallway usability testing?

The neat thing about The Joel Test is that it's easy to get a quick yes or no to each question. You don't have to figure out lines-of-code-per-day or average-bugs-per-inflection-point. Give your team 1 point for each "yes" answer. The bummer about The Joel Test is that you really shouldn't use it to make sure that your nuclear power plant software is safe.

A score of 12 is perfect, 11 is tolerable, but 10 or lower and you've got serious problems. The truth is that most software organizations are running with a score of 2 or 3, and they need serious help, because companies like Microsoft run at 12 full-time.

Of course, these are not the only factors that determine success or failure: in particular, if you have a great software team working on a product that nobody wants, well, people aren't going to want it. And it's possible to imagine a team of "gunslingers" that doesn't do any of this stuff that still manages to produce incredible software that changes the world. But, all else being equal, if you get these 12 things right, you'll have a disciplined team that can consistently deliver.

1. Do you use source control?
I've used commercial source control packages, and I've used CVS, which is free, and let me tell you, CVS is fine. But if you don't have source control, you're going to stress out trying to get programmers to work together. Programmers have no way to know what other people did. Mistakes can't be rolled back easily. The other neat thing about source control systems is that the source code itself is checked out on every programmer's hard drive -- I've never heard of a project using source control that lost a lot of code.

2. Can you make a build in one step?
By this I mean: how many steps does it take to make a shipping build from the latest source snapshot? On good teams, there's a single script you can run that does a full checkout from scratch, rebuilds every line of code, makes the EXEs, in all their various versions, languages, and #ifdef combinations, creates the installation package, and creates the final media -- CDROM layout, download website, whatever.

If the process takes any more than one step, it is prone to errors. And when you get closer to shipping, you want to have a very fast cycle of fixing the "last" bug, making the final EXEs, etc. If it takes 20 steps to compile the code, run the installation builder, etc., you're going to go crazy and you're going to make silly mistakes.

For this very reason, the last company I worked at switched from WISE to InstallShield: we required that the installation process be able to run, from a script, automatically, overnight, using the NT scheduler, and WISE couldn't run from the scheduler overnight, so we threw it out. (The kind folks at WISE assure me that their latest version does support nightly builds.)

3. Do you make daily builds?
When you're using source control, sometimes one programmer accidentally checks in something that breaks the build. For example, they've added a new source file, and everything compiles fine on their machine, but they forgot to add the source file to the code repository. So they lock their machine and go home, oblivious and happy. But nobody else can work, so they have to go home too, unhappy.

Breaking the build is so bad (and so common) that it helps to make daily builds, to insure that no breakage goes unnoticed. On large teams, one good way to insure that breakages are fixed right away is to do the daily build every afternoon at, say, lunchtime. Everyone does as many checkins as possible before lunch. When they come back, the build is done. If it worked, great! Everybody checks out the latest version of the source and goes on working. If the build failed, you fix it, but everybody can keep on working with the pre-build, unbroken version of the source.

On the Excel team we had a rule that whoever broke the build, as their "punishment", was responsible for babysitting the builds until someone else broke it. This was a good incentive not to break the build, and a good way to rotate everyone through the build process so that everyone learned how it worked.

Read more about daily builds in my article Daily Builds are Your Friend.

4. Do you have a bug database?
I don't care what you say. If you are developing code, even on a team of one, without an organized database listing all known bugs in the code, you are going to ship low quality code. Lots of programmers think they can hold the bug list in their heads. Nonsense. I can't remember more than two or three bugs at a time, and the next morning, or in the rush of shipping, they are forgotten. You absolutely have to keep track of bugs formally.

Bug databases can be complicated or simple. A minimal useful bug database must include the following data for every bug:

* complete steps to reproduce the bug
* expected behavior
* observed (buggy) behavior
* who it's assigned to
* whether it has been fixed or not

If the complexity of bug tracking software is the only thing stopping you from tracking your bugs, just make a simple 5 column table with these crucial fields and start using it.

For more on bug tracking, read Painless Bug Tracking.

5. Do you fix bugs before writing new code?
The very first version of Microsoft Word for Windows was considered a "death march" project. It took forever. It kept slipping. The whole team was working ridiculous hours, the project was delayed again, and again, and again, and the stress was incredible. When the dang thing finally shipped, years late, Microsoft sent the whole team off to Cancun for a vacation, then sat down for some serious soul-searching.

What they realized was that the project managers had been so insistent on keeping to the "schedule" that programmers simply rushed through the coding process, writing extremely bad code, because the bug fixing phase was not a part of the formal schedule. There was no attempt to keep the bug-count down. Quite the opposite. The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote "return 12;" and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs. In the post-mortem, this was referred to as "infinite defects methodology".

To correct the problem, Microsoft universally adopted something called a "zero defects methodology". Many of the programmers in the company giggled, since it sounded like management thought they could reduce the bug count by executive fiat. Actually, "zero defects" meant that at any given time, the highest priority is to eliminate bugs before writing any new code. Here's why.

In general, the longer you wait before fixing a bug, the costlier (in time and money) it is to fix.

For example, when you make a typo or syntax error that the compiler catches, fixing it is basically trivial.

When you have a bug in your code that you see the first time you try to run it, you will be able to fix it in no time at all, because all the code is still fresh in your mind.

If you find a bug in some code that you wrote a few days ago, it will take you a while to hunt it down, but when you reread the code you wrote, you'll remember everything and you'll be able to fix the bug in a reasonable amount of time.

But if you find a bug in code that you wrote a few months ago, you'll probably have forgotten a lot of things about that code, and it's much harder to fix. By that time you may be fixing somebody else's code, and they may be in Aruba on vacation, in which case, fixing the bug is like science: you have to be slow, methodical, and meticulous, and you can't be sure how long it will take to discover the cure.

And if you find a bug in code that has already shipped, you're going to incur incredible expense getting it fixed.

That's one reason to fix bugs right away: because it takes less time. There's another reason, which relates to the fact that it's easier to predict how long it will take to write new code than to fix an existing bug. For example, if I asked you to predict how long it would take to write the code to sort a list, you could give me a pretty good estimate. But if I asked you how to predict how long it would take to fix that bug where your code doesn't work if Internet Explorer 5.5 is installed, you can't even guess, because you don't know (by definition) what's causing the bug. It could take 3 days to track it down, or it could take 2 minutes.

What this means is that if you have a schedule with a lot of bugs remaining to be fixed, the schedule is unreliable. But if you've fixed all the known bugs, and all that's left is new code, then your schedule will be stunningly more accurate.

Another great thing about keeping the bug count at zero is that you can respond much faster to competition. Some programmers think of this as keeping the product ready to ship at all times. Then if your competitor introduces a killer new feature that is stealing your customers, you can implement just that feature and ship on the spot, without having to fix a large number of accumulated bugs.

6. Do you have an up-to-date schedule?
Which brings us to schedules. If your code is at all important to the business, there are lots of reasons why it's important to the business to know when the code is going to be done. Programmers are notoriously crabby about making schedules. "It will be done when it's done!" they scream at the business people.

Unfortunately, that just doesn't cut it. There are too many planning decisions that the business needs to make well in advance of shipping the code: demos, trade shows, advertising, etc. And the only way to do this is to have a schedule, and to keep it up to date.

The other crucial thing about having a schedule is that it forces you to decide what features you are going to do, and then it forces you to pick the least important features and cut them rather than slipping into featuritis (a.k.a. scope creep).

Keeping schedules does not have to be hard. Read my article Painless Software Schedules, which describes a simple way to make great schedules.

7. Do you have a spec?
Writing specs is like flossing: everybody agrees that it's a good thing, but nobody does it.

I'm not sure why this is, but it's probably because most programmers hate writing documents. As a result, when teams consisting solely of programmers attack a problem, they prefer to express their solution in code, rather than in documents. They would much rather dive in and write code than produce a spec first.

At the design stage, when you discover problems, you can fix them easily by editing a few lines of text. Once the code is written, the cost of fixing problems is dramatically higher, both emotionally (people hate to throw away code) and in terms of time, so there's resistance to actually fixing the problems. Software that wasn't built from a spec usually winds up badly designed and the schedule gets out of control. This seems to have been the problem at Netscape, where the first four versions grew into such a mess that management stupidly decided to throw out the code and start over. And then they made this mistake all over again with Mozilla, creating a monster that spun out of control and took several years to get to alpha stage.

My pet theory is that this problem can be fixed by teaching programmers to be less reluctant writers by sending them off to take an intensive course in writing. Another solution is to hire smart program managers who produce the written spec. In either case, you should enforce the simple rule "no code without spec".

Learn all about writing specs by reading my 4-part series.

8. Do programmers have quiet working conditions?
There are extensively documented productivity gains provided by giving knowledge workers space, quiet, and privacy. The classic software management book Peopleware documents these productivity benefits extensively.

Here's the trouble. We all know that knowledge workers work best by getting into "flow", also known as being "in the zone", where they are fully concentrated on their work and fully tuned out of their environment. They lose track of time and produce great stuff through absolute concentration. This is when they get all of their productive work done. Writers, programmers, scientists, and even basketball players will tell you about being in the zone.

The trouble is, getting into "the zone" is not easy. When you try to measure it, it looks like it takes an average of 15 minutes to start working at maximum productivity. Sometimes, if you're tired or have already done a lot of creative work that day, you just can't get into the zone and you spend the rest of your work day fiddling around, reading the web, playing Tetris.

The other trouble is that it's so easy to get knocked out of the zone. Noise, phone calls, going out for lunch, having to drive 5 minutes to Starbucks for coffee, and interruptions by coworkers -- especially interruptions by coworkers -- all knock you out of the zone. If a coworker asks you a question, causing a 1 minute interruption, but this knocks you out of the zone badly enough that it takes you half an hour to get productive again, your overall productivity is in serious trouble. If you're in a noisy bullpen environment like the type that caffeinated dotcoms love to create, with marketing guys screaming on the phone next to programmers, your productivity will plunge as knowledge workers get interrupted time after time and never get into the zone.

With programmers, it's especially hard. Productivity depends on being able to juggle a lot of little details in short term memory all at once. Any kind of interruption can cause these details to come crashing down. When you resume work, you can't remember any of the details (like local variable names you were using, or where you were up to in implementing that search algorithm) and you have to keep looking these things up, which slows you down a lot until you get back up to speed.

Here's the simple algebra. Let's say (as the evidence seems to suggest) that if we interrupt a programmer, even for a minute, we're really blowing away 15 minutes of productivity. For this example, lets put two programmers, Jeff and Mutt, in open cubicles next to each other in a standard Dilbert veal-fattening farm. Mutt can't remember the name of the Unicode version of the strcpy function. He could look it up, which takes 30 seconds, or he could ask Jeff, which takes 15 seconds. Since he's sitting right next to Jeff, he asks Jeff. Jeff gets distracted and loses 15 minutes of productivity (to save Mutt 15 seconds).

Now let's move them into separate offices with walls and doors. Now when Mutt can't remember the name of that function, he could look it up, which still takes 30 seconds, or he could ask Jeff, which now takes 45 seconds and involves standing up (not an easy task given the average physical fitness of programmers!). So he looks it up. So now Mutt loses 30 seconds of productivity, but we save 15 minutes for Jeff. Ahhh!

9. Do you use the best tools money can buy?
Writing code in a compiled language is one of the last things that still can't be done instantly on a garden variety home computer. If your compilation process takes more than a few seconds, getting the latest and greatest computer is going to save you time. If compiling takes even 15 seconds, programmers will get bored while the compiler runs and switch over to reading The Onion, which will suck them in and kill hours of productivity.

Debugging GUI code with a single monitor system is painful if not impossible. If you're writing GUI code, two monitors will make things much easier.

Most programmers eventually have to manipulate bitmaps for icons or toolbars, and most programmers don't have a good bitmap editor available. Trying to use Microsoft Paint to manipulate bitmaps is a joke, but that's what most programmers have to do.

At my last job, the system administrator kept sending me automated spam complaining that I was using more than ... get this ... 220 megabytes of hard drive space on the server. I pointed out that given the price of hard drives these days, the cost of this space was significantly less than the cost of the toilet paper I used. Spending even 10 minutes cleaning up my directory would be a fabulous waste of productivity.

Top notch development teams don't torture their programmers. Even minor frustrations caused by using underpowered tools add up, making programmers grumpy and unhappy. And a grumpy programmer is an unproductive programmer.

To add to all this... programmers are easily bribed by giving them the coolest, latest stuff. This is a far cheaper way to get them to work for you than actually paying competitive salaries!

10. Do you have testers?
If your team doesn't have dedicated testers, at least one for every two or three programmers, you are either shipping buggy products, or you're wasting money by having $100/hour programmers do work that can be done by $30/hour testers. Skimping on testers is such an outrageous false economy that I'm simply blown away that more people don't recognize it.

Read Top Five (Wrong) Reasons You Don't Have Testers, an article I wrote about this subject.

11. Do new candidates write code during their interview?
Would you hire a magician without asking them to show you some magic tricks? Of course not.

Would you hire a caterer for your wedding without tasting their food? I doubt it. (Unless it's Aunt Marge, and she would hate you forever if you didn't let her make her "famous" chopped liver cake).

Yet, every day, programmers are hired on the basis of an impressive resumé or because the interviewer enjoyed chatting with them. Or they are asked trivia questions ("what's the difference between CreateDialog() and DialogBox()?") which could be answered by looking at the documentation. You don't care if they have memorized thousands of trivia about programming, you care if they are able to produce code. Or, even worse, they are asked "AHA!" questions: the kind of questions that seem easy when you know the answer, but if you don't know the answer, they are impossible.

Please, just stop doing this. Do whatever you want during interviews, but make the candidate write some code. (For more advice, read my Guerrilla Guide to Interviewing.)

12. Do you do hallway usability testing?
A hallway usability test is where you grab the next person that passes by in the hallway and force them to try to use the code you just wrote. If you do this to five people, you will learn 95% of what there is to learn about usability problems in your code.

Good user interface design is not as hard as you would think, and it's crucial if you want customers to love and buy your product. You can read my free online book on UI design, a short primer for programmers.

But the most important thing about user interfaces is that if you show your program to a handful of people, (in fact, five or six is enough) you will quickly discover the biggest problems people are having. Read Jakob Nielsen's article explaining why. Even if your UI design skills are lacking, as long as you force yourself to do hallway usability tests, which cost nothing, your UI will be much, much better.

Four Ways To Use The Joel Test

1. Rate your own software organization, and tell me how it rates, so I can gossip.
2. If you're the manager of a programming team, use this as a checklist to make sure your team is working as well as possible. When you start rating a 12, you can leave your programmers alone and focus full time on keeping the business people from bothering them.
3. If you're trying to decide whether to take a programming job, ask your prospective employer how they rate on this test. If it's too low, make sure that you'll have the authority to fix these things. Otherwise you're going to be frustrated and unproductive.
4. If you're an investor doing due diligence to judge the value of a programming team, or if your software company is considering merging with another, this test can provide a quick rule of thumb.

Anil on Software