Anil on Software

Wednesday, August 11, 2010

Incentive Pay Considered Harmful

Mike Murray, a surprisingly hapless HR manager at Microsoft, made a number of goofs, but the doozie was introducing a "Ship It" award shortly after he started the job. The idea was that you would get a big lucite tombstone the size of a dictionary when your product shipped. This was somehow supposed to give you an incentive to work, you see, because if you didn't do your job-- no lucite for you! Makes you wonder how Microsoft ever shipped software before the lucite slab.

The Ship It program was announced with an incredible amount of internal fanfare and hoopla at a big company picnic. For weeks before the event, teaser posters appeared all over the corporate campus with a picture of Bill Gates and the words "Why is this man smiling?" I'm not sure what that meant. Was Bill smiling because he was happy that we now had an incentive to ship software? But at the company picnic, it became apparent that the employees felt like they were being treated like children. There was a lot of booing. The Excel programming team held up a huge sign that said "Why is the Excel team yawning?" The Ship It award was so despised that there is even a (non-fiction) episode in Douglas Coupland's classic Microserfs in which a group of programmers try to destroy one with a blow torch.

Treating your rocket scientist employees as if they were still in kindergarten is not an isolated phenomenon. Almost every company has some kind of incentive program that is insulting and demeaning.

At two of the companies I've worked for, the most stressful time of year was the twice-yearly performance review period. For some reason, the Juno HR department and the Microsoft HR department must have copied their performance review system out of the same Dilbertesque management book, because both programs worked exactly the same way. First, you gave "anonymous" upward reviews for your direct manager (as if that could be done in an honest way). Then, you filled out optional "self-evaluation" forms, which your manager "took into account" in preparing your performance review. Finally, you got a numerical score, in lots of non-scalar categories like "works well with others", from 1-5, where the only possible scores were actually 3 or 4. Managers submitted bonus recommendations upwards, which were completely ignored and everybody received bonuses that were almost completely random. The system never took into account the fact that people have different and unique talents, all of which are needed for a team to work well.

Performance reviews were stressful for a couple of reasons. Many of my friends, especially the ones whose talents were very significant but didn't show up on the traditional scales, tended to get lousy performance reviews. For example, one friend of mine was a cheerful catalyst, a bouncy cruise director who motivated everyone else when the going got tough. He was the glue that held his team together. But he tended to get negative reviews, because his manager didn't understand his contribution. Another friend was incredibly insightful strategically; his conversations with other people about how things should be done allowed everyone else to do much better work. He tended to spend more time than average trying out new technologies; in this area he was invaluable to the rest of the team. But in terms of lines of code, he wrote less than average, and his manager was too stupid to notice all his other contributions, so he always got negative reviews, too. Negative reviews, obviously, have a devastating effect on morale. In fact, giving somebody a review that is positive, but not as positive as that person expected, also has a negative effect on morale.

The effect of reviews on morale is lopsided: while negative reviews hurt morale a lot, positive reviews have no effect on morale or productivity. The people who get them are already working productively. For them, a positive review makes them feel like they are doing good work in order to get the positive review... as if they were Pavlovian dogs working for a treat, instead of professionals who actually care about the quality of the work that they do.

And herein lies the rub. Most people think that they do pretty good work (even if they don't). It's just a little trick our minds play on us to keep life bearable. So if everybody thinks they do good work, and the reviews are merely correct (which is not very easy to achieve), then most people will be disappointed by their reviews. The cost of this in morale is hard to understate. On teams where performance reviews are done honestly, they tend to result in a week or so of depressed morale, moping, and some resignations. They tend to drive wedges between team members, often because the poorly-rated are jealous of the highly-rated, in a process that DeMarco and Lister call teamicide: the inadvertent destruction of jelled teams.

Alfie Kohn, in a now-classic Harvard Business Review article, wrote:

... at least two dozen studies over the last three decades have conclusively shown that people who expect to receive a reward for completing a task or for doing that task successfully simply do not perform as well as those who expect no reward at all. [HBR Sept/Oct 93]

He concludes that "incentives (or bribes) simply can't work in the workplace". DeMarco and Lister go further, stating unequivocally that any kind of workplace competition, any scheme of rewards and punishments, and even the old fashion trick of "catching people doing something right and rewarding them," all do more harm than good. Giving somebody positive reinforcement (such as stupid company ceremonies where people get plaques) implies that they only did it for the lucite plaque; it implies that they are not independent enough to work unless they are going to get a cookie; and it's insulting and demeaning.

Most software managers have no choice but to go along with performance review systems that are already in place. If you're in this position, the only way to prevent teamicide is to simply give everyone on your team a gushing review. But if you do have any choice in the matter, I'd recommend that you run fleeing from any kind of performance review, incentive bonus, or stupid corporate employee-of-the-month program.

McAfee says malware threat at a new high

BANGALORE (Reuters) – McAfee Inc, the No. 2 security software maker, said production of software code known as malware, which can harm computers and steal user passwords, reached a new high in the first six months of 2010.
McAfee said total malware production continued to soar and 10 million new pieces of malicious code were catalogued.
McAfee also warned users of Apple's Mac computers, considered relatively safe from virus attacks, that they may also be subjected to malware attacks in the future.
"For a variety of reasons, malware has rarely been a problem for Mac users. But those days might end soon," McAfee said.
"Our latest threat report depicts that malware has been on a steady incline in the first half of 2010," Mike Gallagher, chief technology officer of Global Threat Intelligence for McAfee, said in the report that was obtained by Reuters.
In April, McAfee Labs detected the Mac-based Trojan known as "OSX/HellRTS," which reads or modifies the contents of the clipboard or plays tricks on the user like opening and closing the CD drive.
"We do not want to overstate this threat. But it serves as a reminder that in this age of cybercrime, data theft and identity theft users of all operating systems and devices must take precautions," McAfee said.
After reaching a high point last year, the spread of spam messages has plateaued in the second quarter, McAfee said.
Computer attackers bank on major events such as the football World Cup to poison Internet searches and lure web users to click on infected links.

Thursday, August 5, 2010

Should all apps be n-tier?

I am often asked whether n-tier (where n>=3) is always the best way to go when building software.

Of course the answer is no. In fact, it is more likely that n-tier is not the way to go!

By the way, this isn’t the first time I’ve discussed this topic – you’ll find previous blog entries on this blog and an article at www.devx.com where I’ve covered much of the same material. Of course I also cover it rather a lot in my Expert VB.NET and C# Business Objects books.

Before proceeding further however, I need to get some terminology out of the way. There’s a huge difference between logical tiers and physical tiers. Personally I typically refer to logical tiers as layers and physical tiers as tiers to avoid confusion.

Logical layers are merely a way of organizing your code. Typical layers include Presentation, Business and Data – the same as the traditional 3-tier model. But when we’re talking about layers, we’re only talking about logical organization of code. In no way is it implied that these layers might run on different computers or in different processes on a single computer or even in a single process on a single computer. All we are doing is discussing a way of organizing a code into a set of layers defined by specific function.

Physical tiers however, are only about where the code runs. Specifically, tiers are places where layers are deployed and where layers run. In other words, tiers are the physical deployment of layers.

Why do we layer software? Primarily to gain the benefits of logical organization and grouping of like functionality. Translated to tangible outcomes, logical layers offer reuse, easier maintenance and shorter development cycles. In the final analysis, proper layering of software reduces the cost to develop and maintain an application. Layering is almost always a wonderful thing!

Why do we deploy layers onto multiple tiers? Primarily to obtain a balance between performance, scalability, fault tolerance and security. While there are various other reasons for tiers, these four are the most common. The funny thing is that it is almost impossible to get optimum levels of all four attributes – which is why it is always a trade-off between them.

Tiers imply process and/or network boundaries. A 1-tier model has all the layers running in a single memory space (process) on a single machine. A 2-tier model has some layers running in one memory space and other layers in a different memory space. At the very least these memory spaces exist in different processes on the same computer, but more often they are on different computers. Likewise, a 3-tier model has two boundaries. In general terms, an n-tier model has n-1 boundaries.

Crossing a boundary is expensive. It is on the order of 1000 times slower to make a call across a process boundary on the same machine than to make the same call within the same process. If the call is made across a network it is even slower. It is very obvious then, that the more boundaries you have the slower your application will run, because each boundary has a geometric impact on performance.

Worse, boundaries add raw complexity to software design, network infrastructure, manageability and overall maintainability of a system. In short, the more tiers in an application, the more complexity there is to deal with – which directly increases the cost to build and maintain the application.

This is why, in general terms tiers should be minimized. Tiers are not a good thing, they are a necessary evil required to obtain certain levels of scalability, fault tolerance or security.

As a good architect you should be dragged kicking and screaming into adding tiers to your system. But there really are good arguments and reasons for adding tiers, and it is important to accommodate them as appropriate.

The reality is that almost all systems today are at least 2-tier. Unless you are using an Access or dBase style database your Data layer is running on its own tier – typically inside of SQL Server, Oracle or DB2. So for the remainder of my discussion I’ll primarily focus on whether you should use a 2-tier or 3-tier model.

If you look at the CSLA .NET architecture from my Expert VB.NET and C# Business Objects books, you’ll immediately note that it has a construct called the DataPortal which is used to abstract the Data Access layer from the Presentation and Business layers. One key feature of the DataPortal is that it allows the Data Access layer to run in-process with the business layer, or in a separate process (or machine) all based on a configuration switch. It was specifically designed to allow an application to switch between a 2-tier or 3-tier model as a configuration option – with no changes required to the actual application code.

But even so, the question remains whether to configure an application for 2 or 3 tiers.

Ultimately this question can only be answered by doing a cost-benefit analysis for your particular environment. You need to weigh the additional complexity and cost of a 3-tier deployment against the benefits it might bring in terms of scalability, fault tolerance or security.

Scalability flows primarily from the ability to get database connection pooling. In CSLA .NET the Data Access layer is entirely responsible for all interaction with the database. This means it opens and closes all database connections. If the Data Access layer for all users is running on a single machine, then all database connections for all users can be pooled. (this does assume of course, that all users employ the same database connection string include the same database user id – that’s a prerequisite for connection pooling in the first place)

The scalability proposition is quite different for web and Windows presentation layers.

In a web presentation the Presentation and Business layers are already running on a shared server (or server farm). So if the Data Access layer also runs on the same machine database connection pooling is automatic. In other words, the web server is an implicit application server, so there’s really no need to have a separate application server just to get scalability in a web setting.

In a Windows presentation the Presentation and Business layers (at least with CSLA .NET) run on the client workstation, taking full advantage of the memory and CPU power available on those machines. If the Data Access layer is also deployed to the client workstations then there’s no real database connection pooling, since each workstation connects to the database directly. By employing an application server to run the Data Access layer all workstations offload that behavior to a central machine where database connection pooling is possible.

The big question with Windows applications is at what point to use an application server to gain scalability. Obviously there’s no objective answer, since it depends on the IO load of the application, pre-existing load on the database server and so forth. In other words it is very dependant on your particular environment and application. This is why the DataPortal concept is so powerful, because it allows you to deploy your application using a 2-tier model at first, and then switch to a 3-tier model later if needed.

There’s also the possibility that your Windows application will be deployed to a Terminal Services or Citrix server rather than to actual workstations. Obviously this approach totally eliminates the massive scalability benefits of utilizing the memory and CPU of each user’s workstation, but does have the upside of reducing deployment cost and complexity. I am not an expert on either server environment, but it is my understanding that each user session has its own database connection pool on the server, thus acting the same as if each user has their own separate workstation. If this is actually the case, then an application server would have benefit by providing database connection pooling. However, if I’m wrong and all user sessions share database connections across the entire Terminal Services or Citrix server then having an application server would offer no more scalability benefit here than it does in a web application (which is to say virtually none).

Fault tolerance is a bit more complex than scalability. Achieving real fault tolerance requires examination of all failure points that exist between the user and the database – and of course the database itself. And if you want to be complete, you just also consider the user to be a failure point, especially when dealing with workflow, process-oriented or service-oriented systems.

In most cases adding an application server to either a web or Windows environment doesn’t improve fault tolerance. Rather it merely makes it more expensive because you have to make the application server fault tolerant along with the database server, the intervening network infrastructure and any client hardware. In other words, fault tolerance is often less expensive in a 2-tier model than in a 3-tier model.

Security is also a complex topic. For many organizations however, security often comes down to protecting access to the database. From a software perspective this means restricting the code that interacts with the database and providing strict controls over the database connection strings or other database authentication mechanisms.

Security is a case where 3-tier can be beneficial. By putting the Data Access layer onto its own application server tier we isolate all code that interacts with the database onto a central machine (or server farm). More importantly, only that application server needs to have the database connection string or the authentication token needed to access the database server. No web server or Windows workstation needs the keys to the database, which can help improve the overall security of your application.

Of course we must always remember that switching from 2-tier to 3-tier decreases performance and increases complexity (cost). So any benefits from scalability or security must be sufficient to outweigh these costs. It all comes down to a cost-benefit analysis.

Tuesday, July 27, 2010

World Will Run Out of Internet Addresses in Less Than a Year, Experts Predict

Experts predict the world will run out of internet addresses in less than a year, the Sydney Morning Herald reported Monday.

The internet protocol used by the majority of web users, IPv4, provides for about four billion IP addresses -- the unique 32-digit number used to identify each computer
, website or internet-connected device.

There are currently only 232 million IP addresses left -- enough for about 340 days -- thanks to the explosion in smartphones and other web-enabled devices.

"When the IPv4 protocol was developed 30 years ago, it seemed to be a reasonable attempt at providing enough addresses," carrier relations manager at Australian internet service provider (ISP) Internode John Lindsay told the Herald.

"Bearing in mind that at that point personal computers didn't really exist, the idea that mobile phones might want an IP address hadn't occurred to anybody because mobile phones hadn't been invented [and] the idea that air-conditioners and refrigerators might want them was utterly ludicrous."

The solution to the problem is IPv6, which uses a 128-digit address. It would give everyone in the world more than four billion addresses each, but most of the internet industry has so far been reluctant to introduce it.

It would require each device that connects to the internet to be reconfigured or upgraded, with some users even being forced to buy new hardware, the Sydney Morning Herald reported.

In the meantime ISPs may force multiple customers to share IP addresses, which may lead to common applications
, such as Gmail and iTunes, ceasing to work.

There are also fears a black market of IP addresses may spring up.

Wednesday, July 14, 2010

How to Increase Query Speed by 3 Orders of Magnitude with no Indexes

Many years ago, shortly after I had first switched from embedded to relational database development, I saw my boss optimize performance so elegantly, so beautifully that it changed the way I approach the whole of development. He indeed figured out how to cut the running time of a query from about three hours to about 15 seconds without spending any time at all with the mechanics of query optimization.

First, some background. We were developing an integrated solution for managing a vaccine laboratory (using Oracle, but that is not relevant here). We were well along in the development process, basically starting the user acceptance phase. There was a very important screen that was taking about 3 hours to populate. Since this was a transactional screen, with performance like that the system was unusable.

Managers were starting to freak out. Important Meetings were being scheduled. The DBA was hauled onto the carpet, his explanations of "it's only a development instance" ignored. Diagnostics were being run, the database was in extents (in Oracle World, A Very Bad Thing (tm)), database monitors were flashing red. The project sponsor was called in.

My boss, a true genius, kept his head while others were losing theirs. He calmly went to the business people, the lab techs who were going to actually use the system. He asked a simple question: does the data populating this screen need to be current as of this instant, or would data as of the previous midnight be OK?

"As of midnight would be perfectly fine", the business people said. My boss then created a snapshot (an indexed view in MSSQL, now called a materialized view in Oracle) based on the query, to be run at midnight every day. The screen was now populated in 15 seconds, as opposed to 3 hours. The crisis was averted, the problem fixed, and everybody was happy.

Captain Kirk employed a similar strategy for the Kobayashi Maru game. For those non-Trekkies out there, Captain Kirk won a rigged game by changing the rules of the game. Unfortunately, we are programmed by school to think of that as "cheating", but I think it a shame that we don't do it more often in our lives of creating tools for people to use. All too often, we struggle with tasks that are not useful or are unimportant for the people who actually use our work because it is written somewhere, or because we learned the "proper" way in school.

In addition, all too often, we only talk directly with our users after experiencing a lot of drama. Then, once we sit together, we find out what they really need and want and then we slap our heads and say, that's easy!

That's a game no one enjoys playing. Here are some tips to avoid getting caught in that trap.

* Spend some time with your end users, if possible. Get to know them as people, not just users. Find out what they actually do, and what they actually want to use your tool to do. Be patient with their lack of technical skill; remember that they probably know things you don't, and that the more knowledge they have about what you do, the easier your job will be.
* Look at production data. If you get the chance to create prototypes, use production data in your prototypes.
* If you're in a position of some authority, eliminate the distinction between "analysts" and "programmers". Set it up so that everyone does both.
* Finally, always remember what Liberace said: without the business, there's no show.