Saturday, May 21, 2011

DataStage Custom Routine to Get a File Size

Below is a DataStage custom transform routine to get the size of a file. The full path of the file is passed in as a parameter called "Filename".

CMD = "ls -la " : Filename : " | awk '{print $5}'"
CALL DSExecute("UNIX",CMD,Output,SystemReturnCode)
size = Group(Output, @FM, 1)
Ans = If Num(size) Then size Else -1

If the file doesn't exist, -1 is returned.

Monday, May 16, 2011

Securing ELMAH with Independent HTTP Authentication

Introduction

ELMAH is an open source, plug-&-play solution for logging and reporting unhandled errors in ASP.NET web applications. By deploying it to GAC (global assembly cache) and configuring at server level (machine.config or root web.config), you can achieve zero footprints at application level. Existing and future applications automatically get error logging and reporting capability without a single line of code modification or configuration change. This is extremely attractive to IT shops with a lot of home-grown ASP.NET web applications. ELMAH's inadequate security features, however, might limit its wide adoption within enterprises. In this article, I will show how to secure ELMAH with independent HTTP authentication.

Inadequate security in ELMAH

ELMAH's built-in security can only turn remote access on/off. In other words, it uses server accessibility as a security defense. This is inadequate because the persons who want to use ELMAH may or may not have server access. It is application developers who are most interested in using ELMAH, which provides important clues to help them reproduce, identify and fix bugs. The persons who can logon servers are server administrators. In open source projects or small IT shops, developers are usually also administrators. So the built-in security may be sufficient. In enterprises, however, the two roles are almost always separated. Even if the developers have direct access to servers, they might prefer read ELMAH logs remotely on their development machines rather than locally on the servers. Once remote access is turned on, there is nothing within ELMAH to prevent anyone, from innocent surfers to malicious hackers, from reading error logs.

Phil Haack proposed an enhancement on his blog using a "location" tag in web.config to allow access only for authenticated users, as shown below.
The access can be further restricted to a subset of users, e.g. developers.

Basically, the enhancement secures ELMAH as a virtual sub directory taking advantage of host web application's security. In other words, ELMAH's security is outsourced to individual host web applications. This is a genius work-around, however, with the following limitations.
  1. You cannot do zero-footprint plug-&-play by putting location tags in machine.config or root web.config. Although location tags are allowed in high level configure files, you can only point one to a specific application (e.g. /DefaultSite/MyApplication/admin/elmah.axd). You cannot point one tag to all applications (/*/*/admin/elmah.axd is invalid). Therefore, each application requires a location tag, placed either individually in application level web.config or collectively in a higher level configure file. Put it another way. Since we outsource ELMAH's security to individual host applications, we have to sign separate contracts with every applications.
  2. The contracts are dependent on host applications' authentication mechanisms. Some applications may not require authentication at all. While these applications are open to general or internal (intranet) public, it does not mean that you want to open their ELMAH logs as well.
  3. They are also dependent on host applications' authorization roles. Some applications assign roles based on business functions, such as Director, Manager, and Staff etc. Developers are out of the picture.
In order to make ELMAH more appealing to enterprises, we need to keep its zero-footprint plug-&-play ability while adding an authentication and authorization mechanism independent to host web applications'.

Solution part I: HTTP authentication

First of all, in order to authenticate and authorize an ELMAH request, we have to know who is making the request. As explained above, we are not able to use ASP.NET authentication mechanisms (i.e. Windows or form authentication), since they may be different from application to application. We will resort to HTTP authentication.

HTTP authentication is a sequence of communications between a web server and a client browser based on 401 (Unauthorized) HTTP status code. If the server needs additional authentication information (username and password) for an incoming anonymous request, it responds with a 401 status code along with a WWW-Authenticate response header. Also included in the header is the authentication scheme name (Basic or Digest), indicating how the browser should send in the username and password. For Basic scheme, they should be concatenated with a colon (username:password) and Base64-encoded. And for Digest scheme, they should be encrypted (MD5 cryptographic hashing). Upon receiving the 401 response, the browser pops up a login dialog, allowing the user to enter the username and password. The username and password (encoded or encrypted according to the scheme) are inserted into the original request as an Authentication request header; and the request is then resent to the server. Once the username and password are recovered at the server, there are many options to authenticate and authorize the user.

Solution part II: Don’t call me (I’ll call you)

The next question is when and where we initiate the HTTP authentication so that it does not hurt ELMAH's zero-footprint plug-&-play capability. Since ELMAH requests are handled by ErrorLogPageFactory, it makes sense to have a look over there first. After poking around in ELMAH source code and a couple of Google searches, I found a discussion about an undocumented feature in ELMAH for implementing custom authorization. Below are the main points in the discussion.
  • Register an HttpModule that implements interface IRequestAuthorizationHandler.
  • The interface defines only one method: bool Authorize(HttpContext context).
  • The method is called everytime ELMAH request is made via ErrorLogPageFactory.
  • In order for ErrorLogPageFactory to discover the module under medium trust environment, it should inherit from HttpModuleBase and override SupportDiscoverability to return true.
  • They ran into an issue while trying to authorize a user with credentials stored in session. The Authorize method is called way too earlier before ASP.NET engine associates session state to a request.
The session availability issue they discussed confirmed the validity of independent HTTP authentication approach described in the previous section. The first several bullet points answered the above when and where question. What we need to do is deriving a class from HttpModuleBase, implementing interface IRequestAuthorizationHandler, waiting for a call from ErrorLogPageFactory, and then initiating HTTP authentication from within Authorize method.

It is, however, putting the cart before the donkey. Being an HttpModule, the authorization class has the privilege to examine all incoming requests BEFORE any HttpHandlers or HttpHandlerFactories. Instead of waiting passively for a call from ErrorLogPageFactory, the module can actively intercept ELMAH requests, initiate HTTP authentication, and pass authorized requests to ErrorLogPageFactory. Don't call me, I'll call you.
By taking the active approach, the authorization module does not have to inherit from HttpModuleBase, nor implement interface IRequestAuthorizationHandler. It is completely decoupled from ELMAH, not relying on any ELMAH API (documented or undocumented). So upgrading ELMAH to newer versions won't cause any compatibility issues. It also won't affect how ELMAH is deployed or configured, keeping its zero-footprint plug-&-play capability intact.
NoteThe authorization module intercepts ELMAH requests if incoming requests' URLs contain "elmah.axd". This is the only requirement on ELMAH's configuration.

Example implementation

Below is an example implementation of the authorization module.

In order to illustrate the main points of the solution and minimize the distraction from implementation details, I simplified the implementation by hard-coding Basic HTTP authentication scheme and Active Directory authentication/authorization service. Although simplified, it is still good enough for practical uses, as long as Basic scheme and AD service fit your environment. To use the example, simply drop the AuthModule.cs into your website's app_code folder or the SecurElmah.dll into bin folder, and register it in "httpModules" section in web.config.


Also add the following 2 entries in "appSettings" section to setup AD server and ELMAH authorized role.


Of course, SecurElmah can be deployed and configured globally just like ELMAH to achieve zero-footprint plug-&-play.

If Basic scheme doesn't fit your environment, it is fairly straightforward to switch to the more secure Digest scheme. It is also possible to implement both schemes, and make it configurable through web.config. Another place for improvement is authentication and authorization service. You can use ASP.NET MembershipProvider and RoleProvider to make it more flexible and configurable.

Summary

Combining the convenience of zero-footprint plug-&-play with the security of an independent authentication and authorization mechanism, ELMAH is enterprise-ready.