30 November 2009

JSP Considered Harmful

Why JSPs aren't good in Model 2 web applications

Model 1 is a design approach for web applications. In this approach, one servlet is responsible for handling a browser request from beginning to end - it receives the request, runs the business logic that performs the necessary operations (which more often than not involves database access), and generates output in HTML (or possibly in other forms such as XML, PDF and image data; for the purposes of this article I'm concerned primarily with HTML).

Model 1 is okay for small web applications but it really isn't very well suited to larger ones; because the view - the part that generates the HTML - is embedded in the same servlet that performs the business logic, making modifications to the view risks damaging the business logic. Imagine that you have a fairly large application and you decide to update the look and feel of the whole website; you'd have to modify every servlet.

Early Model 1 applications written as servlets had another problem in that the generation of HTML was performed by Java code. That meant that the servlet developers had to write code to generate that HTML, usually working with a web designer who provided the HTML. Obviously this wasn't a very efficient way of working.

The answer to this was JSP, which has often been described as 'inside-out servlets'. The idea is simple: instead of writing servlets that perform business logic and embed the HTML, a JSP is essentially an HTML file that has the business logic embedded as 'scriptlets' of explicit Java code. At run time a JSP compiler translates the JSP to Java source then compiles that source to class binaries that are then loaded in the normal way. (In some systems the JSP compiler generates bytecodes directly, without the intermediate Java source). This means that HTML designers can produce basic JSPs then hand the result over to a Java developer who can add in the scriptlets to perform the business logic.

However the business logic and the view are still contained in one file, which can still lead to the same kinds of problems - if you change the page layout you risk damaging the business logic and vice-versa.

Model 2 is intended to solve this problem. Model 2 decouples the business logic from the view, minimising the dependencies and making it possible, at least in principle, to make changes to the business logic or to the view without requiring changes on the other side of the fence.

This is where the problems begin for JSP. It's very easy to embed business logic in a JSP - that's the whole point when you consider that JSP is a Model 1 technology. However it's this very same point that makes JSP a bad fit for Model 2; it's simply too easy to drop code that really belongs in the business logic into a JSP. As a result novice developers often do just that and even seasoned developers do it from time to time, especially when a quick'n'dirty fix is needed to cure a production problem in a hurry. The result is business logic that's split between the servlet where it belongs and the JSP that should really contain only view logic. Ideally the scriptlets in a JSP for a Model 2 application should be doing no more than pulling data from the model and formatting it for presentation; any more than that probably belongs in the business logic. Maintaining this separation is a matter of discipline on the part of the developer because there's nothing in JSP to enforce it.

Struts was probably the first Model 2 framework for Java web applications although it's difficult to be certain. According to an article I read a year or two ago (I don't remember where I read it or who wrote it, and I haven't been able to find it through Google, so I can't back any of this up...), the people who developed Struts recognised that this was a problem. However they released Struts and advocated JSP as the view layer, for a number of reasons: first, there was no other mature view technology available at the time. They could have developed a more suitable package as part of Struts but getting something like that into a stable condition would have delayed the release of Struts by close to a year. Further, it would have meant that web developers wanting to use Struts would have had to learn the new view technology, making Struts a less attractive package. By using JSP, Struts would allow Model 1 developers to continue using a technology they were familiar with - they'd just have to take a more disciplined approach to JSP development.

The way I understand it (and again, I can't substantiate this), the Struts developers expected a more suitable view layer to become available before too long after Struts was released. Unfortunately it didn't work out quite that way; the first really useful package to fit the need was probably Velocity and that wasn't on the scene until something like two years later. In the meantime the JSP snowball had been rolling down the hill and picking up speed. Thanks to that momentum, JSP is still being used even today for new applications using Struts 2, Spring MVC and others, regardless of the problems it continues to cause in Model 2 environments.

The Alternatives

If JSP isn't a good fit for Model 2 applications, then what is?

The answer is quite simple: templating software. The idea is that you create HTML templates which contain embedded markers that tell the templating engine how to get dynamic data from the data model produced by the business logic and format it for presentation. The templating engine merges the template with the data model to produce the HTML and deliver it to the client browser.

Velocity and FreeMarker are probably the two most commonly used templating packages in current use. Integrating Velocity and/or FreeMarker is pretty simple. Both packages include ready-to-use servlets that fit with Struts 1, so in both cases all you have to do is configure your application to forward the request/response to the relevant servlet instead of the JSP servlet. Struts 2 has "Results" classes to provide Velocity and FreeMarker views; similarly Spring MVC includes view resolvers and views for Velocity and FreeMarker.

Which is better, Velocity or FreeMarker? I'll probably be writing an in-depth article comparing the two quite soon, but as a short answer... both have their own advantages and drawbacks. Velocity is simpler to learn, but FreeMarker offers more built-in functionality. Velocity is a bit more forgiving than FreeMarker but that may not be such a good thing, especially for larger applications. If I were asked to lay my cards on the table right now, I'd have to say that while I've used Velocity extensively over the last few years, I will almost certainly be using FreeMarker in the future.

In Summary...


  • Don't use JSP for Model 2 applications.

  • Use Velocity or FreeMarker instead.

  • Once you've developed a web application with Velocity or FreeMarker, I can almost guarantee you'll never want to write another JSP again.

Labels: ,

27 June 2008

ClassLoaders and Web Applications

This post has changed to correct some errors

In the previous posting I described the class loader heirarchy, and mentioned that you can use standard Java API classes and methods to create a branching tree structure of loaders. I also touched on the fact that Java EE application servers use this to ensure that each web application gets its own private load path for classes.

However, the loading rules as I described them don't quite apply to web applications, so this article is intended to cover the differences.

I'm describing the way things work in Tomcat; other servers should be doing something similar.

Tomcat Loader Heirarchy

As I mentioned in the previous post, ordinary Java applications have three loaders at run time: the Boot loader, which is usually native, platform-specific code built into the JVM and searches the standard class libraries; the Extension class loader, which searches JARs in the $JAVA_HOME/jre/lib/ext directory; and the System class loader, which searches directories and JARs specified in the CLASSPATH. The Extension loader is the parent of the System loader, and the Boot loader is (in practice) the parent of the Extension loader. Slight correction: some JVMs combine the Boot and Extension loaders into one.

Tomcat adds two more levels to this heirarchy for web applications: at the bottom, a WebappClassLoader (one for each deployed web application) with the application's WEB-INF/classes and all the archive files in the WEB-INF/lib directory set as the search path; and above that a StandardClassLoader (one only for the whole server), with its search path set as Tomcat's lib directory and all the archive files in it.

Tomcat's system class loader is the parent of this StandardClassLoader and has its search path set to include Tomcat's bootstrap.jar file and little else.

The StandardClassLoader class is a subclass of URLClassLoader but doesn't add or override anything from that class - it's functionally identical.

The WebappClassLoader class, on the other hand, subclasses URLClassLoader but reimplements most if not all of the methods. Here's why...

Servlet API loader rules

The Servlet API specification is the root of things here; it says that the loader search algorithm for web applications should ensure that classes and JARs packaged in the WAR should be searched before any of the servlet container's library JAR files, but should not allow the application to override any of the standard Java classes.

Tomcat's WebappClassLoader achieves the first part of this (searching the WAR contents first) by not delegating searches to its parent until after it has already searched its own repositories - the opposite of the usual procedure. (I'm not clear on how it ensures that standard classes don't get overridden - the code in that area is tricky to follow.) Major correction required here: In fact the WebAppClassLoader delegates directly to the System class loader first, then searches its own repositories, then finally delegates to its parent. This ensures that JRE classes can't be overridden (they get searched first) and that repositories are searched before Tomcat's lib contents.

That covers the important differences that apply to web applications. There is one other small aspect that I'd like to mention.

Getting Resources

The Servlet API specification also mandates that web applications must be able to locate their own resources using ClassLoader.getResource(). The heirarchy as described achieves this. However the specification says nothing about the static ClassLoader.getSystemResource() method and in fact this method is of little use in web applications because the System class loader doesn't know anything about the web application's resources (as you can see from where it is in the heirarchy as described).

Labels: ,

06 June 2008

Modifying the CLASSPATH at run time

Here's a question that comes my way occasionally: How can you change the search path for class loading at run time?

For example, let's say I have an application that reads the name of a JAR file from an external source, and then needs to add that JAR to the classpath so that it can load classes from it. This is something that's more likely to come up in a server environment, where you need the server to be able to add plug-in classes dynamically. For example, application servers like Tomcat need to be able to unpack a WAR file when requested; after unpacking there will be a 'classes' directory and a 'lib' directory full of JARs, all of which have to be added to the loading path so that the application can be started.

The solution to this problem requires an understanding of how the ClassLoader heirarchy works, so I'm going to cover that in some detail first.

ClassLoaders

The JVM includes a loader, usually referred to as the Boot loader. The default search path that this loader uses includes the Java runtime classes - java.lang, java.util, etc.

When the JVM is started it creates a ClassLoader object (loaded by the boot loader), usually referred to as the Extension ClassLoader. Its search path includes several JAR files found in the JVM's jre/lib/ext/ directory.

Then, the System loader is created. The search path for this loader is initialized from the CLASSPATH environment variable or from the value passed as the -cp option on the command line. (The label 'System' is a bit confusing; personally I think 'Application' class loader would be a more accurate and descriptive name.)

ClassLoaders are arranged in a heirarchy; each loader has a parent loader. The extension loader is the parent of the system loader; the parent of the extension loader is usually set as null. The boot loader is something of an exception in this respect - it's technically the parent of the extension loader but because it's part of the native implementation of the JVM and hence not a Class in the usual sense, it normally can't be accessed as a Java object.

When the JVM recognizes that it needs to load a new class, it calls a loader to do that. The loader it chooses is the same loader that loaded the class where the new class is first referenced at run time - that means that by default, when your application code first references a class that hasn't yet been loaded, it will call the system loader (the one that loaded your application classes).

The first thing the loader does is to delegate the request to its parent if it has one. The result of this is that all requests for new class references get delegated all the way up to the boot loader. So, if your code has requested a class that's to be found in the Java runtime, such as java.util.Map or java.text.Format, the boot loader will find and load the class.

If the loader can't locate the class it tells the caller - so if the class you requested is not in the boot loader path, it tells the extension loader that called it so. If the class isn't in any of the extension JARs, it gets passed back to the system loader. The system loader then attempts to find the class and in the case of your application classes, this would be where those get resolved. (Of course, if the request makes it all the way back down the heirarchy without the class being found, you'll get a ClassNotFoundException.)

To expand slightly: when a loader is called to load a class, this is the sequence of actions:


  • 1 - delegate to the parent loader if there is one. If the parent finds the class, the loader returns to its caller at this point.

  • 2 - if the parent doesn't find the class, or if there is no parent, the loader checks its local data to see if it already loaded the class. If it finds it, the loader returns at this point.

  • 3 - if the class definition isn't found in the local data, the loader attempts to find the class definition in its search path. If the class definition is found in the path, the class is loaded and added to the loader's local data, and the loader returns.

  • 4 - if this point is reached, the class hasn't been found - the loader returns control to its caller indicating such.



The Answer

Back to the original question: How to add more places to the search? The way to do that is to create a new loader with the locations you want to search set as its search path, and add this new loader into the heirarchy.

ClassLoader is an abstract class, and so can't be instantiated. Instead you'd normally use a URLClassLoader, which is basically the class to use - it does everything you would usually need. You can create your own loader classes by extending ClassLoader, but normally this is unnecessary.

The search path for URLClassLoader is provided as an array of java.net.URL objects; each URL identifies a directory or an archive file (.jar or .zip) to be searched when loading.

Let's say I have a JAR named /tmp/my-jar.jar and it contains a class called com.example.MyClass. I need to create an instance of this class. This code should do the trick:

    // First, set the search path
URL[] searchPath = new URL[1];
searchPath[0] = new File("/tmp/my-jar.jar")
.toURI()
.toURL();

// Now create a new loader
ClassLoader cl = new URLClassLoader(searchPath);

// Now we can load from the JAR:
Object o = Class.forName("com.example.MyClass",
true,
cl)
.newInstance();


A few notes about this code:

First, the URL array can contain URLs for directories as well as JAR and ZIP files. The example here has only one entry but you could provide an array containing hundreds of entries if you needed to. Note that you can't use wildcards here - each entry must point to a single archive file or directory.

Second, the loader created by the URLClassLoader constructor will have the system loader as a parent by default. You can provide a different parent as a second parameter to the constructor - this allows you to build a full-blown heirarchical tree of loaders within your application if you so wish.

Third, note the three-parameter call to Class.forName() - the first parameter is the class name, of course, as in the one-parameter call. The third parameter specifies our new loader as the one to use to load our class; the default is to use the same loader that loaded the calling class (this.getClass().getClassLoader()). The second parameter determines whether or not the class should be initialized (i.e. have its static initializer called) and you'd normally set this to true (offhand I can't think of a circumstance where you wouldn't want to to this).

Lastly, note that the new loader becomes the default for classes referenced by the newly-loaded classes. This means that MyClass can reference other classes in my-jar.jar implicitly or explicitly (i.e. using the one-parameter Class.forName() method) and the classes will be loaded correctly.

Using this you can create a structure of loaders organized as you need to implement different search paths for different requirements (for example, Tomcat uses one branch of a loader tree for its own server classes and another as a connection point for loading web applications; each webapp gets its own subtree. That's how multiple webapps can exist even with conflicting class names or versions, and without being able to access the server's internal classes).

Where the Class definitions are kept

Each loader keeps the Class objects that it loads in its own local space.

This means that if you create two loaders, each with the system loader as parent but with common directories and/or archive files in their search paths, it becomes possible to load the same class twice by invoking both class loaders to load the same class.

Other things URLClassLoader can do

To finish up, here are a couple of other useful things that you can do:

First, there's a method URLClassLoader.getURLs() that returns the loader's current search path as an array of URLs. This can be useful for debugging.

Second, loaders aren't limited to finding .class files - you can use them to find other resources that are in the search path. This applies to all loaders (i.e. ClassLoader and all its subclasses, not just URLClassLoader). This is extremely useful because it allows you to, for example, read from a property file embedded inside a JAR. Some methods that are especially useful are:

ClassLoader.getResource() - returns the URL of a named resource;

ClassLoader.getResourceAsStream() - returns an InputStream allowing you to read a named resource directly (handy for loading .properties files);

ClassLoader.getSystemResource() and ClassLoader.getSystemResourceAsStream() - static methods that do the same as the above methods, but use the system loader rather than a specific one that you may have created.

Labels:

02 March 2008

Ginger 1.4 released

I've just made the latest release of Ginger available on SourceForge.net. This version removes the code that was deprecated in version 1.3, cleans up a few comments (to fix references to the old "Point" classes and methods that got held over from Lynx, Ginger's predecessor). I've also replaced use of the StringBuffer class with the StringBuilder class that was introduced in Java 5, and done some other general code cleanup (nothing that affects the way it works, though - it's fully backward compatible with Ginger 1.3).

This is likely to be the final version of Ginger, as it appears to work just fine and I see no need to add or change anything more at this time. I want to change gears a bit and start working on one or two other small packages I have in mind for the Bohemia project.

Labels:

15 December 2007

How to set up a Derby server as a Tomcat application

What is Derby?

Derby is a 100% pure Java database engine. Its main strength is that it's embeddable, so if you have a standalone application that could benefit from keeping persistent data in a database but you don't want to have to set up a separate database server just to support it, Derby may be the answer. In the embedded mode your application starts Derby during initialization and shuts it down during termination. In the meantime it can work with database tables using SQL statements through a JDBC driver in the usual way.

Why this article?

Derby can also run as a network server, like any other database. Not only that, you can install it as a web application in a server such as Tomcat so that your database server is available whenever your web server is. In fact, the Derby distribution includes a WAR file that you can deploy that makes this very easy.

The problem is that Derby's documentation seems to concentrate on the embedded mode of operation and it's not easy to find information about setting it up as a Tomcat application without quite a bit of digging. Since I've been through this particular loop I decided to write this, partially as a reference for myself in case I need to do it again, and also for others needing to do the same thing.

This article describes the setup with Tomcat 6 running on Windows (XP Pro, specifically). Unix/Linux setup is similar and it shouldn't be a problem for anyone familiar with those systems to adapt the steps as required.

Download and unpack the distribution

Derby is distributed as a .zip or .tar.gz archive. There are a couple of different downloads available depending on whether you just want the executable JAR files or the full package including documents and sample code. Download the appropriate one then simply unpack it into a convenient place. I used the 'bin' distribution which includes everything.

The archive unpacks such that everything is inside one directory (the current version puts itself into 'db-derby-10.3.2.1-bin' for example, for the 'bin' distribution). This is Derby's 'home' directory, which we'll need to know in a moment.

Setting the environment and Java system properties

Derby only really needs one environment setting: DERBY_HOME should be set to the path of the home directory as noted above. On a Windows system use 'Control Panel-> System' then open the Advanced tab. You can add the new environment setting as a System or User value in the dialog there. 'System' is probably better, especially if you run Tomcat as a service.

You may also want to add %DERBY_HOME%\bin to your PATH setting; this isn't required but it makes it easier to run the Derby command-line tools.

One Java system property really needs to be set, too: derby.system.home identifies a directory where Derby will keep all its databases. Without this, when you run Derby under Tomcat it will default to '.', which equates to the Tomcat installation directory - the place where it keeps its conf and WebApps directories. This is almost certainly not a good idea (imagine what might happen if an application tries to create a database called 'conf').

I created a "DerbyData" directory to hold all my databases. Setting the derby.system.home property is a little bit of a problem, but I found a solution that works fine; Tomcat 5 and 6 include a "Monitor" program that you can run, at least on Windows. You can use this to set up the system properties (run the monitor, right-click on the system tray icon that appears, then open the 'Configure' option and click the 'Java' tab). Add -Dderby.system.home=your-database-directory to the system properties and click OK. Tomcat doesn't need to be running to do this.

Deploying derby.war

Start Tomcat (if it's already running you should probably shut it down and restart). Start your browser, find the Tomcat Manager page and log in.

Deploy the Derby WAR file in the usual way. You'll find the WAR file (derby.war) in Derby's lib directory.

You should now be able to hit the Derby service in your browser - it'll be at /derby/derbynet. This starts the database running, and if everything's ok you'll see a status page.

Modifying Derby's web.xml

We're not quite done yet. With the default installation you have to hit the Derby URL as above to start the database running every time you restart Tomcat. What's happening is that the database is started by the servlet's init() method, and that isn't happening until Tomcat gets a request. This probably isn't what you want, especially if you're planning to create other web applications that use the service. You probably want the thing to come up on its own.

You'll find Derby's web.xml file under your Tomcat installation at WebApps/derby/WEB-INF/web.xml. Open the file in a text editor and add the following tag inside the <servlet> tag:

<load-on-startup>0</load-on-startup>


This will cause Tomcat to call the servlet's initialization during startup instead of waiting for a request.

Save your text change and restart Tomcat.

Testing

After restarting Tomcat you should see a 'derby.log' file appear in your database directory. It may take a few seconds for this to happen, so give it time - if it hasn't shown up after, say, a minute, you may want to check your Tomcat logs to see if anything bad happened. Also, check your Tomcat directory to make sure the log file didn't show up there - if it did, it means that the derby.system.home setting hasn't taken for some reason, in which case you should go back and check for misspellings, etc.

If the log file appears in the right place, that probably means everything's ok so far. You can test the installation using Derby's ij command-line tool. Open a command window and do this (this assumes that you set the PATH environment as suggested earlier):

> ij
ij version 10.3
> connect 'jdbc:derby://localhost/testdata;create=true';
> exit;
>


This tells ij to open a connection to the database named 'testdata', creating it if it doesn't exist (which it shouldn't, at this point).

Now check your database directory - there should be a subdirectory called 'testdata' if all went well. You can delete it manually, although I'd recommend shutting Derby down first - either shut down Tomcat or hit the Derby URL as above and click the Stop button.

Congratulations! You now have a database server that will be available whenever Tomcat is running.

Labels: ,

Ginger 1.3 adds better handling of HTTP request methods

No plans for more changes in the near future

I realized after some thought that up until now Ginger was written with a bad assumption in the back of my mind - the assumption that it would only ever have to deal with GET and POST requests. This is not a good assumption, because there are six other types of request that it may have to deal with depending on how your servlet mappings are set up in the deployment descriptor.

HttpServlet is designed to identify the request type for you and takes default actions for the ones you don't explicitly set up to handle in a subclass, but for several reasons Ginger couldn't subclass from that. In fact, Ginger isn't a subclass at all, really - it just implements the Servlet interface directly.

So, in Ginger 1.3 I've added a mechanism to give you control over which HTTP request methods will be handled. By default, it will deal with GET, POST, HEAD, OPTIONS and TRACE. If it receives a request with a type not in that set, it'll respond with a 405 error. That means that by default your subclass servlet won't have to worry about CONNECT, DELETE and PUT requests.

You can change the set in your servletInit(), to define the set of requests you want your servlet to accept. Note though that Ginger itself will handle OPTIONS and TRACE requests in a way you can't override; it will automatically create responses and never calls any command methods for these requests. Also, by default it handles HEAD requests by running your configured commands in the normal way and rendering output from your templates, but the rendered data is only used to generate the character count for the response header - it isn't sent back to the client. (A nice benefit of the code change that makes this work is that Ginger now buffers the output internally for GET and POST responses; this makes it possible to set the response length, which in turn makes it possible for the servlet container to use a persistent connection, which in turn improves response times.) It is possible for your command methods to modify their behaviour for HEAD requests, but doing that really isn't recommended.

Another fairly big change I've made is to deprecate the Context class and replace it with a new class called Dynamic which is otherwise identical. The reason for this is that the name 'Context' is used in several packages all related to web applications - it's used in JNDI, servlet containers use the name to refer to a web application (Tomcat does, anyway), Velocity uses it to mean something completely different again, and there are probably other places too. That means you may have to fully-qualify com.codexombie.bohemia.ginger.Context in every command method that also accesses a JNDI context, for example. That's messy and error-prone, which is why I decided to make the change. Dynamic is so named because it's intended to hold dynamic data for the template merge (among other things) and I couldn't think of a better name that wasn't already in common use. Existing code can still use the old Context class but you'll get deprecation warnings, and in any case I don't intend to leave the deprecated code in place forever - it'll be removed entirely in a future release.

On the subject of future releases, I'm not planning to make any more changes for a while; the 1.3 release is the fourth in just a few weeks, which is a bit rushed. Unless I find that there's a major problem that needs fixing, there won't be any new releases for at least a couple of months (and even then, the only change may be to remove that deprecated code I mentioned).

The new release is, of course, available for download at the usual place.

Labels:

07 December 2007

Ginger 1.2

Adds basic JSP support

I've added very basic support for JSPs to Ginger through the addition of a new template processor. It's not intended to be a replacement for Velocity and doesn't indicate a shift in that direction - it's just to provide some level of support in case JSP is necessary for one or two pages in an application, and for some reason Velocity won't do what's needed. Check it out in the usual place.

I have no plans for further releases in the near future - I'm hoping it's stable enough to stand without more changes for a while to come.

Labels:

30 November 2007

New Ginger Version Released

Version 1.1 is now available

The new version adds three new attributes to render command configuration; these allow control of content-type, character set encoding and also allow for a 'no-render' mode of operation in which the final rendering phase is not performed. This makes it possible for a render command to set up the HTTP response with a redirection URL or an error code return, without needing to specify a superfluous template to be rendered.

The demo application has also undergone a facelift and adds the ability to switch between HTTP and HTTPS protocols. The primary purpose of this isn't to make the demo any more functional; it's so that developers can read the source code to see how it's done under Ginger.

Last, the Quick Start manual has been updated to include the changes.

See the Bohemia home page for full details.

Labels: