A Diaspora Code Review

I was curious to see the Diaspora code base when it was released on September 15.  Ever since I heard about Diaspora, I – like many other developers before me – have been pondering how I might architect a distributed social network myself, so I was curious to see how they had solved certain problems I was running into.  The short answer is:  They didn’t.

First of all, let me get this out of the way:  I’m looking at this purely from an engineering perspective.  From the business and marketing perspectives – arguably more important than engineering in these modern times – Diaspora is fine.  They have generated a huge amount of PR buzz, raised a bunch of money from nowhere, moved to Silicon Valley and probably have a bright future ahead of them.  They’re even leveraging the gullible open source community to do most of their work for them.  Most people can’t do any of that.

But…

It’s clear that the September 15 release of Diaspora is a prototype – a “technology preview” as Microsoft might say.  The functionality is roughly equivalent to what you’d expect from a college class project.  You can login, type status messages, upload pictures, and… well, that’s pretty much it.  (Okay, you can create “aspects,” too – whatever those are.  What’s wrong with naming them “groups?”)  I would estimate it needs a minimum of 6 more months of serious effort before it has any value to consumers, and even then it probably won’t have a tenth of Facebook’s features.  This release should never have been publicized, and the goal of an “Alpha” release in October seems pretty far-fetched.

The code itself is written in Ruby, so I can’t comment too much on it.  This is my first exposure to working code written in this quizzical language the kids like so much.  One comment I will make is that there aren’t very many comments. :)  And would it have killed someone to write some documentation so people could find their way around the code without individually opening every file to see what’s in it?  Just sayin’.

From what I can tell, the majority of work seems to have gone into creating a nice presentation layer – the HTML and Javascript and whatnot – and of course from their perspective, that’s the most important part anyway.  Their primary business goal, after all, is to launch Diaspora.com and sign up as many users as possible.

But the parts that I’m interested in are the “back end” parts – in particular, the parts that deal with exchanging data between “seeds” – the parts that are vitally important for an open, secure, distributed social network, in other words.  Those parts, unfortunately, appear to be an afterthought, and more-or-less delegated to other libraries (eg. Redfinger).  To me, they’re approaching the project bass-ackwards.  If you really want to create a secure, privacy-aware network, you need to think about securing the traffic between the seeds first, as that is the most vulnerable part.

I’m not the only one to notice problems with this release, by the way.  Others have pointed out serious flaws in the front-end:  Code for open-source Facebook littered with landmines – The Register.  Trouble With Diaspora – Steve Klabnik.  Alert raised over Diaspora security – THINQ.  ComputerWorld had one of the kinder articles I found:  Diaspora: It’s no Facebook…yet.  This list goes on and on.  Journalists love to tear people down.

Here’s the sad thing (to me, at least).  Because it’s open source and anyone can commit code (after agreeing to share the copyright, that is), an army of college kids are probably going to fix all of those cross-site security problems for them.  For free.  Then they’ll be able to launch their Diaspora hosting service on the backs of those poor idealistic helpers and rake in tons of “Facebook killer” venture capital.  (I’ll be interested to see how they are planning to appease advertisers that will want more and more access to private data, though.)  The cycle will continue:  I and every other developer looking to write their own social network “node” probably won’t be able to interact with Diaspora any better than we can interact with Facebook now.

I’m weirdly fascinated by this project, so I’m going to continue pouring over the code to identify the protocols used.  If nothing else, it’ll be a Ruby and Git learning experience for me, and who knows, maybe I’ll contribute something.  Now if I could just figure out how to run it inside Eclipse.

Installation Notes

If you’re trying to install and run Diaspora, the instructions leave a couple of things out.  First, don’t bother trying to run it on Windows, I got a headache just reading all of the dependencies, let alone trying to find and install and configure them.  I created a blank Ubuntu 10.04 install on VirtualBox, and followed the installation instructions step-by-step.  It all went great until I got to “bundle install,” which failed with “bundle: command not found.”  Some Googling found the answers.  You have to make a symbolic link to put it into your path.

After sudo gem install bundler:

sudo ln -s /var/lib/gems/1.8/bin/bundle /usr/local/bin/bundle
cd /usr/src/diaspora
sudo bundle install

(Btw, just go ahead and put “sudo” in front of everything; almost nothing works in Ubuntu without it.)

I had to get the rake package, too:

sudo apt-get install rake

I couldn’t get ./script/server to work, so I run the app and websocket servers separately.

(Side question: How is this going to work through firewalls if it needs two ports?)

The Importance Of The Robots NoIndex Meta Tag, Or: What’s The Opposite of SEO?

I’m not much into SEO (search engine optimization).  It’s a booming industry for some, but I still have this crazily naive idea that good content will just naturally rise to the top of search results.  That said, there are a some basic SEO guidelines that I try to follow on my site.  When I recently discovered my site was disappearing from Google results, I learned a very important SEO lesson.

There are some pages on my site that I actually don’t want in Google search results.  For example, perhaps counter-intuitively, my blog pages:  The content there is constantly changing, so I don’t want it to appear as a search result, because it will quickly be out of date, and users will be confused when they arrive.  Instead, I want people to find the individual blog post pages where the content will remain pretty constant.  So I placed meta tags at the top of my blog pages to prevent them from being indexed:

<meta name=”robots” content=”noindex” />

Now fast forward to a few weeks ago, when I was idly going over some web site logs.  According to my logs, the most popular pages on my site are “How To” posts, and the most requested page of all time is How To Rename a Windows Service.  Out of curiosity, I ran a Google search to see how far down this one particular page would be in the results, thinking it should be pretty high if people keep finding it.

It was indeed on the first page of results… on krehbiel.blogspot.com.  (I used to crosspost there.)  But on thomaskrehbiel.com?  It wasn’t there.  Like, anywhere.  Not on result page 1, 2, or even 20.  I typed in the exact title of the page, “How To Rename A Windows Service by Thomas Krehbiel,” in quotes, and got nothing.

I checked Google Webmaster Tools and quickly found the reason.  Of the 1221 pages in my sitemap, only 58 pages were indexed.  Several days later, only 4 were indexed!  Dubya Tee Eff?

Well, I think I found the cause.  Some time ago, I made some code changes in the area of my blog that renders the <head> tags.  A colossal blunder on my part allowed <meta name=”robots” content=”noindex” /> to go on not just the blog pages, but on every single page of the site.

I’ve corrected the problem and resubmitted my sitemap; now I just have to see how long it will take to get my pages back into the index.

So here’s the lesson for anyone looking to improve their SEO skills:  Excluding your entire website from the Google index definitely does not improve your search result rankings. :)  But seriously, make sure the meta robots tag is correct.  (Also, it would probably help to check your logs and statistics more than once or twice a year.)

(However, if you want to make a relatively private site, robots noindex is a very effective solution.)

Thoughts on Diaspora and Distributed Social Networks

Networking socially with a handshakeLike many people, I read about Diaspora a while back and thought it was a great idea.  It’s one of the few open-source projects I could see myself contributing to.  Unfortunately, it’s not “open” in the sense that the technical architecture is open to discussion – it will only become open after they define the architecture, good, bad or indifferent*.  So, as any programmer would, I thought, “Okay, I have some ideas on this, so if they don’t want my help, I’ll just write my own distributed social network.  How hard could it be?”

Pretty hard, as it turns out.  But a lot of other people have thought about this too, and many of the building blocks for a distributed social network are already out thereOpenID is a convenient standard for universal identities that is already supported by many big-name companies (even the U.S. government is looking at it), and WebFinger is a promising standard for mapping easy-to-remember email addresses to metadata (such as an OpenID provider).  Atom, Activity Streams and PubSubHubbub can handle most, if not all of the content distribution among servers.

As a side note, OStatus has been mentioned by Diaspora as a standard they wish to implement, however as I peruse the OStatus specification, it appears to be more of a model for a Twitter-style (follow) architecture than a Facebook-style (friend) architecture.  It does nothing to address what I believe is the biggest missing piece, described below.

On the browser side, Diaspora is using something called WebSockets to push real-time notifications to the user’s browser, but I’m not sure that’s a wise move since currently only a couple of browsers supports it.  For the time being, some other push method seems like a good idea.  In any case, that’s not the most pressing problem for a social network.

In my opinion, the biggest missing piece in the distributed social network puzzle is the mutual authorization required to protect private content from strangers, while allowing approved friends to see it.  There is no open protocol (that I know of) for person A on server X to become “friends” with person B on server Y.  That problem might be easier to solve if we could assume both servers were running the same software (eg. Facebook), but what if server X is running a homegrown PHP app on Linux while server Y is running a totally different ASP.NET app on Windows?

It boils down to finding a lightweight protocol for authenticating both ends of the communication channel between one social server and another (aka. mutual authentication), in a way that is relatively easy to implement on any shared web host.  (Authenticating from a user’s browser to the server is another matter, and in my opinion handled by OpenID.)  That is, ensuring that a request to view the private content of person B is really coming from person A via. server X, and not some hacker or a search bot or a man-in-the-middle attack.  (I make the assumption that in a distributed social network, users will only be communicating directly with their own server, as shown below, and not with their friend’s server.)

Server X —— Server Y
   |               |
Browser         Browser
Person A        Person B

Mutual authentication of HTTP traffic is usually only done in enterprise situations with pricey, proprietary solutions.  As far as I know, there aren’t any open standards that would be feasible for this kind of situation.

I’ll be curious to see how Diaspora addresses this issue, but I suspect they’ll be focusing on other things.

* My sense is that Diaspora is more of a branding and marketing effort than a technical effort.  They have created a sort of mythical image of four kids taking on Facebook, and they present themselves almost like a garage band.  That kind of “rock star” programmer image was ubiquitous in the mid-1980s, but can it still work in 2010?  Who knows.  In any case, they have essentially stated that their goal is to get something out quick and dirty, and worry about the “implementation details” later, which is clearly a business-driven goal.  (Incidentally, that’s exactly how Facebook started, too.)

NoSQL is Coming

There has been an explosion of talk about “NoSQL” lately (ie. I’ve seen a few posts about it), and since it is every blogger’s obligation to follow the crowd and re-write what everyone else is saying, I shall now present my thoughts on NoSQL.

My first thought about NoSQL was:  What the heck is NoSQL?

Wikipedia (as of this writing) defines NoSQL as “a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees.”

Word salad.  To translate, NoSQL (also known as “structured storage”) is a new kind of database — popularized by such big name companies as Google, Amazon and Facebook — designed to store and quickly retrieve bajillions of terabytes of data, something that is challenging to pull off with your average enterprise relational database.  Another attractive feature of a NoSQL database (to places like Google, Amazon and Facebook, at least) is the inherent ability to scale up to accommodate bajillions of simultaneous users.

The NoSQL concept also seems to focus on a “non-fixed table schema.”  Presumably, that means columns (or whatever the NoSQL equivalent is — “keys,” I guess) could be added or updated at any point in the life of an application without too much trauma.  I could see this being useful for rapid software iterations, where you don’t necessarily know what the final schema is going to be when you start out.  (Eg. you roll out a version with a new table, and then in the next iteration a week later you find out you need to add or delete a column, which can be a massive pain with a large relational database.)  In olden days, the schema would be ironed out in the “design” and “alpha testing” and “beta testing” stages, but obviously we in the industry don’t have time for that stuff anymore.

From an application developer perspective, NoSQL databases appear to shift the burden of data integrity from the database to the application.  To the application, I presume the NoSQL database would look like a big dictionary or hashtable (key/value collection) – ie. a big dumb storage area whose only purpose is to read and write bits.  (Similar to, you know, a hard drive — see The Daily WTF’s April Fool’s Joke)  I wouldn’t think there’d be any need for a database administrator or database developer in a NoSQL shop; only application developers and system administrators.  There wouldn’t be any “query optimization” or “stored procedures” because there’s… wait for it… NoSQL.

My second thought about NoSQL was:  Why should I care about NoSQL?  I’m not writing Google, Amazon or Facebook.  My database of choice works fine.  I already know how to build columns and tables, and I already know how to write applications against them.

If you’re an IT veteran, you’ll know the answer is:  Because someday your gullible CEO will drop by and say, “A 20-year-old consultant told me about how great this new-fangled NoSQL is, so we’re paying him tons of money to migrate our data warehouse to it.”  Afterwards, when your whole system is lying on the server room floor in shattered pieces and angry customers are jamming the phone lines, you’ll be the one that has to undo everything the consultant did, so knowing about NoSQL will help you do that.

Of course I kid.  There are plenty of cases where I could easily see this being a good idea.  (Like, say, if you’re writing Google, Amazon or Facebook.)

But I’m a Microsoft .NET developer, so that’s about as far as I can go in researching NoSQL.  Every available server implementation I’ve seen (http://en.wikipedia.org/wiki/Structured_storage) runs on Linux, which means Mono or Java or some other Linux-capable language as a client.  So I won’t be firing up Visual Studio 2008 to check out Cassandra anytime soon, and I don’t have the motivation to setup a Linux development environment just to play around with NoSQL.

It may not sound like it, but — unlike most new whizbang trends in the industry — NoSQL appeals to me because I’m primarily an application developer.  I’ve often found myself wishing (rashly, in most cases) that the database (or the DBA) would get out of my way and let me handle the data storage.  Generating SQL to feed to the database has always been a pain, even if it’s disguised behind ADO.NET or LINQ or some other ORM.  NoSQL sounds like it should integrate better with applications.

So I say, bring on the NoSQL!

Looking Back at 2009 Goals

A look back at how I did on my goals for 2009.
  • I find it helpful to write down some project goals and check back with them from time to time.  The following were the goals I wrote down last year, with the results written in bold italics.
  • Finish my home page update.  Check.
  • Finish setting up the new home server.  Check.
  • Finish Microsoft MCTS certification.  I’ve already taken the 70-536 exam, I just need to finish studying and take the 70-528 exam.  I’m also required to get some certifications for work this year.   Check:  MCTS and Security+ certs obtained.
  • Learn more about Silverlight and WPF.  This is something I want to get at least a rudimentary knowledge of, but I don’t want to sacrifice everything else for it.  I peeked at Silverlight now and then but nothing significant.
  • Learn more about Flash.  My goal is to create a flash “thing” for my home page.  Not really sure what the “thing” will be yet.  I need to look at some other flash things on other sites to get some ideas.  A first step in this process would be to find some software for actually creating flash things.  Did not do anything more than glance at Flash – enough to see I don’t care for it.
  • Use more jQuery in web projects.  Check: Used in some work projects and my home page.
  • Do something with my hostgator web host or cancel it.  Maybe convert it to an ASP.NET host so I can try out some Microsoft stuff there.  UPDATE:  Oops, hostgator doesn’t have Windows hosting.  Check:  Moved my home page to Hostgator.
  • Move some more of my projects to Google Code.  Check:  I have a few projects there now, including my home page code.
  • Give Ubuntu or some other Linux variant a serious trial.  Check:  I used Ubuntu for quite a while in my media center PC.
  • Write some music.  I haven’t done much of anything musical since about 2001, and I think it’s about time to change that.  I need to organize the studio and make it as easy as possible to just sit down and start creating something.  Check:  I wrote a new song during Christmas vacation, giving me a total of two new songs for the year.
  • Try creating a regular podcast.  I don’t really have a good topic, though.  I’m thinking it should be something with a local flavor, though.  Maybe something instructional.  Check:  I did record 3 or 4 episodes of a podcast, but I didn’t think it was very good.
  • I also have this crazy idea to do some short cell animations.  Didn’t do much of anything with this.

Goals for 2010 will be in another post.

A Peek at Google Web Toolkit

So I’m looking over this Google Web Toolkit thing since someone around here thinks it’s the greatest thing since sliced bread.  I hate to disappoint but it’s conceptually the same as the much-hated ASP.NET WebForms – it’s a framework to abstract HTML and Javascript away from the programmer.  But instead of .NET and Visual Studio, Google’s version is based on Java and Eclipse.

Unfortunately this abstraction is pretty much exactly what I’m getting tired of in ASP.NET.  Microsoft went to great lengths to build this huge Web Forms framework to try to make HTML development look exactly the same as Windows Forms development, and I think there’s a growing body of evidence to support the assertion that it was a mistake (hence ASP.NET MVC).

So, not to be outdone, Google built this Web Toolkit to make HTML look like X-Windows widgets.  Deja vu.

So setting aside philosophy and getting into practicalities, the first thing I’m noticing about GWT is that there doesn’t seem to be any way to build an interface declaratively.  All the examples in the Showcase build things programmatically.  That harkens back to the early 90s when it took a thousand lines of code just to make one little gadget on the screen.  (Not to mention the inability to separate designers and programmers on the same project.)

Second thing I’m noticing (which is related to the first) is the almost 100% reliance on Javascript for the finished product.  The HTML host page is basically empty except for a <script> tag*.  I like Javascript as much as the next guy, but I’m nowhere near the point where I think all web apps should be 100% pure Javascript.  Maybe it’s better for the Googles and Facebooks of the world to offload all the processing onto the client, but is it really necessary for every web site?

(Yeah, I know I was just talking about Silverlight which is all client-side processing, too.  But that’s different. 🙂

* Okay I see that you can insert GWT functionality into any element of the HTML, it doesn’t have to be the entire page.  But it’s hard to think of scenarios where that makes sense.  The only reason I can think of is to plug in one of the fancy GWT widgets (like the TreeView or RichText) somewhere on an existing page.  In that case, though, it would be just as easy to plug in some Javascript.

Basically I’m not seeing why this is the greatest thing since sliced bread, or why this is better or even as good as ASP.NET.  But I guess if already prefer Eclipse and Java, it’s probably worth checking out, so you too can experience the disappoint and frustration that ASP.NET developers have.

Exploring Java Web Development, Part 3

I’m happy to report that I’ve completed resurrecting JWebTrack, the terribly feature-incomplete bug tracking project I did for a Java class oh so long ago.  After building an appropriate database and populating it with some data, the app worked like a charm.

Well, except for one thing:  I had to change statement.executeQuery() to statement.executeUpdate() for the INSERTs.  That must be a recent development in either the MySQL connector or Java because it wasn’t like that in 2003.

One thing I have to admit about JSP web pages and servlets:  They load fast.  Admittedly these pages aren’t doing a lot, but they come up like lightning compared to the PHP and ASP.NET pages I’ve done lately.  (This was in the Eclipse browser… don’t know what it’s using to render pages.)

I have this crazy idea to port some of my PHP blog code to JSP and ASP.NET MVC.  Working on them in parallel should give a pretty fair estimate of the pros and cons of each development environment.

GridView, UpdatePanel and PopupControlExtender

This is a nightmarish combination to deal with in ASP.NET 3.5.  I will attempt to document what I learned today about how to get this combination working.

Here is the scenario:  Start with a GridView and a bunch of templated columns.  The first four columns of the GridView contain labels with magnifying glass icons next to them, and the rest of the columns are just TextBoxes.  The idea is that when you click the magnifying glass, a “popup” window appears that lets you pick from a list of choices.  It’s just like a DropDownList, except we can’t use a DropDownList in this case because the descriptive text of each choice could be huge.  Each selection affects the list that appears in the next column over, so the selections “cascade.”  We don’t want any postback inside the grid to redraw the full page.

The grid looks something like this, with anything that my boss might consider sensitive fuzzed out (not that it’s the slightest bit sensitive):

GridView Popups

I won’t get into the details of binding data to the grid and all the controls inside the grid; that’s a whole different subject (in brief – don’t put the GridView.DataBind call in Page_Load if you want to see any events from controls inside the grid).  I’ll just focus on the AJAX functionality.

Here is the template for the first column (oops, I guess you know it’s a Task column now):

<asp:TemplateField HeaderText="Task">
  <ItemTemplate>
    <asp:UpdatePanel ID="updateTaskPanel" runat="server" UpdateMode="Always">
      <ContentTemplate>
        <asp:Label ID="lblTask" runat="server" />
        <asp:Image ID="imgTask" runat="server" ImageUrl="~/Images/Hourglass.png" />
        <cc1:PopupControlExtender ID="pceTask" runat="server"
          TargetControlID="imgTask"
          PopupControlID="popupPanelTask"
          Position="Bottom" />
        <asp:Panel ID="popupPanelTask" runat="server" CssClass="modalPopup" style="display:none;">
          <asp:UpdatePanel ID="updatePopupTaskPanel" runat="server">
            <ContentTemplate>
              <asp:Label ID="lblTaskPopup" runat="server">Title</asp:Label>
              <asp:RadioButtonList ID="radioTasks" runat="server"
                DataTextField="DisplayLabel" DataValueField="Id" />
              <asp:Button ID="btnPopupTaskOkay" runat="server" Text="OK"
                OnClick="btnPopupTaskOkay_Click" UseSubmitBehavior="false" />
              <asp:Button ID="btnPopupTaskCancel" runat="server" Text="Cancel"
                OnClientClick='AjaxControlToolkit.PopupControlBehavior.__VisiblePopup.hidePopup(); return false;'
                UseSubmitBehavior="false" />
            </ContentTemplate>
          </asp:UpdatePanel>
        </asp:Panel>
      </ContentTemplate>
    </asp:UpdatePanel>
  </ItemTemplate>
  <ItemStyle Wrap="false" />
</asp:TemplateField>

There are several important things to note:

  • There are two ContentTemplates; one for the contents of the column and one for the contents of the popup panel.  I believe both are necessary to avoid a full-page refresh.
  • The outer UpdatePanel is set to UpdateMode=”Always”.  I found this necessary because changing the value of one column affected what was in the other columns.
  • The Image (aka. TargetControlID) and the PopupControlExtender are both in the same UpdatePanel.  That’s a requirement.
  • The popup Panel is also inside the ItemTemplate.  I needed to do this because each row could have different popup contents.  If they were all the same, I believe it could have been moved entirely outside the GridView.
  • There is a style=”display:none;” attribute on the popup Panel.  That is necessary to avoid seeing the panel flash for a second when the page first loads.  (Don’t set Visible=”false”.)
  • Both Buttons in the popup panel have the attribute UseSubmitBehavior=”false”.  I had all kinds of problems without those.
  • The Cancel button has the attribute OnClientClick=’AjaxControlToolkit.PopupControlBehavior.__VisiblePopup.hidePopup(); return false;’.  This causes the popup to close with a client-side call, rather than posting back to the server.  This results in much faster performance.

Here is the code-behind for one of the Okay buttons in a popup Panel:

protected void btnPopupTaskOkay_Click(object sender, EventArgs e)
{
    Button btn = (Button)sender;
    GridViewRow row = (GridViewRow)btn.NamingContainer;
    RadioButtonList rbl = (RadioButtonList)row.FindControl("radioTasks");
    TaskRowView dataItem = this.editingRows[row.RowIndex];
    dataItem.TaskId = int.Parse(rbl.SelectedValue);
    dataItem.IsModified = true;
 
    PopupControlExtender pce = AjaxControlToolkit.PopupControlExtender.GetProxyForCurrentPopup(Page);
    pce.Cancel();
}

The important part is the Cancel() at the end.  That is what actually dismisses the popup.  If you don’t do that, the popup will just sit there on the screen.

One other important thing to note about UpdatePanel if you don’t already know:  Even with a partial-page refresh, during the round-trip to the server, the entire page is generated, even though only a subset of the page is sent back to the client.  But keep in mind that all the Page events are fired and all the control events are fired, even if they’re outside the UpdatePanel being refreshed.  That means you need to be careful in tuning the performance of partial page renders.  For example, if your Page_Load event does a time-intensive database load or something, it will occur on every asynchronous refresh and performance will suffer perhaps more than necessary.

Exploring Java Web Development, Part 2

Day two of reacquainting myself with JSP development, wherein we learn that IDEs are powerful tools but they are not very friendly to newcomers.

(Btw thanks to Red for helping out with this stuff.)

My first test was a hello world page using a simple one-property User class.  I got a static page working fine, but for some reason I couldn’t get Eclipse to import the class (<%@ page import=”User” %>).  It kept saying it couldn’t find it.

I figured it had something to do with GlassFish, so I installed Apache Tomcat 6 and set it up as a server in Eclipse.  This also didn’t work at first.  I kept getting “Project facet Sun Deployment Descriptors Files version 9 is not supported” errors.  Thinking GlassFish was still interfering somehow, I uninstalled the entire EE SDK.  Suddenly the import worked and the page loaded perfectly.  (I may not have those steps exactly right, so don’t quote me on it.  I think somewhere in there I also changed some project settings that had something to do with “facets.”)

(Around this time Eclipse stopped working, complaining that I didn’t have the JRE anymore.  I had both a 32-bit and 64-bit JRE and I think it got confused or something, so I uninstalled everything, then reinstalled a 32-bit JRE, the 32-bit SE SDK and the EE SDK (without starting GlassFish!).)

Feeling empowered, I next started a new project and imported some code I wrote many years ago for a Java class, which was a very, very simple bug-tracking app connected to a MySQL database.  I downloaded the latest MySQL JDBC connector, fixed a whole bunch of warnings and errors in the code that made me wonder how it ever worked before, and everything compiled cleanly (at least according to Eclipse).  But it didn’t come up in Tomcat with or without debug, complaining of a 404 Not Found.

It turned out all the .html and .jsp files need to go in the WebContent directory.  Okay, no problem – after moving the files, the index.html page came up.  But it bombed when I clicked into the .jsp pages because it couldn’t find the JDBC driver class.  After some more trial and error I discovered that the mysql-connector-java-5.1.10-bin.jar file needed to go in the WebContent/WEB-INF/lib directory.  (Not in the Project’s Libraries where I originally put it.)

And that’s where I left things, since I don’t actually have a copy of the database for this silly app, so I’ll need to recreate one before I can look at anything else.

Exploring Java Web Development, Part 1

Out of curiosity and recent disgruntlement with ASP.NET I decided to look into Java web development.  I haven’t done this since around 2002, so of course I’ve forgotten everything I ever knew about it.  Herein I will attempt to document the knowledge I uncover.

Immediately after arriving at java.sun.com, I remembered that the Java world has a bewildering array of cryptically-named technologies to sort through.  There are approximately 100,000 different choices and variations of what you can download and there is almost nothing to guide you toward what you’re supposed to get.

First of all, for web development, get the Java EE SDK, not the Java SE SDK.  (SE doesn’t have the servlets and stuff in it.)  The Java EE download page has a bunch of different options – the one I got was called “GlassFish Java EE + JDK.”  (GlassFish is apparently Sun’s alternative to Tomcat.)

Do not be alarmed at the incredibly ugly, 1990s-looking Swing window that the Java EE SDK installer brings up.  Ah, Java Swing.  It was one of the main reasons I switched to .NET.  Showing a Swing-based installer that looks nothing like a normal Windows app is an especially horrible way to introduce Windows people to Java development.  (I guess I should be thankful it even has an installer.)

Java EE 5 SDK Installation Wizard - Swing components

Beware that Windows 7 64-bit is an “unsupported installation platform” for Java EE 5 SDK Update 7.  You have to resize the installer window for the Next button to appear so you can continue anyway.

Java EE 5 SDK Installation Wizard Java EE 5 SDK Installation Wizard after Resizing

(I couldn’t help but notice that the SE install goes to C:Program FilesJava but the EE install goes to some crazy thing like C:SunSDK.  Not very Windows-friendly.)

The installation sets up a GlassFish application server so you should be able to browse to http://localhost:8080 when it’s done.  I suppose theoretically you could start writing code at this point with a text editor and compile with command-line tools, but I’m definitely not that ambitious.

So I downloaded the Eclipse IDE for Java EE Developers.  This was a far, far less painful experience than getting the SDK.  This is the greatest thing about Eclipse:  You just unzip it somewhere, double-click the executable and it just works.  Wouldn’t life be grand if everything worked like that?

Because the SDK installed a GlassFish server and I didn’t particularly feel like trying to setup a Tomcat server, I got a GlassFish plugin for Eclipse.  Follow the handy instructions on that page and when it asks for the directory of the GlassFish application server, put in the same place you installed the SDK (eg. C:SunSDK).  (It took me a while to figure out that was where the GlassFish application server directory was.)

So now theoretically I should be able to create a Java web app.  I’ll save that for another time, though, because I have to dig through my backups to find my old Java code.