A Diaspora Code Review

I was curious to see the Diaspora code base when it was released on September 15.  Ever since I heard about Diaspora, I – like many other developers before me – have been pondering how I might architect a distributed social network myself, so I was curious to see how they had solved certain problems I was running into.  The short answer is:  They didn’t.

First of all, let me get this out of the way:  I’m looking at this purely from an engineering perspective.  From the business and marketing perspectives – arguably more important than engineering in these modern times – Diaspora is fine.  They have generated a huge amount of PR buzz, raised a bunch of money from nowhere, moved to Silicon Valley and probably have a bright future ahead of them.  They’re even leveraging the gullible open source community to do most of their work for them.  Most people can’t do any of that.

But…

It’s clear that the September 15 release of Diaspora is a prototype – a “technology preview” as Microsoft might say.  The functionality is roughly equivalent to what you’d expect from a college class project.  You can login, type status messages, upload pictures, and… well, that’s pretty much it.  (Okay, you can create “aspects,” too – whatever those are.  What’s wrong with naming them “groups?”)  I would estimate it needs a minimum of 6 more months of serious effort before it has any value to consumers, and even then it probably won’t have a tenth of Facebook’s features.  This release should never have been publicized, and the goal of an “Alpha” release in October seems pretty far-fetched.

The code itself is written in Ruby, so I can’t comment too much on it.  This is my first exposure to working code written in this quizzical language the kids like so much.  One comment I will make is that there aren’t very many comments. :)  And would it have killed someone to write some documentation so people could find their way around the code without individually opening every file to see what’s in it?  Just sayin’.

From what I can tell, the majority of work seems to have gone into creating a nice presentation layer – the HTML and Javascript and whatnot – and of course from their perspective, that’s the most important part anyway.  Their primary business goal, after all, is to launch Diaspora.com and sign up as many users as possible.

But the parts that I’m interested in are the “back end” parts – in particular, the parts that deal with exchanging data between “seeds” – the parts that are vitally important for an open, secure, distributed social network, in other words.  Those parts, unfortunately, appear to be an afterthought, and more-or-less delegated to other libraries (eg. Redfinger).  To me, they’re approaching the project bass-ackwards.  If you really want to create a secure, privacy-aware network, you need to think about securing the traffic between the seeds first, as that is the most vulnerable part.

I’m not the only one to notice problems with this release, by the way.  Others have pointed out serious flaws in the front-end:  Code for open-source Facebook littered with landmines – The Register.  Trouble With Diaspora – Steve Klabnik.  Alert raised over Diaspora security – THINQ.  ComputerWorld had one of the kinder articles I found:  Diaspora: It’s no Facebook…yet.  This list goes on and on.  Journalists love to tear people down.

Here’s the sad thing (to me, at least).  Because it’s open source and anyone can commit code (after agreeing to share the copyright, that is), an army of college kids are probably going to fix all of those cross-site security problems for them.  For free.  Then they’ll be able to launch their Diaspora hosting service on the backs of those poor idealistic helpers and rake in tons of “Facebook killer” venture capital.  (I’ll be interested to see how they are planning to appease advertisers that will want more and more access to private data, though.)  The cycle will continue:  I and every other developer looking to write their own social network “node” probably won’t be able to interact with Diaspora any better than we can interact with Facebook now.

I’m weirdly fascinated by this project, so I’m going to continue pouring over the code to identify the protocols used.  If nothing else, it’ll be a Ruby and Git learning experience for me, and who knows, maybe I’ll contribute something.  Now if I could just figure out how to run it inside Eclipse.

Installation Notes

If you’re trying to install and run Diaspora, the instructions leave a couple of things out.  First, don’t bother trying to run it on Windows, I got a headache just reading all of the dependencies, let alone trying to find and install and configure them.  I created a blank Ubuntu 10.04 install on VirtualBox, and followed the installation instructions step-by-step.  It all went great until I got to “bundle install,” which failed with “bundle: command not found.”  Some Googling found the answers.  You have to make a symbolic link to put it into your path.

After sudo gem install bundler:

sudo ln -s /var/lib/gems/1.8/bin/bundle /usr/local/bin/bundle
cd /usr/src/diaspora
sudo bundle install

(Btw, just go ahead and put “sudo” in front of everything; almost nothing works in Ubuntu without it.)

I had to get the rake package, too:

sudo apt-get install rake

I couldn’t get ./script/server to work, so I run the app and websocket servers separately.

(Side question: How is this going to work through firewalls if it needs two ports?)

The Importance Of The Robots NoIndex Meta Tag, Or: What’s The Opposite of SEO?

I’m not much into SEO (search engine optimization).  It’s a booming industry for some, but I still have this crazily naive idea that good content will just naturally rise to the top of search results.  That said, there are a some basic SEO guidelines that I try to follow on my site.  When I recently discovered my site was disappearing from Google results, I learned a very important SEO lesson.

There are some pages on my site that I actually don’t want in Google search results.  For example, perhaps counter-intuitively, my blog pages:  The content there is constantly changing, so I don’t want it to appear as a search result, because it will quickly be out of date, and users will be confused when they arrive.  Instead, I want people to find the individual blog post pages where the content will remain pretty constant.  So I placed meta tags at the top of my blog pages to prevent them from being indexed:

<meta name=”robots” content=”noindex” />

Now fast forward to a few weeks ago, when I was idly going over some web site logs.  According to my logs, the most popular pages on my site are “How To” posts, and the most requested page of all time is How To Rename a Windows Service.  Out of curiosity, I ran a Google search to see how far down this one particular page would be in the results, thinking it should be pretty high if people keep finding it.

It was indeed on the first page of results… on krehbiel.blogspot.com.  (I used to crosspost there.)  But on thomaskrehbiel.com?  It wasn’t there.  Like, anywhere.  Not on result page 1, 2, or even 20.  I typed in the exact title of the page, “How To Rename A Windows Service by Thomas Krehbiel,” in quotes, and got nothing.

I checked Google Webmaster Tools and quickly found the reason.  Of the 1221 pages in my sitemap, only 58 pages were indexed.  Several days later, only 4 were indexed!  Dubya Tee Eff?

Well, I think I found the cause.  Some time ago, I made some code changes in the area of my blog that renders the <head> tags.  A colossal blunder on my part allowed <meta name=”robots” content=”noindex” /> to go on not just the blog pages, but on every single page of the site.

I’ve corrected the problem and resubmitted my sitemap; now I just have to see how long it will take to get my pages back into the index.

So here’s the lesson for anyone looking to improve their SEO skills:  Excluding your entire website from the Google index definitely does not improve your search result rankings. :)  But seriously, make sure the meta robots tag is correct.  (Also, it would probably help to check your logs and statistics more than once or twice a year.)

(However, if you want to make a relatively private site, robots noindex is a very effective solution.)

Froyo Arrives on the HTC Incredible

Verizon pushed Froyo (Android 2.2) to my Droid Incredible this past week.  In a nutshell, the changes are unremarkable.

There is one new app called 3G Mobile Hotspot which would be super cool except Verizon doesn’t let you use it without paying an extra $30 a month.

The browser supposedly supports Flash 10.1 but the only improvement in my browsing experience is that I can now visit live.twit.tv.  Not that I would want to since I’m sure that would drain the battery in about 2 minutes.  Other Flash video sites like CNET still don’t work, so still no listening to Buzz Out Loud’s occasional live event coverage at work.

There is no evidence of the much-hyped speed improvements.  I probably wouldn’t notice anyway since there was never any sluggishness on the Incredible to begin with.

Honestly the most obvious change is that the GMail app has an extra navigation button.  There are 4 buttons on the bottom of the screen now instead of 3 when I delete my spam.

All in all, it’s nothing special.  Kind of a disappointment, actually.