Netbeans + Subversion + Windows XP

By on

For my teaching I’ve been using Netbeans this semester, which has overall been wonderful.  Overall Netbeans has been an even better experience than Eclipse for teaching — though both have a steeper learning curve than I’d prefer.

I’ve enjoyed Netbeans’ built-in subversion support.  (This is not a differentiator with Eclipse, just a comment.)  However, getting subversion working reliably with netbeans on a windows box is a bit fiddly, and the online documentation makes it seem easier than it is.  It’s easiest to break the setup into steps, and get each of them working before moving on to the next step.  (Part of what makes the documentation a bit complicated is that there are many alternatives.  I’m just going to describe one simple alternative, that assumes that you have a shell account on the Unix computer that contains your subversion repository.)  Here are the steps:

1. Get plink (from putty) working on your box.  Plink will be used by CollabNet to tunnel svn+ssh subversion connections.  First install the full putty from the web site.  Then create a .ssh key for putty using ssh-keygen, store it in a safe place on your Windows computer, and install the key in the authorized_keys file on your Unix server.  Then test with:

./PLINK.EXE -v -l <username> -i c:/path/to/key/file/id_rsa_putty.ppk <remote-host>

The result should be an ssh session to your remote host.  (plink is not a good client to actually use for ssh — prefer putty — but this is a simple test that it’s working.)  (I’m using forward slashes in the above because I run it in cygwin shells.  You’ll need backward slashes if you run it in the traditional unix command console.)

2. Install CollabNet’s Subversion Client.  They have a simple installer.

3. Look in your Application Data directory for the Subversion subdirectory.  (It’s possible you have to run the Subversion Client once to cause this directory to be created.)  Edit the config file in that directory.  Look for the section called “tunneling”. In that section, after all the comments, add a line:

ssh = c:/Program Files/putty-0.60/plink.exe -v -l <username> -i c:/path/to/keyfile/id_rsa_putty.ppk

Here you use forward slashes, because the Subversion Client will translate them.  The path to plink.exe should be changed to wherever you put plink. Adding this line to the config file tells the Subversion Client what command to use with URLs of the form svn+ssh.

4. Test the subversion client from the command line with:

./svn ls svn+ssh://<remote-host>/path/to/remote/svn-repo

If this works you have a working subversion client on windows, which is 80% of the battle!

5.In Netbeans go to Tools/Options/Miscellaneous/Versioning and set the Path to the SVN Client to:

C:\Program Files\CollabNet\Subversion Client

(or wherever you installed Subversion).

6. Right click on a directory and you should be able to use Subversion Update and Commit commands!

Occasionally when things are tricky the netbeans client gets confused.  I just use the command-line client to do an svn update, and all is usually well after that.

One issue to watch out for: subversion is very sensitive to version changes.  The working copy (checked out version) will be updated by the subversion client to the style that version of the client expects.  So if you use both a netbeans client and a command-line client you should make sure they’re the same “point” version number.  (E.g., They should both be 1.6.x, though they can have different xs.)

Good luck!


An Exciting Time for Cyclopath

By on

One of the premier research platforms around here is Cyclopath, a geowiki and route-finding service for Twin Cities bicyclists.

Now, we’ve expected Google’s announcement that they were getting into the bicycle routing business for some time. But that doesn’t mean yesterday was relaxed for us. 🙂

After sleeping on it, (and speaking for myself) I think this development is actually either neutral or good. We’re in a different niche than Google — we’re focused on open content and community, not just maps, and we’re strongly local with personal connections to the cycling community and local agencies. And on the plus side: almost all of the reactions from the community I saw on the social web were very supportive of us, and I’ve never seen so much passion at Cyclopath Headquarters as I did yesterday!

We’ll continue to write and publish consistent with our excellent track record (e.g., of the 5 papers we’ve submitted to top-tier conferences, 4 have been accepted on the first try and 2 have been nominated for Best Paper).

Details on what Google’s announcement means for Cyclopath, from the user perspective, are here.

Lastly, and off-topic, please follow @grouplens and @cyclopath_hq on Twitter!

Datasets and availability

By on

Occasionally, GroupLens receives requests for datasets that we possess. In many cases, we are able to provide this data as we have with the Movielens rating datasets. One of the data collections that we have is a 10% sample of Wikipedia page requests (essentially every 10th HTTP request), since April 2007. This data accumulates at a rate of about 5 GB/day, and we currently have around 4 TB of unprocessed compressed data. This is approximately 40 TB when uncompressed. While we sometimes get requests for this data, the sheer size of it makes it difficult for us to make it available for download.

Although we cannot make this data available for download, depending on your request and our availability, we may be able to collaborate with you by performing the analysis you need on our data.

Also, we are not the only ones who have view data of Wikipedia. There are several other sources that have data on page views. Here are some of these resources and the type of data that they have available:

  • – Provides data on per-page view counts by month.
  • – Has files containing hourly per-page view count snapshots, with archives that currently go back to October 2009.
  • Wikipedia Page Traffic Statistics on AWS – Hourly traffic statistics for a 7 month window (October 2008 – April 2009) are available on Amazon Web Services. This data was assembled from files that were available from at the time.