Daniel Allen (da_lj) wrote,
Daniel Allen
da_lj

How to: export your LJ "Watching" list to another RSS feed-reader

I've given up on using LJ for my RSS feeds. I've got 88 of them, which means I sometimes don't see real people posts for pages and pages. I'm jumping to Google Reader. Google Reader will import "OPML" data, so that's how I wanted to do the transfer. This is a how-to.

(If you are hoping to read your LJ friends list directly from an RSS reader, you might instead try this exporter, which grabs the necessary data for friends- it doesn't include feeds and is beyond the scope of this how-to.)

The short version:

1) if you're handy with Perl, grab the code at the bottom of this entry, change the user name, and run the code.
1b) if you're not handy with Perl, give me a shout and I'll run it for you on my server. :)

2) Output is a list of RSS URLs. To translate these to OPML, feed them to this page. Copy and paste them into the big text box, then hit the "Create OPML" link. Seconds later, you will have an output file, which you should save to disk (the file name doesn't matter).

3) Optionally, open the file in a text editor and change the "title" parts from the URL into a sensible title for each feed. Yeah, I was too lazy to write my own OPML and fix that.

4) in Google Reader, the left-hand lower corner, choose "Manage Subscriptions". Choose "Import/Export". Browse and upload your OPML file.

Done!

So far, I like the google reader interface, and now I can actually pay attention to the real people on my list who do still post. (I appreciate y'all! I did this for yoooou!)

---
Don't bother reading the rest unless you want technical details; mostly here for google searching. Let me know if this helped anybody!

The code:

#!/usr/bin/perl

use strict;
use WWW::Mechanize;

my $base_url = "http://da-lj.livejournal.com/profile";

my $m = WWW::Mechanize->new ( autocheck => 1 );

$m->get( $base_url );

my $profile_html = $m->content;
my @feed_lines = ($profile_html =~ /watchingfeeds_body.*/g);
my @feed_urls = ($feed_lines[0] =~ /href='(.*?)'/g);

foreach my $lj_url (@feed_urls) {
    $m->get( $lj_url );
    print $m->find_link( text => 'XML' )->url() . "\n";
}


And that's it. I'm using WWW::Mechanize, which is the bee's knees if you have to do screen-scraping in Perl.

I started off with a manual grab of my "watching" page, a word-processor search-and-replace, and was about to run a batch of 'wget's to grab the lj-feed pages when I realized it would be quicker in perl.

The biggest drawback to this method is the cost of installing WWW::Mechanize in the first place. CPAN makes it easy(ish) but it has a tonne of dependencies. ...I guess it's just one step if you're on a reasonably recent Debian/Ubuntu.

Anyhow.
Tags: geek, lj, perl
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 6 comments