Upgrade rtorrent in karmic 9.10

here is a quick and easy way to upgrade to the latest rtorrent on karmic. Basically we just snag it from lucid.

There are two basic methods to do this, the easiest is debatable between the two. Personally, i just tried method one, but i’m sure method two would work just as well.

Method 1:

  1. head to http://packages.ubuntu.com/lucid/rtorrent
  2. download the deb for your architecture
  3. download the deb for each of the following dependent packages:
    1. libssl
    2. librtorrent
    3. libxmlrpc
    4. libxmlrpc-core
  4. install each of the packages by running sudo dpkg -i *.deb

Method 2:

update your apt sources to lucid, do an apt update and then install rtorrent.  This should get the lucid packages and dependents.  Just remember to revert your changes to the apt source.

Like i said, i did method one, mostly because i was curious which packages i would need to backport to get rtorrent upgraded to version 0.8.6.  If you are curious why one might do this, it is because there are significant changes made to libtorrent and rtorrent, and this is an easy way to get access to them.

There may be some remaining issues to deal with regarding libxmlrpc, this is something i will look into once i have had a chance to work with this new version of rtorrent.

Making Google Chrome Work With a SOCKS5 proxy (i.e. putty ssh tunnel)

I really like chrome, but something that is an absolute must is a SOCKS5 proxy.  This is due to my ultra restrictive corporate firewall, i need to tunnel http content through an ssh tunnel.  And ssh creates a SOCKS5 proxy when you use the -D option.  Chrome seems like it assumes that your proxy is SOCKS4.x and just fails on the ssh tunnel proxy.  But there is hope, i found a way to work around this and it isn’t even complicated!

Just a quick note, i actually use a plugin from chrome called Switchy! which helps me quickly switch to and from the ssh tunnel proxy.  It is certainly no foxyproxy, but it works well enough that i can use it to solve most proxy related problems.

Now, the secret to this solution is to use Proxy Auto Configuration scripts.  These scripts allow you to specify which version of SOCKS to use for your proxy.  So all you need to do is create a file somewhere on your computer (say called pac-ssh-tunnel.pac) and then add the following to it:

function FindProxyForURL(url, host)
{
   return "SOCKS5 localhost:8080";
}

Now, take note that i am creating the ssh tunnel proxy using the following command:

ssh -N -g -D 8080 username@remote_server

Just FYI, “-N” means don’t execute any commands on the server (i.e. strictly a tunnel connection only).  And “-g” allows the remote host to connect to locally forwarded ports (which i must admit, most people will never need this, but handy if you do complex tunneling of data).   Finally, “-D 8080″ means dynamically forward the proxy data through local port 8080.

So once you have your SSH SOCKS5 tunnel up and running, set chrome to use your proxy automatic configuration script that you created (either through chrome’s options, or through Switchy! if you prefer).  Now you can proxy traffic over your ssh tunnel.

One final note, PAC files are basically javascript files (with a few built-in functions).  You can actually create some complicated PAC files that do all your complex proxy selection for you (basically a lot of regular expression matching).  So if you get adventurous, you can just add all your proxy selection logic to that one function (or break it into multiple functions) and then you won’t even need to change your proxy around ever!

Happy surfing!

A quick guide to getting Hpricot to work for you

Hpricot is a cool tool for ruby that performs fast and efficient web page scraping by leveraging the DOM in a web page and the power of XPath. The DOM is just a tree structure (defined in the HTML code), and XPath lets you query this this tree structure as if it were XML. (Recall that most web pages are now XHTML compliant).

The problem with Hpricot is that XPath is not all that fun to work with. But there are some things you can do to make things easier on yourself. For one, there are several tools out there than will help you get the explicit (absolute) xpath for any element in the DOM (firebug extension for firefox comes to mind). Something to note when taking this approach is that the xpath query is not always reliable; some browsers interpret HTML differently. In the case of firefox, i was finding that firefox would transparently insert <tbody> tags under <table> tags in the DOM tree, even though they were not in the HTML code.Then there is MY solution, make Hpricot work for you. Hpricot has the ability to build your query string for you. The steps are as follows:

  1. first search for the DOM node
  2. reverse-generate the xpath query string

To find the DOM node, you will need to use the following query string:

//text()[text()*='text to search for']

The above query string will locate any nodes that contain the text “text to search for” somewhere.  If you want to search for an exact match, remove the asterix from the query.  this query will return a list of element nodes, from which you can request the node’s xpath.  That’s it.  Here is some basic ruby code to get the job done.

require 'rubygems'
require 'hpricot'
require 'open-uri'

uri = 'http://www.google.ca/'
query = "//text()[text()*='more']"
doc = Hpricot(open(uri))

doc.search(query).each do |row|
    # print our xquery
    puts "[#{row.xpath}] => "
    # print the row data
    puts "#{row.to_html}\n"
end

This will print out a list of DOM tree node’s that contain the text your searching for and the XPath query it takes to get to each one specifically.  Happy scraping!

« Previous PageNext Page »