Showing posts with label ruby. Show all posts
Showing posts with label ruby. Show all posts

Friday, June 27, 2008

Ruby for mp3 file organizing

So there we are, me and my friend Oliver, caught in a business trip. We're already bored of the only decent pub in the village, our families are a long distance away, so what's a developer to do? Code in a programming language he/she's not allowed on the job, of course! Oliver seemed interested in Ruby and I've already done a couple of small scripts with it, so we were curious to see what the fuss is about.

It goes without saying that if you want to learn to program (in a particular language) you should not rely too much on books. The only way is to find a task you want to have automated, and then code it using your language of choice. Surely, you must pick the task (or the language) carefully, since not all languages are suitable for all tasks.

One of the things which Oliver has struggled with was organizing all of his podcasts in his player, sorted neatly by directories of author and title. Having found both a hammer and a nail, we were ready to start pounding.

After a bit of research we found the mp3info and id3tag Ruby libraries. id3tag had different fields for ID3v1 and ID3v2 data and didn't have write support (not that we needed it). mp3info didn't have ID3v2.2 support, but I found an interesting link about ID3 internals- the format of the fields was something that could be useful.

After a while our pair programming session has reached a milestone- our script works. It doesn't seem very modular though, so we spend some time making classes and discussing what is the responsibility of each class. Should there be a manager-class? Or should the objects manage themselves? I go with the second approach, and here's the result:


# Class for handling information of the mp3 file
class Mp3File
attr_reader :title, :artist, :album

def initialize filename
@artist = @album = "unknown"
@filename = filename
@title = File.basename(filename, ".mp3")

read_attributes
end

def title
sanitize(@title)
if @title == "unknown" then @title = File.basename(@filename, ".mp3") end
@title
end

def read_attributes
begin
Mp3Info.open(@filename) do |mp3info|
(@title, @artist, @album) = %w{title artist album}.collect { |attrib|
begin
(result = mp3info.tag.send(attrib)).empty? ? "unknown" : result
rescue
"unknown"
end
}
end
rescue
end
end

def sanitize str
str.tr_s!("?'","_")
end

def transfer(newPath)
newPath = eval('"' + newPath + '"')
FileUtils.mkdir_p File.dirname(newPath)
FileUtils.cp @filename, newPath
end
end


This is the class which is initialized with the location of the file and then extracts information about the artist, title and track name. The read_attributes method is meant to show off our new knowledge about the dynamic nature of Ruby- we build a list of methods to invoke on the Mp3Info object, and if no meaningful result, return "unknown". Finally, as the class knows about the current location and mp3 meta-info, it has a method for copying the file to a new location. The new path is passed as a template, where the #@artist, #@album, #@title are substituted with the value of these fields.


class Mp3List

attr_reader :files

def files
@files.map {|file| Mp3File.new(file) }
end

def initialize(sourcePath, days = 7)
@sourcePath = sourcePath
@days = days
@files = read_new
end

def read_new
Dir["#@sourcePath/**/*.mp3"].find_all do |path|
test(?M, path) > (Time.now - (@days * 60 * 60 * 24))
end
end

def to_s
@files.inspect
end
end


Here comes the class, which represents a list of mp3 files in a certain directory (and subdirectories), which satisfies some criteria- in this case, how long ago the files were created (modified). Could it be made more general? Certainly, but in a 80-line script? Maybe next time.


list = Mp3List.new("/home/whoami/Music", 730)
list.files.each do |mp3|
#~ puts "Processing #{filename}"
mp3.transfer('/tmp/music/#@artist/#@album/#@title.mp3')
end


What's left was an example of how to use these classes. Seems good to me- and best of all, it works.

The only thing left was to prepare a patch for the mp3info library for ID3v2.2 support. I actually implemented one (still not incorporated in base), and it also initializes the common fields with either the v2 or v1 data, whatever present (v2 still has precedence, if both are present).

Conclusions from our short session:

  • Ruby is neat for quick hack jobs

  • mp3info does not provide an exhaustive ID3 handling support, but is good enough and workable

  • Pair programming might not be smooth from the start, but you will learn a lot about yourself

  • Organizing your music can sometimes take longer than total time spent looking for your tracks

  • You should choose your business trip accomodation place carefully if you can

Friday, April 18, 2008

Twitter: do you follow me?

This week's hacking task was to implement a "follow all" function for Twitter.

Even for Twitter users, this needs some explanation: the follow functionality now means "enable notifications". However, the command interface in IM/SMS wasn't changed, so the command name remains "follow". For brevity, I will use the word "follow" instead of "enable notifications".

The reason for having this command is that there used to be a function "follow all" in Twitter. It used to instantly turn on notifications for all your friends (users you're following in new terminology). Now there's a user, called "all" and the function doesn't work (ok, maybe that's not the real reason). This put an end to a very useful feature for users who rely often on the Twitter IM integration.

Having a quick look at the Twitter API it seemed pretty straightforward to fetch all users and enable notifications for all of them one by one. It would be fairly slow, but there was no information in the user list whether notifications are enabled for a user or not. This would have eliminated the need to send requests for users, for whom we already have notifications enabled. Ah well...

The first tool I reach in my toolbox is Ruby. I tried using JSON, but had to give up- I simply couldn't handle Unicode issues:

/usr/lib/ruby/1.8/json.rb:288:in `chr': 1090 out of char range (RangeError)

It turned that it was much smoother with REXML, and it really is a superior library for XML processing (Python's are either easy or full-featured, REXML seems to be both).

I initially took the path of using 'open-uri' for fetching the data over http. After all, it handled even http base authentication and abstracted the nitty-gritty details, and so was easy to use.

But it isn't meant to be used for more fine-grained control, and I soon ran into performance problems, which required special treatment. I found that I quickly exhausted the rate limit of the Twitter API- it's only 70 requests per hour, and with one request per user... you get the picture. The web interface wasn't actually subject to such restrictions, so I wanted to check how it's doing it. A slightly different URL, but worked like a charm, and rate limits seemed to be no problem now!

This time, though, the script ran much longer- 80 seconds compared to about 30 before the change. I analyzed the requests and found out that each received a 302 response, forwarding back to the home page. That meant that open-uri was downloading the whole home page for each user!

At that point open-uri had to go and make way for Net::HTTP. It took more lines to rewrite it, but now I had the choice not to follow redirect responses. I only needed to toggle notifications and didn't care what I got back (as long as it's not an error code). In addition, I could use the same Net::HTTP object, meaning that I use the same HTTP keep-alive connection (not sure if open-uri can do this).

And here's the result- dirty, but still quick. You can configure the action to "follow" or "leave" (to disable all notifications). You need to configure the user and password. Putting the configuration options as command-line arguments is left as an exercise to the reader.

#!/usr/bin/env ruby

require 'uri'
require 'net/http'
require 'rexml/document'
include REXML

user = "lazyuser"
pass = "notmypassword"
action = "follow"
PAGE_USERS = 100

Net::HTTP.start("twitter.com") do |http|
page = 0
begin
page += 1
req = Net::HTTP::Get.new("/statuses/friends.xml?lite=true&page=#{page}")
req.basic_auth(user, pass)

doc = Document.new(http.request(req).body)
ids = doc.elements.to_a("/users/user/id")
ids.each do |entry|
req_follow = Net::HTTP::Get.new("/friends/#{action}/" + entry.text)
req_follow.basic_auth(user, pass)
http.request(req_follow)
end
end while ids.size == PAGE_USERS
end