Speaking my (programming) language?: twitter

Showing posts with label twitter. Show all posts

Tuesday, June 23, 2009

The ultimate Twitter client

...is a Feed reader!

OK, not quite, but after some thinking about the top features the ideal Twitter client should have, I found feed readers have most. For the minor drawback of not being able to post messages, you get:

Marking messages as read

Fixed replies

Favorites of friends

others

friends

recommend

Tracking

No follow/unfollow counting

But some of these "features" require you to import the feed of every single one of your users. Noone in their right mind would do that manually, but there's a way to generate a list of feeds in the form of an OPML file (which many readers can import). First of all, get a list of your friends. Then apply the following XSL stylesheet:


<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:output method="xml" encoding="ISO-8859-1" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <opml>
   <body>
    <outline title="twitter">
    <xsl:for-each select="/ids/id">
     <outline>
      <xsl:attribute name="title"><xsl:value-of select="."/></xsl:attribute>
      <xsl:attribute name="xmlUrl">http://twitter.com/statuses/user_timeline/<xsl:value-of select="."/>.atom</xsl:attribute>
     </outline>
    </xsl:for-each>
    </outline>
   </body>
  </opml>
 </xsl:template>

</xsl:stylesheet>

This contains only the most essential elements of a feed, and the name of each feed is the ID, not the name, but Gooogle Reader imports the feed successfully and some readers can rename the feed from information provided in the RSS/Atom format.

For an OPML of your friends' favorites, replace "statuses/user_timeline" with favorites.

Wednesday, January 21, 2009

Apply search to following

There are still some things that don't make sense in search.twitter.com. For instance, it would be perfectly logical to search in just your followees' timelines. Why, TwitterSpy is already doing it. You could, of course, generate a huge query with "from:" and every possible person you're following, but search.twitter.com restricts the length of the query. Good luck if you have more than 10 friends. Uh, are there people like this after 1 month of using Twitter?

You could also write a console script, except that you don't want to type your password and reauthenticate once per follower. So doing it from the browser might be a better fit for a quick & dirty approach. This way your session would be reused.

Here's an educational bookmarklet. You can paste it in the URL bar of Firefox (didn't bother with other browsers) when you've opened twitter.com (won't work if window has a link open under another domain). And you can bookmark it- you thought it would be called a bookmarklet for nothing?

javascript:query = prompt("Twitter search:");
xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", "http://twitter.com/statuses/friends.json");
xmlhttp.onload=function (){
        list=eval(xmlhttp.responseText);
        for (i=0; i<list.length; i++)
                window.open("http://search.twitter.com/search?q="
                        + query + "+from:" + list[i].screen_name)
};
xmlhttp.send(null)

This will open one search window per followee- so don't try this if you have loads of'em. You will hit a limit in Firefox' default config, and you can increase it using the property:


browser.tabs.maxOpenBeforeWarn

Still, you will hit a hardware/hardcoded limit if you're not careful with how many people you're following. I told you this only has educational value, right?

Friday, April 18, 2008

Twitter: do you follow me?

This week's hacking task was to implement a "follow all" function for Twitter.

Even for Twitter users, this needs some explanation: the follow functionality now means "enable notifications". However, the command interface in IM/SMS wasn't changed, so the command name remains "follow". For brevity, I will use the word "follow" instead of "enable notifications".

The reason for having this command is that there used to be a function "follow all" in Twitter. It used to instantly turn on notifications for all your friends (users you're following in new terminology). Now there's a user, called "all" and the function doesn't work (ok, maybe that's not the real reason). This put an end to a very useful feature for users who rely often on the Twitter IM integration.

Having a quick look at the Twitter API it seemed pretty straightforward to fetch all users and enable notifications for all of them one by one. It would be fairly slow, but there was no information in the user list whether notifications are enabled for a user or not. This would have eliminated the need to send requests for users, for whom we already have notifications enabled. Ah well...

The first tool I reach in my toolbox is Ruby. I tried using JSON, but had to give up- I simply couldn't handle Unicode issues:

/usr/lib/ruby/1.8/json.rb:288:in `chr': 1090 out of char range (RangeError)

It turned that it was much smoother with REXML, and it really is a superior library for XML processing (Python's are either easy or full-featured, REXML seems to be both).

I initially took the path of using 'open-uri' for fetching the data over http. After all, it handled even http base authentication and abstracted the nitty-gritty details, and so was easy to use.

But it isn't meant to be used for more fine-grained control, and I soon ran into performance problems, which required special treatment. I found that I quickly exhausted the rate limit of the Twitter API- it's only 70 requests per hour, and with one request per user... you get the picture. The web interface wasn't actually subject to such restrictions, so I wanted to check how it's doing it. A slightly different URL, but worked like a charm, and rate limits seemed to be no problem now!

This time, though, the script ran much longer- 80 seconds compared to about 30 before the change. I analyzed the requests and found out that each received a 302 response, forwarding back to the home page. That meant that open-uri was downloading the whole home page for each user!

At that point open-uri had to go and make way for Net::HTTP. It took more lines to rewrite it, but now I had the choice not to follow redirect responses. I only needed to toggle notifications and didn't care what I got back (as long as it's not an error code). In addition, I could use the same Net::HTTP object, meaning that I use the same HTTP keep-alive connection (not sure if open-uri can do this).

And here's the result- dirty, but still quick. You can configure the action to "follow" or "leave" (to disable all notifications). You need to configure the user and password. Putting the configuration options as command-line arguments is left as an exercise to the reader.

#!/usr/bin/env ruby

require 'uri'
require 'net/http'
require 'rexml/document'
include REXML

user = "lazyuser"
pass = "notmypassword"
action = "follow"
PAGE_USERS = 100

Net::HTTP.start("twitter.com") do |http|
    page = 0
    begin
        page += 1
        req = Net::HTTP::Get.new("/statuses/friends.xml?lite=true&page=#{page}")
        req.basic_auth(user, pass)

        doc = Document.new(http.request(req).body)
        ids = doc.elements.to_a("/users/user/id")
        ids.each do |entry|
            req_follow = Net::HTTP::Get.new("/friends/#{action}/" + entry.text)
            req_follow.basic_auth(user, pass)
            http.request(req_follow)
        end
    end while ids.size == PAGE_USERS
end

Wednesday, April 2, 2008

Playing with Javascript or what binds Greasemonkey, Twitter and Ambient Avatars together

It's been a while since I tried JavaScript hacking (almost 2 years). This time I had the haunting idea to create a Greasemonkey mashup so I can see my twitter page with the avatar next to each tweet exactly as it looked at the time the tweet was posted.

To do this the avatar history must be stored somewhere. That's where chinposin.com comes in. Initially originated as a refreshing avatar on Friday, it evolved into the Ambient Avatar Platform (TM) (credit goes to @monkchips and @yellowpark- you're great). In simple words- you follow @chinposin on twitter, and when you change your avatar, the old one is saved. So you have a gallery of all of your previous avatars for your previewing pleasure and along with the dates they were changed.

For those of you wondering what's twitter, that's a topic for an entire new blog post... or a whole blog, so start at wikipedia, so we can continue with the interesting stuff, shall we?

So there we are- we want to include info from one site into another- a task where Greasemonkey excels (normally JavaScript cannot just fetch info from any other site at whim).

I've obviously lost some of my JavaScript knowledge since it took me an obscene amount of time to get this tiny piece of code working. To start off, I had forgotten that Greasemonkey had also some restrictions, not only enhancements. For security purposes, a lot of objects were wrapped in XPCNativeWrapper and I had to use loads of wrappedJSObject as a workaround. Yes, I know it's not secure, and you should know this too.

Another issue I had a problem with was passing an argument to a closure. I eventually remembered that the closure is an object and you can just assign any field to an object, because each object is also an associative array. Accessing the function object from itself also took some googling- arguments.callee did the trick.

So is there anything that can be improved in this shoddy script? You bet. For starters, it loads the chinposin site a lot, sending 20 simultaneous requests right off the bat, even for duplicate user pages. I could cache the avatar history, but that would require that I synchronize the requests. This script could be modified into a Firefox extension, which has less restrictions than Greasemonkey. And I really should use a prototype for those twenty closures I create, but I gotta have something to do for next time, right?

Without further ado, here's the script. Copy it and paste it into twitteravatarhistory.user.js (OK, you can come up with a longer name if you're so inclined). Then open it with Firefox and if Greasemonkey is installed you will be presented with a dialog prompting you to install it. It's tested with Firefox 2.0.0.13, 3a9, 3b4 and Greasemonkey 0.6.6.20061017 and 0.7.20080121.0. Considering the rate of change, I would be surprised it works in 1 year.

// ==UserScript==
// @name          TwitterAvatarHistory
// @description   Shows tweets with the avatar at time of posting
// @include       http://twitter.com/*
// ==/UserScript==

// Assumptions:
// -chinposin.com has a special date string under the pic
// -avatars are listed chronologically
// -many others regarding DOM position

const avatar_home = "http://www.chinposin.com/home/";
var twitter_images = document.evaluate('//.[contains(@class, "hentry")]', document, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null);
while (message = twitter_images.iterateNext()) {
 message = message.wrappedJSObject;

 // Read user name
 var url = message.getElementsByClassName("url")[0];
 if (!url) continue;
 var username = url.getAttribute("href").match("[^/]*$");

 // Read date of message and extract fields with a regexp
 var date_string = message.getElementsByClassName("published")[0].getAttribute("title");
 var match = date_string.match(/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})\+(\d{2}):(\d{2})/);
 var date = new Date(match[1], match[2], match[3], match[4], match[5], match[6]);

 var http = function(responseDetails) {
  // add dummy element so we can operate on its DOM
  var elem = document.createElement("html");
  document.body.appendChild(elem);
  elem.innerHTML = responseDetails.responseText;

  // getElementById is only found in document object, will use XPath
  var gallery = document.evaluate('//.[@id="gallery"]', elem.wrappedJSObject, null,
                       XPathResult.ANY_UNORDERED_NODE_TYPE, null).singleNodeValue.wrappedJSObject;

  // Might be better to couple these more tightly than creating two separate arrays
  var images = gallery.getElementsByTagName("img");
  var dates = gallery.getElementsByClassName("mainText");

  // Find avatar date not more recent than message date
  for (i = 0; i < dates.length; i++) {
   var match = dates[i].textContent.match(/(\d{4})-(\d{2})-(\d{2}) +(\d{2}):(\d{2}):(\d{2})/);
   var avatar_date = new Date(match[1], match[2], match[3], match[4], match[5], match[6]);

   if (avatar_date < arguments.callee.date) {
    // Replace message pic with avatar corresponding to date
    arguments.callee.img.firstChild.setAttribute("src", images[i].getAttribute("src"))
    // TMTOWTDI:
    //~ arguments.callee.img.replaceChild(images[i].cloneNode(false), arguments.callee.img.firstChild);
    break;
   }
  }

  // clean up temp structure
  document.body.removeChild(elem);

 }

 // Trick to pass data to the closure
 http.date = date;
 http.img = message.getElementsByClassName("url")[0];

 // Reach list of pix from user page
 GM_xmlhttpRequest({method : "GET", url : avatar_home + username, onload : http});

}

Update: code formatting had munched some of the Greasemonkey header, that should be fixed now.

Update 10 April 2008: New code's on Greasemonkey repository since last week, today a fix was issued that adapts to twitter interface changes.

Speaking my (programming) language?