Here’s a bit of a dud write-up. Dud because the risks are minimal, as I realised when I started looking into cross-domain iframe DOM scraping… But potentially interesting reading for web developers nonetheless.
If you have been browsing the internet lately, you have more than likely seen a Facebook Connect box. It looks like this:
This particular screenshot is taken from Gameplanet Forums, but Facebook Connect makes it easy for any website developer to embed the panel into their website. I could put one on tom.net.nz, if I was so inclined.
Now, Facebook would argue that this frame is a naive, harmless feature, because the information is never passed directly to the website in question, but rather the website just embeds a little piece of code, and Facebook generates the actual content of the pane. The code embedded is identical for any user visiting the website.
This is all well and good, however (!), the content that Facebook generates for this pane will differ depending on whether or not the user viewing it is currently logged into Facebook. If they are, then Facebook tries to show information more “relevant” to that user. For example, in the above screenshot, “Matt” is my friend on Facebook (and the only person in my friends who has “liked” Gameplanet on Facebook). The other people are generated randomly, but no matter how many times I refresh the page, Matt will always appear in the list. Do you see where this is going?
One might assume that Facebook has put some measures in place to stop the site from scraping this information, however the tech savvy can follow the following link which generates the box for Gameplanet’s Facebook Connect pane: link. If you view the source, there are all the names, in plain HTML, with links to both the photo, and (perhaps more disturbingly), the profile of each person. I was also able to scrape my own user ID from the HTML. Fortunately, in most modern browsers, XSS (Cross-Site Scripting) protection prevents the parent page from accessing the DOM of Facebook’s frame, but this is still a major potential security problem for older browsers which don’t have such protection built-in. By inspecting the list of users on a few page loads and looking for duplicate names, a malicious site could ascertain who is friends with the user browsing the page. The parameters passed to the frame source allow significant customization of the response, for example with a bit of tweaking I was able to come up with the following source, which now shows 100 users instead of the default 15. To take it a step further, a malicious page could potentially load this frame without even showing it, meaning the user would be completely unaware that the site was doing anything to do with Facebook, meanwhile it’s scraping the user’s private information.
Pinax is a framework upon a framework, effectively. It is built as an additional layer above Django, that adds many niceties out of the box, such as email verification, profile management, login/registration, OpenID support, user messaging, groups, wikis, blogging, etc etc. I have a basic structure in place that currently displays a plain home page, and allows login/registration. Hopefully using Pinax will allow me to quickly develop and extend Money Mouth when I actually sit down to write the meat of the code.
I also set myself up with a nice little deployment script on my WebFaction hosting for pulling down the latest copy of my code from SVN, which should save me additional time.