Tuesday, November 10, 2009

Visual Studio Error Error

Came across a serious issue in the new Visual Studio Beta today when installing. Someone ought to notify Microsoft to prevent this rarely seen issue, maybe it can be included as a feature?

Monday, October 19, 2009

Dynamic JavaScript Text-To-Speech using Acrobat Reader

Adobe Acrobat includes a text-to-speech engine which you can access through the JavaScript functionality built into PDF. However as you can post data between the browser and embedded PDFs, it is possible to use an embedded PDF as a text-to-speech engine to read text aloud, dynamically in the browser. Try it below!

TTS Example

How does it work? A combination of  JavaScript in the browser and in an embedded PDF. To use this method of your site do the following:

First you need to embed the pdf :
<object id="PdfHost" name="PdfHost" type="application/pdf" width="0" height="0" data="http://sites.google.com/site/aplacetostorethings/tts.pdf">

The browser script below initialises the message handler and creates the bridge between the browser and acrobat:
  <script type="text/javascript">
        function sayIt(msg)
            var pdf = document.getElementById('PdfHost');
        function init(){
            var pdf = document.getElementById('PdfHost');
            if (!pdf.messageHandler)
                pdf.messageHandler = {};
            pdf.messageHandler.onMessage = onPdfMsg;
    // -->

The setTimeout is used to attempt to setup the message handler, which has to happen after the pdf has finished loading. Then you can just call it using script like:
javascript:sayIt("read this text");

If you are interested in the source of the PDF, I have posted the JavaScript it contains here: tts.js

For this page I used the origami framework to inject the script into valid PDF.

Saturday, July 4, 2009

Google Chrome and Data URIs

Google Chrome's data URIs appear to work with plug-ins like acrobat (which isn't the case in Firefox). This allows you to do silly things like below:


(Requires chrome, acrobat 7+ and speakers :)

Under the hood, this is a JavaScript URI, which has a small script which replaces the document body with the following:


The enocded blob above is a base64 encoded PDF, that in in turn contains JavaScript to call the acrobat text-to-speech API.

P.S. For anyone interested in acrobat hacking, I'd strongly recommend a look at the "Origami" framework, created by Guillaume Delugré & Fred Raynal. This ruby based tool provides an open-source tool to construct, manipulate and analyse PDF documents.

Saturday, April 25, 2009

Internationalized Domain Names

Internationalized Domain Names (IDNs) are converted into punycode in most browsers by default, although it depends somewhat on the TLD. (For example, firefox will show the punycode domain name for all TLDs except those specified here.) In the quest for the domain-name-to-rule-them-all, I have been searching for domains that are readable in both unicode and punycode. For example, http://ωaψward.gr/ converts to http://xn--award-beef.gr. Pretty much a waste of time, but there are a few interesting side-effects that I noticed.

To start with, I started playing around manually, using a simple conversion app I wrote in Java, using the IDN class. After a few minutes playing around it was fairly clear that it was never going to happen with this approach. Generating unicode URLs using two valid english words - i.e. xn--apple-banana - was a faster approach. It still involved far too much effort, but introducing a few filters helped to cull out urls that are obviously not words. Some of the more interesting ones were:
  • http://www.ωaψward.gr --> http://xn--award-beef.gr
  • кill.gr --> xn--ill-bed.gr
  • ႦႲႶႻႸႽ.org --> xn--endymion.org (endyion is SO a word)
  • 汤.cn --> xn--ftw (汤, means soup in chinese, and soup is clearly ftw)
Then there were a few weird ones which deserve special mention:
  • f̹ace̸̸bo̸ok.com--> xn--facebook-deface.com (facebook defaced, get it?)
  • ̱yahoo.cn --> xn--yahoo-end.cn ( Yahoo ended by homograph attack!)
As it turns out however, the man blocks you making up cool domain names, with bourgeoisneo-capitalist rules. The main restrictions are IDNA (RFC 3490) & NAMEPREP (RFC 3491) which prohibit a lot of the unicode goodness. All a bit TLDR but basically it means "Bad luck Ghengis Botherder, no MONGOLIAN VOWEL SEPARATOR for you".

I tried registering http://www.ωaψward.gr on greek registrars since it doesn't mix unicode from different languages, but I was thwarted, mainly by my complete lack of greek & verified by visa completely failing. (wtf is that shit anyway...srsly) If anyone more clueful than me can shed some light on unicode domain rules that would be cool (Yes Chris Weber, I am talking to you.)