Ben Clifford Technical Blog: 2012

20 November, 2012

columns

(Sorry I'm making up the code in this post rather than actually distilling down the real implementations - it probably doesn't run but you can get the idea)

A few times recently I've wanted to output column-like data from haskell: HTML tables in two cases, and CSV in another.

In these two cases, I wanted column headings (<th> tags in the HTML case; and a CSV heading line in the CSV case.

Previously I've written code that looks roughly like:

mapM_ putHeading ["heading1","heading2","heading3"]
forM_ rows $ \(entry1, entry2, entry3) -> do
  putCell entry1
  putCell entry2
  putCell entry3

The annoyance here was that nothing ties together the headings and the data values: although in the case of two or three columns, it is relatively simple to see the correspondence, it was getting hard in some wider cases.

Vaguely inspired by lenses, I rewrote some of this code to look like this:

cols = [("heading 1", \(entry1, _, _) -> entry1),
        ("heading 2", \(_, entry2, _) -> entry2),
        ("heading 3", \(_, _, entry3) -> entry3)
       ]
mapM_ putHeading (map fst cols)
forM_ rows $ \row -> forM_ cols \col -> (putCell ((snd col) row))

What this does is package up the column heading and the code to generate (for each row) the appropriate content. This makes it easier (I hope) to keep the headings and the data aligned. Also, all the boilerplate that you don't see here (putHeading and putCell disguise it) can be shared, with only a new cols defined for each different table.

13 November, 2012

functor

One of the first cool things you encounter in functional programming is map.

Say you have a function length :: String -> Int which gives the length of a string:

> length "hello"
5

(in real haskell, thats not actually the type of length but its close enough for now)

Now you can apply that to a list of strings, like this:

> map length ["hello","goodbye"]
[5,7]

For most of I've thought of this as meaning "apply the function length to each element of the list ["hello", "goodbye"].

But theres a slightly different interpretation thats a bit more "functional" feeling, that I've come across recently.

Consider only applying the first argument to map (you can do that in haskell...):

map length

Whats the type of this expression? It is [String] -> [Int]. So what its done is converted a function from string to int, into a new function from lists-of-strings to lists-of-ints.
And now we have that function that converts a list of strings to lists of ints, we can apply it to a list of strings:

> (map length) ["hello","goodbye"]
[5,7]

So the different reading that I see now is "lift this function to work on lists", first, followed by application to a list.

The same new intuition applies to functors in general and fmap, and its from thinking more about category theory that this view of things starts to appeal to me.

06 November, 2012

direct.uk consultation notes

Nominet, the authority for the .uk top level domain has a consultation on opening up registration of second level domains, such as example.uk. (At present, people generally can register third level domains such as example.co.uk, example.org.uk, example.ac.uk or example.gov.uk)
Tomorrow I'm going to a round table session (will King Arthur be there?), and so in preparation I made some notes on my response to their consulation document.My notes are here.

05 November, 2012

why didn't I do this before?

I finally got round to making a shell helper, sr:

#!/bin/bash
echo ssh+screen reattach to $1
ssh -tv $1 screen -x

Oh the hours I've wasted over the last 10 years on those very keystrokes...

25 October, 2012

london HUG

well I just got back from the 1st meeting of v2 of the london haskell users group (apparently it used to exist before; and the ghosts of its former incarnation floated around the room in the form of code kata people)
dude (derek) gave a talk on why do monads matter? - a brave thing to do, given how many have tried their own take on a monad tutorial (myself included). nothing spectacular but certainly another take on monads, and it did tickle my brain in the right areas enough into realising that <$> only needs a Functor so it certainly paid off in the OH! sense; even though that leap was personal to me and wouldn't be apparent if you were at the talk - there was no mention of functors at all, really
Turnout was better than the average dutchhug turnout (sorry Shaun)
It also turns out theres a regular Haskell coding dojo in London, hoodlums, already happening (apparently a spinoff or somehow related to v1 of the london hug)
Went to pub afterwards. room booked (or at least some upstairs space that was otherwise empty). chocolate orange beer, which was less disgusting than it sounds. it was cool to meet a bunch of people using haskell for $ (although I count myself in their ranks these days).
after rapidly throwing down a few of those chocolate orange beers (hence the incoherency and lack of case), I shouted out suggestions for future talks on: agda; quickcheck; parsec; and functors/monads/arrows/applicative (turned out some fucker already had a talk on that...)
next meeting 28th nov 2012. i'll probably be there.
ps also at the pub I met another programmer also called Ben - I asked him if he's going to BenConf but although he'd heard of it, it hadn't suckered him in.

18 October, 2012

mifi vs ipv6

today's ipv6 bug: mifi internal nameserver that redirects you to a "mifi not connected" web page when its not connected ... returns some really random shit when you ask it for AAAA. at least sometimes - I don't think it always does that, but maybe? I don't have it switched on but disconnected much.

08 October, 2012

yield zipper

Oleg wrote about converting an arbitrary traversable into a zipper.

His code uses delimited continuations, and I puzzled a while (years...) before starting to understand what was going on.

I just read Yield: mainstream delimited continuations.

It looked to me like I could easily change Oleg's zipper can be expressed using "yield" which gives a different view, that I think I might have understood more easily - because I know yield from other languages, and don't properly have my head around continuations (which is basically the point of the "Yield" paper, I think)

So then, my altered version of the zipper on Oleg's page, using yield:

>  import Data.Traversable as T


>  type Zipper t a = Iterator (Maybe a) a (t a)

>  make_zipper :: T.Traversable t => t a -> Zipper t a
>  make_zipper t = run $ T.mapM f t
>   where
>   f a = do
>     r <- yield a
>     return $ maybe a id r

This is run and yield pretty much as defined on page 10 of the yield paper:

>  data Iterator i o r = Result r | Susp o (i -> Iterator i o r)
>  yield x = shift (\k -> return $ Susp x k)
>  run x = reset $ x >>= return . Result

and some test code:

>  sample = [1,2,3]

>  main = do
>    let (Susp a1 k1) = make_zipper sample
>    print a1
>    let (Susp a2 k2) = k1 Nothing
>    print a2
>    let (Susp a3 k3) = k2 $ Just 100
>    print a3
>    let (Result end) = k3 Nothing
>    print end

and below, to make this posting properly executable, here's Oleg's library code for shift/reset:

> -- The Cont monad for delimited continuations, implemented here to avoid
> -- importing conflicting monad transformer libraries

>  newtype Cont r a = Cont{runCont :: (a -> r) -> r}


>  instance Monad (Cont r) where
>     return x = Cont $ \k -> k x
>     m >>= f  = Cont $ \k -> runCont m (\v -> runCont (f v) k)

>  reset :: Cont r r -> r
>  reset m = runCont m id

>  shift :: ((a -> r) -> Cont r r) -> Cont r a
>  shift e = Cont (\k -> reset (e k))

Update 1: Changed types from Oleg's Z | ZDone to the yield paper's Susp | Result

02 October, 2012

cd

# cd
bash: cd: write error: Success

25 September, 2012

asian letters

I'd got quite used to my os x machine (at least in all its non-terminal windows) being able to deal with chinese (and related) scripts. which was neat because i can read a little bit of that shit. and its the future, after all, so i'd expect that to work. This new debian install though, they're back to being funny squares indicating unknown characters. like they were on the last linux desktop machine i just trashed from 12y ago.

gonna party like its 1999.

18 September, 2012

cut paste paste

I'm running desktop (well, laptop) linux for the first time since I left pygar.isi.edu behind in early 2005.

Trying to cut and paste from an xterm into firefox (actually iceweasel). I copy something (in as much as I remember how xterm copy-on-highlight works). Go to iceweasel and choose edit paste. Get something else completely different. Not random shit. Just some other paste.

A day later, after chatting on #lug, I realise ... *OF COURSE* ... the clipboard I get by pressing my new middle mouse button (I was on a mac before) pastes from a different clipboard than edit... paste does.

#lug also gave me this:

20:34 < philsnow> autocutsel should be installed by default
20:35 < philsnow> it will synchronize the primary and clipboard x selection 
                  buffers
20:35 < milki> but thats like losing a buffer
20:37 < philsnow> i don't know of anybody who _likes_ the separation of primary 
                  and clipboard in x
20:38 < philsnow> i generally use primary exclusively until i find some PoS 
                  site that uses some contrivance to not actually select things 
                  when you select them

I've been using linux since ~1995 and I pretty much have no fucking idea what they are talking about.

Truly this year will be the year of the linux desktop.

11 September, 2012

linux wifi

got a thinkpad x230.

put debian on it

the wifi doesn't work out of the box. grr. just like the last thinkpad I got in 2003...

28 August, 2012

27 o'clock

In Japan, I was surprised to see Osaka FM802 Funky Music Station listing their times in 24h notation all the way up to around 2759... turns out in radio (even apparently in the rest of the world), the day changes at 4am, so the clock runs from 0400 to 2800-δ.

14 August, 2012

os x disk eject

Since 2006 I've been slightly frustrated by having to go to the desktop to eject removable media on my Mac laptop.

Finally I got round discovering the built-in diskutil command.

$ df -h
/dev/disk7s1   1.8Gi  1.8Gi   71Mi    97%    /Volumes/CANON_DC

$ diskutil unmountDisk /dev/disk7
Unmount of all volumes on disk7 was successful

Much nicer than sync... sync... pull.

11 August, 2012

undervoltage

It sounds like a common cause of problems sticking peripherals onto a Raspberry Pi come from unsufficient power supply.

They have a USB port for feeding power in, and I think that encourages people (including myself) to use any old USB compatible supply (even though they explicitly tell you to not do that in the docs...). I tried mine first feeding from my laptop and then from my mifi power supply.

That mifi supply seems to work for regular usage but when I start plugging things into USB it gets all wanky. So I measured the voltage with my trusty old multimeter - it seems to be down around 4.6v, (varying by about 0.1v depending on what I have wired in). That's well below the 4.75v minimum that apparently I should be seeing.

Off the the RS online shop to buy a grown up power supply then...

31 July, 2012

setting up live resize on ec2

ec2 doesn't let you do a live resize on an attached elastic block store; and the procedure for resizeng offline is a bit awkward - make a snapshot and restore that snapshot into a bigger EBS volume (here's a stack overflow article about that).

LVM lets you add space to a volume dynamically, and ext2 can cope with live resizing of a filesystem now. So if I was using LVM, I think I'd be able to do this live.

So what I'm going to do is:

firstly move this volume to LVM without resizing. This will involve downtime as it will be roughly a variant of the above-mentioned "go offline and restore to a different volume"
secondly use LVM to add more space: by adding another EBS to use in addition (rather than as a replacement) for my existing space; adding that to LVM; and live resizing the ext2 partition.

First, move this volume to LVM without resizing.

The configuration at the start is that I have a large data volume mounted at /backup, directly on an attached EBS device, /dev/xvdf.

$ df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/xvdf              99G   48G   52G  49% /backup

in AWS web console, create a volume that is a little bit bigger than the volume i already have. so 105 gb. no snapshot. make sure its in same availability zone as the instance/other volume.

attach volume to instance, in the aws console.

on the linux instance, it should now appear:

$ dmesg | tail
[15755792.707506] blkfront: regular deviceid=0x860 major,minor=8,96, assuming parts/disk=16
[15755792.708148]  xvdg: unknown partition table
$ cat /proc/partitions 
major minor  #blocks  name
 202        1    8388608 xvda1
 202       80  104857600 xvdf
 202       96  110100480 xvdg

xvdg is the new EBS device.

Despite that dmesg warning, screw having a partition table - I'm using this as a raw device. It might suit your tastes at this moment to create partitions though, but it really doesn't matter.

Now I'm going to make that 105Gb on xvdg into some LVM space: (there's a nice LVM tutorial here if you want someone else's more detailed take)

 # pvcreate /dev/xvdg
  Physical volume "/dev/xvdg" successfully created
# vgcreate backups /dev/xvdg
  Volume group "backups" successfully created

Now we've created a volume group backups which contains one physical volume - /dev/xvdg. Later on we'll add more space into this backups volume group, but for now we'll make it into some space that we can put a file system onto:

# vgdisplay | grep 'VG Size'
  VG Size               105.00 GiB

so we have 105.00 GiB available - the size of the whole new EBS volume created earlier. It turns out not quite, so I'll create a logical volume with only 104Gb of space. What's a wasted partial-gigabyte in the 21st century?

# lvcreate --name backup backups --size 105g
  Volume group "backups" has insufficient free space (26879 extents): 26880 required.
# lvcreate --name backup backups --size 104g
  Logical volume "backup" created

Now that new logical volume has appeared and can be used for a file system:

$ cat /proc/partitions 
major minor  #blocks  name

 202        1    8388608 xvda1
 202       80  104857600 xvdf
 202       96  110100480 xvdg
 253        0  109051904 dm-0
# ls -l /dev/backups/backup
lrwxrwxrwx 1 root root 7 Jul 25 20:35 /dev/backups/backup -> ../dm-0

It appears both as /dev/dm-0 and as /dev/backups/backup - this second name based on the parameters we supplied to vgcreate and lvcreate.

Now we'll do the bit that involves offline-ness: I'm going to take the /backup volume (which is /dev/xvdf at the moment) offline and copy it into this new space, /dev/dm-0.

# umount /backup
# dd if=/dev/xvdf of=/dev/dm-0

This dd takes quite while (hours) - its copying 100gb of data. While I was waiting, I discovered that you can SIGUSR1 a dd process on linux to get IO stats: (thanks mdm)

$ sudo killall -USR1 dd
$ 41304+0 records in
41303+0 records out
43309334528 bytes (43 GB) copied, 4303.97 s, 10.1 MB/s

Once that is finished, we can mount the copied volume:

# mount /dev/backups/backup /backup
# df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                       99G   68G   32G  69% /backup

Now we have the same sized volume, with the same data on it, but now inside LVM.

Second, add more space

Now we've got our filesystem inside LVM, we can start doing interesting things.

The first thing I'm going to do is reuse the old space on /dev/xvdf as additional space.

To do that, add it as a physical volume; add that physical volume to the volume group; allocate that new space to the logical volume; and then resize the ext2 filesystem.

These commands add the old space into the volume group:

# pvcreate /dev/xvdf
  Physical volume "/dev/xvdf" successfully created
# vgextend backups /dev/xvdf
  Volume group "backups" successfully extended

... and these commands show you how much space is available (by trying to allocate too much) and then add to the space:

# lvresize /dev/backups/backup -L+500G
  Extending logical volume backup to 604.00 GiB
  Insufficient free space: 128000 extents needed, but only 25854 available
# lvresize /dev/backups/backup -l+25854
  Rounding up size to full physical extent 25.25 GiB
  Extending logical volume backup to 129.25 GiB
  Logical volume backup successfully resized

Even though we've now made the dm-0 / /dev/backups/backup device much bigger, the filesystem on it is still the same size:

 df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                       99G   68G   32G  69% /backup

But not for long...

Unfortunately:

# resize2fs /dev/backups/backup
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/backups/backup is mounted on /backup; on-line resizing required
old desc_blocks = 7, new_desc_blocks = 9
resize2fs: Kernel does not support online resizing

the version of the kernel on this host doesn't allow online resizing (some do). So I'll have to unmount it briefly to resize:

# umount /backup
# resize2fs /dev/backups/backup
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/backups/backup to 33882112 (4k) blocks.
The filesystem on /dev/backups/backup is now 33882112 blocks long.

# mount /dev/backups/backup /backup
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                      128G   68G   60G  53% /backup

So there's the bigger fs. (though not as big as I had expected... I only seem to have got 30G extra worth of storage, not 100 as I was expecting...

Well it turns out that all the space wasn't allocated to this LV even though I thought I'd done that:

# vgdisplay
...
  Alloc PE / Size       33088 / 129.25 GiB
  Free  PE / Size       19390 / 75.74 GiB
...

but no matter. I can repeat this procedure a second time without too much trouble (indeed doing this procedure easily is the whole reason I want LVM installed...

Having done that, I end up with the expected bigger filesystem:

# df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/backups-backup
                      202G   68G  135G  34% /backup

Now whenever I want to add more space, I can repeat step 2 with just a tiny bit of downtime for that particular filesystem; and if I get round to putting on a kernel with online resizing (my raspberry pi has it, why doesn't this?) then I won't need downtime at all...

23 July, 2012

autogenerating reverse DNS for ipv6

I was getting annoyed by manually configuring an IPv6 reverse domain.

For reverse DNS, you need to break the IP address up into pieces (bytes for IPv4, nibbles for IPv6), reverse them, put dots between pieces, to get a domain name. Then at that domain name, you put a reference to the hostname for that IP.

So an IP address like 2001:8b0:7c:1:216:76ff:fe16:755a turns into a domain name a.5.5.7.6.1.e.f.f.f.6.7.6.1.2.0.1.0.0.0.c.7.0.0.0.b.8.0.1.0.0.2.ip6.arpa., and there you can find a PTR record pointing to the hostname dildano.hawaga.org.uk

Forming those long domain names was/is quite awkward, and its a task well suited to automation. All of the hosts already have forward DNS entries, so there's not even much additional information needed to generate the reverse zone.

I wrote a tool (in an unholy alliance of Haskell and dig) which queries a bunch of forward zones and outputs the appropriate reverse DNS records ready for pasting into a zone file.

You specify zones (and appropriate servers) that will be asked for AAAA records; then all of the AAAA records which refer to IPv6 addresses on the specified network will be converted into PTR records and sent to stdout, ready to paste into a zone file.

$ dnsrz hawaga.org.uk@dildano.hawaga.org.uk clifford.ac@malander.clifford.ac charlottevrinten.org@dildano.hawaga.org.uk mrsclifford.eu@malander.clifford.ac --prefix=200108b0007c0001
 
3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 PTR clifford.ac.
3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 PTR malander.clifford.ac.
3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 PTR malander.mrsclifford.eu.
4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 PTR fecolith.clifford.ac.
4.1.2.0.0.f.e.f.f.f.3.9.d.0.2.0 PTR pomade.clifford.ac.
a.5.5.7.6.1.e.f.f.f.6.7.6.1.2.0 PTR dildano.hawaga.org.uk.
c.0.2.a.4.c.e.f.f.f.3.6.b.1.2.0 PTR newsnowdrop.mrsclifford.eu.
0.a.0.c.b.a.e.f.f.f.3.6.1.2.2.0 PTR tenesmus.clifford.ac.
7.2.f.0.1.9.e.f.f.f.b.4.5.2.2.0 PTR coprolith.clifford.ac.
c.2.5.d.b.f.e.f.f.f.b.e.7.2.a.b PTR pygar.hawaga.org.uk.
c.2.5.d.b.f.e.f.f.f.b.e.7.2.a.b PTR pygar-6.hawaga.org.uk.
b.6.5.8.2.f.e.f.f.f.8.c.c.b.a.c PTR laptop.hawaga.org.uk.

I wanted to use the Haskell dns package which I've used a bit before; but it didn't have enough features: no zone transfer capability, for a start... so I invoke dig and parse that out.

The commandline syntax is: <zonename>@<DNS server> where zonename is a forward zone, and the specified server will answer AXFRs for that zone. Thats quite icky but it gets around needing a full Haskell DNS implementation.

The code is on github under benclifford/dnsrz.

(later: as fits my tradition of writing a tool and then finding someone has done something similar first, bind comes with a tool arpaname which will convert an IP address into a reverse name, though it doesn't do all the other stuff above, but does work for ipv4 too: http://ftp.isc.org/isc/bind9/cur/9.9/doc/arm/man.arpaname.html

18 July, 2012

first impressions of raspberry pi

I just got my Raspberry Pi, left on doorstep from Farnell in a package labelled "tshirt"; indeed inside the package was a (free) element14/pi tshirt, and wrapped inside that was my actual raspberry pi. Its taken a while to arrive (like most people's) - it was ordered on the 9th of May.

The first afternoon I spent playing with it came with an excited nostalgic feeling to my teenage years, scrabbling to put random pieces of hardware together scrounged around the house in order to get a working system: a keyboard off one host, a usb hub stolen from someone, a long ethernet wire borrowed from someone because the TV is in one room and the ethernet ports are in another. It was made more nostalgic because my best friend from secondary school who used to be my main partner in this kind of activity turned up to have a play too.

I fairly easily got the example debian distribution downloaded (my first linux distro downloaded by bittorrent) and running on a 4gb SD card; and once that was in place, it was easily to get a browser up and watch the bunnycam.

After giving up on getting flash running in order to get BBC iPlayer, I discovered get-iplayer (on the way getting annoyed by this blog post by a BBC dude about how people shouldn't use 3rd party clients). I can just about play BBC radio audio at the same time as the bunny webcam is streaming video. But it didn't seem to have enough power to decode an iplayer video stream instead of the bunny webcam, which was a bit disappointing.

There seem to be a few funny presumably hardware related problems: the device doesn't reboot at all, and sometimes when I boot, sound does not work at all. I haven't had time or inclination to track these down at all. Most concerning for me is that the experience of different devices seems slightly different - for example, other people seem to be able to reboot just fine.

After downloading the recommended Debian distro, I discovered that there was a beta release of a much more recent version, wheezy. This was a lot smoother experience than the recommended distro - it resized the image automatically to fill the whole 4g disk space, and just seemed more polished. Either way, those distributions are debian based, so the amazingness (if you're being nostalgic about 80s/90s era hacking) of apt comes into play to get a big range of software installed.

It was cool that the wheezy distribution came with Scratch - a visual programming language that is sort of a year-2012 version of Logo. (That's cool - one of the first things I thought about when I heard of the goals of the raspberry pi was to think: why don't you get people programming scratch instead?)

Something interesting I discovered was that the ARM architecture does floating point emulations completely differently to intel: there are different calling conventions for software floating point and hardware floating point, and it seems that all the packages in your distro need recompiling to make use of hardware floating point, if they were originally built for software floating point. I'll have a play with that later.

The GPIO port had pins soldered on, which i wasn't expecting. I'm not sure what to use those for. I've had a user port on a computer before, but never used them much - just for flashing LEDs and making model trains drive around. maybe some geeky weather station thing?

So what do you need to get started, in addition to the pi? Well a keyboard and video (composite or HDMI) and a power supply - I'm using the USB-based power supply I stole from my mifi. Probably you'd like a mouse and an ethernet connection.

If I get up to anything cool, I'll write about it on this blog, no doubt.

10 July, 2012

eye in the wrong place.

lights turn on automatically in shitter when you walk in. great! lights turn off after a while taking a dump. wave arms. its all dark. get out phone for light. realise sensor is *outside* the toilet room in sink area. wait for someone to walk by for light. wonder if I should submit technical fault for mislocated sensor?

03 July, 2012

dnstracer

I started getting into writing a program for checking all paths of a DNS query. Then I discovered that there are two (1, 2) tools, both called dnstracer around for doing that... no new ideas under the sun. I guess I'll have to make mine cooler.

26 June, 2012

cheapish mifi

I got a three huawei mifi from store.three.co.uk.

It cost 50 quid for the mifi and 1 gb prepay sim card that I didn't really want as I already had a sim card.

put in sim card, plugged it in to charge, chose network on my laptop, put in preset password, and it worked without any further fiddling.

lots of menu options to play with, esp to do with port forwarding and the like. but for what I'm using it for I don't think I need that. maybe if I want mobile SMTP one day(?)

they didn't send a vat receipt which is annoying but not disasterous. I eventually found an email form on their website to ask for one.

cool to get on train, and almost seamlessly get a network connection. marred by the fact that my mifi is lower in my OS's priority list than a nearby broken BT Openzone hotspot, so it went there first.

19 June, 2012

oops

oops: 18:39:47 up 62 days, 6:24, 1 user, load average: 1943.85, 447.11, 147.51

12 June, 2012

fcgi, haskell, cpanel, php, drupal

I played with fastcgi, which is like CGI but doesn't have to spawn a new process each time.

The initial motivation for this was a server which has a bunch of drupal websites. It was previously running in plain CGI mode, which forks a PHP process for every page request (about 15 spawns per second on this server), with each site's PHP running under a different user account. (The other mode we've tried with this is using mod_php, which runs things much faster but doesn't provide as much isolation between web sites as using CGI as everything runs as the www-data unix user, rather than as a per-site user).

I thought I'd have to do more compiling, but it turns out fastcgi support for both apache and for PHP was already available. On my dev server I needed to apt-get the fastcgi apache module; on the production server, which uses cpanel, fastcgi support was already installed and switching it on was a single mouse click.

Here's a plot of the server CPU load before and after the switch:

There's a clearly visible daily cycle, using up almost 8 cores worth of CPU before the change. At the end of the 30th, I switched on fastcgi, and woo, the load drops right down and stays down. That's just what I wanted.

Reading more, cpanel disrecommends using fastcgi, and recommends somethign else - ruid2 - which looks like it does something similar but different. That recommendation seems mostly because fastcgi has a lot of tweakables that are hard to get right. see this thread.

caveats

I discovered a few interesting things during deployment:

Firstly, a potential attack on directories that have the ExecCGI option enabled - this is discussed in the context of the nginx web server here.

Another was a bug with a specific version of mod_fcgid and the specific configuration I set up, which resulted in a new PHP process being spawned for every page request, and then staying resident (!). Other people have experienced this and it was straightforward to tweak it so that it didn't happen.

haskell

I have a few apps for my own use written in Haskell, and one (a photo ranking app) struggles when called through the regular CGI interface, due to loading the vote/photo database each time. I've considered putting that into snap, a haskell framework, but it seemed interesting to see if I could get fastcgi running under Haskell.

apt-get install libcfgi-dev; cabal install fcgi got me the modules installed. I had some trouble running the hello-world app here

that came down to me not compiling with the -threaded option.

(I also tried the haskell direct-fastcgi module, but the home page for it is gone, and there is no example code so I rapidly gave up)

barwen.ch

I made an fcgi-bin directory available to all barwen.ch users, running FastCGI code under user accounts. There isn't much CGI going on on barwen.ch, but it seemed easy enough to deploy and make available, and is one more feature for the feature list.

06 June, 2012

3ffe:1900:4545:3:200:f8ff:fe21:67cf

happy ipv6 launch day! the world is different now.

some of us noticed 3ffe:1900:4545:3:200:f8ff:fe21:67cf referred to in a BBC article; that address is "clearly" an old 6bone IP from the dawn of the century, which seemed strange to see in an article published this week.

I typed it into google and its used in a lot of places as an example IP address: IBM documentation, an XKCD discussion thread, software unit tests, youtube comments about minecraft....

But nothing appeared to give an original source.

Did someone just make it up years ago as an example and everyone else just copied it off each other?

Who was it originally allocated to on the 6bone?

Answers on an ipv6-enabled postcard...

05 June, 2012

slight niggle with permissions.

On most unixes, you don't need to own a file to delete it. Instead, you need write permission on the containing directory (and if you don't have write permissions on the directory, you can't delete a file even if you own it)

That's not true for directories though. If a directory (c) has files in it, the owner of the containing directory (..) can't delete it because they can't (necessarily) delete the contents of the directory (c/*). And the owner of the directory (c) can't necessarily delete it unless they have write permission on parent (..).

I've only just noticed that difference in behaviour between files and directories. Its never been a problem. (of course, I have root on most systems where it would be so its easy to work around). So I guess this counts as obscure?

29 May, 2012

wine on os x - almost just worked.

I needed to run some windows only stats software on os x (WinBUGS/OpenBUGS).

The authors suggested running it in wine.

port install wine

That almost worked. MacPorts can't seem to deal with upgrades of anything at all, so I ended up getting rid of the macports directory and building everything from scratch.

Then, it worked. Really well.

I was surprised. I guess wine has got a lot better in the 10 years since I last tried to use it. And maybe this program was simple enough that it didn't trip on anything fancy.

22 May, 2012

google ads on my blog

I put google ads in the top corner of my blog for a few weeks to see what happened.

When i've visited it, the ads seem kinda non-specific. I guess my content isn't good enough to get good ads?

In the 5 weeks I've had googles showing, I've had 796 page views (as far as adwords is concerned) and a single click worth 28p (= USD 0.45).

So pretty much not worth polluting my pages for and I'll take them off again, I think.

15 May, 2012

bunny webcam

I got this webcam for looking at pet rabbits for my girlfriend's birthday. (webcams, dear children, are something still fascinating for people born before, say, 1985)

The set up was pretty straightforward. I plugged it into the local ethernet and it appeared at 192.168.1.239. The local ethernet uses 192.168.1.0/24 but I think maybe have been just chance that it configured itself that way - it wasn't using DHCP out of the box. When I switched on DHCP it reconfigured itself to a DHCP-allocated address in the same range.

The supplied manual (on paper) describes downloading and installing some setup.exe to configure, but I ignored that, and using nmap discovered it was running an http server on port 81. Logging into this as admin, with no password, I found myself able to view and control the camera without need for any software.

I couldn't get the wifi working, but only spent 10 mins or so on that - we decided it would be best placed right by a wired ethernet port, so there was no need for that.

We attached it to the underside of an Ikea table using electrical tape. There was a balance to strike between being low enough to get a good angle, but high enough that they can't eat the camera or the wires. I would have loved to have put it in the cage with them, but the lifetime of the wires would have been measured in minutes, or possibly seconds.

The camera has motors to pan and tilt, though where its installed that isn't really necessary, and the whirring of the servos seems to scare the white rabbit a bit. That one likes to sit looking at the camera on the other side of the fence, hanging off the underside of the table.

There's a night vision mode too. The camera is surrounded by a ring of infrared LEDs - actually also a bit visible red too. These are turned on by a CdS cell above the camera lens (so you can trigger them with your finger rather than needing to put the camera in the dark).

The user interface is clunky but functional. The main page looks like this, with arrows at the top left to drive the servos. There are admin menus too, which appear even if you aren't authorised - they just don't work for non-authorised users. This clunky interface means its not a good camera for streaming to the public at large.

So, for £40, was this worth it? yes

(btw, not all the pictures in this post were made with the webcam - for example, the pictures of the webcam were taken with an iPhone rather than a complicated mirroring optical arrangement)

08 May, 2012

Manhattan would be squarer if it had diagonal streets too

There are different ways of measuring distance - that's the abstraction of metric spaces.

The usual distance that people use looks like this:

Points at a certain distance form a circle around the point that you're measuring from.

Another metric is the Manhattan distance, where rather than moving in any direction you can only move along one axis at once. This leads to diamond contours instead of circles.

Those are both well known metrics. I wondered what the Manhattan distance would look like if you were allowed to move diagonally as well as along the axes. The distance seems to be max(abs(dx),abs(dy)), and that looks like this, all square:

04 May, 2012

suPHP vs CVE-2012-1823

I needed to investigate CVE-2012-1823 for a few sites that I help look after.

They all use suPHP (some of them via cPanel, some directly configured).

I couldn't find anything in Google about whether CVE-2012-1823 affects suPHP - they all talk about php-cgi, and suPHP does something very similar, but with a bit more functionality.

As far as I can tell, the exploit comes specifically from CGI handling; and relates to how a URL turns into an invocation of PHP.

From looking at the suPHP source code, it looks like that exploit path is not available. The arguments to pass to PHP seem to be formed totally differently in suPHP compared to a CGI execution.

I'd love to hear anyone else's opinion though...

24 April, 2012

fire upon the deep unix timestamps are NNNNN bits

Vernor Vinge talks about people still using unix timestamps in his future universe.

Unix timestamps, when stored in fixed width fields, run out of bits. 32 bits runs out in 2038: the year 2038 problem.

I wondered how many bits would be needed to be to represent timestamps in his future.

Vinge is not very clear when his novels are set. Someone on slashdot thinks 38000 years from now.

Well, the next obvious size up from 32 bits is 64 bits. And wikipedia says that will run out in the year 292,277,026,596. Which seems far in advance of Vinge's future.

I was hoping the answer would be cooler than that. But 64 bits seems to be enough.

17 April, 2012

machine ordering restaurant

I went to a sushi restaurant in Brisbane which used touch screens for ordering. That was pretty nice for ordering a few things at once without having to interact with a human being (something hackers hate to do...).

It reminded me of a place like that I went to near Red Square in Moscow, which was an internet cafe/pub. Unfortunately, that place had a bug: it de-duplicated your order so that you didn't accidentally order the same thing twice. But that meant it was difficult to order pint after pint after pint - it detected all but the first as duplicates; and worse, silently discarded the subsequent orders rather than telling you.

10 April, 2012

commandline RSS->text tool using Haskell arrows

I wanted barwen.ch to display news updates at login. I already have an RSS feed from the drupal installation on the main page; and that RSS feed is already gatewayed into the IRC channel. So that seemed an obvious place to get news updates.

I wrote a tool, rsstty, to output the headlines to stdout. Then, I wired it into the existing update-motd installation to fire everything someone logs in.

So you can say:

$ rsstty http://s0.barwen.ch/rss.xml
 * ZNC hosting(Thu, 01 Mar 2012 10:09:15 +0000)
 * finger server with cgi-like functionaity(Wed, 22 Feb 2012 18:43:08 +0000)
 * Welcome, people who are reading the login MOTD(Fri, 17 Feb 2012 23:56:44 +0000)
 * resized and rebooted(Wed, 25 Jan 2012 12:23:39 +0000)
 * One time passwords (HOTP/TOTP)(Wed, 18 Jan 2012 11:33:45 +0000)

I wrote the code in Haskell, using the arrow-xml package.

arrow-xml is a library for munging XML data. Programming using it is vaguely reminiscent of XSLT, but it is embedded inside Haskell, so you get to use Haskell syntax and Haskell libraries.

The interesting arrow bit of the code is this. Arrow syntax is kinda awkward to get used to Haskell and sufficiently different from regular syntax and monad syntax that even if you know those you have to get used to it. If you want to get even more confused, try to figure out how it ties into category theory - possibly the worst possible way to learn arrows ever.

But basically, the below definition make a Haskell arrow which turns a url (to an RSS feed) into a stream of one line text headlines with title and date (as above)

> arrow1 urlstring =
>  proc x -> do
>   url <- (arr $ const urlstring) -< x

This turns the supplied filename into a stream of just that single filename. (i.e. awkward plumbing)

>   rss <- readFromDocument [withValidate no, withCurl []] -< url

This uses that unixy favourite, curl (which already has Haskell bindings), to convert a stream of URLs into a stream of XML documents retrieved from those URLs - for each URL, there will be one corresponding XML document.

>   item <- deep (hasName "item" <<< isElem) -< rss

Now convert a stream of XML documents into a stream of <item> XML elements. Each XML document might have multiple item elements (and probably will - each RSS news item is supplied as an <item>) so there will be more things in the output stream than in the input stream.

>   title <- textOfChild "title" -< item
>   pubdate <- textOfChild "pubDate" -< item

Next, I'm going to pull out the text of the <title> and <pubdate> child elements of the items - there should be one each per item

>   returnA -< " * " ++ title ++ "(" ++ pubdate ++ ")\n"

When we get to this point, we should have a stream of items, a stream of titles corresponding to each item, and a stream of pubdates corresponding to each title. So now I can return (using the arrow-specific returnA) what I want using regular Haskell string operations: a stream of strings describing each item.

The above arrow is wrapped in code which feeds in the URL from the command line, and displays the stream of one-line news items on stdout.

The other interesting bit is a helper arrow, textOfChild which extracts the text content of a named child of each element coming through an XML stream. Each part of this helper arrow is another arrow, and they're wired together using <<<. To read it, imagine feeding in XML elements at the right hand side, with each arrow taking that stream and outputting a different stream: first each element is converted into a stream of its children; then only the element children are allowed through; then only the elements with the supplied name; then all of the children of any elements so selected; and then the text content of those. (its quite a long chain, but thats what the XML infoset looks like...)

> textOfChild name =
>  textNodeToString <<< getChildren <<< hasName name <<< isElem <<< getChildren

03 April, 2012

program a little bit every day

Something I've been trying for the last year or so is to make sure that I program a little bit every day.

I've seen lots of my friends drift upwards into management or other fields, where their default program becomes Word or Powerpoint rather than vi.

I've seen other people already in a management position when I met them, desperate to hack on things rather than manage things, but finding time to do it.

Having seen all that, and eventually realising what I saw, I became fearful that the same would happen to me.

So I started trying to program a little bit every day, at least on workdays.

The rules are pretty simple: I try to edit some code, compile and run it.

Often, I'm being paid by someone to write code for them. That makes things real easy. Not only to do I get my resolution done, but I get a wad of money too.

Sometimes I don't - perhaps because I've spent a day on paperwork, on design documentation, on configuring things, on phoning other people to find out why shit is broken.

In those cases, I have to make time to program. Sometimes the change I make is as small as renaming a variable I didn't like. Or rephasing an error message.

But the important thing is that it keeps me touching a programming editor (vi in my case, but whatever) and compiling stuff and running it and testing it, every day. Stopping me falling down the slope of "its been N days since I last programmed" which seems so easily to become "its been N years since I last programmed".

26 March, 2012

email addresses are half case sensitive

The left hand part of an email address, the bit before the @ is case sensitive, in email in general. I've known that for a while - it seems to be an obscure-ish part of SMTP folklore.

Individual mail domains are perfectly at liberty to fold multiple distinct addresses into one, in their own domain, which is what most mail systems do: BENC and benc and bEnC all go to the same place @hawaga.org.uk. This leads many people to think that the left hand side is case insensitive.

This is just as they are at liberty to do that folding in other ways: for example, gmail ignores . in addresses, giving me b.clifford@gmail.com and bclifford@gmail.com. As well as b.c.l.i.ff.o.r.d@gmail.com

This came up on a mailing list (for browserid) that I watch, and I ended up being challenged in private email to cite a source. Luckily there's plenty of stuff around. RFC2821 section 2.4 seems to be the authority: The local-part of a mailbox MUST BE treated as case sensitive.

In the preparation of this blog post, I discovered something I didn't know before. It seems you cannot have multiple dots in a row in a plain email address. The RFC2821 production rules are:

      Local-part = Dot-string / Quoted-string
            ; MAY be case-sensitive

      Dot-string = Atom *("." Atom)

      Atom = 1*atext

where atext seems to come from the companion RFC2822 section 3.2.4. So b...clifford@gmail.com is not a valid address. Shame.

22 March, 2012

city of a thousand broken access points

Auckland seems to have wifi access points all over the central business district. But for the most part they don't seem to work. They seem either charge-based or needing-a-login-even-though-not-charging, and they all seem screwed up in that my equipment either can't associate to them, or when it does associate to them, it doesn't get an IP address via DHCP.

What kind of madness is this? that the equipment is deployed and then apparently neglected so much by all involved that it doesn't work enough to generate revenue...

13 March, 2012

yubikey input fail.

I've been using a yubikey to generate numeric OTP codes for logging into one of my servers when using friends PCs.

That's work well so far. But today I tried to use it on a french keyboard, where the "number" keys generate punctuation (the numbers are on the shifted version). Well, now, my yubikey then just enters a load of punctuation in place of the OTP code. Fail.

06 March, 2012

Automatic beep on arrival

I have an iPhone with no gsm connection, just wifi. It knows the password for a lot of places I hang out which has led to pretty much it always beeps when I'm standing in the doorway of where ever I was headed - just in case I hadn't noticed.

28 February, 2012

syntactically correct type-checkable /* NOTIMPL */

Agda, a dependently typed programming langauge Agda has a neat feature for partially written code.

Often I'll flesh out code and write something like a TODO inline. I'm sure lots of other people do to.

(For example, in C:

  int x;
  x = TODO_CALCULATEX();

This won't compile, so you don't get the benefit of compile time checking on your code until you've fixed up all your TODOs into something that makes sense to the compiler: either implementing it, or implementing a stub (an extreme case of which might be to replace the whole function call above by the constant 0).

In C and other languages which don't do much in the way of correctness checking at compiler time, thats ok.

For a lot of uses of Agda, the compile time checking is where all the interesting stuff is, though: for example, Agda types are where you put proofs like "this sort function really does sort".

Its a bit more awkward to make up stubs that claim in their comment or name to do something, whilst not doing it, because there is usually a lot more stuff in the type signature (such as the assertion that this sort function really does sort). You can't just put a return 0; and have it type check ok.

So, Agda uses a special extra bit of syntax: _ (an underscore) to mean "I have no value for this; but please pretend to all intents and purposes that you do." That way, compile time checking can carry on. Agda understand that its a TODO that you'll get to later on. Or even in some circumstances figure out the value for you.

27 February, 2012

2y

this blog is 2y old today! I'm writing this way back in May 2011, though, so there's a pretty high chance that it could have been dead for a few months by the time you read this though ;)

21 February, 2012

bugs that never go away

I had a problem trying to get mrtg to read ssCpuRawUser.0 and related snmp variables. it was giving some error, even though it can read various other snmp variables fine. i googled. I got a few pages of people going all the way back to 2001 with the same error. no solutions. there's even a bug where the mrtg author WORKSFORMEs it. fuck this. I think there's an XKCD about this. So, I can read the variable with the net-snmp commandline tool and feed it in that way. yay for being able to write my own plugins.

#!/bin/bash
# $1 = name1
# $2 = name
# $2 = host
snmpget -m + -c public -v 1 $3 $1 | sed 's/^.*Counter32: //'
snmpget -m + -c public -v 1 $3 $2 | sed 's/^.*Counter32: //'

17 February, 2012

dmarc

A bunch of email providers announced DMARC which builds on top of SPF and DKIM to allow domains to specify more policy when SPF and/or DKIM fail.

I already have SPF and DKIM set up on my personal domain, hawaga.org.uk, which has been round for over a decade. I run mail servers for various other domains, but those are much younger and much less widely used.

Its been hard to quantify how much this has helped/not helped. I don't get complaints about spam originating from my address. I used to get a lost of postmaster backscatter but not any more - not sure why, though I can invent various possible reasons.

One of the interesting things with DMARC is that it claims to provide feedback about what filtering is happening, from receiving/filtering parties - that makes it especially interesting, I think.

So, given that I already have DKIM and SPF, what extra do I need to do to get something useful from DMARC?

I need to publish a policy in DNS, under my sending domain. (this is also how SPF and DKIM do things)

So I've put in this policy on the 4th of Feb:
_dmarc.hawaga.org.uk. 3583 IN TXT "v=DMARC1\;p=none\;rua=mailto:benc@hawaga.org.uk\;ruf=mailto:benc@hawaga.org.uk\;ri=3600"
That says to not enforce any policy, but to email benc@hawaga.org.uk with reports every 3600 seconds (= 1 hour).

I set this up at about 5pm on a Saturday and about 11am on Sunday morning my first report arrived, with a timestamp range of a day, which must extend back before I turned this on...

In there, three messages from my main outbound mail server, and no others.

Lets see what else I get...

A couple of weeks later...

I got daily reports most days from Google (I think maybe the day I didn't get a report was because I hadn't sent any mail into google all(?)).

A few days after the above I added in two other domains: my company domain with only occasionally sends mail, and my girlfriend's vanity domain. Neither of those have SPF or DKIM on them, even though they come from the same mail servers as hawaga.org.uk.

There was a noticeable lack of reports from anyone other than Google. I asked around (on Google+) to see if anyone had reports from elsewhere (eg AOL or Yahoo, because those were also listed) but no one said yes.

So what about the reports?

Well, there were surprisingly more mail servers than I expected: along with my own two outbound servers, there were about 10 other servers, being the outbound mail servers of a handful of research institutes that I work with. Those reports were tagged by google as being via a mailing list. Its not clear to me what defines a message as being via a mailing list, but I guess it would mean that they'll put less weight on my SPF records? It also highlights how a naive interaction between mailing lists and SPF can result in your message being treated as spam.

I also got some DKIM fails reported from my own legitimate mail server. The best I have been able to diagnose there is that I had sendmail set to deliver mail without a DKIM signature if the dkim milter timed out; but if that's going to contribute negatively to spam treatment, then I think a better configuration is to have the milter set to retry later, resulting in more delayed mail, but more DKIM-signed mail.

The extra domains I added had no DKIM on them, but those weren't treated as DKIM-fails. Instead they were reported as DKIM 'none'. I'm not sure what causes none rather than fail, but my guess is its something to do with the fact that hawaga.org.uk has DKIM records in its DNS, and thats being treated as an indication that there should be DKIM signatures on messages. I think that's extra meaning that I hadn't understood DKIM DNS records to mean.

I have a similar confusion with the interaction between SPF and DMARC: SPF has multiple output states, not just pass or fail, and its not clear to me how those are treated by DMARC.

Processingwise: the reports come as zipped XML documents. It was relatively straightforward to munge these like any other XML (though I made it harder for myself by learning a new Haskell XML library rather than using ones I already knew).

Its unclear to me how I know that a report really is from a particular sender, and what the threat model is for people injecting false DKIM reports - perhaps injecting them to suggest that people's use of DKIM and SPF is causing their mail to be dropped, and thus encouraging them to turn off SPF and DKIM?

So for now, I'll keep this switched on, in monitoring-only mode. I don't feel I understand it well enough to turn it on in enforcement mode (especially as I'm not the only user sending mail under hawaga.org.uk. I think its very interesting and probably useful to be able to specify policy this way; but the policy language at the moment feels either vaguely defined, or at the least not concisely described, in a way that makes me comfortable.

11 February, 2012

DKIM - domainkeys identified mail

Looks like I never wrote a blog posting on setting up DKIM. I just realised one of my servers wasn't set up after a re-install, so I'm having to remember how to do it again.

I'm using sendmail. (yes, shut up) and DKIM hooks in using its milter (mail filter) mechanism.

# apt-get install dkim-filter

Now wire it into sendmail.mc:
INPUT_MAIL_FILTER(`dkim', `S=/var/run/dkim-filter/dkim-filter.sock')

Now when mail comes in, you should see it gets headers like this added by your mail server (dildano.hawaga.org.uk in this case) when DKIM verification happens (eg in mail from gmail).:

Authentication-Results: dildano.hawaga.org.uk; dkim=pass (1024-bit key)
 header.i=@hawaga.org.uk; dkim-adsp=none

The other half of the equation is DKIM signing my outbound mail, so that other people who do checks like this can verify/not-verify my email.

DKIM needs a public/private keypaid

# dkim-genkey -b 1024 -d hawaga.org.uk -s hampshire

-s specifies a selector name. This is a fairly arbitrary identifier used to identify this keypair, because a domain can have multiple keypairs (for example, one per mail server). In the hawaga.org.uk domain, I seem to use names of English counties.

# ls
hampshire.private  hampshire.txt
# cat hampshire.txt
hampshire._domainkey IN TXT "v=DKIM1; g=*; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDUP+5f0nEWyYICxr8rLN8xannlteBg4WF2Fat/MS8CiAa1lE2wgvhKYJJD/ydJ//5B9fBZAwSXTAq2ZCQYIfRf985Yip0BK80ECTlOunaSnMY/4/RzmkXGpndJaHIFqmSWDhML1yBP6W6owJDXIPDCAbV80kd5Z5aAkv8518lk+wIDAQAB" ; ----- DKIM hampshire for hawaga.org.uk

That .txt file is a DNS record to install under hawaga.org.uk. When you've installed it, you can check with:
dig -t txt hampshire._domainkey.hawaga.org.uk @localhost

That's the public key installed. Now the private key.

In /etc/dkim-filter.conf:

Domain hawaga.org.uk
KeyFile /etc/mail/hampshire.private
Selector hampshire

# /etc/init.d/dkim-filter restart

Now send out some mail to some other account. It should have a DKIM signature header added like this:

DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hawaga.org.uk;
 s=hampshire; t=1328349841;
 bh=hGo8Oadbgx3cVNwLr3hGDRfMX5LwWwXuz2PzqEowx0I=;
 h=Date:From:To:Subject:Message-ID:MIME-Version:Content-Type;
 b=oBeSDSzxz7/awSnxuos6jyJuBoYH2MbiB3HDpbZfLQnTTdEJdx2WD0ubSVAaKAJmV
  ma5xuSaNGeS7X3Xg49obL6nWA89tiOeVAq9FO+7NP+v2DmUPFxEYkLeQJUANYKzAw/
  r8ag9XnbRkxvY+J/rrmeaAjJdnfgUQlKSHlV5CWE=

... and if that other account happens to do DKIM verification, you should see its version of:

Authentication-Results: paella.hawaga.org.uk; dkim=pass (1024-bit key) header.i=@hawaga.org.uk

03 February, 2012

yubikey for encryption, not verification, passwords

I previously mentioned that my yubikey has a mode where it can enter a 64 character fixed string. I've been regarding that as useful in systems that are too closed to support HOTP. But I just realised that they also have a more "sensible" use on systems that due to more fundamental technical reasons cannot have a changing password - where that password is used to actually encrypt data, rather than being verified against an expected password - for example, GPG or encrypted home directories.

28 January, 2012

one line to make your site look nice on iPhone

Well, I came across a magic line of HTML for making a website look basically readable on an iPhone. Not magic in the sense that I don't understand what it does. But magic in the sense that its a single line that is the first big step to making a site look OK.

The line is (to go in your <head> section:
<meta name="viewport" content="width = device-width" />

What it does is make the iPhone web browser render the HTML at a sensible readable font size, with word wrapping at the end of the screen. (the default, otherwise, is to try to fit a regular screen worth of pixels across, then zoom out to make it fit on the small iPhone screen - that means the user has to zoom and pan to do anything).

Now my pages still look like crappy hand written HTML, but at least they're readable on an iPhone now.

I added this to the shellinabox installation I have on barwen.ch, and now its much prettier to use a browser-based shell on an iphone - you get a 30 character terminal thats at sensible font size, rather than a wide wide terminal at unreadable font size.

21 January, 2012

Two hardware OTP keys

I got a couple of OTP (one time password) keys to try out. These are hardware dongles that generate a unique code number every time you use them, which you then use in addition to a password when you log in to places (eg your server, some website).

The goal is to make things more secure by not having a password someone can steal.

The mechanics of this used to be hard to describe but enough people use online banking with security tokens now (at least in western europe) that the idea is pretty widely known already. (You can read about two factor authentication on wikipedia)

First, I tried a yubikey. Yubikey comes with a silly tagline, "the key to the cloud", but don't let that put you off. The yubikey plugs into a USB port on your computer and when you press its single button, it types in the next code in sequence as if it was a USB keyboard. I've tried this on a linux box and and an OS X box and it had no problem on either.

Pressing a button seems much less hassle than typing in a code off an LCD screen, but it does come with downsides: you need to have the device in a USB slot when you press the button. On a tower desktop, that's possibly down by the floor. Even worse, maybe you don't have a USB slot at all, in which case the device is useless.

yubikey has a number of different modes, and can store two configurations at once (yes, even though it has just one button).

There's a yubikey proprietary mode which generates a long key string which contains a bunch of stuff (for example, a device ID); an HOTP mode which generates a 6 or 8 character code (with programmable extra decoration); and a static mode which types in a preprogrammed fixed 64 character string. This is all configured by some software that you get for free off the yubikey website. I'm always a bit wary of vendor software to support hardware devices, because it often seems a much lower priority than the hardware device itself. But this worked well enough.

I only tried out the HOTP mode, because I wanted interoperability with other OTP implementations.

The two configurations are accessed by holding down the button either for less than 2s or for more than 2s. I've only used the first configuration, and I haven't had any trouble with accidentally falling through to the second one. But it sounds a bit cracked out and if I was giving it to the kind of user who would hold the button extra long "just to be sure," then maybe there would be trouble. I was hoping there would be an option to switch that second configuration off, but I didn't see one.

My only interaction with the supplier, yubico, was to order the key online, for $30+VAT. This arrived 24h after I order it, by regular mail (from the next village over!).

The second device I got was an OTP C200 token from Gooze. This has a more traditional user interface with a 6 digit LCD display and a button which turns on/off the display. The C200 is a TOTP token: HOTP, except the code changes every 30 seconds instead of when you press the button.

Gooze also makes a C100 which is regular HOTP. I haven't tried one of these, but the design of the C200 case makes me think that the button would get pressed a bunch randomly if you're carrying this with your keys in your pocket. With TOTP, that's not a problem - the code is not related to the button press. But I've encountered loss-of-sync troubles with other hardware tokens before due to this and I don't think the C100 solves that problem.

There is no configuration of the device itself needed - it only does TOTP, and the seed value is preloaded. You get sent that by Gooze. This was sent by GPG encrypted mail so I could cut and paste it into the configuration of my server easily. It means Gooze knows your secret key (although they claim to delete them after sending). I'm not too fussed by that because I'm not aiming for über-high security, but I'm sure some people will.

Worse though was that through some mess up in customer service, it took them over a week to get the codes to me after the devices arrived, and they were pretty silent during that week despite repeated enquiries. I think this is due to the company being pretty small. This is almost enough to make me not order from them again.

Because the C200 has a screen, you have to read the code and type it into your computer by hand. So some properties are inverted from the yubikey: it's a hassle to type in the code; but it doesn't matter if you have a USB port. Because of that, I think this is more appropriate than the yubikey for "I'm going on holiday but want to be able to access my email from public terminals" uses.

I wired both of these up to pam_oath to log into my linux servers; maybe I'll write about that side of things another time. Neither device has beaten the other in being my favourite - the yubikey has substantially higher geek value for plugging into a USB port and greater convenience in some circumstances, but the C200 feels more practical for other use cases such as connecting from unfamilar devices. I've only had a few days to form the above opinions and I expect I'll form more opinions over time.

15 January, 2012

server availability like uptime

I wondered if I could get a measure of server availability as a single number, automatically (for calculating things like how tragically few nines of uptime my own servers have)

So, I wrote a tool called long-uptime which you use like this:

The first time you run the code, initialise the counter. You can specify your estimate, or let it default to 0:

$ long-uptime --init

and then every minute in a cronjob run this:

$ long-uptime
0.8974271427587808

which means that the site has 89.7% uptime.

It computes an exponentially weighted average with a decay constant (which is a bit like a half life) of a month. This is how unix load averages (the last three values that come out of the uptime command) are calculated, though with much shorter decay constants of 1, 5, and 15 minutes.

When the machine is up (that is, you are running long-uptime in a cron job), then the average moves towards 1. When the machine is down (that is, you are not running long-uptime), then the average moves towards 0. Or rather, the first time you run long-uptime after a break, it realises you haven't run it during the downtime and recomputes the average as if it had been accumulating 0 scores.

Download the code:

$ wget http://www.hawaga.org.uk/tmp/long-uptime-0.1.tar.gz
$ tar xzvf long-uptime-0.1.tar.gz
$ cabal install
$ long-uptime --init

09 January, 2012

cuba

was in cuba for a month. internet is hard there (though possible) so no updates on this blog in that time. sorry.