A very apt metaphor

“we don’t have electricity on the south side of the building, but I know water conducts electricity, so just connect the water main to the power grid, then install an outlet in every faucet”

Brad Montgomery on what we do as software engineers. Original at https://bradmontgomery.net/blog/blue-collar-programmer/

I don’t know why or how I ended up at that post but that situation evokes many fond memories of failed attempts at explaining what I was doing sometimes.

For all of its sins (or our inability to understand it back then) XML coupled with XSLT & friends is a nice basis to do ETL transformations.

Hey, I wrote an assembler.

(It was about time I interrupted a never ending pile of drafts with something)

So, for a bit more than a month I’ve been attending a seminar on VHDL microcontroller design.

One of the workshops involved doing some simple exercises on eval boards. While the overall instruction set is small (fits on less than a page) the idea of programming and then assembling the sources using pencil and paper wasn’t very appealing at the moment.

It sure is a fun way to keep the mind fresh but given time constraints I couldn’t cope with such a long debugging cycle.

And… The obvious path was to build a tool.

I reused parts of python-lx200 because smart data structures and dumb code are nice. Most of this could be implemented with M4 too, perhaps for another code golf session.

A couple of years ago I’d probably tried to use Bison and Flex but doubt that I could manage something like this in just a handful of hours during the weekend.

There are some rough edges but as it is it supports labels, variable definitions and emits a valid IHEX file. I made a couple of dumb mistakes but they were really evident when looking at what Quartus made of the output file.

I’m quite proud of the result, I don’t know when was the last time I had so much fun doing a one off project, even if it wouldn’t be used anymore after the seminar.

The source code lives here: https://github.com/pardo-bsso/islyd-asm

Traitlets

Lately I’ve been hacking on IPython and Jupyter.

That journey led me into Traitlets and I wish I knew about them before.

For most of my stuff I use argparse, blinker (when not using GLib) and plain properties.

I knew about traits from long ago but what really liked from a first look is their solution to configuration management. It’s clean, self documenting, type checked, composable and can be python code.

I’m still a bit torn about having a complete programming language as a user configuration tool instead of a DSL or stuff like ini-files, json or the good dialect of yaml.

Wisdom

Quite a while ago (in the last century nonetheless) my idea of a productive day entailed writing a lot of code, measured by size in any suitable metric.

Lately I’ve been writing less in volume but I realize that I spend a greater time thinking about the problem at hand as a whole and that it happens mostly in the background while I’m doing something else. By the time I’m again at the workstation everything falls into place.

Also, when stepping aside and contemplating whatever I engineered I can’t help to feel anything but pride. Perhaps except for the documentation I build things from the get go thinking of what I would like to have were I a library user, on terms of building blocks.

During the last two weeks I built a library to parse a protocol called LX200 used to control telescopes and I can’t be happier with the result (for now it’s at https://github.com/telescopio-montemayor/python-lx200 ). The first one was a roller coaster, due to some other issues I went back to a night owl schedule and I can’t remember when was the last time I had such prolonged and intense periods of flow. I also taught myself asyncio.

It’s terse, concise, and (mostly) well structured. My former self would’ve made a mess of a state machine tied together with pages of if statements that worked, for sure, but was a pain to extend or correct. Of course looking down the path and leveraging years of experience this things seem obvious now.

Coincidentally, the other day Eric wrote about the advantage of declarative/table driven approaches.

It’s Over.

Yesterday, after almost exactly a year it was over.

We met for the first time on November 2016 while at PyCon. It was just like a dream.

The farewell party was at a very large house with pool and a park.

After a while of all the noise, shallow chit chat and drunken jokes I went for a walk at the park. It was dimly lit and among the shadows I catched a glimpse of a cat. I followed it until it decided to stop and allowed me to pet him.

And then, She joined me. We talked a lot about how we like cats, the clear sky full of stars, funny moments of the event. We got bored of sitting near the pool and I offered to grab a big puff so we can lay and watch the stars.

But by the time I reached the house some of my friends wanted to come back to the hostel, so I had to leave as I was the designated driver.

Later on December we matched on Tinder and started to chat, moved over to whatsapp and later telegram. She was about 1000km away back then.

On our first date I was a tad nervous, almost innocent. But then the fire lit again.

We had our ups and downs, wonderful moments and some not so much. We did a couple of trips, one of the most beautiful to Bariloche.

Without doubt she was one of the few Womans that I truly cared and Loved in the last years (Love with capital L).
She was one of the two that really cared about me and gave me support in one of my darkest eras (among other things She convinced me to start seeing a psychiatrist). Also, we explored our imagination and our kinky sides. She was one of the few woman that made me open up and reveal myself with all my insecurities and weaknesses. She let my human side flourish. Perhaps a bit too much, sometimes just the right word made me cry.

But not everything was roses. I became distant and tormented with troubles, some of them real, others just a byproduct of my mind. I focused on my studies and regaining my place on the society as an active member and so I put her on a second place for most of the week.

We had previous instances where we were about to call it quits. On one of them I applied some techniques of the Dual Control model and against all odds we ended in a whirlwind of lust.

But the writing was on the wall and we saw this coming.

She put it very concisely into words:

“We Love each other a Lot.
But we are not in love.”

Burning down the house

(Watch out)

So many drafts, some stories and pictures from the last PyCon at Bahía Blanca.

I was happily hacking on the kitchen the other Saturday when I hear a strange noise coming from the garden.

To my dismal surprise I see that the shed is on fire and part of the roof collapsed. I went in to take out a propane can to avoid an impending catastrophe and called the firemen (lucky us, they are a few blocks away).

We lost the roof, tools, vinyls and books on an adjacent room but nothing that can’t be replaced. Still fuck.

Some pictures of PyCon at flickr (not mine) https://www.flickr.com/photos/70871182@N04/sets/72157677377824525

PyConAR 2016 Days 1, 2 and 3

The first day we were quite busy handling the admission and all that stuff at Club Emprendedores Bahía Blanca, so no pictures (at least from me, there are others from the official photographer)

The following days were at Complejo Palihue. It’s a lovely campus at the outside of the city within very wealthy neighborhoods.

The lecture rooms and amphitheaters are ample and well stocked and the view is lovely:

I also spent a while at our booth

And attended to a couple of talks. This one is about MicroPython running on the EDU-CIAA

A small detour on my way to PyConAR 2016

It’s that time of the year (again) when I rent a car and hit the route.

This time I’m heading to Bahía Blanca in order to help a bit and attend to PyConAR.

At the side of Ruta 51 on Coronel Pringles, a bit after the crossing with Ruta 72 there’s a wonderful lake.
Even if there were about 60 kms left I had to stop to enjoy the day and stretch a bit my legs.

Breaking a simple captcha with Python and Pillow

A while ago one of our long time customers approached us to automate tasks on a government portal. At least here most of them are kind of ugly, work on a specific set of browser versions and are painfully slow. We already helped him with problems like this before, so instead of having someone enter manually all the data they just populate a database and then our robot does all the work, simulating the actions on the web portal.

This one is a bit different, because they introduced a captcha in order to infuriate users (seriously, it looks like they don’t want people logging in).

Most of the time they look like this:

The first thing I tried was to remove the lines and feed the result into an ocr engine. So I made a very simple filter using Pillow:


#!/usr/bin/python

from PIL import Image
import sys, os

def filter_lines(src):
    w,h = src.size

    stripes = []
    ss = {}

    for x in range(w):
        count = 0
        for y in range(h):
            if src.getpixel( (x,y) ) != (248, 255, 255):
                count += 1
        if count == h:
            stripes.append(x)

    for x in stripes:
        for y in range(h):
            src.putpixel( (x,y),  (248, 255, 255) )
    return src

if __name__ == '__main__':
    src = Image.open(sys.argv[1])
    region = filter_lines(src)
    region.save(sys.argv[2])

Now it looks better but after trying gocr and tesseract it still needs more work:

Just for kicks I decided to filter 100 images and overlap them, this is what I got:

That is interesting… I used this script (not the most efficient approach, but still..)


#!/usr/bin/python

from PIL import Image
import sys, os

dst = Image.new('RGB', (86, 21) )

w,h = 86, 21

for x in range(w):
    for y in range(h):
        dst.putpixel( (x,y),  (255, 255, 255) )

for idx in range(30):
    src = Image.open('filtradas/%i.bmp'%idx)

    for x in range(w):
        for y in range(h):
            if src.getpixel( (x,y) ) != (248, 255, 255):
                dst.putpixel( (x,y),  (255, 0, 0) )

dst.save('overlapeada.bmp')

With this piece of information I can focus my efforts on that area only.
That font, even distorted, looks quite familiar to me. And indeed it is, it’s Helvetica.
This makes the problem a lot easier.

I grabbed a bitmapped version of the same size and made a grid that shows were can a number land assuming 8×13 symbols:

This shows that there is a slightly overlap between digits.
I went for a brute force approach, dividing the captcha in cells and comparing each one with every digit on the font with a small amount of overlap between them.
The symbols are smaller than the cell, so for every one of them I build regions on the cell and assign a score for the number of pixels that are equal on both.
The one that has a highest score is (likely) the correct number.

This is really simple, event tough we do a lot of comparisons performs ok (the images are quite small), and without tunning we got about 30% success rate (the server also adds noise and more aggressive distortions from time to time).

Have a difficult or non conventional problem? Give us a call, we are like the A-Team of technology.

This is the complete algorithm (it’s in Spanish but shouldn’t be hard to follow), can also be found here: https://gist.github.com/pardo-bsso/a6ab7aa41bad3ca32e30


#!/usr/bin/python

from PIL import Image
import sys, os


imgpatrones = []
pixelpatrones = []

for idx in range(10):
    img = Image.open("patrones/%i.png" % idx).convert('RGB')
    imgpatrones.append(img)
    pixelpatrones.append( list(img.getdata()) )


def compara(region, patron):
    pixels = list(region.getdata())
    size = min(len(pixels), len(patron))

    res = 0.0
    for idx in range(size):
        if pixels[idx] == patron[idx]:
            res = res + 1

    return res / size


def elimina_lineas(src):
    cropeada = src.crop( (4, 1, 49, 19) )
    w,h = cropeada.size
    stripes = []

    for x in range(w):
        count = 0
        for y in range(h):
            if cropeada.getpixel( (x,y) ) != (248, 255, 255):
                count += 1

        if count == h:
            stripes.append(x)

    for x in stripes:
        for y in range(h):
            cropeada.putpixel( (x,y),  (248, 255, 255) )
            cropeada.putpixel( (x,y),  (255, 0, 0) )

    return cropeada

def crear_crops(src, celda):
    limites = range(38)
    xceldas = [0, 8, 16, 24, 32, 40]
    xoffsets = range(-3,4)
    yceldas = range(6)
    boxes = []
    crops = []

    x = xceldas[celda]
    x = [ (x+off) for off in xoffsets if (x+off) in limites ]

    for left in x:
        for top in yceldas:
            boxes.append( (left, top, left+8, top+13) )

    for box in boxes:
        crops.append( src.crop(box) )

    return crops

def compara_crops_con_patron(crops, patron):
    scores = []
    for crop in crops:
        scores.append( compara(crop, pixelpatrones[patron] ))
    return max(scores)

def decodifica_celda(src, celda):
    pesos = []
    crops = crear_crops(src, celda)

    for patron in range(10):
        pesos.append( compara_crops_con_patron(crops, patron) )

    return pesos.index( max(pesos) )

def decodifica(filename):
    original = Image.open(filename)
    src = elimina_lineas(original)
    res = []

    for celda in range(6):
        res.append( decodifica_celda(src, celda) )

    return ''.join( str(x) for x in res )

if __name__ == '__main__':
    print decodifica(sys.argv[1])

(trying to) Measure temperature

In a while I’ll need to characterize an oven and perhaps build a new one.
Just to start I have to apply a power step and measure how the internal temperature evolves.

In order to save time I searched my local distributors and bought a K type thermocouple with amplifier and cold junction compensation. It is not the most accurate but it is more than enough for now. There are a couple of ics available that give a direct digital output but the work needed to breadboard them and have a meaningful reading is beyond the scope at this stage.

This is what I bought:

Appears on many places as a “Grove High Temperature Sensor”. It sports an OPA333 precision opamp and a CJ432 adjusted to provide a 1.5V reference. The rest of the circuit is nothing special, except that the manufacturer called the thermistor “light”. It can be consulted here.

First ligths

While I have more capable hardware at hand I grabbed an Arduino Nano and the official library from https://github.com/Seeed-Studio/Grove_HighTemp_Sensor and lo and behold I had it streaming temperature to my terminal.

Let’s get graphical

I cooked a simple gui on python using Qt and Qwt while listening to Olivia Newton.
It is pretty barebones, only has facilities to export into csv, a couple of tracking cursors and gracefully handles device disconnections (say, I yank the cable). I expect to post process the data using QtiPlot or Kst.

Tweaking

One of the first things I noted was that the measured temperature jumped in big steps of about 2°C.
Using the default setup with a 5V “reference” and considering the amplifier gain every adc bit amounts to:

 Vbit = \frac{5000mV}{1023*54.16} = 0.09024 mV

Looking at the polynomial coefficients used by the library (ITS90) and taking a first order approximation one bit corresponds to a 2.26°C step and it grows bigger with the measured temperature as other terms start to influence the result. Even tough the output is low pass filtered at about 1.6KHz and it is averaged over 32 points there’s still noise.

Changing the reference to use the regulated 3.3V makes it about 1.5°C but even if it is more than enough for what I need it can be better.

With a couple of bits more I can achieve better resolution. Instead of using an external adc I took advantage of the inherent noise on the reference and output and chose to apply a 16 times oversample in order to have 12 bits out of the 10 bit adc. Application note AVR121 explains that nicely. Now I am limited (in theory…) to 0.37°C steps and I can average on top of that to further reduce variations.

The last source of error (besides not knowing for sure the “real” value of the references) is that the library assumes a fixed 350mV output, the circuit ideally floats the amplified thermocouple voltage around that. In order to measure it I added a small relay from my stash (TQ2SA-5V) to short the input. It is not meant to be used as a dry relay but does fine so far.
Upon startup it reads 348 mV; while a 2mV difference may not seem that big it turns out to be at least 185m°C. Anyway the main sources of error now are the thermocouple and adc reference.

Using WebKitGTK as the UI for GStreamer applications.

Lately I’ve been thinking a lot about how can I make nice and easily customizable interfaces for video applications. My idea of ‘nice’ is kind of orthogonal to what most of my expected user base will want, and by ‘easily customizable’ I don’t mean ‘go edit this glade file / json stage / etc’.

Clutter and MX are great to make good looking interfaces and like Gtk have something that resembles css to style stuff and can load an ui from an xml or json file. However, they will need sooner or later a mix developer and a designer. And unless you do something up front, the interface is tied to the backend process that does the heavy video work.

So, seeing all the good stuff we are doing with Caspa, the VideoEditor, WebVfx and our new magical synchronization framework I questioned:

Why instead of using Gtk, can’t I make my ui with html and all the fancy things that are already made?

And while we are at it I want process isolation, so if the ui crashes (or I want to launch more than one to see side by side different ui styles) the video processing does not stop. Of course, should I want more tightly coupling I can embed WebKit on my application and make a javascript bridge to avoid having to use something like websockets to interact.

One can always dream…

Then my muse appeared and commanded me to type. Thankfully, mine is not like the poor soul on “Blank Page” had.

So I type, and I type, and I type.

‘Till I made this: two GStreamer pipelines, outputting to auto audio and video sinks and also to a webkit process. Buffers travel thru shared memory, still they are copied more than I’d like to but that makes things a bit easier and helps decoupling the processes, so if one stalls the others don’t care (and anyway for most of the things I want to do I’ll need to make a few copies). Lucky me I can throw beefier hardware and play with more interesting things.

I expect to release this in a couple of weeks when it’s more stable and usable, as of today it tends to crash if you stare at it a bit harder.

“It’s an act of faith, baby”

Using WebKit to display video from a GStreamer application.

Using WebKit to display video from a GStreamer application.
Something free to whoever knows who the singer is without using image search.

 

 

That thrill.

Lately I’ve been working with a lot of technologies that are a bit outside of my comfort zone of hardware and low level stuff. Javascript, html-y things and node.js.  At first it was a tad difficult to wrap my head around all that asynchronism and things like hoisting and what is the value of ‘this’ here. And inheritance.

Then, out of a sudden I had an epiphany and I wrote a truly marvellous piece of software. Now I can use Backbone.io on the browser and the server, the same models and codebase on both without a single change. Models are automatically synchronized. On top of that there’s a redis transport so I can sync models between different node instances in real time without hitting the storage (mongo in this case). And the icing of the cake is that a python compatibility module is about to come.

The bragging tax.

This is no news but I don’t get people. I really don’t.

When a potential client approached me for a quote normally I gave two estimates. One if I am allowed to write something about it and another one (substantially higher) if they refuse.

I never said a word about open sourcing it, naming names or something like that.

Most of the time I explain, as politely as I can, that nobody is going to ‘steal’ they wonderful idea. And also that it is just a very simple variation on stuff found on textbooks and, the only original thing they did was to put a company logo on it.

It is such a shame that I honour my word in these cases.