Pages with tips about Debian.

Tips on using python datetime module

Python's datetime module is one of those bits of code that tends not to do what one would expect them to do.

I have come to adopt some extra usage guidelines in order to preserve my sanity:

  • Avoid using str(datetime_object) or isoformat to serialize a datetime: there is no function in the library that can parse all its possible outputs
  • datetime.strptime silently throws away all timezone information. If you look very closely, it even says so in its documentation
  • Timezones do not exist, all datetime objects have to be naive. aware means broken.
  • datetime objects must always contain UTC information
  • datetime.now() is never to be used. Always use datetime.utcnow()
  • Be careful of 3rd party python modules: people have a dangerous tendency to use datetime.now()
  • If a conversion to some local time is needed, it shall be done via either some ugly thing like time.localtime(int(dt.strftime("%s"))) or via the pytz module
  • pytz must be used directly, and never via timezone aware datetime objects, because datetime objects fail in querying pytz:

That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.

From: http://blog.redinnovation.com/2008/06/30/relativity-of-time-shortcomings-in-python-datetime-and-workaround/

  • Timezone-aware datetime objects have other bugs: for example, they fail to compute Unix timestamps correctly. The following example shows two timezone-aware objects that represent the same instant but produce two different timestamps.
>>> import datetime as dt
>>> import pytz
>>> utc = pytz.timezone("UTC")
>>> italy = pytz.timezone("Europe/Rome")
>>> a = dt.datetime(2008, 7, 6, 5, 4, 3, tzinfo=utc)
>>> b = a.astimezone(italy)
>>> str(a)
'2008-07-06 05:04:03+00:00'
>>> a.strftime("%s")
'1215291843'
>>> str(b)
'2008-07-06 07:04:03+02:00'
>>> b.strftime("%s")
'1215299043'
Posted Thu Jun 25 19:18:25 2009 Tags:

Conversation starting tool

  1. Run ept-cache info and make sure that the popcon and debtags data sources are enabled
  2. Run ept-cache search -t clean -s t- | less

This will show you all packages that are not shared libraries or application data, and will show you at the top of the list those packages that you have installed and other people are very unlikely to have installed.

You can use that list in case someone asks you "tell me of the cool packages that you use".

If you need to improvise a talk at some Linux event, that list could very easily get you started and talking for hours.

Posted Sat Jun 6 00:57:39 2009 Tags:

Send a fax from the laptop

My bank sent me a PDF form via e-mail. I needed to fill it in, then send it back via fax. Send it back via e-mail would not work because it's not secure. The bank agrees that this is fantastically silly, but apparently this requirement is not their fault.

Step 1: send a fax with the laptop

  1. apt-get source sl-modem-daemon efax-gtk
  2. patch as instructed in the Debian BTS
  3. pbuilder-satisfydepends, debuild, dpkg -i
  4. slmodemd -c ITALY --alsa hw:0,6
  5. echo ATDmymobilenumber > /dev/ttySL0 and my mobile phone rung
  6. efax-gtk

Believe it or not, at this point I managed to successfully send a test fax.

Background: the laptop's modem is actually a sound card, and is ashamed to admit that it can also work as a modem:

$ lspci
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)

But the sound card actually has its own bus, which you can query with aplay -l:

$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Intel [HDA Intel], device 0: ALC861VD Analog [ALC861VD Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 0: Intel [HDA Intel], device 6: Si3054 Modem [Si3054 Modem]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

Then you learn that sl-modem-daemon can drive it both on i386 and on amd64, but you get period size 48 is not supported by playback (64) when trying to dial. But then you find the patch to get rid of that, and it works.

The modem was the last device in the new laptop that I had not yet attempted to use. I can now claim that every single piece of hardware on my ASUS F9E-2P119E laptop can be made to work with Debian. Oh, yes!

Step 2: fill in the form

Much to my surprise, evince allowed me to just click in the form fields and type text. Even checkboxes worked. "Save a copy", however, did not retain the field contents: I had to print to file to get another PDF with the fields filled in. Update: this could be a limitation of that specific PDF, see this thread on the Adobe forums (thanks to Tomas Weber).

However, evince did not allow me to import an image with my signature and paste it in the right place. Inkscape, however, successfully managed to import the PDF as an editable vector drawing that I could change at will. Again, that was impressive.

From there, it was just a matter of pasting the signature in the right place, save as PostScript, give it to efax-gtk and phone the bank to learn that, in fact, the fax was received and was perfectly readable.

Posted Sat Jun 6 00:57:39 2009 Tags:

Editing ChangeLog with vim

Turns out vim has a changelog.vim plugin to edit ChangeLog files.

With \o you start a new entry.

A look at /usr/share/vim/vim70/ftplugin/changelog.vim can show some more.

Posted Sat Jun 6 00:57:39 2009 Tags:

Introducing the Humongous Merged Packages File From Hell

Suppose you want to build some service that requires information about all packages in all architectures. Like, for example, the debtags web tagging interface.

If that is the case, then you may be interested in the Humongous Merged Packages File From Hell.

It contains a merge of all Packages and Source files of etch, lenny, sid and experimental, for main, contrib and non-free, in all architectures.

The merge is done according to completely arbitrary criteria of mine:

  • One record per (package, version) couple found; multiple record per (package, version) are merged together
  • Size and Installed-Size are the average of all values found
  • MD5sum, SHA1, SHA256, Filename, Files, Directory, Checksums-Sha1, Checksums-Sha256, Binary are thrown away
  • Information (such as Homepage, Vcs* fields, Uploaders) found in the Sources file is repeated for every binary package
  • All other packages get as value a list of all the values found, with duplicates removed

It's there in case anyone other than me may find it useful.

Posted Sat Jun 6 00:57:39 2009 Tags:

A couple of BTS utilities

I had the idea of creating a script that would tell me which bug numbers have unanswered mails.

The idea is simple: get me all the bug numbers for which the last message posted has not been sent neither by me nor by any of the comaintainers.

Firstly, I needed a script that gives the e-mail address of the sender of the last message received for a given bug:

$ mkdir cache
$ ./getlast 370102
enrico@debian.org

Then I needed a srcipt that gives the bug numbers currently in open state for a given source package:

$ ./getopen debtags
246678
277626
290457
[...]

Finally I needed to correlate the information from a maintainer database with the results of the previous scripts:

$ ./show-unanswered enrico@enricozini.org enrico@debian.org
Buffy comaintained by xxxxxxx@debian.org, xxxxxxxxx@xxxxxxxx.xx
  bug #269386 answered by me
  bug #293898 answered by me
  bug #294436 answered by comaintainer xxxxxxx@debian.org
  bug #300444 answered by me
  bug #372513 answered by comaintainer xxxxxxx@debian.org
  bug #383610 answered by comaintainer xxxxxxx@debian.org
  bug #384108 unanswered message from xxxxxxxx@debian.org
  bug #384116 unanswered message from xxxxxxxx@debian.org
[...]

This could make some people happy.

You can get it with bzr branch http://people.debian.org/~enrico/bts-utils

Posted Sat Jun 6 00:57:39 2009 Tags:

libept 0.5.3 hit unstable

I prepared a new toy to play with at Debconf and uploaded it to unstable:

Package: libept-dev
Description: High-level library for managing Debian package information
 The library defines a very minimal framework in which many sources of data
 about Debian packages can be implemented and queried together.
 .
 The library includes four data sources:
 .
  * APT: access the APT database
  * Debtags: access the Debtags tag information
  * Popcon: access Popcon package scores
  * TextSearch: fast Xapian-based full text search on package description
 .
 This is the development library.

Package: ept-cache
Description: Commandline tool to search the package archive
 ept-cache is a simple commandline interface to the functions of libept.
 .
 It can currently search and display data from four sources:
 .
  * The APT database
  * The Debtags tag information
  * Popcon package scores
  * A fast Xapian-based full text index on package descriptions

Yes, this finally brings lots of very cool data sources about packages together.

Try this one:

# Check if all data providers are active and give instructions on how
# to activate those that aren't
ept-cache info

# Follow the instructions to activate everything

# Show all GUI image editors, sorted by popularity, in reverse order
ept-cache search image editor -t gui -s p-

If you have the Xapian data provider enabled, the results of a search are given in relevance order, the most relevant first. And also, searches are done with proper stemming, so if you look for image editor it will also find image editing, although it would score image editor higher.

It's also quite lovely to work with it in C++. I'll improvise here a few examples:

Print name and short description of every package

#include <ept/apt/apt.h>
#include <ept/apt/packagerecord.h>

using namsepace std;
using namespace ept::apt;

void playWithApt()
{
    // Apt data source
    Apt apt;

    // Parser of package records
    PackageRecord rec;

    // Iterate all package records
    for (Apt::record_iterator i = apt.recordBegin();
        i != apt.recordEnd(); ++i) 
    {
        rec.scan(*i);
        cout << rec.pakcage() << " - " << rec.shortDescription() << endl;
    }
}

Show all image editors

#include <ept/debtags/debtags.h>
#include <set>

using namespace ept::debtags;

void playWithDebtags()
{
    // Apt data source
    Apt apt;
    // Parser of package records
    PackageRecord rec;
    // Debtags data source
    Debtags debtags;

    if (!debtags.hasData())
        return;

    set<Tag> tags;
    tags.insert(debtags.vocabulary().tagByName("works-with::image:raster"));
    tags.insert(debtags.vocabulary().tagByName("use::editing"));
    tags.insert(debtags.vocabulary().tagByName("role::program"));
    set<string> results = debtags.getItemsHavingTags(tags);
    for (set<string>::const_iterator i = results.begin();
        i != results.end(); ++i)
    {
        rec.scan(apt.rawRecord(*i));
        cout << rec.pakcage() << " - " << rec.shortDescription() << endl;
    }
}

Print all package names, sorted by popularity

#include <ept/popcon/popcon.h>
#include <algorithm>

using namespace ept::popcon;

// STL comparator
struct PopconCompare
{
    Popcon& popcon;
    bool operator<(const std::string& pkg1, const std::string& pkg2) const
    {
        return popcon[pkg1] < popocon[pkg2];
    }
};

void playWithPopcon()
{
    // Apt data source
    Apt apt;
    // Popcon data source
    Popcon popcon;
    vector<string> sorted;

    if (!popcon.hasData())
        return;

    // Get all package names in the vector
    copy(apt.begin(), apt.end(), back_inserter(sorted));

    // Sort it by popularity
    sort(sorted.begin(), sorted.end(), PopconCompare(popcon));

    // Print it out
    for (vector<string>::const_iterator i = sorted.begin();
        i != sorted.end(); ++i)
        cout << *i << endl;
}

Search for image viewer, but we don't want to view kernel images

#include <xapian.h>

using namespace ept::textsearch;

void playWithXapian()
{
    TextSearch textsearch;
    vector<string> wanted;
    vector<string> notwanted;

    Xapian::Enquire enq(textsearch.db());
    // This will tokenise the search query into terms, stem them
    // and OR them together in a query.  Xapian will score higher
    // those results in which more ORed terms match, which is what
    // we want.
    Xapian::Query want = textSearch.makeOrQuery("image viewer");
    Xapian::Query dontWant = textSearch.makeOrQuery("linux kernel");

    enq.set_query(Xapian::Query(Xapian::Query::OP_AND_NOT, want, dontWant));

    // Print the top 20 results, with their relevance percentage
    Xapian::MSet matches = enq.get_mset(0, 20);
    for (Xapian::MSetIterator i = matches.begin(); i != matches.end(); ++i)
    {
        // The get_data() of a document is the package name
        cout << i.get_document().get_data() << " ("
             << i.get_percent() << "%)" << endl;
    }
}
Posted Sat Jun 6 00:57:39 2009 Tags:

Telling gpg not to use the key in the card

So, I created the subkeys for the OpenPGP card, and it works.

Now I'd like to upload some Debian packages, but the uploads fail because my new subkeys aren't yet known to the Debian keyring. I tried to push my subkeys to keyring.debian.org, but uploading afterwards still was rejected. Maybe it takes some time for propagation, maybe there's some other procedure to follow,

I don't know. I didn't manage to figure out what is the procedure for getting a new subkey in the Debian keyring. I wish to replace this paragraph with proper details if I'll ever find out.

Now, failing to use the subkeys, I had to convince gpg to use my good old main key. The quick and dirty way was to make a backup of the keyring, delete the subkeys, sign and upload.

Seconds after hours of searching terminated in the above crude hack, as it normally happens, someone (Holger in this case) suggested the correct way to do it: use --default-key and append an exclamation mark at the end of the key ID.

This was in the gpg manpage, but nowhere near the documentation of --default-key:

Note that you can append an exclamation mark (!) to key IDs or fingerprints.
This flag tells GnuPG to use the specified primary or secondary key and not
to try and calculate which primary or secondary key to use.

So, now I'm happy:

$ gpg --sign  --default-key '797ebfab!'

You need a passphrase to unlock the secret key for
user: [...]

$ gpg --sign
gpg: signatures created so far: xx

Please enter the PIN
[sigs done: xx]
Posted Sat Jun 6 00:57:39 2009 Tags:

mod_proxy_html and compressed pages

After putting it behind a reverse proxy, our phpmyadmin setup started showing empty pages.

After one morning of deep cursing, this is what happened:

  1. the web server where phpmyadmin runs generates compressed html pages;
  2. mod_proxy_html tries to edit them, and "normalises" them, adding <html>...</html> headers around the compressed data;
  3. Firefox fails to decompress because there is extra garbage, and shows a blank page instead of complaining.

Other things to note:

  • firebug showed 4kb of data downloaded, but the page sources were empty;
  • using curl to get the pages would show the data.

How I found it:

  1. nc -l -p 444;
  2. configure mod_proxy to send connections to netcat instead of the web browser;
  3. compare curl headers and Firefox headers;
  4. add the headers from Firefox to curl one by one, until the output breaks.

How to solve it:

  1. a2enmod deflate
  2. Replace SetOutputFilter proxy-html with SetOutputFilter DEFLATE;proxy-html;INFLATE so that we always have mod_proxy_html work on decompressed HTML.
Posted Sat Jun 6 00:57:39 2009 Tags:

How to read the Freerunner's accelerometers

This code has been take from moko_eightball by Jakob Westhoff: it just continuously prints the value of the three accelerometers.

#include <stdio.h>
#include <stdint.h>

void processInputEvents(FILE* in)
{
    int x = 0, y = 0, z = 0;
    while (1)
    {
        char padding[16];
        uint16_t type, code;
        int32_t value;

        // Skip the timestamp
        fread(padding, 1, 8, in);

        // Read the type
        fread(&type, 1, 2, in);

        // Read the code
        fread(&code, 1, 2, in);

        // Read the value
        fread(&value, 1, 4, in);

        switch( type )
        {
            case 0:
                switch( code )
                {
                    case 0:
                        fprintf(stdout, "x%d y%d z%d\n", x, y, z);
                        break;
                    default:
                        //warning( "Unknown code ( 0x%02x ) for type 0x%02x\n", code, type );
                        break;
                }
                break;
            case 2:
                switch ( code )
                {
                    case 0:
                        // Update to the new value
                        x = value;
                        break;
                    case 1:
                        // Update to the new value
                        y = value;
                        break;
                    case 2:
                        // Update to the new value
                        z = value;
                        break;
                    default:
                        //warning( "Unknown code ( 0x%02x ) for type 0x%02x\n", code, type );
                        break;
                }
                break;

            default:
                //warning( "Unknown type ( 0x%02x ) in accelerometer input stream\n", type );
                break;
        }


    }
}


int main()
{
    FILE* in = fopen("/dev/input/event2", "r");
    processInputEvents(in);
    fclose(in);
    return 0;
}
Posted Sat Jun 6 00:57:39 2009 Tags: