Today

Today it was a lazy day.

As usual I attended some meetings, worked on reports and negotiated dates for one of the projects I manage.

I have also checked some expenses from the department using a new report which I still don’t know if it is an improved version or a worsened version of the previous report. Mixed feelings…

I have also played for a while with my new toy: a Galaxy Note. Expensive in Brazil, but incredibly useful.

image

And here it was the sky I had all day long while working from home.

Creating an application for checking lottery numbers – Design, Modelling and Code creation

In Brazil we have a lottery that pays millions of reais / dollars twice a week.  This value is paid for the winner that got all 6 numbers (out of 60) correctly.  We can buy a ticket with the numbers we choose (or random numbers) for up to 8 draws, what means that we can win at any draw within the next 30 days or so.

But they also pay money if you got 4 or 5 numbers drawn.  It can be a tedious process to check if you got any number right, since there are 15 combinations for 4 numbers, 6 combinations for 5 numbers and a single combination for all the 6 numbers, with a total of 22 paying combinations.

In the first version of my code, I am checking all those 22 combinations automatically for a user that plays only 6 numbers (yes, we can play more numbers, up to 15, with increasing costs).  But how to be sure that this is done in an efficient manner?

I started with the process that I usually start any application:

  • What I want to be done?
  • What interfaces will I have to work with the user?  (e.g. command line, graphical interface, web interface, etc.)
  • What interfaces will I have to work with the data? (e.g. one single app accessing the data, more than one app, web app, external access, just internal access, etc.)
  • Where do I want things done? (e.g. and the frontend code, at backend code, at database, in the server, etc.)
  • etc.

This makes me think and define some simple rules that will be the foundations of my API and that will guide my code.

Most of the time, I do not have answers to all those questions and need to play on the safe side.  The usual answers — and something I have to be very careful with since it is already a bias — are:

  • <something>
  • Web Interface
  • A web app that might have external parties consuming the informaiton from it
  • Database, browser, backend — in that order.

With that in mind, I started the development of the database based on the data that the lottery provides and that I thought would be useful for some statistical analysis.

This would be analyzing the customer’s requirements (myself, in this case) and modelling the database to be able to represent that information in a manner that will be able to answer the basic questions from the customer.

The second step would be analyzing indices and other database specific questions that will allow me to deal with the volume of data, the type of queries that will be performed, the number of concurrent accesses, the type of concurrent queries, etc.  It is the work of a DBA to plan that.

The third step is creating the code.  The first question is: where?

As I wanted to try it out fast, and the situation is not all that complicated, I could code it inside the database itself (PostgreSQL allows me to code very complex things inside of it, in many different programming languages, so this is one of the reasons why I must be careful to not have everything inside the database itself).  The next question becomes: in which language?

The fastest language is plain SQL, then plpgsql, and then plpython (that is Python embedded on the database). My first thought, out of convenience was Python.  Then, I do not needed a full set of it to justify requiring plpython to run a simple query, so I moved to plpgsql.  But, this meant that I was not generating the combinations dynamically, so why not trying to go to plain SQL?  And that is what I ended up doing.

The following code helped me to get all possible combinations for 4 and 5 numbers out of 6 possible numbers:

>>> a=itertools.combinations([1, 2, 3, 4, 5, 6], 4)
>>> for x in a:
...    print x
...

and

>>> a=itertools.combinations([1, 2, 3, 4, 5, 6], 5)
>>> for x in a:
...    print x
...

This I copied onto my SQL code, applied some magic text replacement commands and ended up with my SQL function in PostgreSQL:

CREATE OR REPLACE FUNCTION f_v_is_possible_result(
       i_dezena1 INTEGER,
       i_dezena2 INTEGER,
       i_dezena3 INTEGER,
       i_dezena4 INTEGER,
       i_dezena5 INTEGER,
       i_dezena6 INTEGER)
       RETURNS SETOF INTEGER
AS $_$
SELECT contest_number
       FROM megasena
       WHERE
-- Checking four numbers
        ARRAY[$1, $2, $3, $4] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
-- Checking five numbers
        ARRAY[$1, $2, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
-- Checking six numbers
        ARRAY[$1, $2, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6]
       ORDER BY contest_number;
$_$ LANGUAGE SQL;

In this particular case, I exploited some advantages from PostgreSQL and worked with array comparisons to check any of the possible combinations, since the number of conditions was small.  If I had been working with all 15 numbers that one can play, then I would have to dynamically generate that in the code.

The “<@” operator working with arrays means “is contained in”, so I check that the combinations from the left side are contained in the array of drawn numbers.  “dezena1” to “dezena6” are the drawn numbers — their column names, actually –, already ordered in ascending order (even though with arrays on PostgreSQL this ordering does not matter, the only requirement is that all elements from the left side are available on the right side to be a successful match).

This approach, as I mentioned before, would not work with all the 15 numbers since there are 1365 combinations for 4 numbers out of 15, 3003 combinations for 5 numbers out of 15 and 5005 combinations for 6 numbers out of 15, adding up to 9373 possible combinations.  Even with magic text replacement, this would mean a lot of work and a huge code for a SQL function.  Not the best idea at all…  In this case, I would go with a Python function.

With that being done, now I can move on to the next step: the web interface building.

As with the latest applications, I will be using ExtJS and grids.  The numbers from my bet will be the filters and this function above will help me limiting the number of lines I will be showing on the grid.

Around version 2 or 3, I will be changing it to allow up to 15 numbers to be played, with a minimum of 6 numbers.  But, then, it is another project, with another design phase, decision making, etc.  I will try remembering to post about it here when that time comes (and if I get there, since I never play more than 6 numbers anyway…).

Why this post?  Because the process is the same for every application and because this is a somewhat common problem.  Using arrays and their special operators shows up as an elegant solution to a problem.

Happy hacking and wish me luck with my bets. 🙂

 

UPDATE1:. I won a lesser prize.  Found out with the above code. About US$ 300, last week.

UPDATE2:. After a series of interactions with a colleague, I changed the above code to the one below:

CREATE OR REPLACE FUNCTION f_v_is_possible_result(
       i_dezena1 INTEGER,
       i_dezena2 INTEGER,
       i_dezena3 INTEGER,
       i_dezena4 INTEGER,
       i_dezena5 INTEGER,
       i_dezena6 INTEGER,
       i_matches INTEGER = 4,
       OUT contest_number INTEGER,
       OUT numbers INTEGER[],
       OUT quantity INTEGER)
       RETURNS SETOF RECORD
AS $_$
WITH numbers AS (SELECT contest_number, ARRAY_INTERSECT (ARRAY[$1, $2, $3, $4, $5, $6],
                                                   ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6]) AS intersect
                 FROM megasena 
                 ORDER BY contest_number)
     SELECT numbers.contest_number,
            numbers.intersect AS numbers,
            ARRAY_LENGTH(numbers.intersect, 1) AS quantity
     FROM numbers
     WHERE ARRAY_LENGTH(numbers.intersect, 1) >= $7;
$_$ STABLE LANGUAGE SQL;

The “WITH” clause will allow me to process the information more than once while preventing the query from being repeated at every time I use it.

The “i_matches” parameter allow me to show cases where there was no prize won or filter out specific types of prizes (4, 5 or 6 numbers matching).

The new code also shows me which numbers matched.  Here’s the new output:

megasena=# select * FROM f_v_is_possible_result(03, 11, 20, 49, 51, 59);
 contest_number |   numbers    | quantity 
----------------+--------------+------------
           1335 | {3,20,51,49} |          4
(1 row)

megasena=# select * FROM f_v_is_possible_result(03, 11, 20, 49, 51, 59, 3);
 contest_number |   numbers    | quantity 
----------------+--------------+------------
             56 | {20,51,59}   |          3
            217 | {3,51,49}    |          3
            296 | {3,20,51}    |          3
            548 | {51,49,59}   |          3
            658 | {11,3,20}    |          3
            857 | {3,49,59}    |          3
           1034 | {3,20,49}    |          3
           1035 | {11,51,59}   |          3
           1057 | {3,49,59}    |          3
           1192 | {20,49,59}   |          3
           1247 | {11,3,51}    |          3
           1335 | {3,20,51,49} |          4
(12 rows)

megasena=#

My prize was on contest number 1335.

New beer glasses

I bought two new beer glasses yesterday.

From two classic beers: Erdinger (their Urweisse beer was the one I wrote about a few minutes ago) and Paulaner.

Both glasses are tall, beautiful and very different from the ones we use here in Brazil.

Every country has its tradition and here we’re used to small glasses and sharing our beers.  In England, for example, people are used to pints and drinking their beers on their own.

Nice acquisitions to what might start as a new kind of hobby or collection…  I haven’t decided yet. 🙂

Anyway, drinking my regular beer on them made it taste better.  I credit that to the shape of those glasses…

Erdinger Urweisse

Erdinger UrweisseJust bought this beer yesterday…  Delicious.

A bit stronger than most that I’m used to drink and it is great. 🙂

Erdinger makes great beers and if you haven’t tried one, you should taste it.

In fact, even though it is stronger, it is easier to drink than most Brazilian beers.  (And this includes my favorite one: Bohemia.)

There is a bad thing, though: it is over too soon, so one bottle is not enough 🙂

The cost, here in Brazil, is also a problem: it costs from 3 to 4 times more than a premium beer.  It is an imported beer, though, but you drink one a day, not much more than that.

KDE 4.2 on OpenSuSE 11.0

Continuing with updates from Factory, I’m now using KDE 4.2.

Several things have improved, the interface is as clean as 4.1’s was and I feel that some things take less resources.

It is worth a change for anyone running KDE 4.x to update to KDE 4.2.  But be aware that some configurations might be lost, so backup your settings before the change, just in case. 🙂

On what to do prior to contact another company’s tech support

I work for a company that does off shore for other companies from the same group.  There are two types of activities that are done worldwide (100+ countries): internal support and support to external business partners (mostly clients, some suppliers, etc.).

As I work with networks and data transmission, and I lead a team that interfaces with external business partners almost 100% of the time, it is common to get some support requests for problems that really aren’t our problems.

It is something that is annoying because when there is a problem — specially in production environment — we need to stop working on everything else and solve that.  Out of those, 99.999% of the times the problem isn’t on our end: we work with thousands of business partners and have the structure in place to work with them: servers, network, firewalls, etc.  Everything is working fine, debugged and stable — there are some destabilizing changes from time to time, of course, but those are rare and very well tested.  Then, there comes the question from the external business partner: “We can’t connect to your server.  What did you change?  We need to submit that data today!”

Last time this happened, I had a meeting with an Indian company from 1:30 AM to 3 AM here…  And what was the issue?  They could manually submit data, but their automation couldn’t.

Anyone that can solve a hard problem — such as adding 2+2 — can see that there is something wrong with that and that something is probably at their end…  But no, this didn’t happen.  So, we scheduled a meeting and after one hour asking questions and asking about changes, they said that they changed the server: manual process (working) was from one host, automated process from another.  When asked to manually test from the new host, it didn’t work as well.  The problem?  They changed from one server to another and didn’t update their own firewall policies…  And then, it is our fault… 🙂

Second time there even was a paging sent to our team: data corrupted.  Oh well, the customer had to fix it and not us as we don’t change data in any way.  Even within our own company, business first assumes it is our fault and then we have to insist to have them go to the customer.

Now, back to the subject:

  1. Make a change procedure, schedule times, analyze impact, etc.
  2. Keep a backup of all of your changes.  Never demise the server after the change, you might need something from it back or reverting back to the old server in case something goes wrong
  3. Check everything that has to be changed (servers, configurations, user profiles, firewalls, IP addresses, etc.)
  4. Have a “recent changes” log and don’t assume that one change is irrelevant, specially when talking to other companies trying to solve a problem
  5. Check your company first: go to your IT team or contractor, ask what has been changed, why it worked before and what is exactly the problem you’re having
  6. If it is a connectivity problem, check that you can connect to other sites that provide a similar service / interface, check that you can get out of your network to the Internet from the same host that will be used and that is presenting the failure
  7. Supply the support team a traceroute / tracert output from the failing host to their server

Give the support team from both companies enough information to solve your problem.  Guessing is too time consuming, error prone and isn’t something specialists like doing without enough information to discard the most obvious problems first.

“Help us to help you”, I always say (no, I didn’t create this expression, but I really enjoyed it!).