Today

Today it was a lazy day.

As usual I attended some meetings, worked on reports and negotiated dates for one of the projects I manage.

I have also checked some expenses from the department using a new report which I still don’t know if it is an improved version or a worsened version of the previous report. Mixed feelings…

I have also played for a while with my new toy: a Galaxy Note. Expensive in Brazil, but incredibly useful.

image

And here it was the sky I had all day long while working from home.

Creating an application for checking lottery numbers – Design, Modelling and Code creation

In Brazil we have a lottery that pays millions of reais / dollars twice a week.  This value is paid for the winner that got all 6 numbers (out of 60) correctly.  We can buy a ticket with the numbers we choose (or random numbers) for up to 8 draws, what means that we can win at any draw within the next 30 days or so.

But they also pay money if you got 4 or 5 numbers drawn.  It can be a tedious process to check if you got any number right, since there are 15 combinations for 4 numbers, 6 combinations for 5 numbers and a single combination for all the 6 numbers, with a total of 22 paying combinations.

In the first version of my code, I am checking all those 22 combinations automatically for a user that plays only 6 numbers (yes, we can play more numbers, up to 15, with increasing costs).  But how to be sure that this is done in an efficient manner?

I started with the process that I usually start any application:

  • What I want to be done?
  • What interfaces will I have to work with the user?  (e.g. command line, graphical interface, web interface, etc.)
  • What interfaces will I have to work with the data? (e.g. one single app accessing the data, more than one app, web app, external access, just internal access, etc.)
  • Where do I want things done? (e.g. and the frontend code, at backend code, at database, in the server, etc.)
  • etc.

This makes me think and define some simple rules that will be the foundations of my API and that will guide my code.

Most of the time, I do not have answers to all those questions and need to play on the safe side.  The usual answers — and something I have to be very careful with since it is already a bias — are:

  • <something>
  • Web Interface
  • A web app that might have external parties consuming the informaiton from it
  • Database, browser, backend — in that order.

With that in mind, I started the development of the database based on the data that the lottery provides and that I thought would be useful for some statistical analysis.

This would be analyzing the customer’s requirements (myself, in this case) and modelling the database to be able to represent that information in a manner that will be able to answer the basic questions from the customer.

The second step would be analyzing indices and other database specific questions that will allow me to deal with the volume of data, the type of queries that will be performed, the number of concurrent accesses, the type of concurrent queries, etc.  It is the work of a DBA to plan that.

The third step is creating the code.  The first question is: where?

As I wanted to try it out fast, and the situation is not all that complicated, I could code it inside the database itself (PostgreSQL allows me to code very complex things inside of it, in many different programming languages, so this is one of the reasons why I must be careful to not have everything inside the database itself).  The next question becomes: in which language?

The fastest language is plain SQL, then plpgsql, and then plpython (that is Python embedded on the database). My first thought, out of convenience was Python.  Then, I do not needed a full set of it to justify requiring plpython to run a simple query, so I moved to plpgsql.  But, this meant that I was not generating the combinations dynamically, so why not trying to go to plain SQL?  And that is what I ended up doing.

The following code helped me to get all possible combinations for 4 and 5 numbers out of 6 possible numbers:

>>> a=itertools.combinations([1, 2, 3, 4, 5, 6], 4)
>>> for x in a:
...    print x
...

and

>>> a=itertools.combinations([1, 2, 3, 4, 5, 6], 5)
>>> for x in a:
...    print x
...

This I copied onto my SQL code, applied some magic text replacement commands and ended up with my SQL function in PostgreSQL:

CREATE OR REPLACE FUNCTION f_v_is_possible_result(
       i_dezena1 INTEGER,
       i_dezena2 INTEGER,
       i_dezena3 INTEGER,
       i_dezena4 INTEGER,
       i_dezena5 INTEGER,
       i_dezena6 INTEGER)
       RETURNS SETOF INTEGER
AS $_$
SELECT contest_number
       FROM megasena
       WHERE
-- Checking four numbers
        ARRAY[$1, $2, $3, $4] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
-- Checking five numbers
        ARRAY[$1, $2, $3, $4, $5] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $4, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $3, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $2, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$1, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
        ARRAY[$2, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6] OR
-- Checking six numbers
        ARRAY[$1, $2, $3, $4, $5, $6] <@ ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6]
       ORDER BY contest_number;
$_$ LANGUAGE SQL;

In this particular case, I exploited some advantages from PostgreSQL and worked with array comparisons to check any of the possible combinations, since the number of conditions was small.  If I had been working with all 15 numbers that one can play, then I would have to dynamically generate that in the code.

The “<@” operator working with arrays means “is contained in”, so I check that the combinations from the left side are contained in the array of drawn numbers.  “dezena1″ to “dezena6″ are the drawn numbers — their column names, actually –, already ordered in ascending order (even though with arrays on PostgreSQL this ordering does not matter, the only requirement is that all elements from the left side are available on the right side to be a successful match).

This approach, as I mentioned before, would not work with all the 15 numbers since there are 1365 combinations for 4 numbers out of 15, 3003 combinations for 5 numbers out of 15 and 5005 combinations for 6 numbers out of 15, adding up to 9373 possible combinations.  Even with magic text replacement, this would mean a lot of work and a huge code for a SQL function.  Not the best idea at all…  In this case, I would go with a Python function.

With that being done, now I can move on to the next step: the web interface building.

As with the latest applications, I will be using ExtJS and grids.  The numbers from my bet will be the filters and this function above will help me limiting the number of lines I will be showing on the grid.

Around version 2 or 3, I will be changing it to allow up to 15 numbers to be played, with a minimum of 6 numbers.  But, then, it is another project, with another design phase, decision making, etc.  I will try remembering to post about it here when that time comes (and if I get there, since I never play more than 6 numbers anyway…).

Why this post?  Because the process is the same for every application and because this is a somewhat common problem.  Using arrays and their special operators shows up as an elegant solution to a problem.

Happy hacking and wish me luck with my bets. :-)

 

UPDATE1:. I won a lesser prize.  Found out with the above code. About US$ 300, last week.

UPDATE2:. After a series of interactions with a colleague, I changed the above code to the one below:

CREATE OR REPLACE FUNCTION f_v_is_possible_result(
       i_dezena1 INTEGER,
       i_dezena2 INTEGER,
       i_dezena3 INTEGER,
       i_dezena4 INTEGER,
       i_dezena5 INTEGER,
       i_dezena6 INTEGER,
       i_matches INTEGER = 4,
       OUT contest_number INTEGER,
       OUT numbers INTEGER[],
       OUT quantity INTEGER)
       RETURNS SETOF RECORD
AS $_$
WITH numbers AS (SELECT contest_number, ARRAY_INTERSECT (ARRAY[$1, $2, $3, $4, $5, $6],
                                                   ARRAY[dezena1, dezena2, dezena3, dezena4, dezena5, dezena6]) AS intersect
                 FROM megasena 
                 ORDER BY contest_number)
     SELECT numbers.contest_number,
            numbers.intersect AS numbers,
            ARRAY_LENGTH(numbers.intersect, 1) AS quantity
     FROM numbers
     WHERE ARRAY_LENGTH(numbers.intersect, 1) >= $7;
$_$ STABLE LANGUAGE SQL;

The “WITH” clause will allow me to process the information more than once while preventing the query from being repeated at every time I use it.

The “i_matches” parameter allow me to show cases where there was no prize won or filter out specific types of prizes (4, 5 or 6 numbers matching).

The new code also shows me which numbers matched.  Here’s the new output:

megasena=# select * FROM f_v_is_possible_result(03, 11, 20, 49, 51, 59);
 contest_number |   numbers    | quantity 
----------------+--------------+------------
           1335 | {3,20,51,49} |          4
(1 row)

megasena=# select * FROM f_v_is_possible_result(03, 11, 20, 49, 51, 59, 3);
 contest_number |   numbers    | quantity 
----------------+--------------+------------
             56 | {20,51,59}   |          3
            217 | {3,51,49}    |          3
            296 | {3,20,51}    |          3
            548 | {51,49,59}   |          3
            658 | {11,3,20}    |          3
            857 | {3,49,59}    |          3
           1034 | {3,20,49}    |          3
           1035 | {11,51,59}   |          3
           1057 | {3,49,59}    |          3
           1192 | {20,49,59}   |          3
           1247 | {11,3,51}    |          3
           1335 | {3,20,51,49} |          4
(12 rows)

megasena=#

My prize was on contest number 1335.

Stakeholders Influence

When categorizing the stakeholders one thing that we have to do is write down the type of influence they have on the project: positive or negative.

But what to do when we don’t know what is it? What if we are tempted to say it is “neutral”, in the sense of “this stakeholder doesn’t care if the project is successful as well as if it is unsuccessful”?

Discussing this today, with a fellow PMP at work, I got to some conclusions / decisions.

The first one, and the most obvious one, is that there is no rule on what types of classifications we can have. As with most things, the Project Manager is free to customize the project as needed. So, adding a “neutral” influence is perfectly possible.

The second conclusion is that there is no “neutral” stakeholder. There is always a small tendency on being pro or against the project. What can be done, then, is adding the “neutral” influence type and using it as a marker for reviewing this classification as soon as a better judgment of the stakeholder’s influence type is known.

The third thing is that not knowing the correct classification is a project risk. That stakeholder can be working against the project and a misclassification of him / her can lead to ignoring key factors as well as loosing opportunities to neutralize / mitigate the bad influence from the stakeholder.

The correct classification and identification of stakeholders is one of the factors that can help with project management and lead to the success of the project.

Project Management Certification

It is being a tough week since I have started studying to get my PMP – Project Management Professional – certification.

A lot of reading, studying, training, etc. going on. I am thankful that the company I work for has access to all the resources I need to study and also has many certified professionals that I can refer to and that have reviewed some documents and training courses I am using.

I have laid out a plan with some colleagues and we are doing weekly meetings that take from 1h to 1h30 to discuss what we have learned from the week’s targets.

Since we’re on the first week, our target was taking two training courses and reading the first two chapters of the PMBOK (Project Management Body of Knowledge).

I am taking notes and using a mindmapper software to map my learnings. So, I expect to have a lot of things written here in the next months all related to my studies.

And, of course, since these are a personal effort, I still have all my usual activities going on at work.

There are some changes and I will move from full responsibility on Communications Systems and Services to Enterprise Technologies, on the platforms area, assuming the global leadership for IBM’s System I (former iSeries) platform.

It will be an interesting challenge since I don’t know anything about this technology and it will then become my first managerial experience where I don’t fully understand the technology I am managing.

I will have to rely a lot on our technical specialists, self study and documentation on the technology and processes related to it.

I also have to work on globalizing this service and keep the globalization of one of my former projects going on. I was – and I still am – responsible for laying out the foundations for it to happen and now it will be my responsibility to keep it moving, even if I am not managing the teams that work with that technology anymore.

VERY interesting challenges coming up and, of course, I will keep some news about how I am dealing with them here.

Stay tuned and enjoy :-)

And, if you want to share some information on any of the topics I am writing about, feel free to do so.

Updates, News and New Implementations

Even though I have been quiet for a while, things are a lot busy.

I have been spending a lot of time on preparing myself to get a PMI certication and become a PMP. I had classes with the PMBOK while at College, I participated on several training courses after that, I have read many books… and even then, I don’t feel ready to apply for the certification.

It is part of my daily tasks, but what makes me believe I am not ready are the subtleties of things and some of the questions I remember from College, when studying that and thinking about applying for the certification.

On the IT field, I have been playing more with org-mode and I am almost ready with making it “talk” to my Google Calendar. Here are some hints for that:

http://emacsworld.blogspot.com/2011/05/integrating-emacs-org-mode-and-google.html

http://thread.gmane.org/gmane.emacs.help/80824

http://code.google.com/p/googlecl/

And, I am also happy with finally being able to enable the transparent background on Emacs. I have added this code to my .emacs (actually, I use org-mode to manage it as well, so the process was a bit different, but the idea is the same):

(defun djcb-opacity-modify (&optional dec)
  "modify the transparency of the emacs frame; if DEC is t, decrease the transparency, otherwise increase it in 1%-steps"
  (let* ((alpha-or-nil (frame-parameter nil 'alpha)) ; nil before setting           (oldalpha (if alpha-or-nil alpha-or-nil 100))
          (newalpha (if dec (- oldalpha 1) (+ oldalpha 1))))
    (when (and (>= newalpha frame-alpha-lower-limit) (<= newalpha 100))
      (modify-frame-parameters nil (list (cons 'alpha newalpha))))))

 ;; C-8 will increase opacity (== decrease transparency)  ;; C-9 will decrease opacity (== increase transparency  ;; C-0 will returns the state to normal (global-set-key (kbd "C-8") '(lambda()(interactive)(djcb-opacity-modify)))
(global-set-key (kbd "C-9") '(lambda()(interactive)(djcb-opacity-modify t)))
(global-set-key (kbd "C-0") '(lambda()(interactive)
                               (modify-frame-parameters nil `((alpha . 100)))))

The source for that is available at http://emacs-fu.blogspot.com/2009/02/transparent-emacs.html

The original code had the steps move in steps of 10%. I have changed the code a little to make it move in 1% steps.

And, finally, this weekend it was Valentine’s Day here in Brazil. It was really cold – with some ice on the glass ceiling from my neighbor – what contributed a lot to the romance and for my wife and I spending the time together.

Every time I am able to spend some time like this, I stop and think about how many times I am doing that and if it is not something that I need to improve. It is hard to manage all the work tasks, studies, and still get some free time to spend without worrying, but I am sure that my Number One priority should be my family.

I think that these updates were not enough for the time I didn’t post, but I am trying. This blog already has more – relevant – posts than anything I was able to keep before (maybe Multiply had more posts, but it was a different time in my life).

Using matplotlib / pylab without a DISPLAY

I’ve been having issues with a web system I coded for a while now and couldn’t find a solution that worked to my satisfaction to eliminate the need of a Xvnc running to provide a virtual DISPLAY to run some commands to plot data using pylab.

Finally, today, I found the solution here and I want to share it.

import matplotlib
matplotlib.use('Agg') # Must be before importing matplotlib.pyplot or pylab!
import matplotlib.pyplot as plt

fig = plt.figure()
plt.plot(range(10))
fig.savefig('temp.png')

The solution, here, is the Agg option as the “device” to be used to plot files.

Solution in place, and I can now eliminate the requirement for Xvnc from my code and all the issues that came with it are also automatically gone.

So, here’s the code for Pylab:

import matplotlib
matplotlib.use('Agg') # Must be before importing matplotlib.pyplot or pylab!
import pylab

pylab.plot(range(10))
pylab.savefig('temp.png')

Enjoy!