Diversity Visa Lottery 2012 Data Analysis

Alright, well… the 2012 Diversity Visa Lottery selection results are now out!  If you’ve applied, you can check your status at the usual place – and you’d better remember your confirmation number, because otherwise you’re SOL – supposedly nobody can tell you what it is/was, and checking online is the only way to know if you’ve been selected.

Update:  It looks like the lottery’s been invalidated due to a computer programming error!  New results on the 15th of June.

If you don’t know what the Diversity Visa lottery is, well… in short it’s 50,000 “free” “green cards” to random people throughout the world with a minimum of a high school education and/or who have worked at least two years out of the last five in an occupation requiring at least two years of training.  It’s a supposedly (conspiracy theories aside) a “random” program where random people of the world with no restrictions to age, gender or nationality (except some quotas which have been reached to enable everyone to have an “even” chance) have a period of two months every year (usually Nov-Dec) to apply, and the winners of the lottery are selected a few months after (usually April-May).

Anyway, you can read more about it here if you’re interested.  It’s a supposedly fair program with equal chance for everyone, but with the quota limitations and some other societal issues it can easily be argued that it’s quite ugly and unfair by design.  Like all immigration issues, it’s quite controversial, but it’s a lottery at its core and it all comes down to pure chance in the end (again, conspiracy theories aside).

The main reason I’m making this post is that I was actually wondering if they publish some data about the demography of “winners” in the lottery – and they indeed do, as I found out at some obscure link that I found through a search engine.  It’s an ugly PDF alright, but after half an hour of work I cleaned it up fairly well.  Then, I decided to take a little deeper look at it.

Dumping all the data into a chart revealed some interesting… limitations and annoyances of Excel 2007.

First of all, I couldn’t really figure out a really good way to display the extremely large amount of overlapping data well.  The only idea I could come up with was to make the chart really really high vertically, but it didn’t really “deliver” the results I was looking for.  I also found a bug in Excel 2007 where making the chart more than 2000-ish “inches (measurement from excel)” started to flat line the graphs – I’m guessing overflowing variables in the code.

I also found a limitation in Excel where data labels couldn’t be applied to more than 1 series at a time – so I was faced with the problem of scrolling through 50+ data series and clicking a couple of ticks on each.  I wasn’t going to do that, so I took a look around, and my problem was solved surprisingly quickly through a macro I found/adapted to my problem in about 10 minutes.

Here it is (if anyone is interested):

Sub Macro1()

‘ Macro1 Macro


ActiveSheet.ChartObjects(“Chart 3”).Activate
For Each s In ActiveChart.SeriesCollection
s.DataLabels.ShowSeriesName = True
s.DataLabels.ShowCategoryName = True
s.DataLabels.Separator = “” & Chr(10) & “”
Next
End Sub

Anyway, after all, I did make the graph as high as possible, and applied labels to it.  Because of the height limitation in Excel, things were quite murky at the bottom, so I made a series of graphs “cutting off” the top of the graph at predefined values.

As I was just posting this article, I was going to insert the images here for you to look at, but I am running into yet another limitation of both WordPress 3 and Firefox 4 – processing/showing extremely large vertical images.  Surprisingly enough, Internet Explorer is capable of showing them.  Not sure about other browsers, but Firefox simply displays “The image “https://hristo.gangov.com/DV2012Stuff/Full.png” cannot be displayed because it contains errors.

So, fun stuff.  You can download the images for desktop viewing with something like IrfanView or view them with IE here.  No “proper” embedding, because WordPress can’t process them either – it gave me some kind of memory overflow error on one of its scripts.

Except failing to figure out how to properly display the “trend over time” for all countries at once without it looking like a bunch of gibberish, I also wanted to visualize the data for just 2012 (the most recent) more clearly.  I chose bar graphs, sorted alphabetically, and by number of applications (descending).  Those images are luckily smaller.  Here they are:

I also wanted to have some more fun by looking up the population for every country and charting the percentage of population who have applied for the DV lottery for all countries.  I think I got the population numbers from the US Census website.  Here was the result:

Interesting social-political-economic insight there.  I think this might be an interesting idea for a book or a documentary.

In any case, the whole project was fun.  Here‘s the Excel file, if you’d like to play with the data yourself.

2 comments

  1. Edossa says:

    I and my families are win america dv lotory and live, learn and work in America

  2. Mixbit says:

    Thank you very much for the invitation :). Best wishes.
    PS: How are you? I am from France 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

*

six + = fifteen