Archive for February, 2010

Chip? No, Thanks

An awesome example of how banks and the payment industry try to screw customers.

In the Netherlands, the payment cards handed off by the banks usually have a magnetic strip and a smart card chip. However, stores here gladly accept the magnetic strip and I’ve never been requested to use the chip. I only had to use the chip in Italy, where angry clerks would say “no, turn the card around, can’t you see the chip?”. Apparently, the chip is going big everywhere else in Europe, and banks are marketing the EMV system – the technology behind the payment with the chip – as a safe way to prevent fraud.

They don’t tell you, however, that the EMV system has a serious design flaw which makes the magnetic strip a safer alternative. And they don’t tell you that if you get frauded, you lose the money.

My colleague Radi, in fact, just wrote a post about this paper that shows how a stolen chip card can be used in a successful transaction without knowing the PIN. Thew worst part is that by using this method, the bank thinks that the PIN was used during the transaction, and if the bank thinks you used the PIN, then, legally, you are responsible for the fraud.

The whole issue behind this flaw is that the PIN verification is left to the card, and the bank never sees the PIN. This is different from the magnetic strip method, in which the PIN is sent – encrypted – to the bank for verification.

This is why I just put a piece of plastic tape on my chip. The next time a clerk takes my card and inserts it face down in the chip reader, I will smile and say “No, thanks – the chip is broken”.

Touchdown: HiSam Will Be Going Live Soon!

A few days ago I finished the exhausting re-labeling effort that I had talked about previously, and I started running tests to see where I finally stand against my stated goal.

At first the results were a bit discouraging. With the final set of labeled documents – 2,000 sentences added to the previous 2,500 sentences – the overall F score went down a bit. I was kind of expecting this though, as the more training data from heterogeneous domains you add to the mix, the more variance you have in the thing being learned, resulting in both more potential for errors and more difficulty in learning. This fact is nicely explained by the following excerpt from this paper on static code analysis tools:

The result of summing many independent random variables? A Gaussian distribution, most of it not on the points you saw and adapted to in the lab. Furthermore, Gaussian distributions have tails. As the number of samples grows, so, too, does the absolute number of points several standard deviations from the mean. The unusual starts to occur with increasing frequency.

The final overall F score – 0.864 at 90% – is still far away from my goal of 0.900 at 60%, and the entity-specific F scores (e.g. 0.900 for GeoLocation entities and 0.915 for Person entities) are far from the F scores boasted by research projects in entity extraction – which are all around 0.93.

So, this very morning I decided to do a real-world test: I took a few articles from CNN, fed these to my HMM, and observed the results. I was astonished!!!!!! The little guy did extremely well with these pieces of text it had never seen before. Here are a few examples – colors correspond to entity types and numbers indicate the probabilities of the extracted entities:

Example 1:

Greene: ” This is about the limitless capacity of the human heart. ” Bob Greene says a small town in Ohio is one of the most inspiring places in the United States.

  • Greene (Person: 5.99677679935634E-05)
  • Bob Greene (Person: 1.25848620925595E-07)
  • Ohio (GeoLocation: 0.001397929451232)
  • United States (GeoLocation: 0.00286421623850843)

Example 2:

Until, on July 20, 1969, Neil Armstrong, of Wapakoneta, walked on the moon.

  • July 20 , 1969 (Time: 0.000150556184495453)
  • Neil Armstrong (Person: 1.33506658714912E-07)
  • Wapakoneta (GeoLocation: 6.91960283284816E-07)
  • moon (AstronomicalPlace: 0.351350422734393)

Example 3:

A soldier mans a weapon at the rear of a U.S. Army helicopter over Afghanistan in May.

  • U.S. Army (Organization: 3.40883387762237E-06)
  • Afghanistan (GeoLocation: 0.000349482362808)
  • May (Time: 0.00299625468107284)

Example 4:

Senate Judiciary Committee considers Sotomayor nomination on Tuesday.

  • Senate Judiciary Committee (Organization: 3.66993148614553E-10)
  • Sotomayor (Person: 5.60905081933395E-07)
  • Tuesday (Time: 0.0389513108539469)

So, why the poor F score and the good results? Well, I think I’ve found the explanation. As I said here, when I calculate the performance of my HMMs I’m being Nazi with myself: all the papers I’ve read, in fact, count the number of tokens correctly tagged by their systems, while I count the number of correct tags. This means that when my HMM extracts “Ohio” from “I’m going to Northern Ohio”, I count that as zero recall – the expected tag is “Northern Ohio” and my guy hasn’t found it. On the other hand, research papers would count that as one token out of two, which yields a 0.5 recall.

With this in mind, the results are so good that I’ve decided to set in motion the “release” machine. It took me a couple of years but the first piece of HiSam will finally be live soon!!!!

These are the last TODO items before I start working on the commercial offering:

  1. Add an option to calculate the F score using the research papers’ method, and compare this score to their score;
  2. Label a few more documents in order to reach better stability and see whether the learning curve shifts up;
  3. Compress the XML serialization of the model – the current XML takes up 800Mb of disk space and takes forever to load…

Canary Hell

I’m finally back from my London stay, and I’m finding the time to answer a question that many asked me with incredulity: how it comes it took you 1 and ½ hours to commute between Chelsea and Canary Wharf?!?!?

Guess what, this is the Canary Wharf tube station at the time I used to get there:

(photo by Radi)

IMHO, getting from one point to the other among this crowd in 1 and ½ hours seems an achievement to me…but still, I had to endure skeptic eyes over and over while I was complaining about the commute.



Follow

Get every new post delivered to your Inbox.