Tuesday, October 26, 2010

Perl library for Google Safe Browsing v2

Google offers URL blacklists to identify malicious websites and phishing sites through the Google Safe Browsing API. It is used in pretty much all browsers (Firefox, Safari, etc.), except Internet Explorer. Version 2 has been available for a few months, but there are only 2 implementations thus far: Python (from Google) and PHP.

Version 1 and 2 are supposed to provide the same coverage, but I've found that Google Safe Browsing v2 lists are updated much more frequently than v1. Since I'm using the API for several of my Perl projects, I wanted to get up to date information from Google. Until now, there was a Perl implementation for v1 only.

Net::Google::SafeBrowsing2, which I've developed, is the first implementation of the Google Safe Browsing v2 API for Perl and I've made it available in CPAN.

Here is a quick example on how to use it:

  use Net::Google::SafeBrowsing2;
  use Net::Google::SafeBrowsing2::Sqlite;
  
  my $storage = Net::Google::SafeBrowsing2::Sqlite->new(file => 'google-v2.db');
  my $gsb = Net::Google::SafeBrowsing2->new(
    key     => "my key", 
    storage => $storage,
  );
  
  $gsb->update();
  my $match = $gsb->lookup(url => 'http://www.gumblar.cn/');
  
  if ($match eq MALWARE) {
    print "http://www.gumblar.cn/ is flagged as a dangerous site\n";
  }

Version 0.1

0.1 is the first version available on CPAN. It does not include Message Authentication Code (MAC), but otherwise it is fully functional. I've been using the library successfully for a couple of weeks now. There may be a bugs left, as more unit tests are required. I encourage you to help me further develop the code. You can report bugs by posting comments here, or send me an e-mail.

Despite the low version number, the library works well!

Multiple backends

Net::Google::SafeBrowsing (API version 1) only uses Sqlite to store the database locally. My new library works with multiple backends: Sqlite, MySQL, Memcached, etc. I have uploaded a storage module which uses Sqlite to CPAN. I hope others will develop new backends. Check the documentation of Net::Google::SafeBrowsing2::Storage to create your own module.

Documentation and examples are available on CPAN.

-- Julien

4 comments:

Luis Alberto said...

Hi Julien, I've installed your library and following the example that you provide. However, every time that I run it the DB storage file is increasing ~20M and the website "http://www.gumblar.cn" is not finding as suspicious.
This is my output:
---
DBD::SQLite::db do failed: table updates already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 128.
DBD::SQLite::db do failed: table a_chunks already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 143.
DBD::SQLite::db do failed: index a_chunks_hostkey already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 150.
DBD::SQLite::db do failed: index a_chunks_num_list already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 158.
DBD::SQLite::db do failed: table s_chunks already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 174.
DBD::SQLite::db do failed: index s_chunks_hostkey already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 181.
DBD::SQLite::db do failed: index s_chunks_num already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 188.
DBD::SQLite::db do failed: table full_hashes already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 204.
DBD::SQLite::db do failed: index hash already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 213.
DBD::SQLite::db do failed: table full_hashes_errors already exists(1) at dbdimp.c line 271 at /Library/Perl/5.8.8/Net/Google/SafeBrowsing2/Sqlite.pm line 228.
http://www.gumblar.cn/ is NOT flagged as a dangerous site
---

May I have your opinion about what is wrong?

Thanks in advance.

Luis A Perez.

Julien Sobrier said...

@Luis What version of DBI are you using? I'm using 1.6.11. It looks like the tables() function does not work for you, so it tries to re-create the tables.

I should mention in the doc that it takes several updates before the database is up to date. For reference, my database is about 47MB.

Luis Alberto said...

Hi Julien, thank you very much for your answer.

I'm using the DBI version 1.615 (Few days ago I upgraded it).
You're right, it's not finding the tables at Sqlite.pm.init() and then it tries to create them.
I'm doing some small changes in your code to make it works in my environment:
e.g.,
instead:
if (! defined first { $_ eq '"main"."updates"' }...

I wrote:
if (! defined first { $_ eq '"updates"' } ...

(maybe it should be: $_ eq '"main"."updates"' OR eq "updates"')

I'll let you know once that it woks.

Thanks.

Luis A Perez.

Julien Sobrier said...

@Luis I've uploaded a new version of the library to CPAN, it should be available in a few hours. It has the change you suggested for SQlite, and it supports MAC.