Archive

Posts Tagged ‘Development’

Bulk domain name resolver with perl

May 7th, 2010 Ivan Villareal No comments

I was in the need of getting the ip address from a long list of domain names, the domains were stored in a MySQL db, so I quickly turned into perl for making a script that got the list and then use an external command to resolve the domain.

At first I was going to use the dig command, but that would have taken to much effort, then I thought about using the ping command, but again to much effort, instead I’ve decided to leverage the IO::Socket::INET module, trying to open a socket to a domain name, will resolve the name and I could use that.

My implementation was very simple and it worked well, here is the relevant part:

 

 

 

#include the module
use IO::Socket::INET;
 
#create a socket to uncommon udp port, because tcp is likely to be firewalled
my $sock = IO::Socket::INET->new(
                           PeerAddr=> "$domain",
                           PeerPort =>40125,
                           Proto=>'udp');
 
#if we can create a socket, then we have an ip
if (defined $sock) {
         $result = $sock->peerhost;
         logOut("Success!");
    } else {
        logOut("Failed trying to connect to $domain");
        return;
    }

 

This was Pretty simple no?, here is the complete script, which gathers all the rows with no ip address and then it will try to get the ip, update the row and continue.

 

#!/usr/bin/perl
use strict;
use DBI;
use IO::Socket::INET;
 
####Global Variables###
 
our %db = ('host' => 'localhost','name' => 'domains','user' => 'ivan','pass' => '1234','port' => '3306');
our $dsn;
our $dbh;
 
#log function
sub logOut {
    my ($msg) = @_;
    print $msg . "\n";
}
 
#db connection
sub initialize {
    logOut("Connecting to db server on $db{'host'}");
    $dsn = "DBI:mysql:database=$db{'name'};host=$db{'host'};port=$db{'port'}";
    $dbh = DBI->connect($dsn, $db{'user'}, $db{'pass'}) or die('failed to connect');
}
 
#function to gather the ip fo a domain
sub getIpAddress {
    my ($domain) = @_;
    logOut("Trying to connect to: $domain...");
    my $result;
    my $sock = IO::Socket::INET->new(
                           PeerAddr=> "$domain",
                           PeerPort =>40125,
                           Proto=>'udp');
    if (defined $sock) {
         $result = $sock->peerhost;
         logOut("Success!");
    } else {
        logOut("Failed trying to connect to $domain");
        return;
    }
    return $result;
}
 
##This function
# gets all domains from domains table
sub getDomains {
    logOut("Fetching db for domains...");
    my $result = $dbh->selectall_hashref("SELECT domain_name FROM domains WHERE ip_address IS NULL", 'domain_name');
    logOut(keys( %$result ) . " domains returned.");
    my $updatedDomains = 0;
    my $noIpDomains    = 0;
    foreach my $id (keys %$result) {
        my $ip = getIpAddress($result->{$id}->{domain_name});
        if ($ip) {
            logOut("Ip found $ip for $result->{$id}->{domain_name}");
            my @values = ($ip, $result->{$id}->{domain_name});
            my $re = $dbh->do("UPDATE domains SET ip_address=? WHERE domain_name=?",undef, @values);
                if ($re == 1) {
                    $updatedDomains++;
                }
        } else {
            logOut("no ip for $result->{$id}->{domain_name}");
            $noIpDomains++;
        }
    }
    logOut("Updated Domains: $updatedDomains");
    logOut("Domains with no Ip found: $noIpDomains");
}
 
#script starts here
initialize();
getDomains();

Maybe there are more efficient methods but as I said, this worked pretty well, it didn’t consume a lot of results and it performs fast enough for my needs.

Sync mysql tables from one db to another with php

February 16th, 2010 Ivan Villareal No comments

I needed a process to move a table from one database to another, the databases are on separate servers, so renaming the table will not work, I did a little research and found this Table Syncer it seemed fine, the problem is that it was on ruby gem, and I said the problem because the server where this will be running don’t have this installed.

So instead of doing more research I spent that time developing my own php class to sync two tables on different server, I have little time to create this so I’m re-using a class I already did for handling several databases and it already has some useful methods, right now its 3:12pm I plan to finish this in 30 minutes lets see how it goes.

In the end I spent about an hour but I have a working file that can put in a cron job to sync the table on a regular basis.

I’ve changed the way I fetch the entire result, to avoid having problems exhausting the dedicated php memory with large arrays, instead I’m fetching one row at a time, another problem I had, was that the null values came blank so, when I was inserting on a unique index, I had an error, to avoid this I remove the empty values from the insert, and use the default table value instead, this results in a bit different tables Null instead of blanks, but because this isn’t important I didn’t want to spend more with this.

The final file, contains the db class and the sync class, and here it is:

< ?php
 
/**
* The purpose of this file is to sync
* tables between 2 databases
*/
 
class tableSync {
    private $_sourceDbUser = 'ivan';
    private $_sourceDbPass = 'pass';
    private $_sourceDbName = 'bounty';
    private $_sourceDbHost = 'localhost';
 
    private $_targetDbUser = 'ivan';
    private $_targetDbPass = 'pass';
    private $_targetDbName = 'testing';
    private $_targetDbHost = 'localhost';
 
    private $_sourceDb, $_targetDb, $_primaryKey, $_table;
 
    public function __construct($table)
    {
        $this->_sourceDb = new Db($this->_sourceDbUser, $this->_sourceDbPass, $this->_sourceDbName, $this->_sourceDbHost);
        $this->_targetDb = new Db($this->_targetDbUser, $this->_targetDbPass, $this->_targetDbName, $this->_targetDbHost);
        $this->_table = $table;
        $this->begin();
    }
 
    public function begin()
    {
      //Check that the table exists
        $sql = 'SHOW TABLES';
        $tbl = $this->_targetDb->fetchAssoc($sql);
 
        if (!in_array($this->_table, current($tbl))) {
                //create the table
                $sql = "SHOW CREATE TABLE `".$this->_table."`";
                $tbl = $this->_sourceDb->fetchAssoc($sql);
                $createQry = $tbl[0]['Create Table'];
                $res = $this->_targetDb->query($createQry);
        }
 
        //Get The primary Key
        $sql     = "SHOW indexes FROM ".$this->_table." WHERE Key_name = 'PRIMARY'";
        $indexes = $this->_sourceDb->fetchAssoc($sql);
        $this->_primaryKey = $indexes[0]['Column_name'];
        //Query source  table
        $sql     = 'SELECT * FROM '.$this->_table;
        $rowset  = $this->_sourceDb->query($sql);
        while ($row = mysql_fetch_assoc($rowset)) {
            //remove empty fields
            $newRow = array();
            foreach ($row as $field => $value) {
                if ($value != '') {
                    $newRow[$field] = $value;
                }
            }
            if ($this->_primaryKey != '') {
                echo "Inserting row ".$row[$this->_primaryKey] . "\n";
                $res = $this->_targetDb->insertUpdate($this->_table, $newRow, $this->_primaryKey, $row[$this->_primaryKey]);
            } else {
                $res = $this->_targetDb->queryInsert($this->_table, $newRow);
            }
        }
    }
}
 
class Db
{
	private	$_link;
	private $_affectedRows;
	private $_lastQueryStatus;
 
	/**
	* Instantiate the object
	**/
	public function __construct( $user, $pass, $dbName, $host = 'localhost')
	{
            $this->_link = mysql_connect($host, $user, $pass, true);
            mysql_select_db($dbName, $this->_link) or die('Could not select database');
            if (mysql_error()) {
                    printf("Connect failed: %s\n", mysql_error());
                    exit();
            } else {
                $sql = "SET NAMES `utf8`";
                mysql_query($sql, $this->_link);
                mysql_query("SET CHARACTER SET 'utf8';", $this->_link);
            }
	}
 
        public function query($sql)
	{
            $this->_lastQueryStatus = @mysql_query($sql, $this->_link) or die('Query failed: ' . mysql_error() . '<br /> SQL: '. $sql);
            if (!$this->_lastQueryStatus) {
                    $this->error("<b>MySQL Query fail:</b> $sql");
            }
            $this->_affectedRows    = @mysql_affected_rows();
            return $this->_lastQueryStatus;
	}
 
        public function fetchAssoc($sql)
	{
            $result = $this->query($sql);
            $table = array();
            while ($row = mysql_fetch_assoc($result)) {
                    $table[] = $row;
            }
            return $table;
	}
 
        public function escape($string)
	{
		if(get_magic_quotes_gpc()) $string = stripslashes($string);
		return mysql_real_escape_string($string);
	}
 
        public function fetchField($table, $field, $where = '1')
	{
	    $sql = "SELECT `$field` FROM `$table` WHERE $where";
	    $result = $this->query($sql);
	    $field = mysql_fetch_row($result);
	    return $field[0];
	}
 
        public function insertUpdate($table, $data, $field, $uniqueValue)
	{
	    $where = is_int($uniqueValue) ? "$field = $uniqueValue" : "$field = '".$this->escape($uniqueValue)."'";
	    $e = $this->fetchField($table, $field, $where);
	    if ($e) {
	        $res = $this->queryUpdate($table, $data, $where);
	    } else {
	        $res = $this->queryInsert($table, $data);
	    }
	    return $res;
	}
 
        public function queryInsert($table, $data)
	{
            $q="INSERT INTO `".$table."` ";
            $v='';
            $n='';
 
            foreach($data as $key=>$val) {
                    $n.="`$key`, ";
                    if(strtolower($val)=='null') $v.="NULL, ";
                    elseif(strtolower($val)=='now()') $v.="NOW(), ";
                    elseif(strtolower($val)=='utc_timestamp()') $v.="UTC_TIMESTAMP(), ";
                    else $v.= "'".$this->escape($val)."', ";
            }
            $q .= "(". rtrim($n, ', ') .") VALUES (". rtrim($v, ', ') .");";
            if($this->query($q)){
                    $result = mysql_insert_id();
            } else {
                    $result = false;
            }
            return $result;
	}
 
        public function queryUpdate($table, $data, $where='1')
        {
            $q="UPDATE `".$table."` SET ";
 
            foreach($data as $key=>$val) {
                if(strtolower($val)=='null') $q.= "`$key` = NULL, ";
                elseif(strtolower($val)=='now()') $q.= "`$key` = NOW(), ";
                else $q.= "`$key`='".$this->escape($val)."', ";
            }
            $q = rtrim($q, ', ') . ' WHERE '.$where.';';
 
            return $this->query($q);
        }
}
 
//Instantiate the tableSync sending the table name I want to sync
$sync = new tableSync('tod_whois_raider');

Categories: Development, PHP Tags: , ,

Mysql unique keys and collations

October 7th, 2009 Ivan Villareal No comments

I’m working on an application that stores unique values on a field, so I used the unique index for this column, everything was ok, until I’ve started inserting multi byte characters in it.

The charachters inserted fine, the problem was that I’m using INSERT … ON DUPLICATE KEY UPDATE syntax to avoid looking if the field exists, so after debugging where the application was breaking I got this:

Query failed: Cannot add or update a child row: a foreign key constraint fails (`trubaseek/mapKeywordName`, CONSTRAINT `fkDomainMapNameId` FOREIGN KEY (`NameId`) REFERENCES `names` (`nameId`) ON DELETE NO ACTION ON UPDATE NO ACTION)<br> SQL: INSERT INTO `mapKeywordName` (`keywordId`, `nameId`) VALUES (’1′, ‘326′);

so I’v spent about an hour changing the encodings without success, but after some testing I discovered that hôteles was the same as hotelës, I did several testing from the linux console, the mysql browser, and from a php script, and all revealed the same.

Whe I did this

INSERT INTO `names` (`name`, `price`, `priceCurrency`, `bids`, `traffic`, `lastUpdate`) 
VALUES ('hotelës', '0', '$US', '0', '0', NOW()) 
ON DUPLICATE KEY UPDATE `price`='0', `priceCurrency`='$US', `bids`='0', `traffic`='0', `lastUpdate` = NOW();

I’ve got this:

Query OK, 2 rows affected (0.01 sec)

Called from a php script mysql_insert_id(); returned me an Id like if it was inserted not updated, but when if I select the returned Id, an error were thrown, because it didn’t existed.

So after some research, I’ve found that adding a collation utf8_bin to that column fixed this issue.

I haven’t tested this enough but for now it appears to be working.

Categories: Development Tags: , , ,

Parsing HTML documents with Simple Html DOM

September 30th, 2009 Ivan Villareal No comments

I had to get some information from an XHTML document, the information required was on nested tables, I’ve started working on that and my first approach was to get the needed info using regular expressions.

After 2 hours I’ve got a set of regex that worked pretty well, but when I received a new document, the layout was changed a little bit, and my didn’t worked as expected, the content was on the same table but this time it had nested tables, which were a really big problem.

I already have used the PHP DOM, but only on xml files, I didn’t was aware that it had the ability to parse HTML, so I’ve started working with this and within an hour I had it working, and this time the changes in the document didn’t affect the scrapping.

In about an hour I had a larger class with several methods to get all the elements I needed, but suddenly I was presented with another challenge, again with nested tables, sometimes the number of childs were shorter than expected, I’ve experimented several things until I found this Simple Html Dom it is pretty straight forward, and it does an excellent job scrapping html documents, all the methods I did were replaced by this:

 
            $html   = curl_exec($ch);
            $dom = new simple_html_dom();
            $dom->load($html);
            $items = array();
            $tabla = $dom->find('table[cellpadding^=2]', 0);
 
            foreach ($dom->find('table[cellpadding^=2]') as $table) {
                foreach ($table->find('tr') as $tr) {
                    $link = trim($tr->find('a', 0)->title);
                    if ($link) {
                        $item['item']  = $link;
                        $item['price'] = trim($tr->children(2)->plaintext);
                        $item['bids'] = trim($tr->children(3)->plaintext);
                        $items[] = $item;
                    }
                }
            }

Categories: Development Tags: , , ,