[242616 views]

[]

Odi's astoundingly incomplete notes

New entries

Code

back | next

Sorting by domain

Sorting a list of email addresses by domain is as simple as

sort -t @ -k 2

posted on 2005-02-18 10:41 UTC in Code | 0 comments | permalink

PHP timeouts

Earlier this morning I posted an entry about configuration of the maximum execution time. The reason I have to do this is actually to work around a bad flaw in PHP.

I want to do some large file uploads, upto 100 MB. Over a slow connection this takes maybe half an hour. Normally the script timeout is set to something like 30 seconds, but you can change it inside the script. The catch here is, it does not work for me. The POST request takes around 30 minutes. So the first line of the script won't get executed before the timeout happens! So no way for me to prevent the timeout on the script level. Took me ages to figure out the actual problem...


posted on 2005-02-04 11:29 UTC in Code | 0 comments | permalink

PHP config woes

My today's PHP feature is about configuration. It's nice you can put special configuration for individual files into the Apache config file or an .htaccess file.

php_value include_path ".:/usr/local/lib/php"

The config options of PHP are devided into two groups: normal ones and admin ones. Unfortunately there is no documentation which options are admin ones and which ones are normal ones....

So I had to find out painfully that max_execution_time is an admin option. It is therefore not allowed in an .htaccess file and must be set like so in httpd.conf:

<Files script.php>
  php_admin_value max_execution_time 300
</Files>

posted on 2005-02-04 11:23 UTC in Code | 0 comments | permalink

RBL check in PHP

I am going to use blacklists for all the interactive parts on this website to get rid of comment spammers. If you would like to implement a similar thing, you can use the sample code blow:

function rbl_is_listed($ip, $rbl) {
  $comp = explode('.', $ip);
  if (sizeof($comp) != 4) die("Not a valid IPv4: $ip");
  $revip = join('.',array_reverse($comp));
  $result = gethostbyname("$revip.$rbl");
  return ($result == '127.0.0.2');
}

$ip = $_SERVER["REMOTE_ADDR"];
$rbl = 'bl.spamcop.net';
echo rbl_is_listed($ip,$rbl) ? 'listed' : 'not listed';

There are many DNS blacklists out there. Choose carefully.


posted on 2005-02-02 15:32 UTC in Code | 1 comments | permalink
Not all RBLs return 127.0.0.2
Some use the last number as a status to explain the RBL listing reason. Perhaps look for the first 3 numbers:
127.0.0.

URL encoding in PHP

The list of flaws in PHP is neverending. My today's favourite is the PHP function urlencode. The flaw there: They are doing it wrong for non-ASCII characters.

STD-66 section 2.5 defines UTF-8 as the encoding to use for characters outside ASCII. RFC-2616 (HTTP) does not make any different definition. But the PHP function uses ISO-8869-1 as the encoding, which is just plain wrong.


posted on 2005-01-31 17:26 UTC in Code | 0 comments | permalink

Ambiguous semantics of Calendar.YEAR in GregorianCalendar

Today I had to file a bug report to Sun:

java.util.GregorianCalendar uses the field Calendar.YEAR in an ambiguous way. This problem surfaces when dealing with Calendar.WEEK_OF_YEAR.

When setting the week of the year:
        Calendar cal = new GregorianCalendar();
cal.set(Calendar.DAY_OF_WEEK, Calendar.MONDAY);
cal.set(Calendar.YEAR, 2003);
cal.set(Calendar.WEEK_OF_YEAR, 1);
the YEAR field is interpreted together with the WEEK_OF_YEAR field as the year of the WEEK_OF_YEAR value. When getting the YEAR field again:
        assertEquals(1, cal.get(Calendar.WEEK_OF_YEAR));
assertEquals(2003, cal.get(Calendar.YEAR));
the YEAR field is interpreted as the year of the date represented by the calendar. This introduces a modality in two ways:

The two years are not the same, which can be verified for Monday of week 1 of year 2003 which is Dec 30 2002.

As you can see, there are clearly two semantically differnt years associated with a calendar but they appear intermixed in one single YEAR field currently. On reading the year of the week is completely missing and must be determined with some additional functionality outside the Calendar class (see Workaround).

Thus I request that the year of the date and the year belonging to the WEEK_OF_YEAR be separated into individual fields. A good name would be for instance Calendar.YEAR_OF_WEEK and Calendar.YEAR_OF_DAY. Calendar.YEAR should be deprecated for its ambiguous semantics.

This is related to Bug #4267450 which request a API for reading the year associated with a week in the DateFormat.

Update

The issue was filed as Bug #6218127 and turned down. Read their comment. Sun's stubbornness is just unbelievable!
posted on 2004-12-29 16:40 UTC in Code | 0 comments | permalink

.NET vs. Java

Some intriguing arguments for Java were expressed by a Slashdot reader today.
posted on 2004-12-24 10:20 UTC in Code | 0 comments | permalink

Why PHP References are crap

$iterator =& $queue->iterator();
$list = array();
while ($iterator->hasNext()) {
  $obj =& $iterator->getNext();
  $list[] =& $obj;
}
Will not do what you expect. It will merely screw up all the objects in $queue. $list[] will then contain n identical references to the last object. Yes PHP guys, you really make my day.
posted on 2004-12-22 10:50 UTC in Code | 0 comments | permalink
back | next