Odi's astoundingly incomplete notes
New entriesCode
back | nextSorting by domain
Sorting a list of email addresses by domain is as simple as
sort -t @ -k 2
PHP timeouts
Earlier this morning I posted an entry about configuration of the maximum execution time. The reason I have to do this is actually to work around a bad flaw in PHP.
I want to do some large file uploads, upto 100 MB. Over a slow connection this takes maybe half an hour. Normally the script timeout is set to something like 30 seconds, but you can change it inside the script. The catch here is, it does not work for me. The POST request takes around 30 minutes. So the first line of the script won't get executed before the timeout happens! So no way for me to prevent the timeout on the script level. Took me ages to figure out the actual problem...
PHP config woes
My today's PHP feature is about configuration. It's nice you can put special configuration for individual files into the Apache config file or an .htaccess
file.
php_value include_path ".:/usr/local/lib/php"
The config options of PHP are devided into two groups: normal ones and admin ones. Unfortunately there is no documentation which options are admin ones and which ones are normal ones....
So I had to find out painfully that max_execution_time
is an admin option. It is therefore not allowed in an .htaccess
file and must be set like so in httpd.conf
:
<Files script.php> php_admin_value max_execution_time 300 </Files>
RBL check in PHP
I am going to use blacklists for all the interactive parts on this website to get rid of comment spammers. If you would like to implement a similar thing, you can use the sample code blow:
function rbl_is_listed($ip, $rbl) { $comp = explode('.', $ip); if (sizeof($comp) != 4) die("Not a valid IPv4: $ip"); $revip = join('.',array_reverse($comp)); $result = gethostbyname("$revip.$rbl"); return ($result == '127.0.0.2'); } $ip = $_SERVER["REMOTE_ADDR"]; $rbl = 'bl.spamcop.net'; echo rbl_is_listed($ip,$rbl) ? 'listed' : 'not listed';
There are many DNS blacklists out there. Choose carefully.
Some use the last number as a status to explain the RBL listing reason. Perhaps look for the first 3 numbers:
127.0.0.
URL encoding in PHP
The list of flaws in PHP is neverending. My today's favourite is the PHP function urlencode
.
The flaw there: They are doing it wrong for non-ASCII characters.
STD-66 section 2.5 defines UTF-8 as the encoding to use for characters outside ASCII. RFC-2616 (HTTP) does not make any different definition. But the PHP function uses ISO-8869-1 as the encoding, which is just plain wrong.
Ambiguous semantics of Calendar.YEAR in GregorianCalendar
Today I had to file a bug report to Sun:
java.util.GregorianCalendar
uses the field Calendar.YEAR
in an ambiguous way. This problem surfaces when dealing with Calendar.WEEK_OF_YEAR
.
Calendar cal = new GregorianCalendar(); cal.set(Calendar.DAY_OF_WEEK, Calendar.MONDAY); cal.set(Calendar.YEAR, 2003); cal.set(Calendar.WEEK_OF_YEAR, 1);the YEAR field is interpreted together with the
WEEK_OF_YEAR field
as the year of the WEEK_OF_YEAR
value.
When getting the YEAR
field again:
assertEquals(1, cal.get(Calendar.WEEK_OF_YEAR)); assertEquals(2003, cal.get(Calendar.YEAR));the
YEAR
field is interpreted as the year of the date represented by the calendar.
This introduces a modality in two ways:
- the meaning of
YEAR
is different when setting and when getting - when settingm the meaning of
YEAR
is different depending ifWEEK_OF_YEAR
is set or not
The two years are not the same, which can be verified for Monday of week 1 of year 2003 which is Dec 30 2002.
As you can see, there are clearly two semantically differnt years associated with a calendar but they appear intermixed in one single YEAR
field currently. On reading the year of the week is completely missing and must be determined with some additional functionality outside the Calendar class (see Workaround).
Thus I request that the year of the date and the year belonging to the WEEK_OF_YEAR
be separated into individual fields. A good name would be for instance Calendar.YEAR_OF_WEEK
and Calendar.YEAR_OF_DAY
. Calendar.YEAR
should be deprecated for its ambiguous semantics.
This is related to Bug #4267450 which request a API for reading the year associated with a week in the DateFormat.
Update
The issue was filed as Bug #6218127 and turned down. Read their comment. Sun's stubbornness is just unbelievable!.NET vs. Java
Why PHP References are crap
$iterator =& $queue->iterator(); $list = array(); while ($iterator->hasNext()) { $obj =& $iterator->getNext(); $list[] =& $obj; }Will not do what you expect. It will merely screw up all the objects in
$queue
. $list[]
will then contain n identical references to the last object.
Yes PHP guys, you really make my day.