Odi's astoundingly incomplete notes

Code

back | next

No URL alias in Tomcat?

Is there seriously no way to create an arbitrary (i.e. cross-context) alias for an URL in Tomcat? I don't want a HTTP redirect, I don't want inefficient reverse-proxying, I don't care about cookie security (as cookies are not used), I don't want HTTPD in front. Just a simple stupid alias. That is the alias URL calls the same servlet code as the aliased URL. Also known as URL rewriting.

I want: http://localhost:8080/old-service -> http://localhost:8080/axis/services/new-service

Filters and not even Valves can do that, because implementations must not "change request properties that have already been used to direct the flow of processing control for this request". Seriously? 2010?

Reason: I have a webservice that has been migrated to a completely different infrastructure, and thus a different context. The clients should not need to be changed. And the clients don't support redirects.

posted on 2010-04-21 08:59 UTC in Code | 1 comments | permalink

Why not run Tomcat in Apache 2.0 and then you can do your aliasing from Apache 2.0... ref aliasing in tomcat, I'm not sure on that (I assume from your rant that it isn't possible).

Best regards

Steve

Add comment

Change java process name

at least under Linux you can do that painlessly from bash:
exec -a myapp java com.example.MyApp

And if you don't want to replace the current shell, execute it in a subshell by using parantheses:
(exec -a myapp java com.example.MyApp)

posted on 2010-04-14 15:19 UTC in Code | 0 comments | permalink

Add comment

OOM killer is not for userspace

When you look at Android it seems that they are relying on the Linux Out-Of-Memory (OOM) killer a lot. Also on the kernel mailing list every now and then someone sends in "improvements" for the OOM killer, so that it selects a more suitable task to kill. To me this looks a severely flawed concept.

The OOM killer is a last resort instrument for the Linux kernel when it runs out of memory. A situation that should ideally never occur. It is a desparate act of the kernel when it has to kill a userspace task in order to get back some memory to satisfy another allocation request. It is worth noting that the allocation request can be from within the kernel or from userspace.

The OOM killer is NOT an instrument for userspace to terminate unused applications. The kernel has insufficient information about the user's perception of the system. So it will always be poor at chosing an appropriate application to kill, no matter how much fancy heuristics you are trying to stuff into the OOM killer. Should it kill httpd? Maybe yes on a developer machine where a self-written Apache module went mad -- maybe not such a bright choice on a production webserver. Should it kill the window manager? No problem on a webserver. Not really a good choice on a desktop workstation. Should it kill the task that requested the memory? Might be clever. Might as well be horribly stupid and lead to thrashing. Should it rather kill a minimized editor that has unsaved files open or the video player that is playing a DVD?

In the end it would be much wiser to let userspace manage this problem on its own. Userspace can monitor memory use. It can have a lot of information from the desktop environment, user interaction, system profile, hardware information and make a much better decision whether it's wise to kill an unused background task. Also userspace can act much earlier than the OOM killer. When memory use reaches a limit it could actively signal applications to save memory, terminate gracefully or kill inactive background stuff based or even interact with the user and ask which one to close. And you can easily customize and exchange the logic.

KDE is already trying to detect unresponsive tasks and kill them. For instance when you click the close button of a window but the window doesn't close after a while, KDE will prompt you to kill the task.

So why are people still building userspace that relies on the OOM killer instead of trying to avoid that the OOM killer ever has to kick in?

posted on 2010-04-06 19:19 UTC in Code | 0 comments | permalink

Add comment

Slow to connect to Samba? Check your packet filter!

I am currently setting up a simple new Samba server on Gentoo. A Windows XP box took forever to connect to the share however. The reason for this is interesting. Apparently the Windows SMB client first tries to access the remote server via WebDAV (HTTP). But on the Samba box there is no HTTP server. Instead an iptables rule is in place to reject connections for non-open ports:

-A INPUT -p tcp -m tcp --syn -j REJECT --reject-with icmp-port-unreachable

The long timeout is easily reproducible on a Windows console with telnet. Of course you would expect timeouts when using a DROP target, as the client is not informed that the port is not open. So I was trying to be clever and send an ICMP message to inform the client. Turns out this is wrong. Closed TCP ports should send a RST packet instead:

-A INPUT -p tcp -m tcp --syn -j REJECT --reject-with tcp-reset

The complete chain of rules (at the end of the rule set) for correctly dropping packets is:

# drop broadcast packets
-A INPUT -m pkttype --pkt-type broadcast -j DROP
# TCP ports that are not open
-A INPUT -p tcp -m tcp --syn -j REJECT --reject-with tcp-reset
# reply with reject to closed UDP ports
-A INPUT -p udp -j REJECT
# drop rest
-A INPUT -j DROP

posted on 2010-03-01 11:34 UTC in Code | 3 comments | permalink

This fixed the super long delay to connect problem I was having (I just set iptables to reject by default instead of drop and now I don't have to wait like I used to). Thanks for the info!

you can disable the iptables, if it is still slow to login, then you can check the /etc/hosts, it should be 127.0.0.1 your_hostname localhost...
and ip(such as 192.168.1.3) your_hostname

I disagree with the former comment: it is a severe mistake to map your own hostname to 127.0.0.1! Old RedHat installs did that for years and it has caused lots of trouble along the road, but recent versions have fixed that. Odi.

Add comment

fix future file timestamps

When installing a new machine you often notice too late that the clock is wrong. Then you may have already created files with a future timestamp. This is bad as soon as you set your clock correctly. Now make will complain and may even fail to build correctly. Here is how to fix it:

create a file with current timestamp: touch now
fix all future files: find / -mount -newer now -print0 | xargs -0 touch
we no longer need that file: rm now

Mind the order of the find options!
posted on 2010-02-25 17:10 UTC in Code | 1 comments | permalink

Thanks! Worked like a charm!

Add comment

Fundamental communication theorem

Every programmer should know this one: Any protocol over an unreliable medium (such as a network) either allows for losing a message or accepting duplicate messages. There is nothing in between. You can't have both at the same time (see below for an explanation).

This doesn't just apply to individual network packets unfortunately (TCP already handles that case fine). But it also applies to larger messages spanning multiple packets: HTTP, SMTP, messaging protocols such as JMS or those of any proprietary SOA product (MQSeries, ActiveMQ, etc.), remote database protocols, etc.

Even a simple HTTP GET request exhibits the problem: As long as the client hasn't read the "200 OK" status code, it can't even know if the request has reached the server. So in an absence of that status code it would have to retry the request, resulting in a possibly duplicate request on the server.

This simple fact has direct and heavy impact on transactional behaviour: you will have to embedd additional data in your protocol to handle loss, misordering and duplicates. If you don't do that your protocol is not transactionally safe. You will lose data or end up with duplicate execution of the same transaction.

How can you secure your protocol?

Transaction tokens: the client has to acquire a transaction token from the server and can use that token only once.
Message sequence numbers: the client sends a unique sequential number with every message. If it has to repeat a message it uses the same sequence number again. The server stores the last used sequence number. If it detects a repeated message, it just replays the last response without doing anything. If it detects an older sequence number it discards the message. If it detects a higher sequence number server and client are out of sync and must renegotiate sequence numbers. NB: timestamps are usually insuffient as sequence numbers because of their limited precision and you can't detect loss.

Explanation
"Unreliable medium" means that messages may be lost or invalidated (scrambled) on the way. So a protocol may choose to detect the message loss. The loss may be detected sender and/or recipient side:

recipient sees an out of order message or a gap in the message sequence
sender gets a negative acknowledgement (NAK) from the sender, or doesn't see an acknowledgement (ACK) from the sender for a certain time

Unfortunately that detection is always unreliable as well and it will detect slightly more incidents than actually happened. So the protocol will detect a message loss when in fact the message was received fine. What does it do if a message is lost? Of course it will have to repeat it. Thus duplicates may occur.
posted on 2010-02-23 15:31 UTC in Code | 0 comments | permalink

Add comment

UTF-8 vs. UTF8

You may have wondered whether the "correct" name of the character set is "UTF-8" or "UTF8". Both seems to work fine in Java. But what about these names in exchanged data like XML files, HTTP Content-Types etc.?

IANA has the answer. In short: always use "UTF-8". "UTF8" is just a private alias used by the JDK, but not a standardized name. The same goes for ISO encodings: "ISO-8859-1" is the name defined by IANA, "ISO8859_1" is the alias of the JDK.

posted on 2010-02-23 12:45 UTC in Code | 3 comments | permalink

I honestly and truelly didn't know! I thought they were both the same, pure evil :). Many thanks.

KV

Good , thanks!!!

thanks. sure cleared that up for me

Add comment

grub and md raid1

There is a little trick necessary to use grub to boot from a software RAID-1 (md). Certain fakeraid devices don't boot properly if the disks don't contain the exact same data in the MBR.

So here is how to install grub on the grub shell:

# first disk
root (hd0,0)
setup (hd0)

# second disk
device (hd0) /dev/sdb
root (hd0,0)
setup (hd0)

posted on 2010-02-09 11:51 UTC in Code | 0 comments | permalink

Add comment

Don't use gdbm

GDBM is a tiny embedded database. Convenient as an internal data store for applications. However, as you can see from this Gentoo Bug, the database format is highly dependent on the architecture and even compiler flags. So it is far from portable. You can not necessarily exchange its files between platforms or different machines of the same platform running different OS versions. It may even break like in this case during a simple upgrade, when the configure flags change.

Having such an unstable database format is inherently bad architecture. I recommend not to use this library in any application. It will cause someone a bad day.

posted on 2010-01-05 10:58 UTC in Code | 0 comments | permalink

Add comment

Unexpected Date.before() / .after() performance impact

The java.util.Date class has two convenience methods before() and after(). If your application is very intensive on these Date operations, be careful of a performance bottleneck here!

Each call to before/after (and even getTime()) may have the cost of an internal object allocation. However not always. If you had called toString() on the instance before, then an object allocation will happen. Otherwise probably not.

To understand this we need to dive into the source code of Date. If you think about it, Date should be nothing more than a wrapper for a long variable holding a system timestamp. This variable is called fastTime. However the Date class has some deprecated methods that deal with a calendar date. These methods are from a time when original developers were confused about Date and Calendars, but have since been deprecated. But to support them Date still needs an internal Calendar instance of type sun.util.calendar.BaseCalendar.Date which is held in the variable cdate. Today this variable is usually null. That is as long as none of the deprecated interface has been used on the instance! Whenever it needs to convert the fastTime to a real date it will instantiate a BaseCalendar.Date in cdate. The problem is that toString() also performs a conversion to a real date. Too bad, toString() is very handy for logging. So any Date instance whose toString() method has been called once, will be slow and a memory waster.

The before/after methods are particularly bad because they will clone the cdate object on each call! The getTime() method is not as bad, but may still allocate a GregorianCalendar object.

My advice is, to never use Date.toString(), or completely work with long timestamps directly.

The bug has been reported to Sun.

posted on 2009-12-07 10:01 UTC in Code | 0 comments | permalink

Add comment

back | next