Security: Don’t blindly trust $_SERVER variables

Article one: Chris Shiflett: SERVER_NAME Versus HTTP_HOST
Article two: Sean Coates: XSS Woes

I thought both were quite insightful, though Sean’s stands-out more because PHP_SELF is used so often (be sure to check-out his phpinfo(); example at the end of the post). One method, as he explained, is to put the $_SERVER[‘PHP_SELF’] variable into htmlentities() so at least you aren’t potentially echoing dangerous output.

Heed Chris’s advice: treat $_SERVER variable just as you would $_GET and $_POST – all have the potential to be tainted.

Digital Fish Library in the news!

The DFL is in the news again!

Included in the most recent updates: News segment from NBC 7/39 in San Diego, and the press release at the SIO website.

Also, link to our little spot in Nature (one of the most prestigious scientific journals in the world). Nature 440, 396-397 (23 March 2006)


The state of PHP Frameworks

Below is a comment I posted on a friend’s dev blog. Looks like he’s turning RoR because RoR just seems pretty sweet to him (originally a Perl programmer, so he welcomes the improvements Ruby’s brought to the field). Just a bit of background: Doug is my right-hand man on Flood’s new website. He’s pretty new to PHP, but has done a really good job picking it up. I should also mention he’s a FANTASTIC DB guy. If you want to query it, he can probably figure it out.


PHP has several pretty good frameworks, but that shouldn’t be a problem for people in the PHP community. We’re used to having options, and lots of them. After listening to a few talks about the new Zend Framework, I think it’s going to be a pretty big deal fairly soon. What’s most exciting, for me at least, is the fact that the community isn’t being snobbish about the ways things should be done – they aren’t sticking to any one game plan (like copying one language or another). Rather, they’re pulling the best features from many languages and combining them into the framework. So RoR may be great, but it also has shortcomings to it, which the ZF community is making sure they don’t repeat.

RoR has some great things about it, or so I hear. I don’t doubt it one bit, but there is soon to be stiff competition from the PHP end. I also like to console my self, knowing that the RoR camp tends to have a few very vocal cheerleaders, so maybe it’s not as big as it may seem. But for now, Ruby just happens to have the “best” (most developed?) framework on the web. I say good for Ruby. In the end what matters most is that FOSS becomes better by building on the strengths of other projects.

–end post–

Most popular posts

So far it looks like my NVDIA Fedora Core 4 video driver install post is by far the most popular. Before we know it, I’ll probably end up doing a Fedora Core 5 NVIDIA video driver install tutorial (though I can’t imagine it would be much different).

Coming in a close second is the rsync backup tutorial from not so long ago. It has actually been working really well so far. The scripts run faithfully at their predestined times and about 2.5 hours later I have a full backup.

A distant third is the Zend Framework review I wrote. Honestly, there probably aren’t a whole lot of people interested in the topic, and I could have done a better job. There just seemed to be some problems installing it that went unresolved (and I didn’t quite have the time to figure them all out before my mini review). I have some new fodder for discussion, though, now that my good friend Doug is now developing a site in RoR for work.

Please comment on the content of this blog. I’d really like to begin posting more about things people are interested in. Certainly I’ll continue to post on all sorts of things I find myself doing, but it also wouldn’t hurt to expand on some more than others. One thing to come will be deploying a site on a 50-node rackmount server farm for the DFL. We’re not using 50 nodes because we think we’ll have enough web traffic to utilize that much horsepower. Rather, it’s for web-based scientific computing – all controlled through the PHP-based site.

So… let me know what you want to see!

Storing hierarchical data in a database: Part 1

(Click to see part 2a and 2b)
In this first of two (or three) posts, I’m going to present a fairly conventional way of representing hierarchical data in a database. (The original article I read that got me interested on posting these ideas can be found here, hence the similarity in the tree structure).

My DB of choice is MySQL, so you may have to modify your queries slightly for other DB engines.

Before we even begin, we should consider why we’d even want to do something like that – what can you even use it for? Animals. It’s something we can all relate to. If, for some reason, you needed to catalog the relationships of different animals to one another, you’d have to come-up with some sort of tree diagram to make sense of it all. Take a look at this one, for example:

What else can you say about it? Pretty simple. I particularly like fish, so I expanded a little bit there (later we’ll see how we can expand our tree to include different breeds of dogs and cats). Also notice that you can really create many different types of trees. One tree that I’m working on, for example, will demonstrate the taxonomic relationships between different species of fish for the DFL. Often times the employee/supervisor scheme is used for demonstration purposes.

The first method of storing our data is by simply indicating to which parent does each item belongs. Take a look at this table:

*Note: I chose to have parent=0 to mean that’s the top-level of our tree (there is no row id 0).

That wasn’t too painless, now was it? The parent_id column basically points to the node (location in the tree) to which that particular item belongs. For example, cat (id 9) has a parent_id 3, which is the id of “Mammal.” Using this information we can reconstruct the tree above (it probably won’t look quite like that, but the structure of the data is what we’re concerned with).

There’s essentially only one basic query we need to achieve our goals:
$query="SELECT * FROM animal_tree where parent_id=$id";
Easy, huh? but it doesn’t really do anything for us. Say we wanted to select Animals->Fish->Marine->all, how would we do that?

In order to make this a bit more automated, since that’s really what we’re trying to do, we need to make use of a recursive function. What is that? you may ask… Well, it’s essentially a function that, at some point, calls itself from within. Here’s a quick example:

This function keeps calling itself until $number equals 100, at which point it kills the script, telling us we’re all done. If we’re going to create a recursive function for our tree, we’re going to need a few modifications to our code first…

That wasn’t too painless, now was it? If this still seems a little to difficult or abstract, don’t worry. Recursive function seem exactly that when you first learn them.

If you ever wanted to find the path to “Halibut,” you can create a similar function that find the parent_id of “halibut”, run the function to find the parent’s parent, so on and so forth.


Digital Fish Library Press Release

Today is the big day – our press release is now out, and even featured on the SIO (Scripps Institution of Oceanography) website. In just under an hour we’ll begin scanning a juvenile great white shark as part of the press “event.”

If you haven’t read my other posts on the DFL, here’s a brief recap: We’ve been given an NSF grant to digitize most genera of marine fish (some freshwater) in the world using magnetic resonance imaging (MRI). The end goal of this has several application points: 1) Fish anatomy is digitized, so researchers can perform virtual dissections and explore the fish’s organs without physically destroying the real specimens; 2) These data, along with software tools we’re creating, will be used as teaching tools in high school education programs, provided by the Birch Aquarium as Scripps, 3) the availability of data will encourage conservation practices, as destroying real tissues will become less and less a necessity to perform common anatomical research – something especially important when dealing with rare specimens.

The press release is at

More importantly, the DFL website is located at


Zend Framework Quick Review

First off, I’ve been having some problems getting things up and running due to a few errors in the code. Tried some of the fix suggestions by Andi Gutmans here, but to no avail. I also didn’t try too hard to correct these problems. After all, I was more interested in seeing how the code works than I was actually seeing work.

Want to do a Flickr search? Five lines of code!

require_once 'Zend/Services/Flickr.php';
$flickr=new Zend_Service_Flickr('api_key_here');
foreach($photos as $photo){
echo'Thumbnail->uri .'" />
echo $photo->title."

To import an RSS feed:

$feed = Zend_Feed::import('');
foreach($feed->items as $item) {
echo "

echo $item->link() . "


So, by these code examples alone there’s a lot of promise for this framework, especially considering the amount of development overhead involved in coding similar functionality in your own applications. From the code I’ve seen these demos I’ve copied here are fairly representative of how the Framework is used. At the moment its much too early to begin using the Framework in a production environment, but keep an eye on its development, as it will soon be one of THE tools for rapid PHP app development.

One final note: Check-out the webcast I linked to in a previous post for more details. No point in me duplicating the content here. Check back in the future for further reviews as this Framework develops.


Zend Framework review coming

Version 1.0 of the Zend framework is finally out:

I’m going to install it and do what I can over the next couple days and report back to you. Another link, for the interested: Zend PHP Framework Webcast

Newsvine – out of beta

Well folks, Newsvine is finally out of private beta!

It’s a great spin on aggregating news over the net. First, it claims to put wire news up faster than any other site because it skips all the editorial processes that most news websites (think CNN) go through before anything makes a page. Thousands of articles are instantly available from the AP, ESPN, and others.

Second, users have the chance to seed articles. In other words, you can point other readers to news elsewhere on the net that you find particularly interesting or worthwhile.

Third, it allows users to submit their own articles – much like a type of blog. But it’s more than just a blog. Consider it a place to write news articles and editorial pieces for others to read. Gain a following. Earn money through advertising click-throughs.

One thing that really makes this site stand-out is the amount of news on it… and it’s well-organized. Really well organized. Users also have the opportunity to vote for articles, pushing them up the page. In other words, the more popular content gets more exposure.

Check it out! Newsvine

Incremental backups with rsync… UPDATE

One week later the scripts seems to be doing their job just fine. The primary server is around 56GB right now, and each day is taking 150-160MB in updates.