php

PHP Frameworks and Libraries Survey

Over the years, there have been many simple polls about PHP frameworks, libraries, and other PHP periphery.

I've created a more comprehensive survey about PHP frameworks, libraries, databases, evaluation techniques and other assorted tidbits in an attempt to better understand how people pick the libraries they do. My hope is that if enough people complete the survey, a picture will form that illustrates some itches that so many projects have been created to try to scratch.

I intend to share some (perhaps MOST) of the information gathered -- so don't worry that all this is going into a black hole somewhere.

The first version of the Killersoft PHP Frameworks & Libraries Survey is available here: http://j.mp/phpsurvey

Take a few moments to fill it out if you can! It should only take 10 minutes (or less!).

5 Reasons Simple Cloud is a Dark Cloud

When it comes to technology, everyone thinks they want interoperability. Using SQL, JavaScript or map APIs? There's an abstraction layer for that (SQL, JavaScript, map APIs). Writing mobile applications? There's an abstraction layer for that too (PhoneGap, Appcelerator, among others).

Developing "cloud-native" applications? Soon there'll be an abstraction layer for that too. Yesterday Zend announced the Simple Cloud API, a set of PHP classes for "writing scalable and highly available applications that are still portable."

Sounds great, right? Not so much. As Chief Architect of a PHP-based "cloud-native" application that's handling hundreds of millions of requests per month, there's no way I'd consider using Simple Cloud. Here are five reasons why you shouldn't either:

1. Abstractions leak. It's never as easy as it seems, which becomes painfully obvious at the least opportune moment. Let's have a show of hands of how many people have launched a SQL-abstraction based web application, only to find later that the crazy-ass SQL your toolset generated was 100x slower than one you'd have cooked up on your own. Okay then. The Law of Leaky Abstractions drags us down when it's time to figure out what's broken.

2. The Lowest common denominator rules. Abstraction layers often talk about unifying access to technologies where the differences are insignificant. Which is funny, since the providers of those technologies introduce differences to be competitive. Do you really want your web applications to be built around the elements of cloud services that are all the same? Your competition will likely be focusing on leveraging cloud advantages to the max, which will inevitably mean taking advantage of features that a lowest common denominator API ignores.

3. The true leaders don't participate. Simple Cloud lists contributors including Microsoft, IBM, Rackspace, Nirvanix and Go Grid, and dubs these companies as "leading the cloud revolution." Note how the three published APIs all include support for an Amazon Web Services solution. Who's really leading here? Amazon is leading the cloud revolution, and has been all along. Everyone else is just playing catch-up.

We've seen a similar situation recently. Remember OpenSocial? Google, MySpace, Yahoo and a few others banded together when Facebook handed their asses to them a couple of years ago. Facebook has never participated, and continues to lead the pack in social platform application development.

Of course all the companies that are not leading want to band together for the sake of interoperability, since interoperability means a greater chance of survival. But as a developer, be careful of getting roped into these company-serving agendas--it will cost you more than it costs them, with little return on your investment.

4. Native adoption isn't hard. Have you looked at the Amazon Web Services APIs, and noticed how many of them actually use the term "Simple" in their full descriptions? SDB is short for SimpleDB, S3 is short for Simple Storage Service. Is an API simplifying these already-simple APIs really necessary? If you're looking for a reduced feature set and inability to leverage each service to the maximum, sure! Otherwise, bite the very tiny bullet and learn about each service that you need. They're not hard. Trust me.

5. Face it: You'll probably never move. Despite portability claims, you'll likely never move from one service to another willy nilly. Consider the last significant web application you built around the database of your choice. Even if you used an abstraction layer, do you really think you could just pick it up tomorrow and move to a competing database engine? Whenever this scenario is truly considered, it's a much more complicated problem than just swapping out the underlying RDBMS. An audit must be done of all code to make sure no one snuck in a workaround query that included vendor-specific features, a new backup system needs to be evaluated (odd that no one has cooked up an SQL backup abstraction layer!), performance testing needs to be done ... all for what? Try doing this in a real company and see how quickly your plan gets shot down. Real companies have real business problems to be solving, and rarely have spare engineering cycles to waste on switching technologies that lie behind abstraction layers. So, just because you could doesn't mean you ever will.

With all that said, don't get me wrong: I love it that there are multiple providers getting into the cloud computing game in a meaningful way. I am actively evaluating each service and looking at ways to spread my cloud dependancies around to reduce my exposure to single-company cloud fail.

However, with each set of services I evaluate, I look for exactly the kinds of differentiating details that abstraction layers strive to hide, and will build out my application to leverage each environment to the max. My company and the services my customers consume deserve the very best utilization of upstream resources possible. Don't yours?

New PHP-focused Yum Repository

PHP 5.3 is now yum-installable.

Ever been frustrated that the latest this-or-that package for PHP is bogged down in Big Distro Packaging politics? I have been. That's why I've put together a deliberately-current-as-possible repository for PHP RPMs.

The repository is currently i386-only, though I'll be adding x86_64 packages within the next week or so. Also, "regular" and "debug" builds are available for all packages, so that users may be more helpful in troubleshooting what issues they find.

The current list of packages, as well as all the RPM spec files for building them and information on how to set up the repository for use on your CentOS5/RHEL5 servers is available at the killersoft-yum Google Code site.

Have fun!

Update: I forgot to mention one of the things I think is interesting about the way I've packaged the PHP modules in the repository. Each supported SAPI -- apache2handler (mod_php), fastcgi and cli -- each have their own configuration paths and php.ini files.

/etc/php.mod.d/php.ini

/etc/php.cgi.d/php.ini

/etc/php.cli.d/php.ini

Within each directory, a subdirectory for per-extension config is present.

If you don't like this setup, no problem: just symlink them all together and you won't know the difference. However, I think it's convenient to be able to easily specify extensions that you want to use in mod_php that you don't want to use in the PHP CLI, and vice versa. If you need an ultralight PHP CLI for some reason, you can have it without having to hobble mod_php that's running on your website.

AWS for PHP Developers

To offset the length of the article, I'll be brief: my deep dive into Amazon's PHP libraries for AWS is now available. Enjoy!

AWS Elastic MapReduce Supports PHP

It's not blatently obvious, or mentioned in detail, or even shown in any of the examples or introductions to the service.

But it's true: Amazon's new Elastic MapReduce service supports mapper and reducer scripts written in PHP. PHP 5.2.6 is installed in the environment created by the MapReduce job flows.

PHP Did Not Cause Facebook Code Leakage

Facebook experienced a technical glitch over the weekend. The nature of the glitch was that the source code for the Facebook homepage was displayed instead of the result of the execution of that source code. Widespread news of the glitch first broke in this TechCrunch article by TechCrunch writer and OmniDrive founder Nik Cubrilovic.

I agree with Cubrilovic that the inadvertent delivery of source code instead of the result of that source code is certainly a horrific situation, with potentially serious ramifications for any company that experiences such a problem on a large scale basis.

That a company like Facebook, currently a hot ticket for searches, articles and blog posts, would experience this kind of problem is noteworthy.

Unfortunately, the updates appended to the article imply that PHP is somehow responsible for this leakage. In the first article update, Cubrilovic states:

It seems that the cause was apache and mod_php sending back un-interpreted source code as opposed to output, due to either a server misconfiguration or high load (this is a known issue).

On the first of Cubrilovic’s suggested causes, server misconfiguration: well, duh.

Of course servers will behave strangely if they are misconfigured. The world of a system administrator is one of details, and when it comes to managing load balanced web servers for an extremely high-traffic destination like Facebook, it’s a world of a large number of details. Miss one of them and things will predictably start breaking in unpredictable ways.

On Cubrilovic’s second allegation: It’s “a known issue” that PHP barfs out source code under high load? I’ve been writing PHP code for some very, very high traffic websites for over 10 years, and this is the first I’ve heard of this.

Surely we in the PHP community would have heard from someone like Rasmus if PHP were prone to puking source at a high load. As an infrastructure architect at Yahoo!, Rasmus has likely seen how PHP behaves under load levels most of us only fantasize about. If PHP coders were building their applications on a platform pre-destined for Twitter-like failures, no doubt we’d have heard about it by now.

Can anyone provide links to articles or posts indicating that PHP will eject application source under a heavy load?

It’s infinitely more likely that Facebook’s problems were caused by a system administrator breaking some web server configuration (possibly not even PHP-specific configuration), or a new installation of a mod_php build that hadn’t been tested properly in a non-production environment.

Cubrilovic’s second amendment to his article links to an article on his own blog, Learning from Facebook: Preventing PHP Leakage.

Given the likelihood of this issue’s cause being server misconfiguration, it is disturbing that Cubrilovic’s first tip for avoiding this kind of problem is to install and correctly configure the powerful and complex Apache module mod_security. After all, if the a sysadmin can’t get Apache and PHP configured properly, how likely is it that they’ll be able to get two modules configured properly?

The rest of Cubrilovic’s tips also relate largely to web server configuration, such as making certain files inaccessible from direct requests.

The disappointing part of the FUD that Cubriolovic is spreading is that anything more than decent release practices are necessary to address and avoid the problem Facebook experienced.

I can only imagine why Cubrilovic has invested this weekend in undermining people’s faith in PHP’s reliability under heavy load. What I can tell you, though, is this:

PHP doesn’t cause website problems and inadvertent code leaks. People making mistakes while using PHP and other powerful tools do.

However, that fact isn’t worthy of two articles, so perhaps that’s why Cubrilovic went with the PHP-as-boogeyman-that-must-be-defended-against approach instead.

What’s the take-away from all this? Servers are powerful, and can be complicated. Tread carefully. Don’t roll untested configurations of web servers and related modules out on production without testing them in an identical staging environment.

Know what you’re doing, and do it carefully.

PHP and JSON: Cut #987

JSON Decoding in PHP 5.2.1 is Broken

As of PHP 5.2.1, json_decode() no longer follows the published standards for JSON-encoded texts.

Why not? For no reason other than the convenience of those ignorant of JSON standards.

Prior to PHP 5.2.1, this:

var_dump(json_decode('true'));

resulted in:

NULL

As of PHP 5.2.1, it results in:

bool(true)

Nice and handy, perhaps … but a blatant violation of JSON specifications, since 'true' is not a valid JSON encoded text.

A little history

Back in August, I spent a lot of time with JSON. I was working on adding Prototype/script.aculo.us support to the Solar Framework, and wanted a handy utility for passing options around in JavaScript with JSON.

Rather than roll some new JSON interpreter, I chose to leverage the Services_JSON package along with ext/json to build a package that was compatible with ext/json, but usable for those who did not have that extension installed. With compatibility built in, developers could move application code back and forth between systems without having to worry about whether or not the extension was installed — if it was, the application would benefit from the added performance of a native extension. If it wasn’t, everything should work exactly the same way.

In the course of my research for the Solar_Json package, I learned a lot about JSON and how it is supposed to behave. JSON.org is a spartan but complete resource about the format, and includes the JSON Checker and a comprehensive JSON test suite. There’s also a link to RFC 4627, which details JSON’s structure in a proposal for the formal application/json media type.

While digging through all this JSON goodness, I came to appreciate ext/json’s strict adherence to JSON’s format. The version of ext/json bundled with PHP 5.2.0 (version 1.2.1) was right on the money in its parsing, and by the time I was done, Solar_Json matched it every step of the way. To ensure ext/json compatibility, I wrote a series of unit tests (26 in all) to ensure that Solar_Json’s “pure PHP” implementation matched ext/json’s output exactly.

It was challenging, but it worked out well. The result was Solar_Json.

JSON Bundled with PHP, Confusion Ensues

To my delight, ext/json was bundled with PHP 5.2.0, and enabled by default. This was great news for PHP developers everywhere who are working with rich applications that need to exchange a lot of data with JavaScript.

All was good, for awhile.

(Yesterday, Paul M. Jones re-ran the JSON unit tests I'd written for Solar_Json using PHP 5.2.1 in preparation for a new release of Solar. He mentioned that some of the tests started failing, which sparked this discussion. Thanks, Paul!)

Sure, a couple people (myself included) didn’t fully understand JSON for awhile. I even opened (and quickly closed) a bug in the way I thought certain strings should be decoded by the json_decode() function. Others had the same confusion.

What’s so confusing?

Well, the common thing that people want to do is something like this:


var_dump(json_decode(true));
// or
var_dump(json_decode('true'));

The confusing part about these two snippets is that they both return NULL instead of true. Based on the two bug reports (#38440 and #38680), it’s common for people to expect the output to be a boolean true in these examples.

However, if you understand JSON at all, you’ll know that NULL is a perfectly reasonable result, because true and 'true' are not valid JSON texts.

Standard, Shmandard

Note that I said “if you understand JSON at all”, you’ll realize that NULL is a perfectly reasonable result when attempting to decode an invalid JSON text.

I should amend that: If you understand JSON at all and actually care about standards and compatibility, you’ll realize that NULL is a perfectly reasonable result of parsing an invalid JSON text.

Just like other formats, there’s actually a specification that defines what is valid and what is not when it comes to JSON. No, really.

Section 2 of the standard states very plainly:

A JSON text is a serialized object or array.

JSON-text = object / array

Translated, that means that a valid JSON-text is either an object or an array. It’s not a string literal, an integer, a boolean. The list of what a valid JSON-text can be is short. It can be an object. It can be an array. It can be … whoops, that’s it. An object, an array, or it just isn’t JSON.

I mean, think about it: JavaScript Object Notation. Not JavaScript Boolean Notation. Not JavaScript Assorted Stuff Notation. Objects. Arrays thrown in because in JavaScript, they’re basically the same thing. Period, the end.

DAMN, that’s inconvenient, you may be thinking. Yep, it is. But, it is what it is. If you don’t like it, submit an RFC to have it changed. That’s the way this crazy thing called the internet works.

Put another way: if you don’t like it, you do not just start making things up. Apparently, enough people unclear on the concept of JSON complained about their lack of understanding that PHP now just does whatever it wants with JSON. Check this out for the details. (And to reiterate, I’m not knocking the people who aren’t clear about JSON. I was one of them too, up until I actually researched how JSON is supposed to behave.)

Imagine if the core team behind every language did that. Hey, if you don’t like the standards, just ignore them! We can explain it away with documentation, right?

Cut #987

The cavalier attitude taken by the PHP internals team on this issue is inexcusable. Yep, cavalier — a colleague who spoke to a member of the PHP internals team about this change confirmed that the break from the JSON spec is deliberate and intentional.

To make matters worse, the version number of ext/json did not change between PHP 5.2.0 and PHP 5.2.1. In both releases, ext/json claims to be at version 1.2.1, despite this significant change.

While some are lobbying to compile the definitive business case for PHP (and I even piped in and agreed that it was necessary), some PHP internals folks are effectively shooting that effort in the foot by disregarding published standards.

I’ve spent the better part of the last two years defending my choice of PHP 5 as my preferred language, first at Feedster, now at Mashery. With all the buzz about other languages these days, the case for PHP is getting harder to make. Incidents like this will not make the case for PHP any easier.

Is this a big flap over a little thing? That’s certainly one way of looking at it. I see this flagrant disregard for published specs as one more cut toward a death by a thousand cuts.

Talented and notable developers are dropping PHP, or seriously considering other languages. If PHP’s next 10 years are to be as poignant as its first, a significant attitude adjustment is required.

Monitor PHP Extension Releases with Y! Pipes

Like many of my fellow geeks, I've found a few moments to play around with Yahoo Pipes over the last couple of days.

The first pipe I've created and published is the PHP Extension Monitor. It's an aggregated feed that pulls in release information on several cool extensions that aren't announced in the PECL feed, such as Suhosin, XCache and DBXML.

Since I use all of these extensions at Mashery, I thought it would be nice to have an aggregated feed to keep track of those releases along with the PECL releases feed.

So, please check it out. Hope it's helpful to someone else out there. Please make a note in the comments if you'd like to see additional cool extensions added to the feed.

My thoughts on the creation process: Pipes is a cool tool, and a very impressive bit of UI scripting. As Aaron noted, the actual generation of the feed is quite slow. I hope they have some kind of caching in place. :)

If the PHP Extension Monitor feed is a popular idea, but Pipes proves to be too slow to handle it, let me know and I'll create a more traditional back-end aggregator for it.

Ohloh Reports May Paint an Inaccurate Picture

Through no fault of the Ohloh tool itself -- it can only report on what it's told, after all -- the reports that Ohloh generates should not be considered The Gospel.

After cruising the site for a few minutes today, and oooohing and aaahing at like many others have been, I popped open a project I'd never heard of: PHPSurveyor.

Ohloh speculates that PHPSurveyor has 731,822 lines of code, and would cost approximately $11 million dollars to reproduce.

Wow.

Upon closer inspection of the history of the project on Ohloh, it appears that the Ohloh crawler is looking at the root of the PHPSurveyor SVN repository, which happens to contain all their releases AND their current working trunk.

I'm sure that Ohloh is using a somewhat different algorithm, but just to get a ballpark comparison, I ran recursive wc -l, on an 'svn export' of the full repository, excluding image files (.gif/.jpg/.png/.svg) and .txt files:

Lines       Path
=====       ====
1,098,024   / (full repository)
  288,618   /source
  273,544   /source
              minus a dir called "rewrite"
  195,198   /source/phpsurveyor
              minus 3rd party ADODB, phpMailer and PEAR

That's quite a difference. I can't be sure, but in Ohloh-speak I think I just saved someone over $9 million dollars. :)

PHP MimeViewer for Trac

As John Herren pointed out in the comments to yesterday's post, my suggestion for PHP syntax coloring wasn't working for several versions of Trac.

After a little digging, it turned out that my CSS rules were based on the non-standard SilverCity highlighting output, rather than the newer and more common native PHP highlighting.

While reviewing the output from the native PHP highlighter, I noticed that Trac's manipulation of the output from highlight_string() isn't quite right. The way things are now, as of Trac 0.10rc1, an intended multi-line docblock only has the first "/**" highlighted as a comment ... the rest of the docblock is highlighted as default PHP (the code-lang class).

I've been clamoring for good PHP syntax coloring output in Trac for a long time. Rather than pass the buck and put this on the Trac team, I've released a little PHP-based shell script called php_mimeview.

With php_mimeview installed, simply update the [mimeviewer] config section of trac.ini in your Trac project to point php_path to the full path location of php_mimeview. With that script, plus these rules in your site_css.cs template file:

[source language=":css"]
.code-keyword { color: #007700; }
.code-lang { color: #0000BB; }
.code-comment { color: #FF8000; }
.code-string { color: #DD0000; }
[/source]

... you'll get what you're probably used to seeing in syntax-highlighted PHP code.

Thanks to John Herren for the nudge to investigate this further.