Tag Archives: PHP

Heartbeat logging while consuming Twitter streams using Phirehose

Phirehose is an awesomely useful Twitter Streaming API client library, written in PHP by Fenn Bailey.

Heartbeat logging is something that I originally added for Rainmaker, and I finally got around to contributing those modifications, which you can see here on GitHub.

Why log heartbeats in Phirehose?

  • To gain assurance that Phirehose is still alive, and actually functioning. In our case, missing tweets means lost money and unhappy clients. We needed to monitor this very closely.
  • To enable automatically detecting connection drops and rewinding the count parameter to pick up those tweets, or backfilling them in using the Twitter Search API.
  • To collect usage data for reporting purposes.

Usage

To use this, simply declare a heartbeat(array $data) method in your Phirehose child class. Continue reading

Solved: “Access Denied” errors when calling signtool.exe from PHP

SIGHntool, why must you give me such grief?

I have spent the last 8 hours trying to figure out why Microsoft’s signtool.exe code signing utility refuses to work when called from PHP’s system() or shell_exec() functions on my WAMP server:

C:\build> "C:\Program Files\InstallMate 7\Tools\signtool.exe" sign /v /f codesignedcert.pfx Setup.exe 2>&1

The following certificate was selected:
    Issued to: <redacted>.
    Issued by: UTN-USERFirst-Object
    Expires:   5/12/2012 6:59:59 PM
    SHA1 hash: <redacted>

Done Adding Additional Store

Attempting to sign: C:\build\Setup.exe

Number of files successfully Signed: 0
Number of warnings: 0
Number of errors: 1

SignTool Error: ISignedCode::Sign returned error: 0x80090010

	Access denied.

SignTool Error: An error occurred while attempting to sign: C:\build\Setup.exe

Note: the 2>&1 at the end of the signtool call is essential if you want to capture error messages which are emitted to STDERR instead of STDOUT. Yes, I lost an hour or two just on that.

Dead ends

  • Windows 7 apparently sets the read-only attribute on all files, and it isn’t easy to turn that attribute off. But since other file operations worked from PHP, this wasn’t the issue.
  • Prefacing the signtool call with CMD /C didn’t help.
  • Setting full control file permissions on the C:\build folder for Guest, SYSTEM, and any other user account I could think of didn’t help either.
  • Wrapping signtool in a batch file was an exercise in futility.

The maddeningly frustrating thing was that signtool worked great when called from the command line — just not from PHP!

An aha! moment

The issue turned out to be pretty stupid, as they usually do. I merely had to change the account that Apache was running as to that of a normal user, instead of the default local system account.

services.msc - changing the Apache user

Uploading large files: covering all the bases

When uploading a file to a PHP script on an Apache web server, there are several configuration options that if improperly set can get in the way. I just encountered yet another one of these, and decided to catalog them here.

Size, Time, and Memory

There are three types of limits that affect file uploads, and the weakest link in the chain is your effective limit.

If your size limit is set to 3gb, but your time limit does not allow for the time required to upload that much data, you’ll still be unable to upload those large files. Likewise, the ability to upload does no good if you do not have enough memory to process the file that was uploaded.

Assumptions

This post assumes an 8MB upload limit (8mb x 1024kb x 1024 bytes = 8388608). You will want to adjust this number up or down according to your needs.

Oddly, although 8mb is the default value for PHP’s upload_max_filesize setting, some of the other default settings are much lower (2mb, or in some cases, only 100k).

PHP limits

upload_max_filesize = 8388608
post_max_size = 8388608
max_input_time = 60

Depending on what you’re doing with the uploaded files, you may also need to increase your memory limit:

memory_limit = 64MB

Apache limits

If LimitRequestBody is set to something non-zero, you may need to increase its value in your Apache httpd.conf file or .htaccess file:

LimitRequestBody 8388608

If you are using mod_fcgid (required to run the latest PHP 5.3 VC9 NTS build for Windows), then you need to set the value of FcgidMaxRequestLen, which defaults to 100k if it is not set. (Note that some systems may put mod_fcgid settings in a file separate from the main httpd.conf file).

FcgidMaxRequestLen 8388608
FcgidIOTimeout 60

Happy uploading!

Takeaways from PHPCon, day 2

Get my day one takeaways here.

Day two at PHPCon in Nashville, TN was packed with lots of information that frankly, I’m still digesting. It was well worth the trip and ended much too quickly!

Download my notes (PDF)

Morning Keynote

In his “brain dump,” Rasmus Lerdorf shared a collection of unrelated but very useful tips and observations about PHP.

  • PHP is the bottleneck, there is no significant difference between nginx/lighthttpd/apache
  • Error handling is not optimized in PHP because it should be an infrequent event. Set error_reporting = -1
  • Insufficient realpath_cache_size in PHP 5.2 can cause excessive filesystem STATs
  • Use gearman for out-of-band processing. Threads aren’t needed in PHP.
  • Don’t overload apache (8 core CPU = 25-30 apache clients; never more than 50)
  • Deploy processes should be atomic and robust (can use Capistrano or Rasmus’ own weploy script)
  • Node.js-style event programming is fast and easy (see http://php.net/libevent)

Slides: http://talks.php.net/show/phpcon2011 (ya gotta love the homegrown web-based presentation system!)

The Original Hypertext Preprocessor

Drew McLellan‘s presentation was particularly relevant for me in my role as CTO at Company52, and I consider it one of the best presentations at PHPCon. Drew’s goal was not to market Perch (although he did an awesome job of it without even trying), but rather to share his philosophy of what really great client support is all about, and how it has impacted his work.

  • Throwing new features at a problem often doesn’t solve it. Functionality is not enough.
  • Find ways to reduce support requests.
  • Every support request should be unique (no FAQs).
  • Fix areas of confusion rapidly.
  • Support your own software – programmers should see issues firsthand.
  • It’s OK to be opinionated (“WYSIWYG is evil”), but don’t be dictatorial. It’s not our place to tell people how to work.
  • Help customers look good in front of their clients.
  • Accept when users are having problems.
  • Really great developers solve problems. Excuses simply are not helpful.

The WonderProxy story

Paul Reinheimer shared the story of how he built (and self-funded) WonderProxy, born out of a personal need to test applications that use IP-based geolocation.

  • Mistakes – “crouch and hope you don’t get hit”
    • No account de-activation
    • NIH – wrote paypal IPN code instead of re-using own code
    • Mixing Linux distros
    • Server account renewals
    • Afraid to look at profitability numbers
  • Old strategy: blog about problems we encounter – competitors find posts
  • New strategy: blog about problems customers have – bring customers instead of competitors

Is it Handmade Code If You Use Power Tools?

Laura Beth Denker of Etsy shared an overview of their continuous integration processes, and how it has greatly improved both confidence and deployment speed. True to Etsy style, Laura came dressed in a handmade outfit from a Nashville-based Etsy vendor, earning a “too much swag” tweet from one listener!

  • NEED CONFIDENCE in your code
  • Effective testing strategies include functional (human) testing, integration testing (database), and unit testing (foundation)
  • Don’t use random data in unit tests
  • Test each case in control structures
  • Use DBUnit for testing database interaction
  • Tests should run rapidly
  • Group unit tests and target test groups to run
    • caches
    • databases
    • network tests (third-party APIs)
    • sleep
    • time
    • smoke, curl, regex
    • flaky

What happened to Unicode in PHP

Andrei Zmievski’s talk was a frank de-briefing of the failed attempt to bring native Unicode support to PHP6. Although this story is a rather personal one for Andrei, he was honest and incorporated a few surprisingly hilarious bits of humor. The conclusion: native Unicode support will only come to PHP if and when the community wants it — and is willing to put noses to the grindstone. The task is simply too big for his elite band of 10 (including Rasmus himself). Most of the content was historical in nature, but there were a few nice tidbits of information.

  • Complete I18N is more than language stuff:
    • Character set
    • Date/time formats
    • Currency formats
    • Collation (sorting, contractions – thanks to Andrei for finally helping me to understand what a “collation” is)
  • pecl/intl has some useful classes left over from the PHP6 unicode project (Collator, NumberFormatter, MessageFormatter)

Closing Keynote

Terry Chay’s closing keynote wove a common thread through all of the presentations given at PHPCon, along with a heaping helping of humor (seedy at times). My favorite part were the Chayisms: http://phpdoc.info/chayism/

Takeaways from PHPCon, day 1

I’m here at PHPCon, the first PHP community developer conference in Nashville, TN. The first day consisted of two rather lengthy workshops, both of which were very informative.

Download my notes (PDF)

Web Services

This talk was given by Lorna Jane Mitchell, whose totally awesome British accent I could listen to all day. I consider myself no novice on consuming web services, but being a relative newcomer to building web services, I got a real education on how to do it right.

Key takeaways:

  • Use curl, it eliminates points of failure for more accurate testing. Lorna rejects any bug ticket that does not come with a curl test case, which reduced support requests by 50%!
  • Every web service should have a heartbeat method that echos the variables you pass to it.
  • Every web service should have documentation, (real) examples, and a support mechanism. If you’re not going to do this, don’t bother building a web service, ’cause nobody’s gonna use it.
  • Utilize the HTTP protocol as fully as possible, including HTTP headers (Accept, Content-Type, User-Agent), verbs (GET [read], POST [create], PUT [update], DELETE [delete]), and status codes (HTTP 200, 201, 301, 302, 400, 401, 403, 404, and 500).
  • Give consumers a choice of formats. text/html is useful for debugging purposes.
  • Parse the Http-Accept header and deliver content in first format listed that you support.
  • Don’t confuse HTTP 401 with HTTP 403 (“I don’t know who you are, so I’m denying access” vs “I know who you are, and you don’t have permission”).
  • pecl_http is an easier way to access web services than curl.
  • Error handling defines API quality. Provide complete, useful, and consistent error messages in the expected response format.

Link bundle: http://bit.ly/bundles/lornajane/2

Frontend Caching

This talk was given by Helgi Þorbjörnsson (I will not even attempt his Icelandic surname). Helgi is a long-time PEAR contributor and experienced front-end developer. Key takeaways:

  • 80% of response time is spent downloading resources.
  • Don’t abuse cookies. Large cookies hurt performance because of slow upload speeds, and because they are sent with every request. When you use cookies, be sure to set an expiration date and limit them to only the domains they are needed on.
  • Browsers have per-domain concurrent download limits. You can spread static assets across 3-4 multiple subdomains as a workaround.
  • Combine files judiciously. Be aware of the trade-off between fewer server requests and larger file size.
  • Load above the fold first.
  • Minify Javascript and CSS, preferably at build time.
  • Use gzip compressiononly for text-based content.
  • Save HTTP 404 bandwidth by ensuring that you have a robots.txt file and a favicon.
  • Compress images more (Photoshop doesn’t cut it; better alternatives include pngcrush and jpegtran).
  • Test with slower connections (tread the user’s path).

Three ways to tell if your are running PHP 5.3

A quick-n-dirty way

If all you want to do is see if you are running PHP 5.3+, then just check for the existence of the array_replace() function, which was added in PHP 5.3:

<?php
if(function_exists('array_replace')) {
    // running PHP 5.3+
} else {
    // running something prior to PHP 5.3
}
?>

The “right way”

The version_compare() method can be used in conjunction with the PHP_VERSION constant to compare standardized PHP version strings. It also makes for more readable code:

if(version_compare(PHP_VERSION, '5.3.0') >= 0) {
    echo 'I am at least PHP version 5.3.0, my version: ' . PHP_VERSION . "\n";
}

Version constants

If you need to get down to the nitty gritty specifics of the PHP version you are running, use the PHP_VERSION_ID, PHP_MAJOR_VERSION, PHP_MINOR_VERSION, and PHP_RELEASE_VERSION constants, which were added in PHP 5.2.7. To ensure backward compatibility, the following code snippet from php.net will define these constants if they are undefined in your PHP version:

<?php
// PHP_VERSION_ID is available as of PHP 5.2.7, if our
// version is lower than that, then emulate it
if (!defined('PHP_VERSION_ID')) {
    $version = explode('.', PHP_VERSION);
    define('PHP_VERSION_ID', ($version[0] * 10000 + $version[1] * 100 + $version[2]));
}

// PHP_VERSION_ID is defined as a number, where the higher the number
// is, the newer a PHP version is used. It's defined as used in the above
// expression:
//
// $version_id = $major_version * 10000 + $minor_version * 100 + $release_version;
//
// Now with PHP_VERSION_ID we can check for features this PHP version
// may have, this doesn't require to use version_compare() everytime
// you check if the current PHP version may not support a feature.
//
// For example, we may here define the PHP_VERSION_* constants thats
// not available in versions prior to 5.2.7

if (PHP_VERSION_ID < 50207) {
    define('PHP_MAJOR_VERSION',   $version[0]);
    define('PHP_MINOR_VERSION',   $version[1]);
    define('PHP_RELEASE_VERSION', $version[2]);
    // and so on, ...
}
?>

Human Name Parsing in PHP

Parsing human names are not exactly easy, but they can be done. Keith Beckman’s nameparse.php is an excellent PHP library for doing this.

Download nameparse.php

nameparse.php can recognize names in “[title]first[middles]last[,][suffix]” and “last,first[middles][,][suffix]” forms, which, when you think about it, cover most if not all well-formed name input formats. nameparse.php handles last names of arbitrary complexity, such as “bin Laden”, “van der Vort”, and “Garcia y Vega”, as well as middle names of arbitrary size and complexity, differentiating between most last names and the first or middle names or initials preceding them.

An example of names correctly parse by nameparse.php:

  • Doe, John. A. Kenneth III
  • Velasquez y Garcia, Dr. Juan, Jr.
  • Dr. Juan Q. Xavier de la Vega, Jr.

To use, simple include() or require() nameparse.php and call parse_name($string) on any name. parse_name() returns an associative array of all name segments found of “title”,”first”,”middle”,”last”, and “suffix”. Do note that no spelling, capitalization, or punctuation of titles, prefixes, or suffixes is normalized. That is, every token remains as entered: nameparse.php is a semantic parser only. If you want orthographic or other normalization, you’ll have to postprocess the output. However, since the name is now semantically parsed, such postprocessing is (for applications which require it) simple.

print_r(parse_name('Velasquez y Garcia, Dr. Juan Q. Xavier III'));

yields . . .

Array
(
    [title] => Dr.
    [first] => Juan
    [middle] => Q. Xavier
    [suffix] => III
    [last] => Velasquez y Garcia
)

Performing a bitwise NOT on arbitrarily long integers

Here’s the surprisingly simple solution to a fairly challenging problem. I do not understand why PHPs GMP extension does not include a gmp_not() function.

function gmp_not($n) {
	
	# convert to binary string
	$n = gmp_strval($n, 2);
	
	# invert each bit, one at a time
	for($i = 0; $i < strlen($n); $i++) {
		$n[$i] = ~$n[$i];
	}
	
	# convert back to decimal
	return gmp_strval(gmp_init($n, 2), 10);
}