Category: Code

Cleaning up a hack

“Hi, you don’t know me, but my friend says you’re really good with web stuff.  Someone told me my web site is trying to send them a virus, can you help?”

Ever gotten one of those emails?  I get one every couple months.  There are some excellent documents on how to deal with it here and here and here.

I recently did some fairly specific cleanup stuff and thought I’d share.

This particular hack placed a large chunk of base64 encoded stuff at the very top of every single php file on the entire site.  It was kind of dumb, since the first time I visited the plugins admin page they all deactivated because the headers weren’t in the right place anymore.  Certainly not subtle.

I installed the Timthumb Vulnerability Scanner (a must anytime you inherit a site) and it found an old copy that was almost certainly the hole.  The plugin let me update it on the spot.

Then I wanted to find out how pervasive this code was. I won’t post the entire string, it was far too long, but it started with

<?php $zend_framework=

I was logged into the server with SSH and I did this on the command line:

grep -r zend_framework *

That listed a ridciulously long list of files, and showed me that they were all hacks.  Next I wanted to find all the php files on the server and do a search and replace on them to replace that line.  I’ll post here the command to do that and then disscect below, because you should be able to use it do just about any search and replace.

find . -name '*.php' | xargs perl -pi -e 's#\<\?php \$zend_framework=".*b\\57\\x2f"\); \?\>##'

Let’s start at the easy part at the beginning: find . -name '*.php' “find” is a Unix command that finds stuff.  The . means “Look where I am now, and any child directories” and the -name ‘*.php’ means “get all the files that end with .php”.

Next is the pipe, or “|”.  In Unix that means “take all the output from the previous command and hand it off to the following command”.

Next is xargs.  xargs takes each line of input from the previous command and runs another command on it.  In this case it’s a perl search and replace.

Last is the perl search and replace.  I’m going to post a simplified version here:

perl -pi -e 's#search#replace#g' filename.php

You run that on the command line and it looks in filename.php and finds all instances of “search” and replaces them with the word “replace”. I’ve used this thousands of times to search and replace across multiple files in Unix.

In my example above, I was searching for \<\?php \$zend_framework=".*b\\57\\x2f"\); \?\> and replacing it with nothing.  That’s why my replace position looks like ##.

“But”, you say, “you said the bad code was very long!”  Indeed, and here’s where regular expressions come in.  Right in the middle of that string is “.*” which means “everything between what’s on my left and what’s on my right.  So the bad code started with ‘<?php $zend_framework="‘ and ended with ‘b\57\x2f"); ?>'

Note that in my search section I have lots of back slashes.  That’s to indicate that the following characters are NOT regular expression operators, but rather regular characters I want to search on.

Once I had it all worked out, this command:

find . -name '*.php' | xargs perl -pi -e 's#\<\?php \$zend_framework=".*b\\57\\x2f"\); \?\>##'

Went through every php file on my site and removed the hack code, and it did it in about 7 seconds.

One last thing: BE CAREFUL.  Test thoroughly.  Make backups.  You could very easily break every php file on your server with this.  Those links at the top of this post?  Do those first.  This is the harsh abrasive cleaner that you only use when it’s really ugly.

Storing complex queries in transients

I’ve recently discovered the joy of transients in WordPress.  You can read about them here: http://codex.wordpress.org/Transients_API.  It’s a method of caching bits of information.  If you have an object cache on your server like memcache, it’ll store it there, otherwise it stores it in the database.

You may wonder why I’d want to store the results of a database query in the database, since I’m still making a call.  The difference is that the original query takes much longer to run than the transient retrieval.

Here’s the code I have for creating the transient:

But what happens if someone makes a change?  We want to clear that transient, so that the next time it’s asked for it’ll recreate itself.

Now we have the results of of a complex query stored in a place that takes less time to retrieve, while making it so that it self clears when needed.

pre_get_posts instead of query_posts

The other day I had a custom content type of things for sale.  There’s a meta field for marking it Sold.  On my archive page I didn’t want to include Sold items, so I went hunting for the proper query_posts() bit to put in the top of my archive template.

What I found was several posts that said to never ever use query_posts().  The reason is that it doesn’t actually change the query like I thought it did, it simply runs another one.  However successful that may be at getting your content, it’s a second database query you don’t want.

The answer is to use a filter called pre_get_posts, and just like it sounds, it allows you to do stuff before you get posts.

So here’s what I did:

function filter_sold_coaches( $query ) {

    // check to see if we're editing the main query on the page.
    if( $query->is_main_query() ){

        // Check to make sure I'm on the Coaches archive page OR the Make page AND I'm not in the admin area
        if ( ($query->is_post_type_archive('coaches') OR $query->is_tax('make')) AND !is_admin() ) {

            // set a meta query to get only Coaches where the ecpt_sold element does not exist.
            $query->set('meta_query',array( array( 'key' => 'ecpt_sold', 'value' => 'on', 'compare' => 'NOT EXISTS')));

            // now I want to order by the year field, so I do that
            $query->set('meta_key', 'ecpt_coachyear');
            $query->set('orderby', 'meta_value_num');
        }

    }
    return $query;

}
// now I apply my function to pre_get_posts and Bob's your uncle
add_filter( 'pre_get_posts', 'filter_sold_coaches' );

The !is_admin bit is important, because this will affect your admin area if you don’t and you’ll find that you’re missing some posts.

Using the set() method you can set anything that you migth set with WP_Query, so if you’re good with that then this should be quite easy for you.

Many thanks to @pippinsplugins for wisdom in putting this together.

Rendering Jetpack Stats

jetpack-logo-smallJetpack is a plugin offered by WordPress.com, and it contains a variety of services, each of which can be activated or deactivated within Jetpack, depending on what you want to use.

One of the features is web site statistics.  You create a WordPress.com account and then associate it with your Jetpack plugin.  Then Jetpack keeps track of statistics for your site on WordPress.com.  You can log in there and view what’s going on, or right in your own site you can view some charts and graphs.

On the sites we build at work we have a Social bar, with the usual colleciton of social traffic indicators.  You can see it here: http://www.salinehornets.com/.  One of the things we wanted there was an indication of how many times the specific articles had been viewed.

Through a long voyage of discovery about why various plugins that claimed to do this were failing, AND some wonderful support from Andy at Automattic, I figured out a good way.

Jetpack provides a function called stats_get_csv that queries WordPress.com, and here’s how I’m using it:

$args = array(
    'days'=>-1,
    'limit'=>-1,
    'post_id'=>$post_id
);

$result = stats_get_csv('postviews', $args);

The -1 indicates infinity, so I’m asking for all days, unlimited, and I’m giving it the post_id I’m looking for.

Then in the function call I’m telling it I want postviews.  That gets me an array that looks like this:

Array
(
    [0] => Array
        (
            [views] => 2
        )

)

So I then do something like this:

$views = $result[0]['views'];
return number_format_i18n($views);

This is all in a function, hence the return.

The stats_get_csv function caches for 300 seconds, so you won’t necessarily see instant changes as you reload your page.  You probably also should be caching on your own end, so that you’re not even hitting it that often.  We’re doing whole page caching, so that my archive page doesn’t make 10 calls at once on that page.

There’s precious little documentation about this process, which is one of the reasons I did this post.

The plugin we tried using before this wasn’t passing the post_id, it was trying to get stats for ALL posts, and then loop through them and grab the one we wanted.  The problem is that the function doesn’t return ALL posts, it returns the top 30 or so.  It’s much more efficient to simply ask for the one you want.

Creating a new image size

Recently I found the need to make a new image size in WordPress, other than the usual “Thumbnail”, “Medium”, “Large”, and “Full”.

It’s not very difficult, you simply need to use the function add_image_size in your theme function file, a custom plugin, or an mu-plugins file.

It takes 4 options, 3 are required.  The first is the name of your new size.  This will never be front facing, so use something code-friendly.  Mine was homepage-slide.  Then it takes width and height.  The fourth parameter is whether you want to actually resize the original, or crop it to your new size.

My code looks like this:

if ( function_exists( 'add_image_size' ) ) {
add_image_size( 'homepage-slide', 640, 427, false ); //( NOT cropped)
}

An additional problem that I ran into is that the client had already uploaded about 300 images.  This meant I needed to re-process all of them.

There are several good plugins that do this, but most of them want to do all of your images in one php run, which would time-out for me, since I had so many.

The plugin I used is called AJAX Thumbnail Rebuild. Rather than one big resize call it uses AJAX to make an individual call for each image.  It went through all 300 of my image no problem and then that image size was available to my code.