Author: topher

Cleaning up a hack

“Hi, you don’t know me, but my friend says you’re really good with web stuff.  Someone told me my web site is trying to send them a virus, can you help?”

Ever gotten one of those emails?  I get one every couple months.  There are some excellent documents on how to deal with it here and here and here.

I recently did some fairly specific cleanup stuff and thought I’d share.

This particular hack placed a large chunk of base64 encoded stuff at the very top of every single php file on the entire site.  It was kind of dumb, since the first time I visited the plugins admin page they all deactivated because the headers weren’t in the right place anymore.  Certainly not subtle.

I installed the Timthumb Vulnerability Scanner (a must anytime you inherit a site) and it found an old copy that was almost certainly the hole.  The plugin let me update it on the spot.

Then I wanted to find out how pervasive this code was. I won’t post the entire string, it was far too long, but it started with

<?php $zend_framework=

I was logged into the server with SSH and I did this on the command line:

grep -r zend_framework *

That listed a ridciulously long list of files, and showed me that they were all hacks.  Next I wanted to find all the php files on the server and do a search and replace on them to replace that line.  I’ll post here the command to do that and then disscect below, because you should be able to use it do just about any search and replace.

find . -name '*.php' | xargs perl -pi -e 's#\<\?php \$zend_framework=".*b\\57\\x2f"\); \?\>##'

Let’s start at the easy part at the beginning: find . -name '*.php' “find” is a Unix command that finds stuff.  The . means “Look where I am now, and any child directories” and the -name ‘*.php’ means “get all the files that end with .php”.

Next is the pipe, or “|”.  In Unix that means “take all the output from the previous command and hand it off to the following command”.

Next is xargs.  xargs takes each line of input from the previous command and runs another command on it.  In this case it’s a perl search and replace.

Last is the perl search and replace.  I’m going to post a simplified version here:

perl -pi -e 's#search#replace#g' filename.php

You run that on the command line and it looks in filename.php and finds all instances of “search” and replaces them with the word “replace”. I’ve used this thousands of times to search and replace across multiple files in Unix.

In my example above, I was searching for \<\?php \$zend_framework=".*b\\57\\x2f"\); \?\> and replacing it with nothing.  That’s why my replace position looks like ##.

“But”, you say, “you said the bad code was very long!”  Indeed, and here’s where regular expressions come in.  Right in the middle of that string is “.*” which means “everything between what’s on my left and what’s on my right.  So the bad code started with ‘<?php $zend_framework="‘ and ended with ‘b\57\x2f"); ?>'

Note that in my search section I have lots of back slashes.  That’s to indicate that the following characters are NOT regular expression operators, but rather regular characters I want to search on.

Once I had it all worked out, this command:

find . -name '*.php' | xargs perl -pi -e 's#\<\?php \$zend_framework=".*b\\57\\x2f"\); \?\>##'

Went through every php file on my site and removed the hack code, and it did it in about 7 seconds.

One last thing: BE CAREFUL.  Test thoroughly.  Make backups.  You could very easily break every php file on your server with this.  Those links at the top of this post?  Do those first.  This is the harsh abrasive cleaner that you only use when it’s really ugly.

Storing complex queries in transients

I’ve recently discovered the joy of transients in WordPress.  You can read about them here: http://codex.wordpress.org/Transients_API.  It’s a method of caching bits of information.  If you have an object cache on your server like memcache, it’ll store it there, otherwise it stores it in the database.

You may wonder why I’d want to store the results of a database query in the database, since I’m still making a call.  The difference is that the original query takes much longer to run than the transient retrieval.

Here’s the code I have for creating the transient:

But what happens if someone makes a change?  We want to clear that transient, so that the next time it’s asked for it’ll recreate itself.

Now we have the results of of a complex query stored in a place that takes less time to retrieve, while making it so that it self clears when needed.

pre_get_posts instead of query_posts

The other day I had a custom content type of things for sale.  There’s a meta field for marking it Sold.  On my archive page I didn’t want to include Sold items, so I went hunting for the proper query_posts() bit to put in the top of my archive template.

What I found was several posts that said to never ever use query_posts().  The reason is that it doesn’t actually change the query like I thought it did, it simply runs another one.  However successful that may be at getting your content, it’s a second database query you don’t want.

The answer is to use a filter called pre_get_posts, and just like it sounds, it allows you to do stuff before you get posts.

So here’s what I did:

function filter_sold_coaches( $query ) {

    // check to see if we're editing the main query on the page.
    if( $query->is_main_query() ){

        // Check to make sure I'm on the Coaches archive page OR the Make page AND I'm not in the admin area
        if ( ($query->is_post_type_archive('coaches') OR $query->is_tax('make')) AND !is_admin() ) {

            // set a meta query to get only Coaches where the ecpt_sold element does not exist.
            $query->set('meta_query',array( array( 'key' => 'ecpt_sold', 'value' => 'on', 'compare' => 'NOT EXISTS')));

            // now I want to order by the year field, so I do that
            $query->set('meta_key', 'ecpt_coachyear');
            $query->set('orderby', 'meta_value_num');
        }

    }
    return $query;

}
// now I apply my function to pre_get_posts and Bob's your uncle
add_filter( 'pre_get_posts', 'filter_sold_coaches' );

The !is_admin bit is important, because this will affect your admin area if you don’t and you’ll find that you’re missing some posts.

Using the set() method you can set anything that you migth set with WP_Query, so if you’re good with that then this should be quite easy for you.

Many thanks to @pippinsplugins for wisdom in putting this together.