Wednesday, September 30, 2009

Proxies PHP and XML OH MY

I like PHP. I like XML, but like anything, it sometimes just flat out pisses me off. Such was the case when a host I had been gleefully fetching an RSS feed from decided they needed to move their website to a host that provides DDOS detection and protection.

The happy little code snippet below, that had been chugging along for almost a year, and continued to work on other hosts suddenly broke...

$rssfeedurl = http://www.somdomain.com/somerss.xml

if (!$xmlDoc = new DOMDocument()){
echo 'Could not initiate feed '.$rssfeedurl ;
}

if (!@$xmlDoc->load($rssfeedurl)){
echo 'Could not fetch or translate '.$rssfeedurl.' into XML';
} else {
// PARSE AWAY ! ! ! !
//get elements from ""
$channel = $xmlDoc->getElementsByTagName('channel')->item(0);
$channel_title = @$channel->getElementsByTagName('title') ->item(0)->childNodes->item(0)->nodeValue;
$channel_link = $channel->getElementsByTagName('link') ->item(0)->childNodes->item(0)->nodeValue;
$channel_desc = $channel->getElementsByTagName('description') ->item(0)->childNodes->item(0)->nodeValue;
}


After beating my head against my already battered desk for far too long, I asked the owners of the domain, what up?

They in turn conveyed my question to their new hosting company who promptly responded with "The HTTP request header is invalid" and hinted that their external proxy couldn't forward such a request to the targeted server..

So, I tried using stream_context_create() and libxml_set_streams_context() to create valid HTTP request headers, and it seemed to work....


$opts = array(
'http' => array(
'user_agent' => 'xml fetcher 1.0',
)
);

$context = stream_context_create($opts);
libxml_set_streams_context($context);
if (!@$xmlDoc->load($rssfeedurl)){
echo 'Could not fetch or translate '.$rssfeedurl.' into XML';
} else {
// PARSE AWAY ! ! ! !


I finally gave up on using it though because I was still getting intermittent failures, likely because creating a fully valid HTTP request header is trickier than I thought... So I switched to CURL, and Eureka ! CURLOPT_HTTPPROXYTUNNEL to the rescue!

Why didn't I try this before? If you have done a Google search on "PHP CURL proxy" you will find tons of info on posting through a local or intermediary proxy. But I could find almost nothing on dealing with the nuances of a remote proxy directly in front of the target server, and besides $xmlDoc->load($rssfeedurl) is just so damn elegant...

After some trial and error, I came up with this.

$cookie="./.cookiefile";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $rssfeedurl);
curl_setopt($ch, CURLOPT_HTTP_VERSION, 1.0); // 1.1 likely works as well, but 1.0 seems fine
curl_setopt($ch, CURLOPT_HEADER, 0); // Dont want response headers
curl_setopt($ch, CURLOPT_REFERER, 'http://'.$_SERVER['SERVER_NAME'].$_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_USERAGENT, 'Dew-Code XML Grabber 1.0');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); // to handle redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // assign response to a variable
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie); // .cookiefile is just an empty file, just in case
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie); // ditto
curl_setopt($ch, CURLOPT_TIMEOUT, 30); // seconds to wait for response
curl_setopt($ch, CURLOPT_AUTOREFERER, 1); // in case of redirect, convey referrer
curl_setopt($ch, CURLOPT_MAXREDIRS, 5); // how many times to allow redirect
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1); // absolute must when a proxy is involved
$rssblob = curl_exec($ch); // make it go!!

if (curl_errno($ch)) {
echo curl_error($ch); // handy for debuging
} else {
curl_close($ch); // Save the binviroment! clean up after yourself
}

if (!@$xmlDoc->loadXML($rssblob)){
echo 'Could not fetch or translate '.$rssfeedurl.' into XML';
} else {
// PARSE AWAY ! ! ! !


I hope you find it useful..

Tuesday, September 15, 2009

Tiny little trickling income streams

I know a lot of website owners that spend their time and money making their site as good as it can be. For them it's not about making a buck, it's about giving back, which is of course a good thing.. but, there is certainly nothing wrong
with supplementing your income so that at the end of the month you are at least covering the costs of hosting your website, and if you come out ahead, well that's even better.

So here I will cover some of the ways you can get your website earning a bit of income. Depending on how much traffic your site gets, you aren't likely to be retiring early from the money you make, but you should be able to at least cover your monthly web hosting costs.

Google Adsense :
Adsense is probably the most widespread and popular way to monetize a website. After signing up you can create banner ads as well as text link blocks to spread around your site. Further you can customize the colors and even fonts they use. Which colors work best on your website is a hot topic for debate, so I suggest creating 2 custom pallets that blend well with your site's colors, then set all your banner ads to cycle through up to four random pallets, selecting your 2, and either 2 stock pallets or your own that contrast your site's color scheme.

Google will cycle through the color pallets available, and eventually start favoring the one that results in the highest click through rate.

You can also limit banner ads to images only, text only, or both. I suggest both, again, Google will start favoring the format that provides you the best potential earning.

You can have up to 3 banner ads, and up to 3 text link units on each page, but more isn't always better. And of course where your adsesne code is placed on your web pages can have quite an effect on their performance. Often a square banner ad wrapped by your site's content, like the one shown on this page will outperform the typical banner in the header and footer. Likewise, text link units often perform well when directly below your site's main content.

Either way, make a habit of doing a monthly tweak, change one thing, placement, format, color etc. and let it run for a month before deciding whether you have increased or decreased your site's revenue.


Chitika :

Chitika's Premium Ad units is one of the few website ad services that can be used in conjunction with Google Adsense.

After adding the Premium Ad code to your website, only visitors that were referred to your site through a major search engine will see the ad. So your regular viewers, who get to your site through a bookmark, or other link won't see the Chitika ads.

Like Adsense, placement and color choices can affect your revenue, so yet again, make a monthly tweak, and review the performance.



Text-Link-Ads :

Once your site has been around for a while and has a decent and steady stream of traffic you can sell static links to other sites at rates you determine. TextLinkAds will estimate what they think a link on your site should sell for and once you have sold a link placement, they split the income from it 50/50 which may not sound like such a great deal, but when you consider they handle everything, leaving you to continue work on your website, it works our pretty well.



There are of course numerous other ways to monetize a website and I've only touched on a few here which I have had experience with. I'm curious what visitors might have to say about these, or other methods of site monetization , so let the comments fly!