At work, we are using Varnish, a very good reverse proxy, since many years.
Varnish does a lot of stuff for us. One of the major is the ability to cache responses of our quite slow php backends. Getting a page in 16ms is ever better than 300ms
Since my venue to the London Velocity Conf, I am quite obsessed to test Google ModPagespeed.
The product is not focused on the frontend response time but more on the user experience.
By providing many filters, it can optimized web page applying rules like: CSS minifying, optimizing images, defer JavaScript and many others. Nothing we can’t do ourselves but it can save times of dev teams.
I have done my first test of ModPageSpeed few weeks ago. Results seems very good but one things hurts me. ModPageSpeed proceed in two stage optimization: the first call of the page, ModPageSpeed get the page from PHP and returns the response without any optimization to the client. And in parallel it launches a thread to optimize the page for the next client.
As it don’t optimize the first call, it forces a cache-control “no-cache” on the response.
The problem is that no page will go in the cache of Varnish with a no-cache directive.
One last thing, ModPageSpeed sends different optimizations depending of the User-Agent.
So, one side we have optimized the HTML of the response, increasing user experience, but on the other side we have put more load on our backend.
Thinking about it, I asked myself if we could put ModPageSpeed in front of Varnish. If yes, we could have an optimized HTML and less load on the backend.
The response is yes, we can put ModPageSpeed in front of Varnish !

We just need to install Apache and configure it with “ProxyPass”.
This is all the modules we need:
LoadModule authz_host_module /usr/lib64/httpd/modules/mod_authz_host.so LoadModule deflate_module /usr/lib64/httpd/modules/mod_deflate.so LoadModule log_config_module /usr/lib64/httpd/modules/mod_log_config.so LoadModule setenvif_module /usr/lib64/httpd/modules/mod_setenvif.so LoadModule proxy_module /usr/lib64/httpd/modules/mod_proxy.so LoadModule proxy_http_module /usr/lib64/httpd/modules/mod_proxy_http.so LoadModule status_module /usr/lib64/httpd/modules/mod_status.so LoadModule vhost_alias_module /usr/lib64/httpd/modules/mod_vhost_alias.so
All others modules can be safely disabled.
Since ModPagespeed uses threads, think to start your Apache in worker mode:
HTTPD=/usr/sbin/httpd.worker
in
/etc/sysconfig/httpd
If you have more than one vhost, you need to use at least the 1.1.23.2-2191 of ModPagespeed to support multi-vhosts.
A default setup could be:
DocumentRoot "/var/www/htdocs" <Directory /> Order allow,deny Allow from all </Directory> Include conf.d/pagespeed.conf NameVirtualHost *:80 <VirtualHost *:80> #Default VHost : ModPagespeed disabled ModPagespeed Off CustomLog /var/log/httpd/default_access.log combined Errorlog /var/log/httpd/default_error.log </VirtualHost> <VirtualHost *:80> #mywebsite.com VHost : ModPagespeed enabled ServerName www.mywebsite.com ServerAlias s1.mystaticwebsite.com ServerAlias s2.mystaticwebsite.com ModPagespeed On ModPagespeedEnableFilters lazyload_images,collapse_whitespace,combine_javascript,defer_javascript ModPagespeedMapOriginDomain http://localhost:8000 http://www.mywebsite.com ModPagespeedMapOriginDomain http://localhost:8000 http://mystaticwebsite.com ModPagespeedShardDomain mystaticwebsite.com s1.mystaticwebsite.com,s2.mystaticwebsite.com CustomLog /var/log/httpd/www.mywebsite.com_access.log combined Errorlog /var/log/httpd/www.mywebsite.com_error.log </VirtualHost> ProxyRequests Off ProxyPreserveHost On ProxyPass / http://127.0.0.1:8000/
This configuration is fully operational but not yet tested in production.
Is there someone who has already tested this architecture in production?
But then, why do you still use Varnish? Why not using mod_proxy for Apache instead of installing another piece of software? mod_pagespeed can be configured to access static files directly, instead of going through the web server. Have you tried it? Is Varnish quicker even when not used as frontend web server?
super, merci très instructif. par contre est-ce que ça pose des soucis en https?
@+
It’s seem weird to me but I’m curious to see response time …
The main benefit from varnish is it’s hability to cache ..
From my previous tests, it deliver a static file a lot faster than apache and scale better when concurrency increase..
So without testing I am guessing that using this arch’ you will loose this benefits and your front will deliver content as a “normal” apacher server with pagespeed delivering static files.
But I may be wrong so keep me in touch if you try (by email or via @petitchevalroux on tweeter)
I will try to answer all your good questions.
First, we are not ready to use this configuration in production, this is just a concept, a idea, and i wrote this post to have a feedback from you.
To answer to Martin, we have not tried what you suggest. For the moment, it will be very difficult for us to not use Varnish. Varnish is not just a cache, it is a real swiss knive of http, we do a lot of things that apache can’t do easily.
To answer to Romain, https is not supported by Varnish, so if we add apache in front of Varnish, we can have a solution for that.
To answer to petitchevalroux, this idea is pricesely to keep a cache, when using Modpagespeed on a backend, Varnish can’t cache HTML because of no-cache headers. With this solution, response of our slow backend is already in cache in Varnish. I’m not the only to think of this kind of infrastruture, take a look at “http://www.webperformancetoday.com/2012/11/26/were-teaming-up-with-amazon-web-services-to-bring-advanced-feo-to-cloudfront-customers/” , what are they doing: they put a stage of optimization between the client and the server.
sorry but that just doesn’t make any sense. all the benefits you get from using varnish are gone by squeezing that apache in front. you’re just one peak of PIs away of having your apache server trash its disk swapping like there is no tomorrow. if you can handle your load with apache in the front line then maybe varnish is overkill for you.
Ok, I understand.
What do you propose in order to keep a cache with Varnish and to use ModPagespeed ?
I want to use both of them, so how do that ?
i’m not sure i would use that module in the first place. i’m a big fan of beating the knowledge of “webdev best practice” into my developers heads. a quick scan of some filters just made me realize that there seems to be some risk involved in almost all of these filters in the sense that they can’t and the don’t know so at the end of the day they will screw you over one way or another. seems easier and less trouble to make devs read up on best practice and do it right in the first place thus having no need for this module.
but since you asked; quick thought without asking my magic 8 ball if it is feasible: if there is some kind of marker that the no-cache header is because the response is the first and mod_pagespeed is about to do its magic i could remove the no-cache, set a ttl of 1 minute (thus serving unoptimized for a while), log the request and have a special daemon pick up on this log, request the very same resource again, this time receiving the optimized version, and finally replace the varnish cache with it and its longer cache time.
but then again, what do i know? seems more fun to whip some developers into shape, making the world a better place while doing so.
I’ve sent a suggestion to pagespeed team, and posted on forums: Varnish, mod_pagespeed, additional header for reverse proxy caching fully optimized html.
I dont think apache->varnish is such a bad idea, since apache is only a proxy, without heavy mods to eat up memory. Varnish is probably faster than direct file fetch, because files are cached in memory.
Still, it would be better to cache fuly optimized page.
btw. i think you should also add memcached as a cache backend to your apache+MPS setup.
We use layer 7 balancers to run varnish to the side of apache, rather than in front. The balancers know if content can be cached (roughly) for example, ssl traffic isn’t cached. if there’s a cookie, that’s not cached either. So we send only stuff we can safely assume varnish might have to varnish, which in turn has the balancer as the default backend. The balancer sends all non-cachable traffic on ports 443 and 80 to the app servers (apache, most of the time).
The problem comes in if you can’t afford a layer 7 balancer(only $80,000!)… We’re still trying to engineer a solution similar to a traffic manager that uses only FLOSS and maybe some scripts. Layer 7 balancers have ASICs and FPGAs specifically programmed to take apart a packet and quickly route it to the proper backend VIP, be it varnish, or otherwise. They also can decrypt (terminate) SSL traffic for proper routing as well. Doing this in software is very difficult, and so far only a couple of open source softwares can even come close to hardware with just basic proxying – nevermind scanning the packet and routing based on payload.
Good luck!
To answer to Vincent I think if you want to use modpagespeed and varnish you must use the bellow:-
Varnish->Apache(modpagespeed)->varnish->Webserver application
This way you will have the power of both and that all can be done in one box
All the best,