Monthly Archives: May 2010

Something weird with varnishncsa


Our blocking srapers system have reported us a IP which browse continuously one of our website.

So I have done a grep on the varnishncsa log, but i don’t see it in realtime.
I’ve done a grep on the all day varnishncsa log, i’ve seen:

91.88.187.67 - - [19/May/2010:00:58:14 +0200] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
91.88.187.67 - - [19/May/2010:00:39:13 +0200] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
91.88.187.67 - - [19/May/2010:01:25:15 +0200] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
91.88.187.67 - - [19/May/2010:04:02:24 +0200] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
91.88.187.67 - - [19/May/2010:04:28:14 +0200] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"

When i do varnishncsa in realtime with a varnishncsa | grep “/0_2″, i can see the backend traffic:

server1 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server2 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server1 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server2 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server1 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server1 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server2 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
server3 - - [00/Jan/1900:00:00:00 +0000] "POST http://www.ouest-france.fr/0_2 HTTP/1.1" (null) - "http://www.ouest-france.fr/0_2?page_ref=/0_4" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)"
...
...

Varnishlog contains:

115 SessionClose c pipe
115 ReqStart     c 91.88.187.67 4436 1089132167
115 RxRequest    c POST
115 RxURL        c /0_2
115 RxProtocol   c HTTP/1.1
115 RxHeader     c x-requested-with: XMLHttpRequest
115 RxHeader     c Accept-Language: fr
115 RxHeader     c Referer: http://www.ouest-france.fr/0_2?page_ref=/0_4
115 RxHeader     c Accept: application/xml, text/xml, */*
115 RxHeader     c Content-Type: application/x-www-form-urlencoded
115 RxHeader     c x-requested-handler: ajax
115 RxHeader     c Accept-Encoding: gzip, deflate
115 RxHeader     c User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; ficeLiveConnector.1.3; OfficeLivePatch.0.0)
115 RxHeader     c Host: www.ouest-france.fr
115 RxHeader     c Content-Length: 13
115 RxHeader     c Connection: Keep-Alive
115 RxHeader     c Cache-Control: no-cache
115 RxHeader     c Cookie: crm_cookieEnabled=1; __utmc=88432901; xtvrn=$61164$; __utma=88432901.1379090940.1266326763.1274179082.1274180897.682; __utmz=88432901.1274169829.676.57.utmcsr=ofmnewsletter|utmccn=une|utmcmd=lettredinformation
115 VCL_call     c recv
115 VCL_return   c pipe
115 VCL_call     c pipe
115 VCL_return   c pipe
209 BackendOpen  b server1 192.120.1.182 35846 192.120.1.68 80
115 Backend      c 209 director_of server1
209 TxRequest    b POST
209 TxURL        b /0_2
209 TxProtocol   b HTTP/1.1
209 TxHeader     b x-requested-with: XMLHttpRequest
209 TxHeader     b Accept-Language: fr
209 TxHeader     b Referer: http://www.ouest-france.fr/0_2?page_ref=/0_4
209 TxHeader     b Accept: application/xml, text/xml, */*
209 TxHeader     b Content-Type: application/x-www-form-urlencoded
209 TxHeader     b x-requested-handler: ajax
209 TxHeader     b Accept-Encoding: gzip, deflate
209 TxHeader     b User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.4; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)
209 TxHeader     b Host: www.ouest-france.fr
209 TxHeader     b Content-Length: 13
209 TxHeader     b Connection: Keep-Alive
209 TxHeader     b Cache-Control: no-cache
209 TxHeader     b Cookie: crm_cookieEnabled=1; __utmc=88432901; xtvrn=$61164$; __utma=88432901.1379090940.1266326763.1274179082.1274180897.682; __utmz=88432901.1274169829.676.57.utmcsr=ofmnewsletter|utmccn=une|utmcmd=lettredinformation
209 TxHeader     b X-Forwarded-For: 91.88.187.67
209 TxHeader     b X-Varnish: 1089132167
209 TxHeader     b X-Forwarded-For: 91.88.187.67
209 TxHeader     b whitelisted: 0
209 TxHeader     b X-Scraping: 1
209 TxHeader     b X-Hostname: 67.187.88-91.rev.gaoland.net
209 BackendClose b server1
115 ReqEnd       c 1089132167 1274251253.994797468 1274251254.165796518 0.000020742 0.001982212 0.169016838
115 StatSess     c 91.88.187.67 4436 0 1 1 1 0 0 948 0

I’ve searched on google about something like that. What I found is a problem on a box from a french ISP which loop on an url.

But i don’t understand why i can’t see all the client traffic with varnishncsa. Is it due to the “pipe” command ?
I don’t understand why there isn’t http status in the varnishncsa for the few lines i’ve captured.

Production test of blocking scraping system


Today is a big day ! This is the first test of our blocking scraping system on production.
We have put it on two of our three varnish.

For a first test we don’t want to block scrapers but just log them.

So we wrote a light version of the script.

# Copyright (c) 2010 OUESTFRANCE-MULTIMEDIA
# All rights reserved.
#
# Author: Vincent ROBERT <v.robert@of2m.fr>
# Thanks to: Tony FOUCHARD <t.fouchard@of2m.fr>
# Thanks to: Cyrille MAHIEUX <c.mahieux@of2m.fr>
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
#    notice, this list of conditions and the following disclaimer in the
#    documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.

C{
  #define HOSTNAME_MAX_SIZE 200
  #include <string.h>
  #include <libmemcached/memcached.h>
  #include <syslog.h>
  #include <netdb.h>
  #include <sys/types.h>
  #include <sys/socket.h>
}C
sub check_ip {
  set bereq.http.whitelisted = "0";
  if (bereq.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|js|css|swf|xml|flv)(\?(.)*)?$") {
     set bereq.http.whitelisted = "1";
  }
  else {
     if (bereq.http.host == "internalwebSite") {
       set bereq.http.whitelisted = "1";
     }
     elsif ((bereq.http.X-Forwarded-For == "IPaddress1") || (bereq.http.X-Forwarded-For == "IPaddress2")) {
       set bereq.http.whitelisted = "1";
     }
  }

  if (bereq.http.whitelisted=="0") {
    C{

      void ip_to_hostname(char *ipaddr, char *hostname)
      {
        struct addrinfo * result;
        int successfull=0;
        int error;

        error = getaddrinfo(ipaddr, NULL, NULL, &result);
        if (error==0) {
           error = getnameinfo(result->ai_addr, result->ai_addrlen, hostname, HOSTNAME_MAX_SIZE, NULL, 0, 0);
           if (error==0) {
             successfull=1;
           }
        }

        freeaddrinfo(result);

        if (!successfull){
          memcpy(hostname, ipaddr, HOSTNAME_MAX_SIZE*sizeof(char));
          hostname[HOSTNAME_MAX_SIZE-1]=0;
        }
      }

      memcached_server_st *servers = NULL;
      memcached_st *memc;
      memcached_return rc;
      char hostname[HOSTNAME_MAX_SIZE]="";

      char key[500]= "varnish_";
      strcat(key, VRT_GetHdr(sp, HDR_REQ, "\005host:"));
      strcat(key, "_");
      strcat(key, VRT_IP_string(sp, VRT_r_client_ip(sp)));

      memc= memcached_create(NULL);
      servers= memcached_server_list_append(servers, "memcacheServer", 11211, &rc);
      rc= memcached_server_push(memc, servers);

      if (rc == MEMCACHED_SUCCESS) {

        uint64_t intval;
        rc= memcached_increment(memc, key, strlen(key), (uint64_t)1, &intval);

        if (rc != MEMCACHED_SUCCESS)
          rc= memcached_set(memc, key, strlen(key), "1", 1, (time_t)60, (uint32_t)0);
        else
          if (intval>30) {
            ip_to_hostname(VRT_IP_string(sp, VRT_r_client_ip(sp)), hostname);
            VRT_SetHdr(sp, HDR_BEREQ, "\013X-Scraping:", "1", vrt_magic_string_end);
            VRT_SetHdr(sp, HDR_BEREQ, "\013X-Hostname:", hostname, vrt_magic_string_end);
            VRT_SetHdr(sp, HDR_BEREQ, "\013X-HostOrig:", VRT_GetHdr(sp, HDR_REQ, "\005host:"), vrt_magic_string_end);
            if (intval<300)
              rc= memcached_set(memc, key, strlen(key), "500", 3, (time_t)600, (uint32_t)0);
          }
      }
      memcached_free(memc);
      memcached_server_list_free(servers);
    }C
    if (bereq.http.X-Scraping=="1") {
      if (!(bereq.http.X-Hostname ~ "(\.(yahoo\.(net|com))|(exabot\.com)|(googlebot\.com)|(search\.msn\.com))$")) {
        C{
          syslog(LOG_INFO, "Scraping detected from %s %s on %s with %s",VRT_IP_string(sp, VRT_r_client_ip(sp)), VRT_GetHdr(sp, HDR_BEREQ, "\013X-Hostname:"), VRT_GetHdr(sp, HDR_REQ, "\005host:"), VRT_GetHdr(sp, HDR_REQ, "\013User-Agent:"));
        }C
      }
    }
  }
}

sub vcl_miss {
  call check_ip;
  fetch;
}

sub vcl_pipe {
  call check_ip;
  pipe;
}

sub vcl_pass {
  call check_ip;
  pass;
}

The load is about 600 requests/s on each Varnish server.
We put just one memcached server.

The result are:

  • No CPU overhead on Varnish or too small to be viewed
  • No memory leak spotted.
  • Memcached server is almost idle: about 60 requests/s and 80 KBytes used !
  • Turn off the memcached server don’t affect Varnish and user experience
  • Good news, the system show us a lot of scrappers and many spiders
  • Bad news, we have detected one or two proxy as scrappers, but they are not.

Here, a piece of our logs :

Jun  9 10:26:40 ouest-proxy03 varnishd[4015]: Scraping detected from 94.23.240.216 ns209194.ovh.net on www.ouest-france.fr with posh
Jun  9 10:29:28 ouest-proxy03 varnishd[4015]: Scraping detected from 94.23.198.123 ns207043.ovh.net on www.ouest-france.fr with Synthesio Crawler release MonaLisa (contact at synthesio dot fr)
Jun  9 10:29:55 ouest-proxy03 varnishd[4015]: Scraping detected from 69.191.249.201 injr-spdrproxy1.bloomberg.com on www.ouest-france.fr with BLP_bbot/0.1
Jun  9 10:29:59 ouest-proxy03 varnishd[4015]: Scraping detected from 89.122.29.79 adsl89-122-29-79.romtelecom.net on www.ouestfrance-emploi.com with Java/1.6.0_04

Using phpMemCacheAdmin, you can see memcache server is fine:





Here the mk-query-digest profiling of memcache server TCPdump:

 ./mk-query-digest --type memcached --group-by res --limit 10 dump_zym.log

# 69.3s user time, 440ms system time, 83.70M rss, 171.79M vsz
# Current date: Tue May 18 12:17:09 2010
# Files: dump_zym.log
# Overall: 70.68k total, 3 unique, 67.20 QPS, 0.01x concurrency __________
#                    total     min     max     avg     95%  stddev  median
# Exec time             9s    20us    22ms   122us   224us   233us    98us
# Time range        2010-05-18 11:45:40.061323 to 2010-05-18 12:03:11.908228
# bytes             17.63k       0       3    0.26    0.99    0.43       0
#  74% (53k)  Memc_incr
#  25% (18k)  Memc_miss
#  25% (18k)  Memc_set

# Profile
# Rank Query ID Response time    Calls R/Call   Item
# ==== ======== ================ ===== ======== =========
#    1 0x           4.9508 57.3% 34673   0.0001
#    2 0x           2.4583 28.5% 17995   0.0001 NOT_FOUND
#    3 0x           1.2310 14.2% 18015   0.0001 STORED

Here the top of one of the Varnish:

top - 16:38:02 up 57 days,  5:09,  3 users,  load average: 0.36, 0.54, 0.54
Tasks: 104 total,   1 running, 103 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.0%us,  1.1%sy,  0.1%ni, 93.2%id,  0.0%wa,  0.3%hi,  0.3%si,  0.0%st
Mem:  33030932k total, 29536176k used,  3494756k free,   781740k buffers
Swap: 11727408k total,     3792k used, 11723616k free,   727416k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2016 varnish   20   0 35.1g  26g  81m S   38 84.2 122:50.26 varnishd
24769 root      20   0 98.8m  80m  80m S    1  0.3   0:03.26 varnishncsa

Here the varnishstat of one of our Varnish:

0+07:03:59                                                                                                                                                                      ouest-proxy02.sdv.fr
Hitrate ratio:       10      100     1000
Hitrate avg:     0.7814   0.7972   0.8006

     3693144       103.00       145.18 Client connections accepted
    13991870       556.00       550.02 Client requests received
    11372520       469.00       447.05 Cache hits
       14376         1.00         0.57 Cache hits for pass
     2546655        85.00       100.11 Cache misses
     2622997        87.00       103.11 Backend conn. success
          41         0.00         0.00 Backend conn. not attempted
         507         0.00         0.02 Backend conn. failures
          96         0.00         0.00 Backend conn. reuses
         996         0.00         0.04 Backend conn. was closed
        1093         0.00         0.04 Backend conn. recycles
          36         0.00         0.00 Fetch head
     2338109        71.00        91.91 Fetch with Length
      226850        15.00         8.92 Fetch chunked
         932         0.00         0.04 Fetch wanted close
        1540          .            .   N struct sess_mem
         691          .            .   N struct sess
     2151242          .            .   N struct object
      877521          .            .   N struct objecthead
          56          .            .   N struct vbe_conn
         341          .            .   N struct bereq
         144          .            .   N worker threads
        1474         0.00         0.06 N worker threads created
           0         0.00         0.00 N queued work requests
       19144         0.00         0.75 N overflowed work requests
          29          .            .   N backends
         549          .            .   N expired objects
      395693          .            .   N LRU nuked objects
      345902          .            .   N LRU moved objects
          23         0.00         0.00 HTTP header overflows
    10527830       435.00       413.85 Objects sent with write
          12         0.00         0.00 Objects overflowing workspace
     3693079       113.00       145.17 Total Sessions
    13991789       557.00       550.01 Total Requests
       57104         2.00         2.24 Total pipe
       19567         0.00         0.77 Total pass
     2565891        90.00       100.86 Total fetch
  5115095581    210266.25    201072.98 Total header bytes
 71922504943   2823420.83   2827253.62 Total body bytes
      779458        22.00        30.64 Session Closed
         407         0.00         0.02 Session Pipeline
         317         0.00         0.01 Session Read Ahead
    13466511       535.00       529.36 Session Linger
    12094595       508.00       475.44 Session herd
   680468640     26550.16     26749.03 SHM records
    40799330      1602.01      1603.81 SHM writes
       52523         2.00         2.06 SHM flushes due to overflow
       64353         1.00         2.53 SHM MTX contention
         292         0.00         0.01 SHM cycles through buffer
     5476474       264.00       215.28 SMA allocator requests
     4256875          .            .   SMA outstanding allocations
 32212244810          .            .   SMA outstanding bytes
 63522967274          .            .   SMA bytes allocated
 31310722464          .            .   SMA bytes free
        1494         0.00         0.06 SMS allocator requests
      776444          .            .   SMS bytes allocated
      776444          .            .   SMS bytes freed
     2566011        86.00       100.87 Backend requests made
           4         0.00         0.00 N vcl total
           4         0.00         0.00 N vcl available
           1          .            .   N total active purges
           1         0.00         0.00 N new purges added

Next step: deploy on our last server, and really blocking !

A way to obtain in almost realtime a list of slow webpages with LogParser and IIS



Today, I’ve a problem with one of our website.

In google webmaster tools, crawl stats show a big increase of time spent downloading a page. The time have increased from 250ms to 500ms.





So, i’m looking for a solution to obtain which are the slow webpages in order to profile them.


My backend is under IIS 6.0. In the logs, we have append the time-taken.

So, with the great and powerfull Logparser, i’ve done that:

logparser "SELECT time-taken,strcat(strcat(cs-uri-stem,'?'), cs-uri-query) into c:\slowpages.csv from \\servername\logfiles\W3SVC36\ex100510.log where time-taken>500 order by time-taken desc" -i:IISW3C -o:CSV -iCheckPoint:myCheckPoint.lpc

And I’ve obtain a list like that:

time-taken STRCAT(STRCAT(cs-uri-stem, ‘?’), cs-uri-query)
46999 /page1.asp?MAR_encode=agence-immobiliere&VIT_encode=blot-habitat-nantes-buat
46593 /page2.asp?p1=maison&p2=louzy&p3=4431849&IN_LocInsee=79157
46202 /page2.asp?p1=appartement&p2=medreac&p3=5-pieces-4140247&IN_LocInsee=35171
45718 /page2.asp?p1=appartement&p2=nantes&p3=2-pieces-4435946&IN_LocInsee=44109
40593 /page2.asp?p1=maison&p2=noyal-sur-vilaine&p3=4058790&IN_LocInsee=35207
35671 /page2.asp?p1=maison&p2=pornichet&p3=3738904&IN_LocInsee=44132



Oh shit! What is that ? A web page taking about 40 seconds ???!!!!

I need to profile now !!!


Note, i’ve used the “-iCheckPoint:myCheckPoint.lpc” option, so when we restart analysis, it will resume from the last checkpoint. So you can launch the analysis every minute to have an almost realtime analysis.


Note again, if you want to filter on googlebot, you can add the clause: cs(User-Agent) like ‘%GoogleBot%’ .

You can obtain the average time for the googlebot using:

logparser "SELECT avg(time-taken), cs(User-Agent) from \\servername\logfiles\W3SVC36\ex100510.log where cs(User-Agent) like '%GoogleBot%' group by cs(User-Agent)" -i:IISW3C

Varnish 2.1.2 released

From varnish-announce@varnish-cache.org :

Hot on the heels of 2.1.1, Varnish 2.1.2 is now released.


It fixes a regression introduced with Range support which was added in 2.1.1 whereby large objects would get junk appended to them. This would not show up in browser due to the Content-Length header, but confused at least some load balancers leading to very strange errors.


The release can be downloaded from Sourceforge, as usual: http://sourceforge.net/projects/varnish/

HEADS-UP: Bug in Varnish 2.1.1 may append junk to objects

From varnish-announce@varnish-cache.org :



Thanks to Audun Ytterdal, I just fixed a really boneheaded bug I introduced in Varnish 2.1.1


This will warrant a 2.1.2 release, but because that typically takes a week or so, I am sending this heads-up to the varnish-announce list, so those of you who are affected can take evasive action in the meantime.


If you backend sends chunked encoding (typically only CGI processing) and delivers objects larger than 128k, you may hit this bug.


To fix this for good, you can apply the one-line patch attached to the commit message below.


As a workaround, you can increase the “fetch_chunksize” parameter to be bigger than any object your backend may deliver with chunked encoding.  Be aware that this will affect storage usage negatively.


Telling your backend to not use chunked encoding (if possible) is also a water-tight workaround.


My apologies,


Poul-Henning