[Privoxy-devel] 0007-Create-a-fast-CGI-function

Fabian Keil fk at fabiankeil.de
Sun Sep 10 08:52:38 CEST 2023


Lee <ler762 at protonmail.com> wrote on 2023-09-08 at 19:05:16:

> On Thursday, September 7th, 2023 at 12:06 PM, Fabian Keil wrote:
> 
> > I suppose some filled-out templates could be cached as another
> > mechanism to speed things up.
> 
> But then you'll need to check the cache to see if it's out of
> date & how do you do that without accessing the disk?

Some of the templates could be cached until the config file
gets reloaded, one example for this would be mod-title.

Even if we would continue to check the config file for each
request, caching mod-title and friends should help a bit
by reducing the number of system calls Privoxy makes.

> Maybe you can come up with something that is fast; all
> I could come up with is what I did in cgi_show_url_final_info:
>   char body[] = \
>   "<!DOCTYPE html><html lang=\"en\"><head><title>URL Block Info</title></head>\n"\
>   ... </html>\n";
>   ...
>   /* return template_fill_for_cgi(csp, "show-url-final-info", exports, rsp);   -LR- */
>   rsp->body = strdup_or_die(body);
>   template_fill(&rsp->body, exports);
>   free_map(exports);
>   return 0;

Doing it without template_fill_for_cgi() (which leads to lots
of pcre compilations and pcrs substitutions) should be even faster,
but of course every small step would help.

> > > I expect it to be noticeably slower when everything is happening
> > > on the same machine. I originally had this in my awk program for
> > > talking to Privoxy:
> > > 
> > > print "GET http://config.privoxy.org/show-url-final-info?url=" url " HTTP/1.1" |& webserver
> > > print "Host: config.privoxy.org" |& webserver
> > > print "Accept: text/html" |& webserver
> > > print "Connection: Keep-Alive" |& webserver
> > > print "" |& webserver
> > > 
> > > That's slower than
> > > 
> > > printf("GET http://config.privoxy.org/show-url-final-info?url=%s HTTP/1.1\r\n"\
> > > "Host: config.privoxy.org\r\n"\
> > > "Accept: text/html\r\n"\
> > > "Connection: Keep-Alive\r\n\r\n", url ) |& webserver
> > > 
> > > because the multiple print statements tend to cause multiple writes
> > > to Privoxy. The single printf causes a single write to Privoxy and
> > > is clearly faster.
> > 
> > That's what I would expect.
> 
> Not me.  Or at least, not the me back then :)
> I was seeing multiple tcp packets with wireshark and not coming up
> with a reasonable explanation for why.  It took me a while before
> trying to do everything in a single write.

While the topic is TCP packets, you could also experiment
with letting Privoxy not set TCP_NODELAY (see set_no_delay_flag()
in jbsockets.c) for the socket Privoxy accepts the connection
on.

It's not guaranteed to make things faster (after all we set
the TCP_NODELAY flag for a reason) but it could reduce the
number of packets Privoxy sends which may reduce the context
switches your awk program does.

> > Is your awk program free software and publicly available?
> 
> GPL yes. Available, maybe.  I tried to update my github repository
> and after multiple pushes _finally_ saw something in the contrib directory.
>   https://github.com/ler762/privoxy/tree/lee/contrib

Thanks, I'll take a look.

> Hopefully the "nothing, nothing, nothing, YaY! there it is,
> nothing, nothing" flakiness was because I'd just updated it
> and they took _way_ too long to update all their web servers.

I'm not sure I follow. Are you referring to Microsoft/Github
or who are "they"?

Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.privoxy.org/pipermail/privoxy-devel/attachments/20230910/f809d73c/attachment.bin>


More information about the Privoxy-devel mailing list