[Privoxy-users] external-filter example

Lee ler762 at gmail.com
Thu Jul 23 03:15:18 UTC 2020


On 7/22/20, U.Mutlu <um at mutluit.com> wrote:
> Lee wrote on 07/22/2020 09:29 PM:
>>
>> Have you added
>> debug    64 # debug regular expression filters
>> to the config file to see if privoxy tries to run the filter?
>
> Thx. Now tried with "debug 64" and it shows that it recognizes the said
> external-filter, but IMHO the action for it is not executed:
  <.. snip ..>
correct, a filter being applied looks like
2020-07-22 21:38:06.979 000004a0 Re-Filter: filtering
'Accept-Encoding: gzip, deflate, br' (size 34) with
'no-brotli-accepted' ...
2020-07-22 21:38:06.979 000004a0 Re-Filter: ... produced 1 hits (new size 30).

> There is no other log entry regarding this "myTee" filter.
> Ie. it seems the action for it does not get executed.

Did you add the filter call to your user.action?  for example:
A long time ago flamingtext made you wait for the pretty picture.  I
got annoyed, so I added a filter to remove the wait.  In user.filter:
#
# flamingtest: removes image wait timer
#
# flamingtest.com - removes the timeout obstruction so you don't have
to wait for your image
FILTER: flamingtext no waiting

s@<META HTTP-EQUIV=Refresh CONTENT="15; URL=@<META HTTP-EQUIV=Refresh
CONTENT="1; URL=@ig

Then activate the filter just for the flamingtext domain in user.actions:
## don't wait for flamingtext.com
{ +filter{flamingtext} }
.flamingtext.com/

and now, http://config.privoxy.org/show-url-info?url=flamingtext.com gives me
In file: user.action
{+filter{flamingtext} }
.flamingtext.com/
{+server-header-tagger{content-type} }
/
{-deanimate-gifs }
.flamingtext.com/

>> If the filter isn't being called, maybe you need
>> https://www.privoxy.org/user-manual/filter-file.html
>>    Enabled content filters are applied to any content whose "Content
>> Type" header is recognised as a sign of text-based content, with the
>> exception of text/plain. Use the force-text-mode action to also filter
>> other content.
>
> But doing so seems to be dangerous according to the doc
> https://www.privoxy.org/user-manual/actions-file.html#FORCE-TEXT-MODE
> It says:
> "Warning - Think twice before activating this action. Filtering binary data
> with regular expressions can cause file damage."
>
>
> I just want to get a copy of the whole HTML that privoxy gets from the
> server,
> either the unmodified raw data, or the filtered version after
> privoxy applies the other defined filters.
>
> I think the HTML is always a text file,

oops, no, not at all

> even if it contains binary data
> like picture images etc. (they are somehow text encoded I think using
> base64 encoding or so). Is this assumption wrong?

it is wrong

> If I read the doc right then the external-filter handler has to write
> the received data either verbatim (ie. no filtering), or a filtered
> version of the data, back to stdout, so privoxy can continue doing its job.
>
> A working example for such an external-filter would be helpful.
> Unfortunately I couldn't find any on the web :-(

How about
  http://config.privoxy.org/user-manual/actions-file.html#SERVER-HEADER-TAGGER

Add this to your user.action:
# Tag every request with the content type declared by the server
{+server-header-tagger{content-type}}
/

# apply myTee filter to all text/html content
{ +external-filter{myTee} +force-text-mode}
TAG:^text/html

and the bit you have in user.filter should be something like

EXTERNAL-FILTER: myTee make a copy of reveived text
/usr/bin/tee myCopiedText

maybe that will work?

Lee


More information about the Privoxy-users mailing list