[Privoxy-users] Help filtering text or syntax in a web page.

Fabian Keil fk at fabiankeil.de
Tue Dec 28 11:53:05 UTC 2021


Jason Parson <jasonlparson at gmail.com> wrote on 2021-12-20 at 23:28:58
(forwarding with his permission):

> Is it possible to filter text in a web page. I want to block any reference
> to places like those shown below. I use xmatrix or ublock to see if the
> pages have the sites attached to them. Here is an example. I don't want
> adodedtm.com, go-impulse.net, unpkg.com or googleapis.com to show up. I want
> the privoxy to block the sites completely. Is it possible to filter these
> and other sites like it

Yes. Privoxy can remove URLs from pages in which case the client
will not make the request.

Letting Privoxy block requests is usually easier, though.

> I have reviewed the documentation a number of times. Either I am missing it
> or it is not there.

Filters are documented at:
https://www.privoxy.org/user-manual/filter-file.html

> I would also, like to find out if it is possible to filter content the same
> way  uBlock does. Here is an example of my filters. I hope you understand
> it. I want to be able to filter third-party sites, inline-scripts,
> inline-fonts etc.  and other parameters
> 
> ||amazonaws*^$important,all
> 
> ||google*^$important,all
> 
> ||googletagmanager^$important,all
> 
> ||google-analytics^$important,all
> 
> ||googletagservices^$important,all
> 
> ||adobedtm^$important,all
> 
> ||launchdarkly^$important,all
> 
> ||newrelic^$important,all
> 
> ||segment^$important,all
> 
> ||amazon-adsystem.com^$important,all
> 
> ||aswpsdkus*^$important,all
> 
> ||chartbeat*^$important,all
> 
> ||criteo.net^$important,all
> 
> ||demdex*^$important,all
> 
> ||doubleclick.net^$important,all
> 
> ||go-mpulse^$important,all
> 
> ||h-cdn*^$important,all
> 
> ||imrworldwide.com^$important,all
> 
> ||lytics*^$important,all
> 
> ||omny*^$important,all
> 
> ||opecloud.com^$important,all
> 
> ||rlcdn.com^$important,all
> 
> ||taplytics*^$important,all
> 
> ||yimg*^$important,all
> 
> ||litix*^$important,all
> 
> ||cachefly*^$important,all
> 
> ||s2.pluralsight.com/analytics/*^$important,all
> 
> ||pluralsight.com/analytics/*^$important,all
> 
> ||demdex.net^$important,all
> 
> ||indexww*^$important,all
> 
> ||ispot*^$important,all
> 
> ||kampyle*^$important,all
> 
> ||scene7*^$important,all
> 
> ||facebook*^$important,all
> 
> ||romote*^$important,all
> 
> ||hibu*^$important,all
> 
> ||audioeye*^$important,all
> 
> ||flipboard*^$important,all
> 
> #||arkoselabs.com*^$important,all
> 
> #||.^$important,all
> 
>  
> 
> *$1p,important,all
> 
> *$3p,important,all
> 
> *$image,important,all
> 
> *$third-party,important,all
> 
> *$frame,important,all
> 
> *$script,important,all
> 
> *$document,important,all
> 
> *$subdocument,important,all
> 
> *$stylesheet,important,all
> 
> *$css,important,all
> 
> *$xhr,important,all
> 
> *$beacon,important,all
> 
> *$font,important,all
> 
> *$inline-font,important,all
> 
> *$inline-script,important,all
> 
> *$ghide,important,all
> 
> *$first-party,important,all
> 
> *$doc,important,all
> 
> *$shide,important,all
> 
> *$strict1p,important,all
> 
> *$popunder,important,all
> 
> *$strict3p,important,all
> 
> *$empty,important,all
> 
> #*$all,important
> 
>  
> 
> *$removeparam
> 
> *$removeparam=/^*
> 
> *$removeparam=/^/
> 
> *$removeparam=/^utm*_/
> 
> *$removeparam=/^utm*_*/
> 
> *$removeparam=/^__*/
> 
> *$removeparam=/sp_
> 
> *$removeparam=/^_*/
> 
> *$removeparam=/^-/
> 
>  
> 
> #@#+js()
> 
>  
> 
> #*/*.js$important,all,redirect=noopjs:100
> 
> #||*/*.js$important,all,redirect-rule=noopjs
> 
>  
> 
> #||google.*/*.js$all,redirect=noopjs:100
> 
> #||googleapis.*/*.*$all,redirect=noopjs:100
> 
>  
> 
> @@||app.pluralsight.com^$stylesheet,domain=app.pluralsight.com
> 
> @@||app.pluralsight.com^$image,domain=app.pluralsight.com
> 
> @@||app.pluralsight.com^$script,domain=app.pluralsight.com
> 
> @@||app.pluralsight.com^$xhr,domain=app.pluralsight.com
> 
> @@||img.pluralsight.com^$image,domain=app.pluralsight.com
> 
> @@||s2.pluralsight.com^$stylesheet,domain=app.pluralsight.com
> 
> @@||s2.pluralsight.com^$image,domain=app.pluralsight.com
> 
> @@||s2.pluralsight.com^$script,domain=app.pluralsight.com
> 
> @@||vid21.pluralsight.com^$xhr,domain=app.pluralsight.com
> 
> @@||vid5.pluralsight.com^$xhr,domain=app.pluralsight.com
> 
> @@||gravatar.com^$image,domain=app.pluralsight.com
> 
> @@||vid5.pluralsight.com^$xhr,domain=app.pluralsight.com
> 
>  
> 
> @@||imgix.net^$image,domain=pluralsight.imgix.net
> 
> @@||imgix2.net^$image,domain=pluralsight.imgix.net
> 
>  
> 
> ! 2021-10-29 https://www.pluralsight.com
> 
> @@||www.pluralsight.com.cdn.cloudflare.net^$stylesheet,domain=www.pluralsigh
> t.com
> 
> @@||www.pluralsight.com.cdn.cloudflare.net^$image,domain=www.pluralsight.com
> 
>  
> 
> @@||google.com^$stylesheet,domain=google.com
> 
> @@||google.com^$image,domain=google.com
> 
>  
> 
> @@||gstatic.com^$stylesheet,domain=paypal.com
> 
> @@||gstatic.com^$image,domain=paypal.com
> 
> @@||gstatic.com^$script,domain=paypal.com
> 
>  
> 
> @@||gstatic.com^$stylesheet,domain=paypalobjects.com
> 
> @@||gstatic.com^$image,domain=paypalobjects.com
> 
> @@||gstatic.com^$script,domain=paypalobjects.com
> 
>  
> 
> @@||gstatic.com^$stylesheet,domain=recaptcha.net
> 
> @@||gstatic.com^$image,domain=recaptcha.net
> 
> @@||gstatic.com^$script,domain=recaptcha.net

I'm not familiar with this syntax but maybe others on this list are.

> And my last question. When I go to a site. Sometimes it works and other
> times it does not. Do you have or know of a tool that can tell me why a
> pages fails as it relates to privoxy. I want my system security very strict.
> I need to know why a given site does not work.

In general it helps to enable debugging as described at:
<https://www.privoxy.org/user-manual/contact.html>
and check the logs.

Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.privoxy.org/pipermail/privoxy-users/attachments/20211228/50e35663/attachment.bin>


More information about the Privoxy-users mailing list