[Privoxy-devel] add a new cgi function?
Lee
ler762 at gmail.com
Sat Aug 4 11:41:03 UTC 2018
On 8/4/18, Fabian Keil <fk at fabiankeil.de> wrote:
> Lee <ler762 at gmail.com> wrote:
>
>> On 8/1/18, Lee <ler762 at gmail.com> wrote:
>
>> > How do you feel about adding a 'show url info' cgi function more
>> > suitable for use by a program?
>
> Seems reasonable.
>
>> > show-url-info hits the disk way too much.. I had process monitor
>> > running
>> > ( https://docs.microsoft.com/en-us/sysinternals/downloads/procmon )
>> > and each call to show-url-info reads
>> > templates\show-url-info Offset: 0, Length: 4,096, Priority: Normal
>> > templates\show-url-info Offset: 4,096, Length: 4,096
>> > templates\show-url-info Offset: 8,192, Length: 1,998
>> > templates\mod-title
>> > templates\mod-unstable-warning
>> > templates\mod-local-help
>> > templates\mod-support-and-service
>> >
>> > and takes about 0.001 second. Which seems fast enough until you
>> > realize disk i/o is going to take about a minute for every 60,000
>> > calls. And the hosts file I'm working with is
>> > $ grep -v "^#" unified-hosts.txt | wc -l
>> > 63266
>> >
>> >
>> > The other big change was not checking for file system changes in jcc.c
>> > ::
>> > serve
>> > $ git diff jcc.c
>> > diff --git a/jcc.c b/jcc.c
>> > index d1fca148..df4f1448 100644
>> > --- a/jcc.c
>> > +++ b/jcc.c
>> > @@ -3311,7 +3311,10 @@ static void serve(struct client_state *csp)
>> > }
>> > }
>> >
>> > - if (continue_chatting && any_loaded_file_changed(csp))
>> > +/*
>> > + * don't check for action/filter file changes if processing cgi
>> > requests
>> > + */
>> > + if (continue_chatting && !(csp->flags & CSP_FLAG_CRUNCHED) &&
>> > any_loaded_file_changed(csp))
>> > {
>> > continue_chatting = 0;
>> > config_file_change_detected = 1;
>> >
>> >
>> > I was thinking I'd just make another cgi function that would call
>> > any_loaded_file_changed & that would take care of it.
>>
>> except it looks like any_loaded_file_changed doesn't actually reload
>> anything that changed. Rather than add another function I added this
>> bit to cgisimple.c so that show-status would make sure everything was
>> current before showing the status:
>> diff --git a/cgisimple.c b/cgisimple.c
>> index 1f13d5ce..6e1c9387 100644
>> --- a/cgisimple.c
>> +++ b/cgisimple.c
>> @@ -1084,6 +1084,15 @@ jb_err cgi_show_status(struct client_state *csp,
>> assert(rsp);
>> assert(parameters);
>>
>> + /*
>> + * make sure config files are current
>> + */
>> + if (run_loader(csp))
>> + {
>> + log_error(LOG_LEVEL_FATAL, "a loader failed - must exit");
>> + /* Never get here - LOG_LEVEL_FATAL causes program exit */
>> + }
>> +
>> if ('\0' != *(lookup(parameters, "file")))
>> {
>>
>> are you ok with that change?
>
> Why is this necessary? run_loader() is called from listen_loop()
> already.
My guess[1] is because I've got this configured
default-server-timeout 300
my program adds a
"Connection: keep-alive"
to the request, and I didn't want disk accesses slowing things down
while I'm checking urls, so I changed serve to not call
any_loaded_file_changed for cgi requests.
(see attached diff.txt)
We had a "TODO list proposal" thread back in April 2009 where I came up with
# awk script to see if a URL is blocked by Privoxy or not
I ended up with two curl calls to show-url-info before running the
script because one call didn't always reload the config. Adding this
code to show-status guarantees I need to call curl only once to
refresh the config files.
background: Windows 10 isn't so much an operating system as an O/S As A Service
Major updates tend to reset things back to the default; my original
block-test.awk was taking almost 20 minutes to run but after I got a
windows update last week that required rebooting the next time I ran
the original program it took almost 36 minutes. I probably need to
exclude things from anti-virus checking again, but even with that it's
still going to take way too long. Using the gawk |& extension to talk
to a co-process and changing the privoxy code to not read from disk
gets it down to about 1 minute.
Lee
[1] I have a hard time following the code & it's quite possible I
missed or misunderstood something.
-------------- next part --------------
diff --git a/actions.c b/actions.c
index 6249de9e..d87b0cc9 100644
--- a/actions.c
+++ b/actions.c
@@ -778,6 +778,54 @@ jb_err merge_current_action (struct current_action_spec *dest,
}
+/*********************************************************************
+ *
+ * Function : merge_single_actions
+ * same thing as merge_current_action except
+ * skip processing of multi actions
+ * no, i have no idea what the diff is between single & multi actions
+ *
+ * Description : Merge two actions together.
+ * Similar to "dest += src".
+ * Differences between this and merge_actions()
+ * is that this one doesn't allocate memory for
+ * strings (so "src" better be in memory for at least
+ * as long as "dest" is, and you'd better free
+ * "dest" using "free_current_action").
+ * Also, there is no mask or remove lists in dest.
+ * (If we're applying it to a URL, we don't need them)
+ *
+ * Parameters :
+ * 1 : dest = Current actions, to modify.
+ * 2 : src = Action to add.
+ *
+ * Returns 0 : no error
+ * !=0 : error, probably JB_ERR_MEMORY.
+ *
+ *********************************************************************/
+jb_err merge_single_actions (struct current_action_spec *dest,
+ const struct action_spec *src)
+{
+ int i;
+ jb_err err = JB_ERR_OK;
+
+ dest->flags &= src->mask;
+ dest->flags |= src->add;
+
+ for (i = 0; i < ACTION_STRING_COUNT; i++)
+ {
+ char * str = src->string[i];
+ if (str)
+ {
+ str = strdup_or_die(str);
+ freez(dest->string[i]);
+ dest->string[i] = str;
+ }
+ }
+ return err;
+}
+
+
/*********************************************************************
*
* Function : update_action_bits_for_tag
diff --git a/actions.h b/actions.h
index af401766..a36c74f9 100644
--- a/actions.h
+++ b/actions.h
@@ -70,6 +70,8 @@ extern void init_current_action (struct current_action_spec *dest);
extern void free_current_action (struct current_action_spec *src);
extern jb_err merge_current_action (struct current_action_spec *dest,
const struct action_spec *src);
+extern jb_err merge_single_actions (struct current_action_spec *dest,
+ const struct action_spec *src);
extern char * current_action_to_html(const struct client_state *csp,
const struct current_action_spec *action);
extern char * actions_to_line_of_text(const struct current_action_spec *action);
diff --git a/cgi.c b/cgi.c
index 22601760..bfcab1c9 100644
--- a/cgi.c
+++ b/cgi.c
@@ -112,6 +112,10 @@ static const struct cgi_dispatcher cgi_dispatchers[] = {
cgi_show_url_info,
"Look up which actions apply to a URL and why",
TRUE },
+ { "show-url-final-info",
+ cgi_show_url_final_info,
+ "Look up the final actions that apply to a URL",
+ TRUE },
#ifdef FEATURE_TOGGLE
{ "toggle",
cgi_toggle,
diff --git a/cgisimple.c b/cgisimple.c
index 1f13d5ce..6e1c9387 100644
--- a/cgisimple.c
+++ b/cgisimple.c
@@ -1084,6 +1084,15 @@ jb_err cgi_show_status(struct client_state *csp,
assert(rsp);
assert(parameters);
+ /*
+ * make sure config files are current
+ */
+ if (run_loader(csp))
+ {
+ log_error(LOG_LEVEL_FATAL, "a loader failed - must exit");
+ /* Never get here - LOG_LEVEL_FATAL causes program exit */
+ }
+
if ('\0' != *(lookup(parameters, "file")))
{
return cgi_show_file(csp, rsp, parameters);
@@ -1691,6 +1700,245 @@ jb_err cgi_show_url_info(struct client_state *csp,
}
+/*********************************************************************
+ *
+ * Function : cgi_show_url_final_info
+ *
+ * Description : CGI function that shows just the "Final results:"
+ * section from cgi_show_url_info.
+ * If all you want to know is if a URL would be blocked
+ * or not, this is the function for you!
+ *
+ * Parameters :
+ * 1 : csp = Current client state (buffers, headers, etc...)
+ * 2 : rsp = http_response data structure for output
+ * 3 : parameters = map of cgi parameters
+ *
+ * CGI Parameters :
+ * url : The url whose actions are to be determined.
+ * If url is unset, the url-given conditional will be
+ * set, so that all but the form can be suppressed in
+ * the template.
+ *
+ * Returns : JB_ERR_OK on success
+ * JB_ERR_MEMORY on out-of-memory error.
+ *
+ *********************************************************************/
+jb_err cgi_show_url_final_info(struct client_state *csp,
+ struct http_response *rsp,
+ const struct map *parameters)
+{
+ char *url_param;
+ struct map *exports;
+
+ assert(csp);
+ assert(rsp);
+ assert(parameters);
+
+ if (NULL == (exports = default_exports(csp, "show-url-final-info")))
+ {
+ return JB_ERR_MEMORY;
+ }
+
+ /*
+ * Get the url= parameter (if present) and remove any leading/trailing spaces.
+ */
+ url_param = strdup_or_die(lookup(parameters, "url"));
+ chomp(url_param);
+
+ /*
+ * Handle prefixes. 4 possibilities:
+ * 1) "http://" or "https://" prefix present and followed by URL - OK
+ * 2) Only the "http://" or "https://" part is present, no URL - change
+ * to empty string so it will be detected later as "no URL".
+ * 3) Parameter specified but doesn't start with "http(s?)://" - add a
+ * "http://" prefix.
+ * 4) Parameter not specified or is empty string - let this fall through
+ * for now, next block of code will handle it.
+ */
+ if (0 == strncmp(url_param, "http://", 7))
+ {
+ if (url_param[7] == '\0')
+ {
+ /*
+ * Empty URL (just prefix).
+ * Make it totally empty so it's caught by the next if ()
+ */
+ url_param[0] = '\0';
+ }
+ }
+ else if (0 == strncmp(url_param, "https://", 8))
+ {
+ if (url_param[8] == '\0')
+ {
+ /*
+ * Empty URL (just prefix).
+ * Make it totally empty so it's caught by the next if ()
+ */
+ url_param[0] = '\0';
+ }
+ }
+ else if ((url_param[0] != '\0')
+ && ((NULL == strstr(url_param, "://")
+ || (strstr(url_param, "://") > strstr(url_param, "/")))))
+ {
+ /*
+ * No prefix or at least no prefix before
+ * the first slash - assume http://
+ */
+ char *url_param_prefixed = strdup_or_die("http://");
+
+ if (JB_ERR_OK != string_join(&url_param_prefixed, url_param))
+ {
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+ url_param = url_param_prefixed;
+ }
+
+ if (url_param[0] == '\0')
+ {
+ /* URL paramater not specified, display query form only. */
+ free(url_param);
+ if (map_block_killer(exports, "url-given")
+ || map(exports, "url", 1, "", 1))
+ {
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+ }
+ else
+ {
+ /* Given a URL, so query it. */
+ jb_err err;
+ char *s;
+ struct file_list *fl;
+ struct url_actions *b;
+ struct http_request url_to_query[1];
+ struct current_action_spec action[1];
+ int i;
+
+ if (map(exports, "url", 1, html_encode(url_param), 0))
+ {
+ free(url_param);
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+
+ init_current_action(action);
+
+ if (map(exports, "default", 1, current_action_to_html(csp, action), 0))
+ {
+ free_current_action(action);
+ free(url_param);
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+
+ memset(url_to_query, '\0', sizeof(url_to_query));
+ err = parse_http_url(url_param, url_to_query, REQUIRE_PROTOCOL);
+ assert((err != JB_ERR_OK) || (url_to_query->ssl == !strncmpic(url_param, "https://", 8)));
+
+ free(url_param);
+
+ if (err == JB_ERR_MEMORY)
+ {
+ free_http_request(url_to_query);
+ free_current_action(action);
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+ else if (err)
+ {
+ /* Invalid URL */
+
+ err = map(exports, "matches", 1, "<b>[Invalid URL specified!]</b>" , 1);
+ if (!err) err = map(exports, "final", 1, lookup(exports, "default"), 1);
+ if (!err) err = map_block_killer(exports, "valid-url");
+
+ free_current_action(action);
+ free_http_request(url_to_query);
+
+ if (err)
+ {
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+
+ return template_fill_for_cgi(csp, "show-url-final-info", exports, rsp);
+ }
+
+ for (i = 0; i < MAX_AF_FILES; i++)
+ {
+ if (NULL == csp->config->actions_file_short[i]
+ || !strcmp(csp->config->actions_file_short[i], "standard.action")) continue;
+
+ b = NULL;
+ if ((fl = csp->actions_list[i]) != NULL)
+ {
+ if ((b = fl->f) != NULL)
+ {
+ b = b->next;
+ }
+ }
+
+ for ( ; b != NULL; b = b->next)
+ {
+ if (url_match(b->url, url_to_query))
+ {
+ /* if (merge_current_action(action, b->action)) -LR- orig */
+ if (merge_single_actions(action, b->action))
+ {
+ free_http_request(url_to_query);
+ free_current_action(action);
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+ }
+ }
+ }
+
+ free_current_action(csp->action);
+ get_url_actions(csp, url_to_query);
+
+ free_http_request(url_to_query);
+
+ s = current_action_to_html(csp, action);
+
+ free_current_action(action);
+
+ if (map(exports, "final", 1, s, 0))
+ {
+ free_map(exports);
+ return JB_ERR_MEMORY;
+ }
+ }
+
+ /* return template_fill_for_cgi(csp, "show-url-final-info", exports, rsp); -LR- */
+ rsp->body = \
+"<!DOCTYPE html><html lang=\"en\"><head><title>URL Block Info</title></head>\n"\
+"<body><table cellpadding=\"20\" cellspacing=\"10\" border=\"0\" width=\"100%\">\n"\
+"<!-- @if-url-given-start -->\n"\
+"<!-- @if-valid-url-start -->\n"\
+"<tr><td><h2>Final results:</h2>\n"\
+"<b>@final@</b>\n"\
+"</td></tr>\n"\
+"<!-- if-valid-url-end@ -->\n"\
+"<!-- if-url-given-end@ -->\n"\
+"<tr><td><h2>Look up the actions for a URL:</h2>\n"\
+"<form method=\"GET\" action=\"@default-cgi at show-url-final-info\">\n"\
+"<p><input type=\"text\" name=\"url\" size=\"80\" value=\"@url@\"><input type=\"submit\" value=\"Go\"></p>\n"\
+"</form>\n"\
+"</td></tr></table>\n"\
+"</body></html>\n";
+
+ template_fill(&rsp->body, exports);
+ free_map(exports);
+ return 0;
+
+}
+
+
/*********************************************************************
*
* Function : cgi_robots_txt
diff --git a/cgisimple.h b/cgisimple.h
index 52642a40..790574f3 100644
--- a/cgisimple.h
+++ b/cgisimple.h
@@ -61,6 +61,9 @@ extern jb_err cgi_show_status (struct client_state *csp,
extern jb_err cgi_show_url_info(struct client_state *csp,
struct http_response *rsp,
const struct map *parameters);
+extern jb_err cgi_show_url_final_info(struct client_state *csp,
+ struct http_response *rsp,
+ const struct map *parameters);
extern jb_err cgi_show_request (struct client_state *csp,
struct http_response *rsp,
const struct map *parameters);
diff --git a/jcc.c b/jcc.c
index d1fca148..df4f1448 100644
--- a/jcc.c
+++ b/jcc.c
@@ -3311,7 +3311,10 @@ static void serve(struct client_state *csp)
}
}
- if (continue_chatting && any_loaded_file_changed(csp))
+/*
+ * don't check for action/filter file changes if processing cgi requests
+ */
+ if (continue_chatting && !(csp->flags & CSP_FLAG_CRUNCHED) && any_loaded_file_changed(csp))
{
continue_chatting = 0;
config_file_change_detected = 1;
More information about the Privoxy-devel
mailing list