[Privoxy-devel] PATCH for pcre2 support
Gagan Sidhu
broly at mac.com
Thu Mar 16 18:18:35 CET 2023
> On Mar 16, 2023, at 9:11 AM, Fabian Keil <fk at fabiankeil.de> wrote:
>
> Gagan Sidhu <broly at mac.com> wrote on 2023-03-16 at 05:41:42:
>
>>> On Mar 16, 2023, at 3:56 AM, Fabian Keil <fk at fabiankeil.de> wrote:
>>>
>>> Gagan Sidhu <broly at mac.com> wrote on 2023-03-13 at 11:28:42:
>>>
>>>> therefore it is not my changes per-se that are the problem.
>>>> it is a combination of privoxy’s string preprocessing/postprocessing
>>>> and pcre2 that is the problem.
>>>>
>>>> i will also add the ’server’ is up. when you go to
>>>> http://127.0.0.1:8118, you get the exact same output for pcre2 and pcre1.
>>>
>>> Are you saying you are using the patch:
>>> SHA256 (substandard_pcre2.patch) = 142f99e4b685fee8c6592ea47a89cc4ea29622744458e70d3ae1f370abd9df27
>>> and get actual content when requesting http://127.0.0.1:8118/ and http://p.p:8118/? <http://p.p:8118/?>
>>
>> no. what i’m saying is i get this msesage when visiting 127.0.0.1:8118 from either the pcre1 or pcre2 builds:
>>
>> "Invalid header received from client.”
>
> Which client do you use? Can you reproduce the problem with curl?
just firefox. i haven’t tried to do that because i don’t know how ;P
in terms of the patch, by the way, i updated it. there are small changes, so the sum will now be:
e7e00d9ac411127c500d786f13cc22281d65f0575f29edabca4331d5c4afc937
but i doubt your observations will be any different.
>
>>>> some people were kind enough to share some things that may break in pcre2 (that worked in pcre1):
>>>>
>>>> https://stackoverflow.com/a/73767663 <https://stackoverflow.com/a/73767663>
>>>>
>>>> any assistance on fixing this issue would be great.
>>>
>>> Very interesting.
>>>
>>> I noticed another problem with the patch.
>>>
>>> Destination rewriting seems to reproducible result in a stack overflow:
>>>
>>> fk at t520 ~/git/privoxy $gdb-privoxy
> [...]
>>> 2023-03-16 10:47:16.294 801012700 Redirect: pcrs command "s@^https?://twitter.com/([^?]*)@http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/$1@" changed "https://twitter.com/TCNOco/status/1634620446002774018" to "http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/TCNOco/status/1634620446002774018" (1 hit).
>>>
>>> Thread 2 received signal SIGABRT, Aborted.
>>> Sent by kill() from pid 33334 and user 1001.
>>> [Switching to LWP 101318 of process 33334]
>>> kill () at kill.S:4
>>> 4 kill.S: No such file or directory.
>>> (gdb) where
>>> #0 kill () at kill.S:4
>>> #1 0x000000080089b4e0 in __fail (msg=0x8007a57a4 "stack overflow detected; terminated") at /usr/src/lib/libc/secure/stack_protector.c:130
>>> #2 0x000000080089b450 in __stack_chk_fail () at /usr/src/lib/libc/secure/stack_protector.c:137
>>> #3 0x000000000024abfd in rewrite_url (old_url=0x801c28100 "https://twitter.com/TCNOco/status/1634620446002774018",
>>> pcrs_command=0x801c10000 "s@^https?://twitter.com/([^?]*)@http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/$1@") at filters.c:1038
>>> #4 0x000000000024ad07 in redirect_url (csp=0x8010f1008) at filters.c:1257
>>> #5 0x00000000002583b5 in crunch_response_triggered (csp=0x8010f1008, crunchers=0x218920 <crunchers_all>) at jcc.c:953
>>> #6 0x00000000002569d6 in chat (csp=0x8010f1008) at jcc.c:4482
>>> #7 0x0000000000255736 in serve (csp=0x8010f1008) at jcc.c:5056
>>> #8 0x0000000800745a7a in thread_start (curthread=0x801012700) at /usr/src/lib/libthr/thread/thr_create.c:292
>>> #9 0x0000000000000000 in ?? ()
>>> Backtrace stopped: Cannot access memory at address 0x7fffdfffe000
>>>
>>> The rewrite is enabled with an action like:
>>>
>>> {+redirect{s@^https?://twitter.com/([^?]*)@http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/$1@}}
>>> twitter.com
>>>
>>
>> interesting. if the result is a stack overflow/SIGABRT, should the program not terminate?
>> - i will admit i have not had to address this kind of problem in quite some time, so i am rusty.
>
> The stack overflow above happened while gdb was
> attached to the Privoxy process so the behaviour
> is expected.
gotcha.
>
>>> 2023-03-16 10:47:16.294 801012700 Redirect: pcrs command "s@^https?://twitter.com/([^?]*)@http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/$1@" changed "https://twitter.com/TCNOco/status/1634620446002774018" to "http://vfaomgh4jxphpbdfizkm5gbtjahmei234giqj4facbwhrfjtcldauqad.onion/TCNOco/status/1634620446002774018" (1 hit).
>>
>>
>> i would prefer we stick to the regression tests because, from the above, it seems we are replacing the original string with a bigger string, right?
>> -i’m not sure how this would be a problem.
>
> It's conceivable that the issues are unrelated.
fair enough.
i hope to have more time by either tomorrow, or for sure this weekend, to troubleshoot further using gdb, and hopefully see what’s up.
i’d like to step through and print what’s being returned from pcre2_compile when the tests are run, since that seems to be when the issues start.
>
> Fabian
More information about the Privoxy-devel
mailing list