Add support for HTTP_PROXY and SOCKS_PROXY #458

unclearParadigm · 2025-08-01T09:43:22Z

It has been a while since I initially proposed to add Support for Proxies. The project was still called LibReddit at that time - see original request: libreddit/libreddit#841 .

Further Issues that relate to that

Since my instance is getting rate-limited frequently, people are already complaining about it, and I don't want to add further VPS (with new public IPs) just to circumvent the rate-limiting. So I finally decided to patch redlib to support HTTP_PROXY and SOCKS proxies.

Changes in this MR

Redlib honors the following Environment-Variables:

HTTP_PROXY
HTTPS_PROXY
SOCKS_PROXY

if SOCKS_PROXY is set, it'll take precedence over HTTPS_PROXY and HTTP_PROXY. Additionally the Environment Variables support authentication using the following format scheme://username:password@host:port (e.g. http://proxyUsername:mySuperSecureProxyPassword@localhost:8090).

For SOCKS support I introduced tokio-socks as dependency. For HTTP Proxy support, I just wrote a simple HTTP CONNECT wrapper. In all cases, consumers of the CLIENT won't notice a difference.

Tests

I have tested HTTP Proxying with TinyProxy locally. And I have tested SOCKS Proxying with my VPN Providers SOCKS Proxy, and TOR proxying through a local tor-socks-proxy container.

Would appreciate if you could review this MR - and maybe merge back. That'd make the world a better place - at least get rid of some rate-limits ^^

Cheers,
ruffy

…l) credentials

unclearParadigm · 2025-08-27T21:37:51Z

Hey @sigaloid, hope you're doing good. I don't want to rush things here, but it's been around a month without a response to that PR. Is there anything wrong, problematic with the PR? Can/Should I change anything?

uhthomas · 2025-09-06T03:20:22Z

I've tested the SOCKS5 proxy on my own instance and it works great, thank you :)

sigaloid · 2025-09-09T17:33:04Z

This looks great! Thanks for the PR.

sigaloid

LGTM except a few things - what does the encryption look like for something like this? I'd rather not allow unencrypted HTTP proxies at all for privacy reasons - I don't want to send plaintext traffic under any configuration.

sigaloid · 2025-09-09T17:37:41Z

src/proxy.rs

+    if !response.starts_with("HTTP/1.1 200") {
+        return Err(Box::new(ProxyError(format!("Proxy CONNECT failed: {}", response))));
+    }


I'm not sure of the particulars regarding HTTP proxies but I assume this is acceptable as most (all?) proxies support 1.1? Also is there really no packages that handles the manual TCP transmission 😅 I tried to find one myself last time I tried to tackle this and similarly could not find one. It's also not my area of expertise so 😞

As much as I am a fan of HTTP > 1.1 and all the innovation that HTTP2 and HTTP3 bring - a proxy implementation that supports CONNECT only over 2/3 but not over 1.1 would be very weird one. It'd be equally weird if a Webserver implementation would only support 2/3 and deliberately not support HTTP 1.1.

I have to admit, I'm not very much into Rust and the ecosystem to know all of the libraries flying around. But I did a fair amount of research, even asked some LLMs (*pilot, *gpt) for potentially existing implementations that tackle HTTP Proxying. If you stumble over a lib, I'd be happy to patch it in accordingly. I can very much understand why you don't want that in Redlib ;)

One option is to extract that logic into a library block and make redlib depend on it? Would you feel more comfortable if that logic is not within redlib, but in a separate repo/crate/package/lib/submodule?

uhthomas · 2025-09-09T17:42:55Z

Just chiming in to say I'm using a plaintext SOCKS5 proxy. I run redlib in Kubernetes and have a user space wireguard SOCKS5 proxy as a sidecar container. Forced encryption or authentication would be a huge pain and for no benefit.

sigaloid · 2025-09-09T17:53:19Z

How about a mandatory flag if your proxy is plaintext, like PLAINTEXT_PROXY=1? It's something I want to avoid accidentally enabling, not something I am morally opposed to. If this is common or expected behavior across similar projects, though, I'd be okay to merge it - any citations for others that do it like this?

uhthomas · 2025-09-09T17:59:34Z

I am pretty sure some projects use two different environment variables: HTTP_PROXY and HTTPS_PROXY. That should be pretty clear on whether encryption is intended?

This is how Go handles it: https://cs.opensource.google/go/x/net/+/internal-branch.go1.17-vendor:http/httpproxy/proxy.go;l=92

sigaloid · 2025-09-09T18:07:49Z

That's fair - the hard-coded 80 did make me nervous but I'm happy to leave it as HTTP_PROXY, if that's standard.

uhthomas · 2025-09-09T18:56:45Z

There isn't really a standard for these env vars unfortunately, and their meaning can be a bit weird / deceptive. The example I linked will prefer HTTPS_PROXY for https requests, not that it means that it will use a HTTPS proxy. Sorry if that's confusing.

Anyway, it's my understanding that using plain HTTP is actually not that bad in this case? The HTTP proxy server will essentially just proxy TCP, and so the actual connection between redlib and reddit are still using HTTPS / TLS. The HTTP proxy cannot inspect or modify the stream without breaking it.

I don't see any explicit handling for HTTPS/TLS though, so I don't think HTTP_PROXY=https://example.com will work, not that it's necessary.

unclearParadigm · 2025-09-11T05:04:20Z

Hey,

happy you @sigaloid got to review the PR - thank you very much for taking the time, also thanks to @uhthomas for testing and your (very much correct) answers so far. I'd like to add a few cents into HTTP_PROXY and HTTPS_PROXY discussion and some more details to get a common understanding here - buckle up that response might take me some hours to formulate - if it took me long to write, it should take you long to read 😄

HTTP CONNECT (tunneling/piping)

The HTTP CONNECT wrapper I have implemented here is using the de-facto way to proxy traffic through HTTP, sometimes even referred to as HTTP tunneling/piping. Redlib as client opens up a TCP socket to the proxy and continues to send a HTTP CONNECT to the PROXY (proxy address taken from HTTP_PROXY or HTTPS_PROXY env vars), which contains the Hostname/FQDN + Port of the destination (e.g. api.reddit.com:443). The Mozilla Dev Docs show a simple request example. The proxy will extract the FQDN/Hostname + Port from the CONNECT request - and opens up a TCP socket to the specified target, then the proxy will respond with (let's assume the happy path here) with a HTTP 200 Status Code. The catch here is, that the Proxy will not close the underlying TCP socket after it responded. At this point the proxy acts as a dumb pipe from Redlib to Reddit. Whatever Redlib writes to socket to the proxy, is what the proxy will write to its open socket with reddit. Note, this is a raw socket - data transferred does not need to be HTTP-formatted content - at that point it's basically a tunnel or pipe for any TCP traffic. In a nutshell, at that point Redlib has a TCP connection to reddit.com, just not directly, but through the proxy. Let's continue with the journey and talk about TLS. Clients that open a connection on 443/tcp typically assume HTTPS (which is HTTP through TLS) - Redlib is luckily not an exception here. That requires a client to initiate a TLS Handshake. Redlib now uses the established TCP socket to the proxy (that pipes byte by byte to reddit) and starts the TLS handshake by sending a CLIENT HELLO. Again the proxy just pipes whatever it receives to reddit - and sends the CLIENT HELLO to Reddit. The reddit server will respond with a SERVER HELLO and the proxy will again just pipe it back to Redlib. After playing through the whole TLS Handshake procedure and land at the point that we have a trusted/encrypted connection from Redlib to Reddit using TLS - one that cannot even be snooped by the Proxy. I don't want to get into the TLS details here, but we're talking about the whole: "Alice (=Redlib) wants to talk to Bob (reddit.com) while Eve (the Proxy) is snooping on the whole conversation" situation. For the sake of simplicity (and to prevent me from ending up at explaining how electrons work), we can safely assume that TLS prevents Eve as MITM (Man in the middle - yes Eve identifies as man... ) to snoop on what Alice and Bob chat about.

HTTP_PROXY Vs. HTTPS_PROXY

There's a lot of confusion about the HTTPS_PROXY and HTTP_PROXY environment variables. Both ENV vars point to a proxy (in format scheme://<optionalUsername>:<optionalPassword>@<fqdn/hostname>:<TCP-port>). The only formal difference for HTTPS_PROXY is, that instead of sending the HTTP CONNECT on a plain-text TCP socket to the proxy, the client needs to perform a TLS handshake with the Proxy first - just to send the HTTP CONNECT over the afterwards TLS protected socket. At the end of the day it remains a dumb pipe.

Why HTTPS_PROXY is nonsensical in the case of Redlib?

Well, Redlib is a privacy fronted for Reddit. From what I have seen (and please tell me if that is incorrect), redlib only uses https:// URLS. I have not found a single reference to any http:// URLS pointing to reddit. Reddit has a TLS Server certificate that is trusted by Redlib. Redlib performs a TLS Handshake against Reddit, and considering that TLS does a great job at preventing MITM (especially in this setup), it is nonsensical to additionally encrypt from Redlib to the proxy (just for the sake of sending the one HTTP CONNECT). It's like wrapping up a package in 2 layers of bubble-wrap, when one layer would be enough already.

But why does Redlib then even support HTTPS_PROXY env var?

short answer: it's complicated and not very much standardized. From what I have seen with most clients implementing HTTP CONNECT method for proxying is, that they don't really care about the SCHEME prefix - all they want is FQDN/Hostname + Port of the proxy (optionally also Authentication credentials for the proxy). In almost all cases they attempt an HTTP CONNECT in plaintext first, which almost always works out. Some implementations don't honor HTTP_PROXY at all, as they rely on HTTPS_PROXY to be present. Others (like I have implemented) use HTTPS_PROXY and fall back to HTTP_PROXY if needed. Others only support only HTTP_PROXY as env var. Sysadmins also have developed the following protective measure to configure their systems to maximize compatibility with all applications.

myproxy="http://yomamasof.at:80`
export HTTP_PROXY="$myproxy"
export HTTPS_PROXY="$myproxy"
# purposefully setting HTTP_PROXY to the same as HTTPS_PROXY to maximize compatibility.

What about the big corporate proxies that terminate TLS and re-establish TLS?

Now one might argue, that not all proxies operate the way I've been trying to explain it above. And yes, that's a great argumentation - Especially, the corporate world has some very evil implementations in place that allow inspection of each individual HTTP request flowing through it. They do this by "terminating" TLS. Redlib requests reddit.com through the proxy - but instead of showing Redlib the certificate of reddit.com to Redlib, it shows its own (that obviously is trusted on machines in the corporate environments). The client establishes a TLS protected socket with the Proxy - and the proxy thus can decrypt all the traffic with ease, and just re-encrypts (using another TLS handshake) whatever the client sent.. This MR does not support this setup, because for this setup a) Redlib would need a way to get a certificate of the proxy it is allowed to trust, and b) would not necessarily improve the privacy ;)

unclearParadigm · 2025-09-11T05:24:27Z

LGTM except a few things - what does the encryption look like for something like this? I'd rather not allow unencrypted HTTP proxies at all for privacy reasons - I don't want to send plaintext traffic under any configuration.

Please refer to section: Why HTTPS_PROXY is nonsensical in the case of Redlib in this comment

unclearParadigm · 2025-09-11T05:36:20Z

Just chiming in to say I'm using a plaintext SOCKS5 proxy. I run redlib in Kubernetes and have a user space wireguard SOCKS5 proxy as a sidecar container. Forced encryption or authentication would be a huge pain and for no benefit.

I'm in a similar boat - I host Redlib as a Deployment in a Kubernetes Cluster on 6 worker nodes with (atm) a total of 6 replicas, where each node has its own public IPv4 egress addresses, a round-robin Loadbalancer in front - still I'm getting rate-limited as if there is no tomorrow. I had a similar intention of spinning up a SOCKS proxy as sidecar. In my case a TOR proxy to circumvent the rate limits accordingly. I like your socks-wireguard proxy. Great work!

uhthomas · 2025-10-04T12:41:42Z

Is there anything blocking this @sigaloid? It would be great to have this on the main branch so I don't have to build my own image and rebase it to get updates.

unclearParadigm added 3 commits August 1, 2025 11:05

install tokio-socks latest

a168bb2

add proxyConnector handling HTTPS_PROXY and SOCKS_PROXY with (optiona…

e0800bd

…l) credentials

refactor proxy to use less inline code

3dda94c

sigaloid reviewed Sep 9, 2025

View reviewed changes

Uh oh!

Add support for HTTP_PROXY and SOCKS_PROXY #458

Are you sure you want to change the base?

Add support for HTTP_PROXY and SOCKS_PROXY #458

Conversation

unclearParadigm commented Aug 1, 2025

Changes in this MR

Tests

Uh oh!

unclearParadigm commented Aug 27, 2025

Uh oh!

uhthomas commented Sep 6, 2025

Uh oh!

sigaloid commented Sep 9, 2025

Uh oh!

sigaloid left a comment

Choose a reason for hiding this comment

Uh oh!

sigaloid Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

unclearParadigm Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

uhthomas commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sigaloid commented Sep 9, 2025

Uh oh!

uhthomas commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sigaloid commented Sep 9, 2025

Uh oh!

uhthomas commented Sep 9, 2025

Uh oh!

unclearParadigm commented Sep 11, 2025

HTTP CONNECT (tunneling/piping)

HTTP_PROXY Vs. HTTPS_PROXY

Why HTTPS_PROXY is nonsensical in the case of Redlib?

But why does Redlib then even support HTTPS_PROXY env var?

What about the big corporate proxies that terminate TLS and re-establish TLS?

Uh oh!

unclearParadigm commented Sep 11, 2025

Uh oh!

unclearParadigm commented Sep 11, 2025

Uh oh!

uhthomas commented Oct 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

uhthomas commented Sep 9, 2025 •

edited

Loading

uhthomas commented Sep 9, 2025 •

edited

Loading