-
-
Notifications
You must be signed in to change notification settings - Fork 177
Add support for HTTP_PROXY and SOCKS_PROXY #458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hey @sigaloid, hope you're doing good. I don't want to rush things here, but it's been around a month without a response to that PR. Is there anything wrong, problematic with the PR? Can/Should I change anything? |
I've tested the SOCKS5 proxy on my own instance and it works great, thank you :) |
This looks great! Thanks for the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except a few things - what does the encryption look like for something like this? I'd rather not allow unencrypted HTTP proxies at all for privacy reasons - I don't want to send plaintext traffic under any configuration.
if !response.starts_with("HTTP/1.1 200") { | ||
return Err(Box::new(ProxyError(format!("Proxy CONNECT failed: {}", response)))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the particulars regarding HTTP proxies but I assume this is acceptable as most (all?) proxies support 1.1? Also is there really no packages that handles the manual TCP transmission 😅 I tried to find one myself last time I tried to tackle this and similarly could not find one. It's also not my area of expertise so 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as I am a fan of HTTP > 1.1 and all the innovation that HTTP2 and HTTP3 bring - a proxy implementation that supports CONNECT only over 2/3 but not over 1.1 would be very weird one. It'd be equally weird if a Webserver implementation would only support 2/3 and deliberately not support HTTP 1.1.
I have to admit, I'm not very much into Rust and the ecosystem to know all of the libraries flying around. But I did a fair amount of research, even asked some LLMs (*pilot, *gpt) for potentially existing implementations that tackle HTTP Proxying. If you stumble over a lib, I'd be happy to patch it in accordingly. I can very much understand why you don't want that in Redlib ;)
One option is to extract that logic into a library block and make redlib depend on it? Would you feel more comfortable if that logic is not within redlib, but in a separate repo/crate/package/lib/submodule?
Just chiming in to say I'm using a plaintext SOCKS5 proxy. I run redlib in Kubernetes and have a user space wireguard SOCKS5 proxy as a sidecar container. Forced encryption or authentication would be a huge pain and for no benefit. |
How about a mandatory flag if your proxy is plaintext, like PLAINTEXT_PROXY=1? It's something I want to avoid accidentally enabling, not something I am morally opposed to. If this is common or expected behavior across similar projects, though, I'd be okay to merge it - any citations for others that do it like this? |
I am pretty sure some projects use two different environment variables: HTTP_PROXY and HTTPS_PROXY. That should be pretty clear on whether encryption is intended? This is how Go handles it: https://cs.opensource.google/go/x/net/+/internal-branch.go1.17-vendor:http/httpproxy/proxy.go;l=92 |
That's fair - the hard-coded |
There isn't really a standard for these env vars unfortunately, and their meaning can be a bit weird / deceptive. The example I linked will prefer HTTPS_PROXY for https requests, not that it means that it will use a HTTPS proxy. Sorry if that's confusing. Anyway, it's my understanding that using plain HTTP is actually not that bad in this case? The HTTP proxy server will essentially just proxy TCP, and so the actual connection between redlib and reddit are still using HTTPS / TLS. The HTTP proxy cannot inspect or modify the stream without breaking it. I don't see any explicit handling for HTTPS/TLS though, so I don't think |
Hey, happy you @sigaloid got to review the PR - thank you very much for taking the time, also thanks to @uhthomas for testing and your (very much correct) answers so far. I'd like to add a few cents into HTTP_PROXY and HTTPS_PROXY discussion and some more details to get a common understanding here - buckle up that response might take me some hours to formulate - if it took me long to write, it should take you long to read 😄 HTTP CONNECT (tunneling/piping)The HTTP CONNECT wrapper I have implemented here is using the de-facto way to proxy traffic through HTTP, sometimes even referred to as HTTP tunneling/piping. Redlib as client opens up a TCP socket to the proxy and continues to send a HTTP CONNECT to the PROXY (proxy address taken from HTTP_PROXY Vs. HTTPS_PROXYThere's a lot of confusion about the HTTPS_PROXY and HTTP_PROXY environment variables. Both ENV vars point to a proxy (in format Why HTTPS_PROXY is nonsensical in the case of Redlib?Well, Redlib is a privacy fronted for Reddit. From what I have seen (and please tell me if that is incorrect), redlib only uses But why does Redlib then even support HTTPS_PROXY env var?short answer: it's complicated and not very much standardized. From what I have seen with most clients implementing HTTP CONNECT method for proxying is, that they don't really care about the SCHEME prefix - all they want is FQDN/Hostname + Port of the proxy (optionally also Authentication credentials for the proxy). In almost all cases they attempt an HTTP CONNECT in plaintext first, which almost always works out. Some implementations don't honor HTTP_PROXY at all, as they rely on HTTPS_PROXY to be present. Others (like I have implemented) use HTTPS_PROXY and fall back to HTTP_PROXY if needed. Others only support only HTTP_PROXY as env var. Sysadmins also have developed the following protective measure to configure their systems to maximize compatibility with all applications. myproxy="http://yomamasof.at:80`
export HTTP_PROXY="$myproxy"
export HTTPS_PROXY="$myproxy"
# purposefully setting HTTP_PROXY to the same as HTTPS_PROXY to maximize compatibility. What about the big corporate proxies that terminate TLS and re-establish TLS?Now one might argue, that not all proxies operate the way I've been trying to explain it above. And yes, that's a great argumentation - Especially, the corporate world has some very evil implementations in place that allow inspection of each individual HTTP request flowing through it. They do this by "terminating" TLS. Redlib requests reddit.com through the proxy - but instead of showing Redlib the certificate of reddit.com to Redlib, it shows its own (that obviously is trusted on machines in the corporate environments). The client establishes a TLS protected socket with the Proxy - and the proxy thus can decrypt all the traffic with ease, and just re-encrypts (using another TLS handshake) whatever the client sent.. This MR does not support this setup, because for this setup a) Redlib would need a way to get a certificate of the proxy it is allowed to trust, and b) would not necessarily improve the privacy ;) |
Please refer to section: |
I'm in a similar boat - I host Redlib as a Deployment in a Kubernetes Cluster on 6 worker nodes with (atm) a total of 6 replicas, where each node has its own public IPv4 egress addresses, a round-robin Loadbalancer in front - still I'm getting rate-limited as if there is no tomorrow. I had a similar intention of spinning up a SOCKS proxy as sidecar. In my case a TOR proxy to circumvent the rate limits accordingly. I like your socks-wireguard proxy. Great work! |
Is there anything blocking this @sigaloid? It would be great to have this on the main branch so I don't have to build my own image and rebase it to get updates. |
Hi @sigaloid,
It has been a while since I initially proposed to add Support for Proxies. The project was still called LibReddit at that time - see original request: libreddit/libreddit#841 .
Further Issues that relate to that
Since my instance is getting rate-limited frequently, people are already complaining about it, and I don't want to add further VPS (with new public IPs) just to circumvent the rate-limiting. So I finally decided to patch redlib to support HTTP_PROXY and SOCKS proxies.
Changes in this MR
Redlib honors the following Environment-Variables:
if
SOCKS_PROXY
is set, it'll take precedence overHTTPS_PROXY
andHTTP_PROXY
. Additionally the Environment Variables support authentication using the following formatscheme://username:password@host:port
(e.g.http://proxyUsername:mySuperSecureProxyPassword@localhost:8090
).For SOCKS support I introduced tokio-socks as dependency. For HTTP Proxy support, I just wrote a simple HTTP CONNECT wrapper. In all cases, consumers of the CLIENT won't notice a difference.
Tests
I have tested HTTP Proxying with TinyProxy locally. And I have tested SOCKS Proxying with my VPN Providers SOCKS Proxy, and TOR proxying through a local tor-socks-proxy container.
Would appreciate if you could review this MR - and maybe merge back. That'd make the world a better place - at least get rid of some rate-limits ^^
Cheers,
ruffy