-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #25293 - Puma support for smart proxy #623
Conversation
Rebased, this nows adds entries to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing like testing the changes in production.
Some findings:
- when installing puma in production by gem, make sure you have
openssl-devel
present. If you want to give it try as well with latest nightlies, this might be handy:
yum install gcc rpm-build rpm-devel rpmlint make python bash coreutils diffutils patch rpmdevtools openssl-devel ruby-devel
-
In production, we set
:bind_host
to::
, which caused the puma to fail withthe scheme ssl does not accept registry part: ::
, changing:bind_host
setting to '*' fixed the issue: we might need to update the installer to use this value (and check webrick is fine with it). -
the
request.env['https']
is set tohttps
in puma (which is non of the values we are testing right now) - needs to be added -
verify_mode was not set, so the client certificate were not loaded
-
once the verify_mode is set, the client cert is not available under
SSL_CLIENT_CERT
env , butpuma.peercert
instead: we need to change all the occurrences ofSSL_CLIENT_CERT
lib/launcher.rb
Outdated
app_details[:SSLArgs] = { | ||
:ca => @settings.ssl_ca_file, | ||
:key => @settings.ssl_private_key, | ||
:cert => @settings.ssl_certificate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing :verify_mode => 'peer'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporating this change.
We did this because |
This is likely just a parsing issue, fixed by using |
lib/launcher.rb
Outdated
query_list = app[:SSLArgs].to_a.map do |x| | ||
"#{CGI::escape(x[0].to_s)}=#{CGI::escape(x[1])}" | ||
end | ||
host = "ssl://#{address}/?#{query_list.join('&')}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @iNecas has reported, if address
contains :
or .
characters, this is being misparsed. Using ssl://[#{address}]/
is probably the right thing to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a common code for both IPv4 and IPv6, we feed it correctly: ['0.0.0.0', '[::1]']
so I guess this will not fix the issue.
Added two changes, not sure what to do with other items. For the :: problem we can either fix this in the installer or change the code - if it spots "::" it will convert this to "[::]" which is a bit lame but it will do it. Ping me on IRC for faster turnaround, I have huge backlog on GH. Let me know. |
Changed slightly the code according to @ekohl recommendation. To listen on all interfaces use |
@ekohl since we just branched off, let's merge it now as an opt-it? |
@adamruzicka could you also have a look? I believe the launcher code might be useful in |
We had a discussion in the platform team and we feel that at this point in the release cycle we don't have time to support this. We're still very busy with EL8 support and Rails 6 as well as the regular release work. Nightly is very red and next week we're supposed to start stabilization week. We can't handle more instability and the cipher change would require an installer change was well to migrate from the current model (hardcoded + blacklist) to this model. While I think that's a very good step, at this point in the cycle it's too busy. Our conclusion was that we want to merge this soon in the 2.2 development cycle to give it time to harden. I'd also to see the follow up plan, like how we would enable multiple threads and verify thread safety. |
rebased with current Rubocop rules 🤖. |
Thanks for info, I will poke you once we branch off. |
2.1-stable exists ;-) |
@ekohl can you please bring this to attention on the meeting again? We have passed the suggested timeframe but it's post 2.2 branching and still lot of time in the 2.3 cycle, it could be a good time to merge this. |
I'm still not convinced it's a good idea. I can't speak for others, but I don't have time for this. Perhaps you can start with splitting some things up. For example, right now it changes how SSL ciphers are configured. That might be a good part to extract and merge. That already includes a migration that will need changes in the installer and in itself is complicated enough. |
I squashed into a single commit to make my life easier when dealing with those conflicts (keeping co-authorship). I rebased and implemented a util class |
Fixed rubocop issues, I don't think this is too valid so disabled this rule:
Can we proceed with merging? I spent so much time on this one, we have the option to go back to webrick so we should be safe if issues arise. |
Co-authored-by: Ido Kanner <[email protected]>
Changes requested from @ekohl and @ShimShtein please guys re-review. |
I'd like to add that Webrick has always been a multi-threaded server, this is enabled by default so I don't expect problems with threads after switch to Puma. |
As I stated again and again: I still don't really see the benefit in using Puma. There is no performance benefit and it now only complicates the code. If I read puma/puma#2260 I'm not sure it's wise to spawn multiple launchers from the same process and things may break in the future. Nothing in this PR or the Redmine issue explains why this is a good idea and I think this is a bad idea. Unless you convince me otherwise that there is a benefit, I don't want to merge this. |
For the record, I know I could do better in regard to description, but this was a follow-up PR. The purpose of this is that Webrick seems to be dying under load: https://bugzilla.redhat.com/show_bug.cgi?id=1614087 Webrick 1.4 or newer however does fix the issue. The Puma migration started before we actually migrated Smart Proxy to SCL Ruby (which was blocking upgrading Webrick from 1.3). I guess we don't really needed. I was driving this forward since we are changing core to Puma as well and I was hoping for a single stack. You can configure Puma in (multithreaded) singleinstance too to follow Webrick deployment strategy, but you don't want this to continue. Allright. |
From my perspective, being a novice on the internals of the smart proxy, I like the idea of landing on the same technologies across our stack where possible. This means we get more out of our investment, more knowledge and in theory understanding of what might be happening under the hood in each. For example, we benefit from investing and supporting systemd notify support in Puma if we use it in two places. All technology shifts come with risks, and I admit I do not fully know the extent this may impact. Sounds like nobody truly does and we do not have the confidence in our testing to give a clear yes or no. We are starting to push more workloads onto the smart proxy, which does concern me from a workload perspective. And I would wonder if Puma is a better forward looking choice to handle those workloads. Examples are the global registration workflow, container registry support that Katello is looking in to, using the smart proxy to support Foreman Proxy with Content moving to port 443, more reverse proxying type use cases. I apologize if I have asked this before (possibly even in this PR). What is it about the smart proxy that leads us to run it as we do instead of running it as a more traditional sinatra application + application server? |
The original motivation was my own ticket https://projects.theforeman.org/issues/24634 and then Ido (the original author of this PR) took over and created a new RM issue for whatever reason which has an empty description. At the time, I have tested that webrick 1.3 suffers from slow connection termination and puma does not. That was the initial goal, which is no longer a must as we migrated to SCL Ruby with Webrick 1.4+. Now, it's still a useful move not only for the reason I explained of having a single web stack for both Core and Proxy, but also for scaling reasons. Webrick has always been a multi-treaded server with new-thread-per-request design. This is an old-school design which does not scale under heavy load - too many threads can be created per process and there is no way of controlling this behavior as this is hardcoded in the codebase (server.rb). Puma, on the other hand, is using thread-pool technique with wait queues which is much better design in terms of reliability and robustness. There is indeed a set limit of requests processed in parallel and this can be managed by operations, this does not necessary better performance as you noted, but it is definitely better admin-experience and reliability. On top of that, we can setup Puma as multi-process server. There was no plan to do this in the initial phase, as I noted above. But it's good to have this option if we find that a single process is no longer able to handle the load. You can't do this with Webrick at all.
I don't understand this point, in the initial phase we would only use a single-process deployment strategy as we are not aware of any consequences multiple processes could bring. Then of course could can start experimenting with other strategies once the code was in. But even then, I don't understand how systemd-notify has to do anything with this. You haven't touched systemd through the whole review and now you see this as a stopper.
That's why I designed this patch in a way that users can switch between webrick and puma. Also the plan was to start with single-process deployment. We could start small and learn in our scale-labs. The idea is that Puma is much more flexible and configurable than Webrick.
Nothing. We spawn some background threads in the app which is a bit unusual but Webrick has always has been a multi-threaded server and we haven't experienced any problems so far. For multiple processes would would need to make sure that those background threads (inotify monitors of ISC DHCP leases file) do work correctly, but I don't see any problem with this and IMHO this would work without a change. |
Thank you. That information was very helpful and I certainly didn't know that.
I think my major concern is that the way Puma is that this dual identity process doesn't appear to fit with the Puma deployment. Puma can bind to multiple endpoints (both HTTP and HTTPS. I wonder if we should instead embrace this model. One change would be that we would drop Then another layer is that a module in its entirety can be disabled, HTTP, HTTPS or both. Perhaps we can hook in the existing sinatra routing layer to look at the HTTP protocol. Maybe it could also be solved in middleware.
The problem that I see is that you start 2 puma launchers if both HTTP and HTTPS are enabled. However, in that PR it assumes there is only one. That means we would need to clean the environment as well. That tells me that we are probably swimming against the stream and we may see more issues in the future. As I am currently fully occupied with other matters, I don't want to pick up more matters. This PR also changes the SSL cipher configuration. That's technically a change that can live outside of this PR single both webrick and Puma can accept the same format. However, that change also needs an installer change. There's a migration involved. Right now I don't have time for that. If I would merge this, I would feel responsible for those changes as well. That's why I don't want to merge this: I don't have time for the resulting work. This doesn't mean others can't do it. |
I opened puma/puma#2404 for some guidance from upstream on how to handle it properly. |
quoting puma/puma#2404 (comment)
I think that's the root concern I had and why I was very hesitant in merging this. I'm working on a proposal based on suggestions in that issue to see if that would work. |
I think I know, but I would appreciate someone refreshing me on the architectural reasons behind having a single application running trying to support HTTP and HTTPs based services over alternatives such as:
|
#774 is my attempt at supporting Puma. If it works well, we should probably rewrite the launcher to start the Puma process with parameters from settings.yml.
The major reason this exists is that some things must happen over HTTP. For example, unattended installs such as kickstarts and preseeds typically don't support HTTPS at all or it's a pain to set up the trust chain. The Smart Proxy can reverse proxy these calls to the Foreman. In doing so, it also provides the URLs to the Foreman application. This allows Foreman to generate the correct callback URL (which happens after the build is done; by default this is something like http://foreman.example.com/unattended/built but should become http://smart-proxy.example.com:8000/unattended/built). However, HTTP is insecure. There is no authentication and it's unencrypted. That's why we prefer to run everything over HTTPS when we can. To limit the attack surface, every module can be configured to listen only on HTTPS (the installer does this for us). In practice we only need HTTP for the template and httpboot modules. If you disable those, you don't really need to listen on HTTP at all. That's why the Proxy also allows you to not listen on HTTP at all. Again, to reduce attack surface. There are also companies with policies that disallow HTTP and this suits them. HTTP is also easy in development. So effectively the Smart Proxy has 2 virtual hosts where we distinguish on the protocol. That's also the approach I've taken in my PR, based on a suggestion by a Puma developer. |
Initial implementation of Puma done by Ido: #613
This works for me, if I need to do some changes I will add additional commits on top of Idos to keep authorship information clear.