Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lightning-liquidity crate to the workspace #3436

Merged

Conversation

tnull
Copy link
Contributor

@tnull tnull commented Dec 3, 2024

We upstream the lightning-liquidity crate into the rust-lightning
workspace. First, all source files are copied over as per current main of the lightning-liquidity repository (commit c80eb75f5a31bea5c2b73e41c50ca382ec0020f8).

Then, fixup commits adjust the crate to the current state of rust-lightning, and add it to our local CI script.

Once this is merged I'll transfer any relevant open issues from https://github.com/lightningdevkit/lightning-liquidity and archive the repo.

@tnull tnull requested a review from TheBlueMatt December 3, 2024 10:21
@tnull tnull added this to the 0.1 milestone Dec 3, 2024
@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 1cd0e69 to 510ae5d Compare December 3, 2024 12:39
@tnull
Copy link
Contributor Author

tnull commented Dec 3, 2024

AFAICT, remaining CI failures should be unrelated.

@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 510ae5d to d408ac6 Compare December 3, 2024 18:05
@tnull
Copy link
Contributor Author

tnull commented Dec 3, 2024

Rebased on main to re-run CI with fixes.

@arik-so
Copy link
Contributor

arik-so commented Dec 4, 2024

check_commits and linting are still failing

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran out of time to finish really digging into this, but at least have some doc requests and such.

lightning-liquidity/Cargo.toml Outdated Show resolved Hide resolved
lightning-liquidity/Cargo.toml Show resolved Hide resolved
lightning-liquidity/Cargo.toml Outdated Show resolved Hide resolved
lightning-liquidity/src/events.rs Show resolved Hide resolved
lightning-liquidity/src/events.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/manager.rs Show resolved Hide resolved
lightning-liquidity/src/manager.rs Show resolved Hide resolved
lightning-liquidity/src/sync/nostd_sync.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/utils.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Outdated Show resolved Hide resolved
@tnull
Copy link
Contributor Author

tnull commented Dec 5, 2024

check_commits and linting are still failing

Yup, both are expected. check_commits should pass once we squash, and the linting failures are pre-existing, I think.

@tnull
Copy link
Contributor Author

tnull commented Dec 5, 2024

Ran out of time to finish really digging into this, but at least have some doc requests and such.

Sure, no worries! I'm generally happy to address any comments that come up during review, but would lean towards more or less only addressing anything that's needed to move the crate in this PR, and address anything that might be more substantial in follow-ups.

@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch 4 times, most recently from 1608102 to dca99f4 Compare December 5, 2024 13:06
Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 59.92574% with 1403 lines in your changes missing coverage. Please review.

Project coverage is 89.44%. Comparing base (797993c) to head (71212ad).
Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
lightning-liquidity/src/lsps2/service.rs 57.76% 494 Missing and 23 partials ⚠️
lightning-liquidity/src/lsps1/client.rs 0.00% 333 Missing ⚠️
lightning-liquidity/src/lsps0/ser.rs 44.75% 171 Missing and 45 partials ⚠️
lightning-liquidity/src/manager.rs 47.03% 154 Missing and 7 partials ⚠️
lightning-liquidity/src/lsps2/client.rs 61.69% 76 Missing and 1 partial ⚠️
lightning-liquidity/src/lsps0/client.rs 57.89% 32 Missing ⚠️
lightning-liquidity/src/lsps1/msgs.rs 90.50% 11 Missing and 6 partials ⚠️
lightning/src/sync/debug_sync.rs 0.00% 14 Missing ⚠️
lightning-liquidity/src/lsps0/msgs.rs 89.60% 13 Missing ⚠️
lightning-liquidity/src/events.rs 95.03% 5 Missing and 3 partials ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3436      +/-   ##
==========================================
- Coverage   89.69%   89.44%   -0.26%     
==========================================
  Files         130      149      +19     
  Lines      107335   116152    +8817     
  Branches   107335   116152    +8817     
==========================================
+ Hits        96277   103894    +7617     
- Misses       8658     9793    +1135     
- Partials     2400     2465      +65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch 4 times, most recently from fe85637 to 8b9c6e0 Compare December 9, 2024 15:29
@tnull
Copy link
Contributor Author

tnull commented Dec 9, 2024

Addressed all the pending feedback.

@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 8b9c6e0 to 4d20463 Compare December 9, 2024 16:02
@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 3a318f8 to 71212ad Compare December 10, 2024 09:42
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the new comments we can freely ignore and are more things to address later, but some of them probably are quite important.

lightning-liquidity/README.md Outdated Show resolved Hide resolved
lightning-liquidity/src/lsps2/client.rs Show resolved Hide resolved
lightning-liquidity/src/lsps2/client.rs Show resolved Hide resolved
lightning-liquidity/src/lsps2/client.rs Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Outdated Show resolved Hide resolved
lightning-liquidity/src/lsps2/service.rs Show resolved Hide resolved
@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 71212ad to 6f5875c Compare December 11, 2024 12:51
@tnull
Copy link
Contributor Author

tnull commented Dec 11, 2024

Responded to all pending comments and addressed most of them.

@tnull
Copy link
Contributor Author

tnull commented Dec 11, 2024

@TheBlueMatt Let me know if I can squash fixups.

@tnull
Copy link
Contributor Author

tnull commented Dec 11, 2024

Btw now moved all open issues from the lightning-liqudity repo to here and added a label of the same name.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to squash. I think I'm okay with landing this, AFAIU there's no remaining trivial "anyone can run us out of memory" vulnerabilities, but would like @johncantrell97 to take a glance at the set of changes and confirm that matches his understanding as well.

lightning-liquidity/src/lsps2/service.rs Show resolved Hide resolved
.. to which end we also need to make some additions to `debug_sync` and
`nostd_sync`.
We add a size limit on the event queue, after which we'll just start
dropping events to ensure we could never OOM.

Additionally, we document the requirement that users need to handle
generated events ASAP.
We introduce a new `MAX_PENDING_REQUESTS_PER_PEER` limit for the
number of pending (get_info and buy) requests per peer.
.. which is a prefactor to also start checking the total number of
pending requests in the next commit.
To this end we introduce a new counter keeping track of overall requests
pending and reject inbound requests if they would put us over the
limit.
We clean up any `get_info` request state when peers disconnect.
@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 91c6f62 to 08fd499 Compare December 11, 2024 16:01
@tnull
Copy link
Contributor Author

tnull commented Dec 11, 2024

Feel free to squash.

Squashed fixups without further changes.

@wvanlint
Copy link
Contributor

Exciting for this to be upstreamed! @johncantrell97 is out currently, I went over the changes briefly.

With regard to OOM vulnerabilities, it seems only limits on pending requests have been introduced i.e. requests that need to be handled by the server. Could memory not still increase with a constant throughput of handling requests?

  • Could a server not repeatedly fulfill BuyRequests with invoice_parameters_generated, which removes the pending request but allocates an OutboundJITChannel?
  • Similarly, handling GetInfoRequest inserts a PeerState for any new peer, and the pending request will be removed in opening_fee_params_generated. Empty PeerStates get cleared after disconnection though in the changes.

Basically, we might need to limit the number of PeerStates and OutboundJITChannels per peer, in addition to a total pending requests limit.

I also wonder if silently dropping events from EventQueue::enqueue could result in the flow getting stuck, what would happen if LSPS2ServiceEvent::OpenChannel gets dropped for example? Wouldn't the held HTLCs get stuck as they're waiting for the channel to be opened? Rate-limiting at the message handling level should limit the event queue growth as well.

@tnull
Copy link
Contributor Author

tnull commented Dec 13, 2024

Thanks for taking a look!

Could memory not still increase with a constant throughput of handling requests?

Yes, but to a degree that's unavoidable, at assuming least in the current protocol. Having users means having to keep some kind of state around, means increasing memory and/or storage requirements :)

Could a server not repeatedly fulfill BuyRequests with invoice_parameters_generated, which removes the pending request but allocates an OutboundJITChannel?

Hm, that's a valid point, although I was hesitant to introduce a limit on this in this PR as we'll need to work out a larger strategy how we want to handle/when we want to drop the OutboundJITChannel entries, especially when we'll look into persistence soon. But since we're already pruning any expired OutboundJITChannel that haven't made it further than PendingInitialPayment I now added a commit including them in the per-peer limit.

Similarly, handling GetInfoRequest inserts a PeerState for any new peer, and the pending request will be removed in opening_fee_params_generated. Empty PeerStates get cleared after disconnection though in the changes.

Right, I don't think we can't do better than keeping them around as long as they are needed, and garbage collecting them eventually.

Basically, we might need to limit the number of PeerStates and OutboundJITChannels per peer, in addition to a total pending requests limit.

The number of PeerStates is already limited by LDK's connection limits, see above (the new commit) for the latter.

I also wonder if silently dropping events from EventQueue::enqueue could result in the flow getting stuck, what would happen if LSPS2ServiceEvent::OpenChannel gets dropped for example? Wouldn't the held HTLCs get stuck as they're waiting for the channel to be opened?

Yes, we hopefully never hit this, but timing out channels (which we need to develop a strategy, as mentioned above) or even force-closing individual channels is better than running out of memory and failing to service existing customers, losing funds, etc. FWIW, the limit on the event queue is really a last-resort limit that never should be reached.

Rate-limiting at the message handling level should limit the event queue growth as well.

Yes, LDK already does this at the network level, as well as limiting the number of peers. We could eventually think about adding overly spammy peers to the ignore list (which currently is only used for peer sending us bogus messages), but for the most part we should be good leaning on LDK here (or if we find that not, we probably want to improve it there).

@TheBlueMatt
Copy link
Collaborator

The number of PeerStates is already limited by LDK's connection limits, see above (the new commit) for the latter.

Given those limits are only applied in ChannelManager I wonder if we shouldn't also have a limit on PeerStates here. Could be quite high, even like 100k.

Copy link
Contributor

@wvanlint wvanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking another look at the per-peer limit! I think that should cover all the DoS cases as far as I can tell.

I'm not sure about the interaction between ChannelManager's limit in ChannelMessageHandler::peer_connected, and the implementation here through CustomMessageHandler (Does it require the former to be called first?). Will defer to Matt there.

@tnull
Copy link
Contributor Author

tnull commented Dec 16, 2024

Given those limits are only applied in ChannelManager I wonder if we shouldn't also have a limit on PeerStates here. Could be quite high, even like 100k.

Alright, I'm not convinced it's strictly necessary, but probably better safe than sorry here. Now added a commit adding such a 100k limit.

@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from 19ddd84 to c5c0d72 Compare December 16, 2024 15:17
.. we clean up any pending buy requests that hit their `valid_until`
time when the counterparty disconnects.
We're now also pruning any expired `OutboundJITChannels` if we haven't
seen any related HTLCs.
In addition to pruning expired requests on peer disconnection we also
regularly prune for all peers on block connection, and also remove the
entire `PeerState` if it's empty after pruning (i.e., has no pending
requsts or in-flight channels left).
We include any `OutboundJITChannel` that has not made it further than
`PendingInitialPayment` in the per-peer request limit, and will of
course prune it once it expires.
While LDK/`ChannelManager` should already introduce an upper-bound on
the number of peers, here we assert that our `PeerState` map can't
grow unboundedly. To this end, we simply return an `Internal error` and
abort when we would hit the limit of 100000 peers.
@tnull tnull force-pushed the 2024-12-add-lightning-liquidity-crate branch from c5c0d72 to f68c6c5 Compare December 16, 2024 15:19
@tnull
Copy link
Contributor Author

tnull commented Dec 16, 2024

Force-pushed with minor fixup to avoid warning with disabled std feature:

> git diff-tree -U2 19ddd84 HEAD
diff --git a/lightning-liquidity/src/lsps2/utils.rs b/lightning-liquidity/src/lsps2/utils.rs
index 3de2d2eb5..8a085b76c 100644
--- a/lightning-liquidity/src/lsps2/utils.rs
+++ b/lightning-liquidity/src/lsps2/utils.rs
@@ -32,4 +32,5 @@ pub fn is_valid_opening_fee_params(

 /// Determines if the given parameters are expired, or still valid.
+#[cfg_attr(not(feature = "std"), allow(unused_variables))]
 pub fn is_expired_opening_fee_params(fee_params: &OpeningFeeParams) -> bool {
        #[cfg(feature = "std")]

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna go ahead and land this to get it out of the way. A handful of doc nits are still here, and one Real DoS vuln that will need fixing before we can release an RC.

@@ -0,0 +1,50 @@
[package]
name = "lightning-liquidity"
version = "0.1.0-alpha.6"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to bump this to 0.1 with the rest of LDK in the next release or will we just bump to a new alpha?

Copy link
Contributor Author

@tnull tnull Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question, it would of course be nice to tag this 0.1 to ensure lock-step with lightning.

As far as I understood you correctly above, you consider the absence of service-side persistence a blocker to consider this 0.1? I tend to agree that it's a critical feature for the service-side, but we could still consider to tag this 0.1 w.r.t. to the client side? That is, we could either add doc comments making the current limitations clear and labeling service support as alpha/experimental (or even introduce an cfg(lsps2_service) flag for now?)?

And, given we might have an extended beta phase for 0.1, I hope to have a PR up to add service-side persistence by the time we'd get around to the 0.1 release (IIUC currently targetting mid-/late January?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a super strong opinion, certainly happy to just say its 0.1 "but service-side is in beta". We also really should have fuzzers for this logic.

In terms of 0.1, no, I don't think its mid/late January? We intend to get the beta out this week and then probably release kinda whenever bindings are done. We'll probably have an 0.1.1 with some minor fixes but that's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a super strong opinion, certainly happy to just say its 0.1 "but service-side is in beta".

Will add a respective note to the docs in #3493.

We also really should have fuzzers for this logic.

The idea was to write proptests for these, which we already kinda started.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to write proptests for these, which we already kinda started.

proptests aren't sufficient for the same level of coverage. They're useful to get a different kind of coverage, though.

lightning-liquidity/README.md Show resolved Hide resolved
lightning-liquidity/src/lib.rs Show resolved Hide resolved
lightning-liquidity/src/lsps0/client.rs Show resolved Hide resolved
lightning-liquidity/src/lsps0/msgs.rs Show resolved Hide resolved
lightning-liquidity/src/manager.rs Show resolved Hide resolved
lightning-liquidity/src/manager.rs Show resolved Hide resolved
user_channel_id: jit_channel.user_channel_id,
intercept_scid,
});
self.pending_events.enqueue(event);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true in quite a few places in this crate, but just noting it here - enqueing an event on the queue calls all the way into user code with the Waker notify. Its mostly probably fine cause the Waker is actually tokio and it doesn't immediately poll the future, but it could, and if it does, holding a lock when it does so is pretty gross. Where possible we should drop locks before pushing an event.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as mentioned above we'll want to do some more refactoring like this for the message queues, too. We probably should switch to a RAII-style Notifier that does all the (event/message) queue flushing when dropped for us.

use lightning::ln::channelmanager::InterceptId;
use lightning_types::payment::PaymentHash;

/// Holds payments with the corresponding HTLCs until it is possible to pay the fee.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to somehow handle timeouts of HTLCs in the queue? Presumably I could send one HTLC to the LSP which times out in a few blocks, then wait until it times out, then send another for one more satoshi, causing the LSP to open the channel but there's not actually payment for them. After that, I assume the client can then use the non-intercept SCID to bypass fees. Its maybe not a huge deal cause the client can already effectively bypass fees by just rejecting the initial HTLC forwards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to somehow handle timeouts of HTLCs in the queue?

We should already pop any HTLCs for which we receive the HTLCHandlingFailed event and then try to forward the next one in line.

Presumably I could send one HTLC to the LSP which times out in a few blocks, then wait until it times out, then send another for one more satoshi, causing the LSP to open the channel but there's not actually payment for them. After that, I assume the client can then use the non-intercept SCID to bypass fees. Its maybe not a huge deal cause the client can already effectively bypass fees by just rejecting the initial HTLC forwards?

Yes, this is a general issue applicable to the LSP-trusts-client flow. It can only mitigated by the LSP delaying broadcasting the funding transaction until they're positive the client claimed all parts required to pay for the channel open.

@@ -0,0 +1,685 @@
#![cfg(test)]
// TODO: remove these flags and unused code once we know what we'll need.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, I'd like to wait until we added LSPS1 service work and possibly finished the LSPS2 integration test flow (note it currently only test up until the invoice generation step, not the channel opening/payment flow). However, the latter might additionally(?) get tested (and easier) in LDK Node integration tests as part of adding service-side support there.

@TheBlueMatt TheBlueMatt merged commit 6ad40f9 into lightningdevkit:main Dec 17, 2024
16 of 19 checks passed
@tnull tnull mentioned this pull request Dec 17, 2024
TheBlueMatt added a commit that referenced this pull request Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants