Skip to content

Optimize RTC exchange performance in global RIB#2881

Open
bor1go wants to merge 1 commit into
osrg:masterfrom
bor1go:rtc-fork
Open

Optimize RTC exchange performance in global RIB#2881
bor1go wants to merge 1 commit into
osrg:masterfrom
bor1go:rtc-fork

Conversation

@bor1go
Copy link
Copy Markdown

@bor1go bor1go commented Feb 18, 2025

  1. Add Key(RouteTarget)->uint64.
  2. Add map[Key(rt)]map[*Path]{} to globalRib VPN tables for fast making path list by RT.
  3. Add map[Key(rt)]int to Adj RTC tables for fast filtering paths by RTs.
  4. Add tests on pp.1-3.
  5. Non-VPN paths must not depend on RTC.

This is about complexity of RTC.

  1. Currently, filtering a single VPN path before sending requires O(n) operations, where n is the number of RTs in the RF_RTC_UC table. Constructing a list of VPNs takes O(nm) operations, where m is the number of VPN paths (filterpath).
  2. When a new RT is received, filtering VPNs to respond (propagateUpdate) requires O(m) operations.

This PR proposes a new approach that potentially reduces the complexity to O(1) for filtering and retrieving VPN paths, assuming the map container is used effectively. However, this approach requires additional memory. Specifically, for each VPN GRT, it necessitates:
n*L*m*sizeof(*Path) , where L is the average number of extRTs in the Path, m - number of paths, n - number of known RTs +
n*sizeof(RouterID) - to store interested on RT peers +
n*sizeof(uint64) - for indexing RTs
Additionally, for each Adj RF_RTC_UC table, the memory requirement is approximately:
n*(sizeof(uint64)+sizeof(int))

@bor1go
Copy link
Copy Markdown
Author

bor1go commented Feb 21, 2025

@fujita could you please take a look at it, if it is possible?
I attempted to optimize the RTC by reducing linear and quadratic complexity for large numbers of paths.

@fujita
Copy link
Copy Markdown
Member

fujita commented Feb 24, 2025

Specifically, for each VPN GRT, it necessitates

What' VPN GRT?

@bor1go
Copy link
Copy Markdown
Author

bor1go commented Feb 24, 2025

What' VPN GRT?

I meant global rib tables for VPN families.

@bor1go bor1go force-pushed the rtc-fork branch 3 times, most recently from 5819e58 to 8127e3a Compare February 24, 2025 15:05
@bor1go
Copy link
Copy Markdown
Author

bor1go commented Mar 6, 2025

@fujita could you take a look at it again, please?
This approach could be helpful with big numbers of routes and route targets.

@bor1go
Copy link
Copy Markdown
Author

bor1go commented Mar 31, 2025

I made some changes: the approach that distinguished between VPN and non-VPN families was wrong.

Now all families are could be use in the RTC scheme. And current PR:

  1. Adds Key(RouteTarget)->uint64.
  2. Adds map[Key(rt)]map[*Path]{} to globalRib tables for fast making path list by RT.
  3. Adds map[Key(rt)]int to Adj RTC tables for fast filtering paths by RTs.
  4. Adds tests on pp.1-3.

@bor1go bor1go force-pushed the rtc-fork branch 2 times, most recently from 1914fde to 177fee0 Compare April 24, 2025 11:38
@bor1go
Copy link
Copy Markdown
Author

bor1go commented May 6, 2025

Hello, @fujita ,
Could you approve a workflow please? I've tested it locally, but it would be better to check it here...

@bor1go
Copy link
Copy Markdown
Author

bor1go commented May 13, 2025

Hello, @fujita ,
May I kindly remind you about this PR?

When a large number of RTs are loaded into GoBGP, the RTC filter becomes dramatically slow. I measured the performance on two GoBGPs running on the same PC. I loaded paths on one server and RTs on another, then established peering.

For example, if we have 10 paths with the same RT:

  • For 1,000 RTs: filtering speed is about 100,000 paths in a few seconds
  • For 5,000 RTs: 100,000 paths in 600 seconds
  • For 10,000 RTs: 100,000 paths in 1,300 seconds
  • For 100,000 RTs: 100,000 paths in approximately 20,000 seconds (I didn’t wait for it to finish)

The performance depends on the average number of paths per RT and the total number of RTs.

I modified the filter process to use maps instead of nested loops. Now it works much faster, with O(1) complexity instead of O(nm).

We are using this change in our environment and have not encountered any issues in this code so far.

@bor1go bor1go closed this Jun 19, 2025
@bor1go bor1go reopened this Jun 19, 2025
@bor1go bor1go force-pushed the rtc-fork branch 10 times, most recently from 1d1c567 to 43e988a Compare July 14, 2025 11:55
@bor1go bor1go force-pushed the rtc-fork branch 2 times, most recently from a0c9a70 to a2abdad Compare October 6, 2025 10:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request optimizes Route Target Constraint (RTC) handling in GoBGP by introducing fast lookup data structures. The changes reduce complexity from O(n*m) to approximately O(1) for RTC-related operations at the cost of additional memory usage.

Changes:

  • Introduces RT hash-based indexing using uint64 keys for fast route target lookups
  • Adds vpnFamilyRTCMap for global VPN tables and routeFamilyRTCMap for adjacency RTC tables
  • Refactors RTC path propagation logic to leverage the new indexing structures
  • Updates destination.Calculate to return withdrawn paths for proper RTC bookkeeping

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/pkg/table/rtc.go New RTC handler interface and implementations for fast RT lookups
internal/pkg/table/rtc_test.go Unit tests for RTC handler low-level operations
internal/pkg/table/table.go Adds RTC handler to Table struct and integrates with path updates
internal/pkg/table/table_test.go Tests for table creation with RTC and RT-based path retrieval
internal/pkg/table/path.go Helper functions for RT-based path filtering and retrieval
internal/pkg/table/destination.go Modifies Calculate to return withdrawn path for RTC unregistration
internal/pkg/table/destination_test.go Updates tests for new Calculate signature
internal/pkg/table/adj.go Integrates RTC handlers into adjacency RIB updates
internal/pkg/table/adj_test.go Tests for adjacency RTC functionality
internal/pkg/table/table_manager.go Manager-level methods for RT-based path retrieval across tables
internal/pkg/table/table_manager_test.go Integration tests for table manager RTC operations
internal/pkg/table/policy.go Updates function call to renamed ExtCommRouteTargetKey
pkg/server/server.go Refactored RTC propagation logic using new fast lookup methods
pkg/server/peer.go Optimized peer interest checking using RTC index
pkg/server/server_test.go End-to-end test for RTC with multiple route targets

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/pkg/table/adj.go Outdated
Comment on lines +76 to +78
if t.rtc != nil {
t.rtc.unregister(path, true)
}
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When unregistering a withdrawn path from the RTC map, the old path should be passed instead of the withdrawal path for consistency. While both paths have the same NLRI and the current implementation works, passing 'old' is more semantically correct since it's the path that was originally registered.

Copilot uses AI. Check for mistakes.
Comment thread pkg/server/server_test.go Outdated
}
}
case <-t1.C:
t.Logf("no paths have been withdrawn")
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop logic is incorrect. When the timer expires at line 2090, the code logs "no paths have been withdrawn" but doesn't set found to true or break out of the loop. This means the test will continue waiting indefinitely after the timeout. The timer case should set found=true to exit the loop, or there should be a break statement.

Suggested change
t.Logf("no paths have been withdrawn")
t.Logf("no paths have been withdrawn")
found = true

Copilot uses AI. Check for mistakes.
Comment thread internal/pkg/table/table.go Outdated
Comment on lines +194 to +195
// rtc is used for RTC rt saving. Every global vpn table has vpnPathsPart,
// every Adj RF_RTC_UC table has rtPathsPart, and other tables have not rtc instance.
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references type names "vpnPathsPart" and "rtPathsPart" which don't exist in the code. These should be "vpnFamilyRTCMap" and "routeFamilyRTCMap" respectively to match the actual type names.

Copilot uses AI. Check for mistakes.
@bor1go bor1go force-pushed the rtc-fork branch 2 times, most recently from cc3af40 to efea3e0 Compare January 15, 2026 03:26
@fujita fujita requested a review from Copilot January 15, 2026 12:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/pkg/table/adj.go Outdated
if val > 0 {
rtc.rts[rtHash]--
}
if val <= 1 && deleteEmpty {
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decrement logic is incorrect. The condition if val > 0 decrements the counter, but then the condition if val <= 1 && deleteEmpty checks the original value val before the decrement. This means that when val equals 1, it will be decremented to 0, but then the check val <= 1 will be true (since val is still 1), causing the entry to be deleted. However, if val is 2, it will be decremented to 1, and val <= 1 will be false (since val is 2), so the entry won't be deleted even though the counter is now 1.

The correct logic should check the value AFTER decrementing, not before. It should be: if rtc.rts[rtHash] <= 0 && deleteEmpty { delete(rtc.rts, rtHash) }

Suggested change
if val <= 1 && deleteEmpty {
if rtc.rts[rtHash] <= 0 && deleteEmpty {

Copilot uses AI. Check for mistakes.
Comment thread internal/pkg/table/table.go Outdated
Family bgp.Family
destinations Destinations
logger *slog.Logger
// map[rt valsue]->(number of such rtc paths in the table). Only for adj table and RF_RTC_UC family.
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error in the comment: "valsue" should be "value".

Suggested change
// map[rt valsue]->(number of such rtc paths in the table). Only for adj table and RF_RTC_UC family.
// map[rt value]->(number of such rtc paths in the table). Only for adj table and RF_RTC_UC family.

Copilot uses AI. Check for mistakes.
Replace linear route scanning for RTC requests with O(1) lookups by adding
a route target index to global RIB tables. The index maps route targets
to containing paths and requesting peers for instant RTC response generation.

Includes comprehensive test coverage.
@bor1go bor1go changed the title Speed up RTC exchange, exclude non-VPN paths from the RTC scheme Optimize RTC exchange performance in global RIB Jan 19, 2026
@bor1go
Copy link
Copy Markdown
Author

bor1go commented Jan 20, 2026

Dear @fujita,
While I haven't been able to significantly reduce the size of this PR, I've focused on making its purpose and implementation as clear as possible. Now this PR introduces a route target indexing mechanism that improve the performance of RTC operations in global RIB.

Key changes:

  1. Added route target index (rtIndex) to global RIB tables that maps route targets (represented as uint64 keys) to:
    • Paths containing each route target
    • Peers requesting each route target
  2. Eliminates the need for linear scanning through all routes when processing RTC requests, replacing O(n) complexity with O(1) lookups
  3. Added comprehensive tests to validate the indexing functionality

@bor1go bor1go marked this pull request as ready for review January 20, 2026 00:10
@bor1go
Copy link
Copy Markdown
Author

bor1go commented Jan 26, 2026

Hi @fujita, gentle ping on this PR — it’s been a week since my last comment. When you have a moment, could you please take a look? Happy to clarify anything or adjust the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants