View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000628 | AlmaLinux-10 | General | public | 2026-05-28 13:06 | 2026-05-28 13:06 |
| Reporter | v-sriramsur | Assigned To | |||
| Priority | normal | Severity | minor | Reproducibility | have not tried |
| Status | new | Resolution | open | ||
| Summary | 0000628: [AlmaLinux][Mana][Backport] net: mana: Avoid queue struct allocation failure under memory fragmentation | ||||
| Description | Backport Justification (ADO) 1. Problem Summary MANA driver does large contiguous kmallocs during queue setup that may fail in system with highly fragmented memory: - mana_create_txq (tx_qp array): 0000005:0000002.2 MB @ 64 queues - mana_create_rxq (rxq + flex array): 0000005:0000002.4 MB per queue @ depth 8192 - mana_pre_alloc_rxbufs: 0000007:0000004 MB @ 64 queues / depth 8192 Failures occur on driver open and on runtime reconfig (channels, ring size, MTU). Mellanox (mlx5) previously hit the same class of issue — large contiguous allocations (0000057:0000128 KB / 0000198:0000512 KB) failed on fragmented systems and were fixed by switching to smaller page-sized units. 2. Impact on Customer VMs - Driver load failure → VM loses networking / availability. - Runtime reconfig (ethtool, MTU) fails → blocks workload tuning. - Worst on long-running VMs, high-memory workloads (AI/HPC, in-memory DBs), and max queue/ring configs (64 queues, depth 8192). 3. Reproduction Details (If Available) - Trigger: High memory fragmentation + MANA driver load / queue reconfiguration. - Symptom: kmalloc order-N allocation failure during mana_create_txq / mana_create_rxq / mana_pre_alloc_rxbufs; driver open/reconfig returns -ENOMEM. - No deterministic repro; observed on fragmented systems and reproducible by forcing high-order allocation pressure. 4. Relationship to Larger Feature None 5. Patch Criticality - Medium - Customer-visible loss of VM networking on memory fragmented systems on reloading driver or reconfiguring queue depth, queue size etc. 6. Classification (Select All That Apply) - Bug fix – driver fails to load or reconfigure queues when contiguous high-order memory is unavailable due to memory fragmentation. 7. Upstream References Upstream commit: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=3af0820c878e https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d07efe5a6e64 Request backport of the above patches to the active kernel versions. Thanks. | ||||
| Tags | No tags attached. | ||||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2026-05-28 13:06 | v-sriramsur | New Issue |