Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cxl create-region not working for --type=ram #276

Open
hextag opened this issue Nov 24, 2024 · 5 comments
Open

cxl create-region not working for --type=ram #276

hextag opened this issue Nov 24, 2024 · 5 comments

Comments

@hextag
Copy link

hextag commented Nov 24, 2024

The following 4 devices are shown in QEMU VM when I do 'cxl list'.

[root@fedora dev]# cxl list
[
  {
    "memdevs":[
      {
        "memdev":"mem3",
        "ram_size":268435456,
        "serial":0,
        "host":"0000:c2:00.0"
      },
      {
        "memdev":"mem2",
        "ram_size":268435456,
        "serial":0,
        "host":"0000:c3:00.0"
      },
      {
        "memdev":"mem1",
        "pmem_size":268435456,
        "serial":0,
        "host":"0000:36:00.0"
      },
      {
        "memdev":"mem0",
        "pmem_size":268435456,
        "serial":0,
        "host":"0000:37:00.0"
      }
    ]
  },
  {
    "regions":[
      {
        "region":"region0",
        "resource":328833433600,
        "size":536870912,
        "type":"pmem",
        "interleave_ways":2,
        "interleave_granularity":8192,
        "decode_state":"commit"
      }
    ]
  }
]

I want to create a region from devices mem2 and mem3( which are listed as ram in cxl list) and create a device node from this region to access in SysRAM mode. However, I get failure when I use create region command

cxl create-region -m -d decoder0.0 -w 2 -g 8192 mem2 mem3 --type=ram

cxl region: collect_memdevs: no active memdevs found: decoder: decoder0.0 filter: mem2,mem3
cxl region: cmd_create_region: created 0 regions

I tried to enable these devices using cxl enable-memdev but it doesn't help to resolve the above error,

cxl enable-memdev mem2
cxl enable-memdev mem3

Earlier I created a region (region0) as you can see in cxl list output from memory devices mem0 and mem1 both listed as pmem.

I created a namespace for this region0 using 'sudo ndctl create-namespace --mode=devdax --force --region=region0'

daxctl list shows this device node which is devdax.

[root@fedora dev]# daxctl list
[                                                                                                                                                                                             {
    "chardev":"dax0.0",
    "size":526385152,
    "target_node":32,
    "align":2097152,
    "mode":"devdax"
  }
]

The above dax0.0 node is created from mem0 and mem1 both for which are listed as pmem in cxl list. I'm able to read and write to this node.

When I try to re-configure the above dax0.0 to system-ram mode, I get the following error and hence I want to create a new region and a new device node in SysRAM from the other two devices (mem2 and mem3) which are listed as ram in cxl list output.

[root@fedora dev]# daxctl reconfigure-device -m system-ram dax0.0
dax0.0: error: kernel policy will auto-online memory, aborting
error reconfiguring devices: Device or resource busy

How do I go about SysRAM emulation ? Can anyone please help me here.

@djbw
Copy link
Member

djbw commented Dec 3, 2024

Are you sure that mem2 and mem3 are reachable under decoder0.0? Depending on the configuration they may be connected under a separate CXL window and host bridge.

@hextag
Copy link
Author

hextag commented Dec 4, 2024

They are connected to a different bridge. I tried to use a different root decoder (decoder0.1) but still face some issue.

There are only two root decoders in the system decoder0.0 and decoder0.1. I verified this using 'cxl list -D' and inspecting decoders in /sys/bus/cxl/devices/

Here is what I see when I try to create the region from decoder0.1

[root@fedora ~]# cxl create-region -d decoder0.1 -w 2 -g 8192 -m mem1 mem3 --type=ram
cxl region: create_region: region1: failed to set target0 to mem1
cxl region: cmd_create_region: created 0 regions

Using decoder0.0 throws the below error,

[root@fedora ~]# cxl create-region -m mem1 mem3  -d decoder0.0 -w 2 -g 8192 --type=ram
cxl region: collect_memdevs: no active memdevs found: decoder: decoder0.0 filter: mem1,mem3
cxl region: cmd_create_region: created 0 regions

I'm not clear which decoder I'm supposed to use to create a SysRAM region now that I've tried both of the available root decoders.

I rebooted the qemu image with '--rebuild wipe' from the previous state. The following is the new state of the VM.

Following is the CXL topology:

Output of 'cxl list',

[root@fedora devices]# cxl list
[
  {
    "memdev":"mem1",
    "ram_size":268435456,
    "serial":0,
    "host":"0000:c3:00.0"
  },
  {
    "memdev":"mem3",
    "ram_size":268435456,
    "serial":0,
    "host":"0000:c2:00.0"
  },
  {
    "memdev":"mem2",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:37:00.0"
  },
  {
    "memdev":"mem0",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:36:00.0"
  }
]
[root@fedora ~]# cd /sys/bus/cxl/devices
[root@fedora devices]# readlink -f mem1
/sys/devices/pci0000:bf/0000:bf:00.0/0000:c0:00.0/0000:c1:01.0/0000:c3:00.0/mem1
[root@fedora devices]# readlink -f mem3
/sys/devices/pci0000:bf/0000:bf:00.0/0000:c0:00.0/0000:c1:00.0/0000:c2:00.0/mem3
[root@fedora devices]# readlink -f mem0
/sys/devices/pci0000:35/0000:35:00.0/0000:36:00.0/mem0
[root@fedora devices]# readlink -f mem2
/sys/devices/pci0000:35/0000:35:01.0/0000:37:00.0/mem2

PCIe tree:

[root@fedora ~]# lspci -t
-+-[0000:00]-+-00.0
 |           +-01.0
 |           +-02.0
 |           +-03.0
 |           +-04.0
 |           +-1f.0
 |           +-1f.2
 |           \-1f.3
 +-[0000:35]-+-00.0-[36]----00.0
 |           \-01.0-[37]----00.0
 \-[0000:bf]-+-00.0-[c0-c5]--+-00.0-[c1-c5]--+-00.0-[c2]----00.0
             |               |               +-01.0-[c3]----00.0
             |               |               +-02.0-[c4]--
             |               |               \-03.0-[c5]--
             |               \-00.1
             \-01.0-[c6]--

The following are decoders available. I verified using 'cxl list -D' and manually running 'cat devtype' to see the type of decoder

[root@fedora devices]# cxl list -D
[
  {
    "decoder":"decoder0.0",
    "resource":49660559360,
    "size":4294967296,
    "interleave_ways":1,
    "max_available_extent":4294967296,
    "pmem_capable":true,
    "volatile_capable":true,
    "accelmem_capable":true,
    "nr_targets":1
  },
  {
    "decoder":"decoder0.1",
    "resource":53955526656,
    "size":4294967296,
    "interleave_ways":2,
    "interleave_granularity":8192,
    "max_available_extent":4294967296,
    "pmem_capable":true,
    "volatile_capable":true,
    "accelmem_capable":true,
    "nr_targets":2
  }
]

Only decoder0.0 and decoder0.1 seem to be root decoder. Rest of the decoders seem to be switch or end point decoders

[root@fedora devices]# ls
decoder0.0  decoder2.3  decoder5.0  decoder7.1  mem2            pmu_mem2.0
decoder0.1  decoder3.0  decoder5.1  decoder7.2  mem3            pmu_mem2.1
decoder1.0  decoder3.1  decoder5.2  decoder7.3  nvdimm-bridge0  pmu_mem3.0
decoder1.1  decoder3.2  decoder5.3  endpoint3   pmem0           pmu_mem3.1
decoder1.2  decoder3.3  decoder6.0  endpoint5   pmem2           port1
decoder1.3  decoder4.0  decoder6.1  endpoint6   pmu_mem0.0      port2
decoder2.0  decoder4.1  decoder6.2  endpoint7   pmu_mem0.1      port4
decoder2.1  decoder4.2  decoder6.3  mem0        pmu_mem1.0      root0
decoder2.2  decoder4.3  decoder7.0  mem1        pmu_mem1.1
[root@fedora devices]# cd decoder0.0
[root@fedora decoder0.0]# cat devtype
cxl_decoder_root
[root@fedora decoder0.0]# cd ../decoder0.1/
[root@fedora decoder0.1]# cat devtype
cxl_decoder_root
[root@fedora decoder0.1]# cd ../decoder2.0/
[root@fedora decoder2.0]# cat devtype
cxl_decoder_switch

I tried with both of available root decoders, decoder0.0 & decoder0.1 to create a SysRam region but failing with errors.

Can you please help which decoder I need to use or if I'm making any obvious mistake in any of the steps.

@djbw
Copy link
Member

djbw commented Dec 4, 2024

If you run "cxl list -d 0.0 -M" it will list the memory devices that are mapped by decoder0.0.

@hextag
Copy link
Author

hextag commented Dec 4, 2024

I get this output with cxl list -M for decoder0.0 and decoder0.1 respectively. cxl create region with mem1 and mem3 with decoder0.1 results in same error though

[root@fedora ~]# cxl list -d 0.0 -M
[
  {
    "memdev":"mem2",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:37:00.0"
  },
  {
    "memdev":"mem0",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:36:00.0"
  }
]
[root@fedora ~]# cxl list -d 0.1 -M
[
  {
    "memdev":"mem1",
    "ram_size":268435456,
    "serial":0,
    "host":"0000:c3:00.0"
  },
  {
    "memdev":"mem3",
    "ram_size":268435456,
    "serial":0,
    "host":"0000:c2:00.0"
  },
  {
    "memdev":"mem2",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:37:00.0"
  },
  {
    "memdev":"mem0",
    "pmem_size":268435456,
    "serial":0,
    "host":"0000:36:00.0"
  }
]

The outputs of commands like 'cxl list' are in previous comment in the thread,

The create-region command is failing with the following error,

[root@fedora ~]# cxl create-region -d decoder0.1 -w 2 -g 8192 -m mem1 mem3 --type=ram
cxl region: create_region: region1: failed to set target0 to mem1
cxl region: cmd_create_region: created 0 regions

@djbw
Copy link
Member

djbw commented Dec 4, 2024

So it looks like you have 2 host bridges, one host bridge has 2 root ports with devices attached, the other host bridge has 1 root port with devices attached through a single switch.

If you run "cxl list -DT -d root" it will show you which windows target which decoders, but it is clear that 0.0 only targets one host bridge, while 0.1 targets both host bridges. To create an x2 region with decoder0.1 you need at least one mem device per-host bridge. mem1 and mem3 are from the same branch of the interleaved host-bridges. The only way to create an interleave with only mem1 and mem3 is if you had another CXL memory window that was x1 targeting the pci0000:bf host-bridge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants