Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update HAL for scatter and gather engines #697

Open
wants to merge 4 commits into
base: staging
Choose a base branch
from

Conversation

hiddemoll
Copy link
Contributor

This PR updates the outdated Hardware Abstraction Layer for the scatter and gather engines. It also adds a simulation to test the updated HALs.

Since we use DataLink in bittide internals, we cannot use address 0. This makes the test a bit more complex to read, mostly because of all the conversions from bytes to 32-bit and 64-bit words you have to make in your head. I think it may be better to wait on #684 before merging this PR.

@hiddemoll hiddemoll force-pushed the hidde/scatter-gather-hal branch from 8fa6bc1 to ec9427d Compare December 10, 2024 08:58
@hiddemoll hiddemoll marked this pull request as ready for review December 10, 2024 08:58
@hiddemoll
Copy link
Contributor Author

@rslawson could you review the Rust code in this PR?
@lmbollen I'm assigning you as reviewer for the rest, since you created issue #663

Copy link
Contributor

@lmbollen lmbollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clash part looks good modulo some minor comments and one question

bittide-instances/tests/Wishbone/ScatterGather.hs Outdated Show resolved Hide resolved
ScatterConfig nBytes addrW ->
GatherConfig nBytes addrW ->
Circuit (Df dom (BitVector 8)) (Df dom (BitVector 8))
dut scatterConfig gatherConfig = circuit $ \uartRx -> do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are also idle driving uartRx in the declaration of uartStream above.

The code would be cleaner if you remove that incoming circuit from dut and drive it similar to the other unconnected circuits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a separate commit doing the same for the other bittide-instances unittests which also had an idle uartRx.

bittide-instances/tests/Wishbone/ScatterGather.hs Outdated Show resolved Hide resolved
bittide-instances/tests/Wishbone/ScatterGather.hs Outdated Show resolved Hide resolved
@hiddemoll hiddemoll force-pushed the hidde/scatter-gather-hal branch 2 times, most recently from d006a45 to 0bc27c9 Compare December 13, 2024 13:32
firmware-support/bittide-sys/src/gather_unit.rs Outdated Show resolved Hide resolved
Comment on lines 24 to 30
/// # Safety
///
/// The source memory size must be smaller or equal to the size of the
/// `GatherUnit` memory.
pub unsafe fn copy_from_slice(&self, src: &[u8], offset: usize) {
assert!(src.len() + offset <= MEM_SIZE * 4);
core::ptr::copy_nonoverlapping(src.as_ptr(), self.memory.add(offset), src.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed if Deref and DerefMut are implemented.

Copy link
Collaborator

@rslawson rslawson Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this shouldn't be removed, but we can present a safe interface if you write the necessary asserts. If the assert! that already exists in this function is enough to guarantee that no out-of-bounds writes can occur, then it would be valid here to move the unsafe from the function to a block inside it. Also here, since this is MMIO, we mostly want to be using volatile operations to ensure the compiler doesn't elide them. What the HAL had previously is close to what it seems like you want, except for the ability to copy something smaller than a frame and have an offset. Something like

pub fn copy_from_slice(&self, src: &[u8], offset: usize) {
    assert!(src.len() + offset <= MEM_SIZE * 4);
    let mut off = offset;
    for &byte in src {
        unsafe {
            self.base_addr.add(off).write_volatile(byte);
        }
        off += 1;
    }
}

should probably get the job done. Also, to make Clippy happy, you may need to change # Safety in the doc comment to # Panics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented your change request, but sadly this does make reading/writing roughtly 4 times slower than it was with copy_nonoverlapping. I've increased the padding of the calendar for the test by a factor of 4.
Do you think this change is worth it?
Do you agree with the new function names of the read/write functions for the scatter and gather units?

Comment on lines 24 to 31
/// # Safety
///
/// The destination memory size must be smaller or equal to the size of the
/// `ScatterUnit`.
pub unsafe fn copy_to_slice(&self, dst: &mut [u8], offset: usize) {
let src = self.memory.add(offset);
assert!(dst.len() + offset <= MEM_SIZE * 4);
core::ptr::copy_nonoverlapping(src, dst.as_mut_ptr(), dst.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed if Deref and DerefMut are implemented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore, see comment on copy_from_slice, but replace the write_volatile with a read_volatile and assign that to the appropriate index in dst.

Also removes the FDT infrastructure, which is no longer used.

Fixes: #663
The scatter and gather units now only have a base address and an associated
constant for the offset of the metacycle register.
Reads from and writes to the scatter and gather unit are now done with volatile
operations, similar to the old situation. This does the read and write
operations slower by roughly a factor of 4, which is made up for by the
increased padding of the calendar.
@hiddemoll hiddemoll force-pushed the hidde/scatter-gather-hal branch from 94edee1 to 9ad8608 Compare December 19, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants