exploration of LoRA using composition #167

davidkoski · 2024-12-16T21:32:58Z

Not to be merged -- just an exploration of composition and LoRA

davidkoski · 2024-12-16T21:34:00Z

Libraries/MLXLLM/Models/Gemma2.swift

+    @ModuleInfo(key: "q_proj") var wq: UnaryLayer
+    @ModuleInfo(key: "k_proj") var wk: UnaryLayer
+    @ModuleInfo(key: "v_proj") var wv: UnaryLayer
+    @ModuleInfo(key: "o_proj") var wo: UnaryLayer


To use composition we would declare the layers with an appropriate protocol instead of a concrete type.

davidkoski · 2024-12-16T21:34:34Z

Libraries/MLXLLM/Models/Gemma2.swift

+// - make a Quantized protocol that provides the groupSize and bits
+// - make the QuantizedLinear shape produce the expanded shape
+// - make `items()` open
+// - make `updateModule(key:_:)` open


Some ideas that I think I should do regardless of the outcome of this

davidkoski · 2024-12-16T21:35:06Z

Libraries/MLXLLM/Models/Gemma2.swift

+//      - see items() and updateModule()
+
+// TODO: make UnaryLayer extend Module
+public protocol UnaryLayer2: Module {


Here just so the types work out -- all UnaryLayer are also Module.

davidkoski · 2024-12-16T21:36:14Z

Libraries/MLXLLM/Models/Gemma2.swift

+
+    // TODO: in LoRALinear this is
+    // public static func from(linear: Linear, rank: Int = 8) -> LoRA
+    public convenience init(linear: Linear, rank: Int = 8, scale: Float = 20.0) {


So rather than calling:

qProj = LoRALinear.from(linear: qProj)

you would:

qProj = LoRA(linear: qProj)

davidkoski · 2024-12-16T21:41:09Z

Libraries/MLXLLM/Models/Gemma2.swift

+    // produce a merged view of properties (flatten LoRA into adapts)
+    override func items() -> ModuleItems {
+        var result = adapts.items()
+        for (key, value) in super.items() {
+            if key == "adapts" { continue }
+            result[key] = value
+        }
+        return result
+    }
+
+    // forward module updates -> adapt
+    func updateModule(key: String, _ value: Any) throws {
+        try adapts.updateModule(key: key, value)
+    }


This doesn't work as-is because these methods can't be overridden (see TODOs).

The idea is that the LoRA composition would flatten itself into what it adapts -- the Linear and LoRA keys would be merged for the purpose of updates, etc.

As per the notes noGrad would need to be overridable (it is a property with storage and cannot be used that way right now).

I think this is necessary for a couple reasons:

generally weight saving and loading doesn't want to see the adaptor layer

this matches the typical shape of a graph with lora (mixed in to the linear)

But I think this forwarding is the worst part of it. We could potentially make a subclass of Module that encapsulates this if it becomes a common thing. That would help, but I suspect there would be complications.

davidkoski · 2024-12-16T21:53:42Z

Libraries/MLXLLM/Models/Gemma2.swift

+
+    // TODO: this requires knowledge of the innards of the adapted layer so it
+    // is specific to Linear (and QuantizedLinear).
+    public func toLinear(deQuantize: Bool = false) -> Linear {


The fuse operation requires knowledge of how to combine the LoRA weights with the target. Type-wise this could easily return a UnaryLayer but it has to understand the implementation in order to fuse.

davidkoski · 2024-12-16T21:56:03Z

Libraries/MLXLLM/Models/Gemma2.swift

+        // TODO let y = super.callAsFunction(x.asType(scales.dtype)) -- ignoring the asType here
+        let y = adapts(x)
+        let z = matmul(matmul(x, self.loraA), self.loraB)
+        return y + scale * z


The nicest part of it -- since LoRA is an adaptor we can easily express it via composition.

davidkoski · 2024-12-16T21:57:12Z

Libraries/MLXLLM/Models/Gemma2.swift

+}
+
+/// LoRA layer that can wrap any UnaryLayer
+class LoRA: Module, UnaryLayer2 {


For reference here is the current implementation of LoRA:

https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/MLXLMCommon/Lora.swift#L71

- see ml-explore/mlx-swift-examples#167 - also fixes issue where quantize() could quantize a quantized layer!

exploration of LoRA using composition

04c9346

davidkoski commented Dec 16, 2024

View reviewed changes

add note on noGrad

42d0049

davidkoski commented Dec 16, 2024

View reviewed changes

davidkoski added a commit to davidkoski/mlx-swift that referenced this pull request Dec 17, 2024

address issues that prevent using composition for layers like LoRA

739f84b

- see ml-explore/mlx-swift-examples#167 - also fixes issue where quantize() could quantize a quantized layer!

davidkoski mentioned this pull request Dec 17, 2024

address issues that prevent using composition for layers like LoRA ml-explore/mlx-swift#177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exploration of LoRA using composition #167

exploration of LoRA using composition #167

davidkoski commented Dec 16, 2024 •

edited

Loading

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

davidkoski Dec 16, 2024

exploration of LoRA using composition #167

Are you sure you want to change the base?

exploration of LoRA using composition #167

Conversation

davidkoski commented Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkoski commented Dec 16, 2024 •

edited

Loading