Accelerate

MLTensor computation took more time than expected.

func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.

Machine Learning & AI Core ML ML Compute Accelerate

0

594

Aug ’24

Documentation and usage of BNNS.NormalizationLayer

Hello everybody, I am running into an error with BNNS.NormalizationLayer. It appears to only work with .vector, and matrix shapes throws layerApplyFail during training. Inference doesn't throw but the output stays the same. How to correctly use BNNS.NormalizationLayer with matrix shapes? How to debug layerApplyFail exception? Thanks let array: [Float32] = [ 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15, 16, 17, 18, ] // let inputShape: BNNS.Shape = .vector(6 * 3) // works let inputShape: BNNS.Shape = .matrixColumnMajor(6, 3) let input = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) let output = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) let beta = BNNSNDArrayDescriptor.allocate(repeating: Float32(0), shape: inputShape, batchSize: 1) let gamma = BNNSNDArrayDescriptor.allocate(repeating: Float32(1), shape: inputShape, batchSize: 1) let activation: BNNS.ActivationFunction = .identity let layer = BNNS.NormalizationLayer(type: .layer(normalizationAxis: 0), input: input, output: output, beta: beta, gamma: gamma, epsilon: 1e-12, activation: activation)! let layerInput = BNNSNDArrayDescriptor.allocate(initializingFrom: array, shape: inputShape) let layerOutput = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) // try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .inference) // No throw try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .training) _ = layerOutput.makeArray(of: Float32.self) // All zeros when .inference

Machine Learning & AI General Accelerate

1

0

853

Jul ’24

Performant alternative to scaling a CIImage / PixelBuffer

Hey, I’m building a camera app where I am applying real time effects to the view finder. One of those effects is a variable blur, so to improve performance I am scaling down the input image using CIFilter.lanczosScaleTransform(). This works fine and runs at 30FPS, but when running the metal profiler I can see that the scaling transforms use a lot of GPU time, almost as much as the variable blur. Is there a more efficient way to do this? The simplified chain is like this: Scale down viewFinder CVPixelBuffer (CIFilter.lanczosScaleTransform) Scale up depthMap CVPixelBuffer to match viewFinder size (CIFilter.lanczosScaleTransform) Create CIImages from both CVPixelBuffers Apply VariableDepthBlur (CIFilter.maskedVariableBlur) Scale up final image to metal view size (CIFilter.lanczosScaleTransform) Render CIImage to a MTKView using CIRenderDestination From some research, I wonder if scaling the CVPixelBuffer using the accelerate framework would be faster? Also, Instead of scaling the final image, perhaps I could offload this to the metal view? Any pointers greatly appreciated!

Media Technologies Photos & Camera Metal Camera Accelerate Photos and Imaging

2

0

1k

Jul ’24

Scipy problems with OpenBLAS and Accelerate

I'm using M1pro and have successfully installed Numpy with Accelerate following, and it really speedup my programs. I also ran np.test() to check the correctness and every test passed. However, I can't install Scipy with Accelerate, since the official document said Accelerate has a LAPACK of too old version. I can't even find a scipy that can pass scipy.test(). I tried the codes below: conda install numpy 'libblas=*=*accelerate' conda install scipy np.test() as fails, sp.test() can't even finish conda install numpy 'libblas=*=*openblas' conda install scipy Both np.test() and sp.test() can finish, but with many failures. I believe the bugs are due to Conda. pip install --no-binary :all: --no-use-pep517 numpy pip install scipy np.test() has no failure and went fast, sp.test() uses OpenBLAS and has 3 failures. This is the best version I have found. So my question is: can we find a reliable version of scipy on M1? Considering the popularity of scipy, I think it's not a high-living expectation. And a question for Apple: is there really a plan to upgrade the LAPACK in Accelerate?

Developer Tools & Services Xcode Developer Tools Accelerate Mac Apple Silicon

2

0

2.6k

Jan ’25

Post

Replies

Boosts

Views

Activity

Accelerate

Posts under Accelerate tag

Post

Replies

Boosts

Views

Activity