Accelerate

RSS for tag

Make large-scale mathematical computations and image calculations with high-performance, energy-efficient computation using Accelerate.

Posts under Accelerate tag

24 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

MLTensor computation took more time than expected.
func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.
0
0
594
Aug ’24
Documentation and usage of BNNS.NormalizationLayer
Hello everybody, I am running into an error with BNNS.NormalizationLayer. It appears to only work with .vector, and matrix shapes throws layerApplyFail during training. Inference doesn't throw but the output stays the same. How to correctly use BNNS.NormalizationLayer with matrix shapes? How to debug layerApplyFail exception? Thanks let array: [Float32] = [ 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15, 16, 17, 18, ] // let inputShape: BNNS.Shape = .vector(6 * 3) // works let inputShape: BNNS.Shape = .matrixColumnMajor(6, 3) let input = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) let output = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) let beta = BNNSNDArrayDescriptor.allocate(repeating: Float32(0), shape: inputShape, batchSize: 1) let gamma = BNNSNDArrayDescriptor.allocate(repeating: Float32(1), shape: inputShape, batchSize: 1) let activation: BNNS.ActivationFunction = .identity let layer = BNNS.NormalizationLayer(type: .layer(normalizationAxis: 0), input: input, output: output, beta: beta, gamma: gamma, epsilon: 1e-12, activation: activation)! let layerInput = BNNSNDArrayDescriptor.allocate(initializingFrom: array, shape: inputShape) let layerOutput = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape) // try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .inference) // No throw try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .training) _ = layerOutput.makeArray(of: Float32.self) // All zeros when .inference
1
0
853
Jul ’24
Performant alternative to scaling a CIImage / PixelBuffer
Hey, I’m building a camera app where I am applying real time effects to the view finder. One of those effects is a variable blur, so to improve performance I am scaling down the input image using CIFilter.lanczosScaleTransform(). This works fine and runs at 30FPS, but when running the metal profiler I can see that the scaling transforms use a lot of GPU time, almost as much as the variable blur. Is there a more efficient way to do this? The simplified chain is like this: Scale down viewFinder CVPixelBuffer (CIFilter.lanczosScaleTransform) Scale up depthMap CVPixelBuffer to match viewFinder size (CIFilter.lanczosScaleTransform) Create CIImages from both CVPixelBuffers Apply VariableDepthBlur (CIFilter.maskedVariableBlur) Scale up final image to metal view size (CIFilter.lanczosScaleTransform) Render CIImage to a MTKView using CIRenderDestination From some research, I wonder if scaling the CVPixelBuffer using the accelerate framework would be faster? Also, Instead of scaling the final image, perhaps I could offload this to the metal view? Any pointers greatly appreciated!
2
0
1k
Jul ’24
Scipy problems with OpenBLAS and Accelerate
I'm using M1pro and have successfully installed Numpy with Accelerate following, and it really speedup my programs. I also ran np.test() to check the correctness and every test passed. However, I can't install Scipy with Accelerate, since the official document said Accelerate has a LAPACK of too old version. I can't even find a scipy that can pass scipy.test(). I tried the codes below: conda install numpy 'libblas=*=*accelerate' conda install scipy np.test() as fails, sp.test() can't even finish conda install numpy 'libblas=*=*openblas' conda install scipy Both np.test() and sp.test() can finish, but with many failures. I believe the bugs are due to Conda. pip install --no-binary :all: --no-use-pep517 numpy pip install scipy np.test() has no failure and went fast, sp.test() uses OpenBLAS and has 3 failures. This is the best version I have found. So my question is: can we find a reliable version of scipy on M1? Considering the popularity of scipy, I think it's not a high-living expectation. And a question for Apple: is there really a plan to upgrade the LAPACK in Accelerate?
2
0
2.6k
Jan ’25