func testMLTensor() {
let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self)
let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self)
for _ in 0...50 {
let t = Date()
let x = (t1 * t2)
print("MLTensor", t.timeIntervalSinceNow * 1000, "ms")
}
}
testMLTensor()
The above code took more time than expected, especially in the early stage of iteration.
Accelerate
RSS for tagMake large-scale mathematical computations and image calculations with high-performance, energy-efficient computation using Accelerate.
Posts under Accelerate tag
24 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Hello everybody,
I am running into an error with BNNS.NormalizationLayer. It appears to only work with .vector, and matrix shapes throws layerApplyFail during training. Inference doesn't throw but the output stays the same.
How to correctly use BNNS.NormalizationLayer with matrix shapes? How to debug layerApplyFail exception?
Thanks
let array: [Float32] = [
01, 02, 03, 04, 05, 06,
07, 08, 09, 10, 11, 12,
13, 14, 15, 16, 17, 18,
]
// let inputShape: BNNS.Shape = .vector(6 * 3) // works
let inputShape: BNNS.Shape = .matrixColumnMajor(6, 3)
let input = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape)
let output = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape)
let beta = BNNSNDArrayDescriptor.allocate(repeating: Float32(0), shape: inputShape, batchSize: 1)
let gamma = BNNSNDArrayDescriptor.allocate(repeating: Float32(1), shape: inputShape, batchSize: 1)
let activation: BNNS.ActivationFunction = .identity
let layer = BNNS.NormalizationLayer(type: .layer(normalizationAxis: 0), input: input, output: output, beta: beta, gamma: gamma, epsilon: 1e-12, activation: activation)!
let layerInput = BNNSNDArrayDescriptor.allocate(initializingFrom: array, shape: inputShape)
let layerOutput = BNNSNDArrayDescriptor.allocateUninitialized(scalarType: Float32.self, shape: inputShape)
// try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .inference) // No throw
try layer.apply(batchSize: 1, input: layerInput, output: layerOutput, for: .training)
_ = layerOutput.makeArray(of: Float32.self) // All zeros when .inference
Hey, I’m building a camera app where I am applying real time effects to the view finder. One of those effects is a variable blur, so to improve performance I am scaling down the input image using CIFilter.lanczosScaleTransform(). This works fine and runs at 30FPS, but when running the metal profiler I can see that the scaling transforms use a lot of GPU time, almost as much as the variable blur. Is there a more efficient way to do this?
The simplified chain is like this:
Scale down viewFinder CVPixelBuffer (CIFilter.lanczosScaleTransform)
Scale up depthMap CVPixelBuffer to match viewFinder size (CIFilter.lanczosScaleTransform)
Create CIImages from both CVPixelBuffers
Apply VariableDepthBlur (CIFilter.maskedVariableBlur)
Scale up final image to metal view size (CIFilter.lanczosScaleTransform)
Render CIImage to a MTKView using CIRenderDestination
From some research, I wonder if scaling the CVPixelBuffer using the accelerate framework would be faster? Also, Instead of scaling the final image, perhaps I could offload this to the metal view?
Any pointers greatly appreciated!
Topic:
Media Technologies
SubTopic:
Photos & Camera
Tags:
Metal
Camera
Accelerate
Photos and Imaging
I'm using M1pro and have successfully installed Numpy with Accelerate following, and it really speedup my programs. I also ran np.test() to check the correctness and every test passed.
However, I can't install Scipy with Accelerate, since the official document said Accelerate has a LAPACK of too old version. I can't even find a scipy that can pass scipy.test(). I tried the codes below:
conda install numpy 'libblas=*=*accelerate'
conda install scipy
np.test() as fails, sp.test() can't even finish
conda install numpy 'libblas=*=*openblas'
conda install scipy
Both np.test() and sp.test() can finish, but with many failures. I believe the bugs are due to Conda.
pip install --no-binary :all: --no-use-pep517 numpy
pip install scipy
np.test() has no failure and went fast, sp.test() uses OpenBLAS and has 3 failures. This is the best version I have found.
So my question is: can we find a reliable version of scipy on M1? Considering the popularity of scipy, I think it's not a high-living expectation.
And a question for Apple: is there really a plan to upgrade the LAPACK in Accelerate?
Topic:
Developer Tools & Services
SubTopic:
Xcode
Tags:
Developer Tools
Accelerate
Mac
Apple Silicon