6.2.1 About analyzing code for parallelization

When you have identified the most compute intensive parts of your application, analyze the code to see if you can run it in parallel.

Parallelizing code can present the following degrees of difficulty:
Parallelizing the code requires small modifications.
Parallelizing the code requires complex modifications. If you are using work-items in place of loop iterations, compute variables based on the value of the global ID rather than using a loop counter.
Difficult and includes dependencies
Parallelizing the code requires complex modifications and the use of techniques to avoid dependencies. You can compute values per frame, perform computations in multiple stages, or pre-compute values to remove dependencies.
Appears to be impossible
If parallelizing the code appears to be impossible, this only means that a particular code implementation cannot be parallelized.
The purpose of code is to perform a function. There might be different algorithms that perform the same function but work in different ways. Some of these might be parallelizable.
Investigate different alternatives to the algorithms and data structures that the code uses. These might make parallelization possible.
Related concepts
6.3.1 Use the global ID instead of the loop counter
6.3.2 Compute values in a loop with a formula instead of using counters
6.3.3 Compute values per frame
6.3.4 Perform computations with dependencies in multiple-passes
6.3.5 Pre-compute values to remove dependencies
6.4 Using parallel processing with non-parallelizable code
Non-ConfidentialPDF file icon PDF versionARM 100614_0300_00_en
Copyright © 2012, 2013, 2015, 2016 ARM. All rights reserved.