The disclosure discloses a graph optimization method and apparatus for neural network computation. The graph optimization method includes the following steps: S1: converting a computation graph; S2: allocating a register; S3: defining a route selector for a redefined variable; S4: solving the route selector for the redefined variable; S5: defining a criterion of inserting the route selector for the redefined variable into a node; S6: analyzing a dominating edge set of the node for the redefined variable; S7: inserting the route selector for the redefined variable; and S8: renaming the redefined variable. The disclosure solves the problem of the corresponding route selection on a correct definition of the redefined variable when a node including the redefined variable in a computation graph in the compiling period flows through multiple paths of computation flow, reduces the memory cost and promotes the development of implementation application of a deep neural network model.
|
1. A graph compiling and optimization method for neural network computation, comprising the following steps:
S1: converting a computation graph: converting a neural network computation graph for neural network computation into a computation graph in a global single-node defining mode;
S2: allocating a register of a computer memory: allocating the register of the computer memory for a variable at a node of the computation graph;
S3: defining a route selector for a redefined variable, and selecting a correct definition of the redefined variable at the node of the computation graph according to a path through which execution flow flows in the operation phase of the computation graph;
S4: solving the route selector for the redefined variable;
S41: inserting a copy node of the correct definition of the redefined variable at a non-key edge of the computation graph: inserting an output variable of the route selector at the non-key edge of the computation graph to take over the copy node of the correct definition of the redefined variable;
S42: decomposing a key edge of the computation graph: adding a blank node at the key edge of the computation graph;
S43: inserting the copy node of the correct definition of the redefined variable at the key edge of the computation graph: inserting the output variable of the route selector at the position of a predecessor node of the blank node in the step S42 to take over the copy node of the correct definition of the redefined variable; and
S44: removing the node with the route selector inserted at the junction of multiple paths of computation flow in the computation graph: when the route selector of the correct definition of the redefined variable is de-structured by the step S42 and the step S43, inserting a correct definition node of the redefined variable into the predecessor node corresponding to the junction node of the multiple paths of computation flow;
S5: defining a criterion of inserting the route selector for the redefined variable into the node;
S6: analyzing a dominating edge set of a node for the redefined variable;
S7: inserting the route selector for the redefined variable; and
S8: renaming the redefined variable.
2. The graph compiling and optimization method for neural network computation according to
S11: recording a name of a variable defined at a start node of the neural network computation graph for neural network computation where all variables are located; and
S12: traversing the neural network computation graph for neural network computation according to a topological sequence of the neural network computation graph for neural network computation, and whenever a successor node redefining the variable exists, generating a new name for the variable to obtain the computation graph in the global single-node defining mode.
3. The graph compiling and optimization method for neural network computation according to
S21: analyzing a life cycle of the variable at the node of the computation graph; and
S22: allocating the register for the variable according to the life cycle, and when the life cycle corresponding to the variable has no conflict, enabling the variable with the non-conflict life cycle to multiplex the same register.
4. The graph compiling and optimization method for neural network computation according to
5. The graph compiling and optimization method for neural network computation according to
defining a condition of inserting the route selector;
defining a dominance attribute of the computation graph; and
defining the rigorous computation graph.
6. The graph compiling and optimization method for neural network computation according to
the condition 1 is that connecting edges exist between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and the connecting edges have a unique common node which is the junction node of multiple paths of computation flow of the computation graph; and
the condition 2 is that in one time of execution flow of the computation graph, the junction node of multiple paths of computation flow of the computation graph cannot simultaneously flow through the connecting edges between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and can only select one of the connecting edges.
7. The graph compiling and optimization method for neural network computation according to
8. The graph compiling and optimization method for neural network computation according to
for a node without the route selector, a definition of the redefined variable certainly exists in a predecessor node to dominate the node; and
for a node with the route selector, a plurality of definitions of the redefined variable certainly exist and a corresponding definition of the redefined variable dominates a predecessor node corresponding to the node with the route selector.
9. The graph compiling and optimization method for neural network computation according to
S51: defining that a node V1 rigorously dominates a node V2, wherein the node V1 dominates the node V2, the node V1 is a predecessor node of the node V2, and V1≠V2; and
S52: defining a dominating edge of the node V1, the dominating edge comprising a set of all nodes Vi that satisfy the following conditions: the node V1 dominates a predecessor node of the node Vi, and the node V1 does not rigorously dominate the node Vi.
10. The graph compiling and optimization method for neural network computation according to
S61: analyzing a dominating edge of the start node comprising the redefined variable, an insertion position of the route selector for the redefined variable at the random node being a dominating edge set of the node; and
S62: iterating a successor node of the start node until no node requires the route selector for the redefined variable.
11. The graph compiling and optimization method for neural network computation according to
12. The graph compiling and optimization method for neural network computation according to
13. A graph compiling and optimization apparatus for neural network computation, comprising a memory and one or more processors, an executable code being stored in the memory, and the one or more processors being configured to implement the graph optimization method for neural network computation according to
14. A non-transitory computer readable storage medium, storing a program, when the program is executed by a processor, the graph compiling and optimization method for neural network computation according to
|
The present application claims priority of Chinese Patent Application No. 202210874564.2 filed to the Patent Office of CNIPA on Jul. 25, 2022, entitled “GRAPH OPTIMIZATION METHOD AND APPARATUS FOR NEURAL NETWORK COMPUTATION”, the entire contents of which are incorporated herein by reference.
The disclosure herein relates to the technical field of a computer based on a specific computational model, in particular to a graph compiling and optimization method and apparatus for neural network computation.
With the rapid development of industrialization application of the artificial intelligence, a graph compiling and optimization technology for deep neural network model computation increasingly becomes the research hotspot in the academic world and the industrial world.
Therefore, a graph compiling and optimization method and apparatus for neural network computation are proposed.
The disclosure aims to provide a graph compiling and optimization method and apparatus for neural network computation. When a plurality of nodes in a computation graph in compiling for neural network computation include a redefined variable, the corresponding selection on a correct definition of the redefined variable when the redefined variable faces a multi-path computation flow graph is depended on a path through which execution flow flows in the operation phase of the computation graph. In order to compile and optimize the computation graph including the redefined variable before execution of the computation graph, the disclosure provides a computation graph intermediate representation of graph compiling and optimization in a global single-node defining mode and proposes an insertion criterion and an analysis method of a route selector of a correct definition of a redefined variable. The disclosure solves the problem of the corresponding route selection on the correct definition of the redefined variable when a node including the redefined variable in a computation graph in the compiling period flows through multiple paths of computation flow. According to the disclosure, by simplifying the characteristic of the redefined variable of the node in the computation graph, the graph compiling and optimizing process for neural network computation is simplified, and a better optimization result is obtained. In the process of developing an algorithm model by researchers and engineering users, by utilizing the graph optimization method and apparatus provided by the disclosure, data flow of the computation graph in compiling for neural network computation is simplified, the overall memory cost required by a tensor variable in the data flow is reduced, the requirement of a large model for hardware memory resources is reduced, and the development of implementation application of a deep neural network model is promoted.
The disclosure adopts the following technical solution that:
A graph compiling and optimization method for neural network computation includes the following steps:
Further, the step S1 specifically includes the following sub-steps:
Further, the step S2 specifically includes the following sub-steps:
Further, in the step S3, in the computation graph intermediate representation, during the corresponding selection of a correct route when the redefined variable faces multiple paths of computation flow, the route selector is inserted at a junction of the multiple paths of computation flow of the redefined variable, and the correct definition of the redefined variable is matched by utilizing the route selector.
Further, the step S3 of defining the route selector for the redefined variable includes the following definitions:
Further, defining the condition of inserting the route selector specifically is that: when a predecessor node of a junction node of multiple paths of computation flow of the computation graph constitutes two or more different sub-graphs and each of the sub-graphs includes a definition node of the redefined variable, a condition 1 and a condition 2 are simultaneously satisfied and the route selector for the redefined variable is inserted at the definition node of the redefined variable of the computation graph;
The condition 1 is that connecting edges exist between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and the connecting edges have a unique common node which is the junction node of multiple paths of computation flow of the computation graph; and
The condition 2 is that in one time of execution flow of the computation graph, the junction node of multiple paths of computation flow of the computation graph cannot simultaneously flow through the connecting edges between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and can only select one of the connecting edges.
Further, defining the dominance attribute of the computation graph specifically is that: all paths of the execution flow of the computation graph in the process of flowing to a node from a root node of the computation graph pass through the junction node of multiple paths of computation flow of the computation graph.
Further, defining the rigorous computation graph specifically is that:
Further, the step S4 specifically includes the following sub-steps:
Further, the step S5 specifically includes the following sub-steps:
Further, the step S6 specifically includes the following sub-steps:
Further, the step S7 specifically includes: when a node includes the correct definition of any one redefined variable, inserting one route selector for the redefined variable at any node at the dominating edge of the any one redefined variable.
Further, the step S8 specifically includes: carrying out renaming on a variable output by the inserted route selector for the redefined variable.
The disclosure further provides a graph compiling and optimization apparatus for neural network computation, including a memory and one or more processors, an executable code being stored in the memory, and the one or more processor being configured to implement the graph compiling and optimization method for neural network computation according to any one of the embodiments above when executing the executable code.
The disclosure further provides a computer readable storage medium storing a program. When the program is executed by a processor, the graph compiling and optimization method for neural network computation according to any one of the embodiments above is implemented.
The disclosure has the beneficial effects that: the disclosure solves the problem of the corresponding selection on the correct definition when the redefined variable included by a plurality of nodes in the computation graph for neural network computation faces the multi-path computation flow graph, but in a conventional method, it needs to firstly execute the computation graph and select a correct definition corresponding to the redefined variable according to the path through which the execution flow actually flows. The disclosure provides the graph optimization method in the compiling period for the computation graph including the redefined variable, provides a computation graph intermediate representation of graph optimization in a global single-node defining mode, and solves the problem of the corresponding route selection on the correct definition of the redefined variable when the node including the redefined variable in the computation graph flows through multiple paths of computation flow in the compiling period of the computation graph. The structure of the data flow of the computation graph in compiling for neural network computation is simplified, the overall memory cost required by the tensor variable in the data flow is reduced, and the requirement of the large model for the hardware memory resources is reduced. By a data flowing method for neural network computation provided by the disclosure, the computation efficiency of the entire computation graph is improved and the hardware and time cost is saved.
The description below on at least one exemplary embodiment actually is merely illustrative and definitely is not intended to make any limit to the disclosure and application or use thereof. Based on the embodiments in the disclosure, those ordinary skilled in the art can obtain other embodiment(s), without any inventive work, which should be within the scope of protection of the disclosure.
With reference to
In the computation graph intermediate representation, during the corresponding selection of a correct route when the redefined variable faces multiple paths of computation flow, the route selector is inserted at a junction of the multiple paths of computation flow of the redefined variable, and the correct definition of the redefined variable is matched by utilizing the route selector;
Defining the route selector for the redefined variable includes the following definitions:
A condition of inserting the route selector is defined;
When a predecessor node of a junction node of multiple paths of computation flow of the computation graph constitutes two or more different sub-graphs and each of the sub-graphs includes a definition node of the redefined variable, a condition 1 and a condition 2 are simultaneously satisfied and the route selector for the redefined variable is inserted at the definition node of the redefined variable of the computation graph;
The condition 1 is that connecting edges exist between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and the connecting edges have a unique common node which is the junction node of multiple paths of computation flow of the computation graph; and
The condition 2 is that in one time of execution flow of the computation graph, the junction node of multiple paths of computation flow of the computation graph cannot simultaneously flow through the connecting edges between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and can only select one of the connecting edges.
A dominance attribute of the computation graph is defined; and
All paths of the execution flow of the computation graph in the process of flowing to a node from a root node of the computation graph pass through the junction node of multiple paths of computation flow of the computation graph.
The rigorous computation graph is defined;
For a node without the route selector, a definition of the redefined variable certainly exists in a predecessor node to dominate the node; and
For a node with the route selector, a plurality of definitions of the redefined variable certainly exist and a corresponding definition of the redefined variable dominates a predecessor node corresponding to the node with the route selector.
With reference to
With reference to
The route selector has the semantics of a multiplexer;
In the computation graph intermediate representation, during the corresponding selection of a correct route when the redefined variable faces multiple paths of computation flow, the route selector is inserted at a junction of the multiple paths of computation flow of the redefined variable, and the correct definition of the redefined variable is matched by utilizing the route selector; and
For the redefined variable in the node of the computation graph, in the computation graph intermediate representation, the corresponding selection on the correct route when the redefined variable faces a multi-path computation flow graph is depended on the path through which the execution flow flows in the operation phase of the computation graph. Therefore, in order to optimize the computation graph before execution of the computation graph, it needs to insert the route selector at the junction of multiple paths of computation flow of the redefined variable and match the correct definition of the redefined variable by utilizing the route selector.
With reference to
Defining the route selector for the redefined variable includes the following definitions:
A condition of inserting the route selector is defined;
When a predecessor node of a junction node of multiple paths of computation flow of the computation graph constitutes two or more different sub-graphs and each of the sub-graphs includes a definition node of the redefined variable, a condition 1 and a condition 2 are simultaneously satisfied and the route selector for the redefined variable is inserted at the definition node of the redefined variable of the computation graph;
The condition 1 is that connecting edges exist between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and the connecting edges have a unique common node which is the junction node of multiple paths of computation flow of the computation graph; and
The condition 2 is that in one time of execution flow of the computation graph, the junction node of multiple paths of computation flow of the computation graph cannot simultaneously flow through the connecting edges between the different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and can only select one of the connecting edges.
With reference to
Condition 1: a sub-graph 1 and a sub-graph 2 exist, the node V1 and the node V3 of the definition of the redefined variable are respectively included in the sub-graphs, and two connecting edges between the sub-graph 1 and the sub-graph 2 and the junction node of multiple paths of computation flow of the computation graph are a connecting edge E1,3 and a connecting edge E2,3, respectively; and
Condition 2: in one time of execution flow of the computation graph, the junction node V5 of multiple paths of computation flow of the computation graph cannot simultaneously flow through the connecting edge E1,3 and the connecting edge E2,3 between different sub-graphs and the junction node of multiple paths of computation flow of the computation graph and can only select one of the connecting edge E1,3 and the connecting edge E2,3.
A dominance attribute of the computation graph is defined; and
All paths of the execution flow of the computation graph in the process of flowing to a node from a root node of the computation graph pass through the junction node of multiple paths of computation flow of the computation graph.
With reference to
The rigorous computation graph is defined;
For a node without the route selector, a definition of the redefined variable certainly exists in a predecessor node to dominate the node; and
For a node with the route selector, a plurality of definitions of the redefined variable certainly exist and a corresponding definition on the redefined variable dominates a predecessor node corresponding to the node with the route selector.
With reference to
With reference to
A definition of the key edge of the computation graph is that: a start node of the connecting edge has a plurality of successor nodes, and meanwhile, a tail node of the connecting edge has a plurality of predecessor nodes.
With reference to
The route selector on the redefined variable is the definition on the redefined variable per se, so the dominating edge criterion has to be iterated;
With reference to
The process of analyzing a dominating edge set {e} of a node e is as follows:
When a node includes the correct definition of any one redefined variable, one route selector for the redefined variable is inserted at any node at the dominating edge of the any one redefined variable.
With reference to
Renaming is carried out on a variable output by the inserted route selector for the redefined variable.
With reference to
Corresponding to the above-mentioned embodiment of the graph compiling and optimization method for neural network computation, the disclosure further provides an embodiment of a graph compiling and optimization apparatus for neural network computation.
With reference to
The embodiment of the graph compiling and optimization apparatus for neural network computation of the disclosure can be applied on a random device with the data processing capacity. The random device with the data processing capacity may be a device or an apparatus such as a computer and the like. An apparatus embodiment may be implemented by software, or may be implemented by hardware or in a software and hardware combined mode. By taking software implementation as an example, in a logical sense, the apparatus is formed by reading a corresponding computer program instruction in a non-volatile memory into a memory for operation through a processor of the random device with the data processing capacity where the apparatus is located. In the hardware aspect, as shown in
The implementing process of the function and the effect of each unit in the apparatus above specifically refers to the implementing process of the corresponding step in the method above, and will not be repeated herein.
The apparatus embodiment basically corresponds to the method embodiment, and thus, the related description can refer to part of the description of the method embodiment. The apparatus embodiment described above is merely exemplary, wherein the units illustrated as separate parts may be or may be not physically separated, and as unit display parts, may be or may be not physical units, i.e., may be located in one place, or may be distributed on a plurality of network units. Part or all of modules of the apparatus may be selected according to actual demands to fulfill the aim of the solution of the disclosure. Those ordinary skilled in the art can understand and implement without any inventive work.
An embodiment of the disclosure further provides a computer readable storage medium storing a program. When the program is executed by a processor, the graph optimization method for neural network computation in the embodiment above is implemented.
The computer readable storage medium may be an internal storage unit, e.g., a hard disk or a memory, of the random device with the data processing capacity according to any one of the embodiments above. The computer readable storage medium may also be an external storage device of the random device with the data processing capacity, such as a plug-in type hard disk, a Smart Media Card (SMC), a SD card, a flash card, and the like equipped on the device. Further, the computer readable storage medium may also include not only the internal storage unit, but also the external storage device of the random device with the data processing capacity. The computer readable storage medium is configured to store the computer program and other programs and data required by the random device with the data processing capacity, or may be configured to temporarily store data which has been output or is to output.
The above are only the preferred embodiments of the disclosure and not intended to limit the disclosure. Those skilled in the art can make various modifications and changes to the disclosure. Any modifications, equivalent replacements, improvements and the like within the spirit and principle of the disclosure shall fall within the scope of protection of the disclosure.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
6026241, | Jun 13 1997 | RPX Corporation | System, method, and computer program product for partial redundancy elimination based on static single assignment form during compilation |
20220198296, | |||
CN101923472, | |||
CN108446540, | |||
CN112084037, | |||
CN112084038, | |||
CN114461351, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Sep 12 2022 | WANG, HONGSHENG | Zhejiang Lab | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 061173 | /0851 | |
Sep 14 2022 | CHEN, GUANG | Zhejiang Lab | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 061173 | /0851 | |
Sep 21 2022 | Zhejiang Lab | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Sep 21 2022 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Date | Maintenance Schedule |
Feb 27 2027 | 4 years fee payment window open |
Aug 27 2027 | 6 months grace period start (w surcharge) |
Feb 27 2028 | patent expiry (for year 4) |
Feb 27 2030 | 2 years to revive unintentionally abandoned end. (for year 4) |
Feb 27 2031 | 8 years fee payment window open |
Aug 27 2031 | 6 months grace period start (w surcharge) |
Feb 27 2032 | patent expiry (for year 8) |
Feb 27 2034 | 2 years to revive unintentionally abandoned end. (for year 8) |
Feb 27 2035 | 12 years fee payment window open |
Aug 27 2035 | 6 months grace period start (w surcharge) |
Feb 27 2036 | patent expiry (for year 12) |
Feb 27 2038 | 2 years to revive unintentionally abandoned end. (for year 12) |