METHOD FOR DEBUGGING PROGRAM OF MANYCORE PARALLEL PROCESSOR BASED ON CODE EXECUTION AND APPARATUS USING THE SAME
20230130429 · 2023-04-27
Assignee
Inventors
Cpc classification
International classification
Abstract
Disclosed herein are a method for debugging a program of many core parallel processors based on code execution and an apparatus for the same. The method, performed by debugger software running on a host processor, includes generating a program execution binary including debug execution code and providing the same to multiple parallel processors, acquiring context data corresponding to the state of a target processor immediately before the debug execution code is executed in the target processor, among the multiple parallel processors, and analyzing the context data and thereby performing debugging of a program executed in the processor in which the debug execution code is executed.
Claims
1. A method for debugging a program of many core parallel processors, which is performed by debugger software running on a host processor, comprising: generating a program execution binary including debug execution code and providing the program execution binary to multiple parallel processors; acquiring context data corresponding to a state of a target processor immediately before the debug execution code is executed, the target processor being a processor in which the debug execution code is executed, among the multiple parallel processors; and analyzing the context data, thereby performing debugging of a program executed in the processor in which the debug execution code is executed.
2. The method of claim 1, wherein the debug execution code includes a break instruction for suspending execution of the program and a handler program for generating an interrupt for passing a control flow to the debugger software.
3. The method of claim 2, wherein the target processor suspends execution of the program in compliance with the break instruction and stores the context data in a context memory buffer in main memory based on execution of the handler program.
4. The method of claim 3, wherein the context data is stored at a location assigned to match an identifier of the target processor in the context memory buffer.
5. The method of claim 2, wherein the target processor stores an address value of the break instruction in an internal register, and when the handler program is terminated, the target processor resumes the suspended execution of the program based on the address value stored in the internal register.
6. The method of claim 5, wherein the handler program generates the interrupt and thereby notifies the debugger software of a fact that execution of the program is suspended in the target processor, and the handler program is terminated when the interrupt is cleared by the debugger software.
7. The method of claim 3, wherein, when the interrupt is received, the debugger software acquires the context data from the context memory buffer and analyzes the context data.
8. The method of claim 1, wherein the debugger software generates the program execution binary including the debug execution code by inserting the debug execution code at a breakpoint set by a user for debugging in a general program execution binary generated by compiling a source program.
9. The method of claim 8, further comprising: replacing, by the debugger software, the program execution binary including the debug execution code by deleting the debug execution code inserted at the breakpoint and by again inserting the debug execution code at a new breakpoint requested by the user.
10. The method of claim 5, further comprising: when execution of new code for debugging is requested by a user, generating, by the debugger software, new code including the break instruction at an end of the code and storing, by the debugger software, the new code in a debug code memory buffer in main memory; and storing, by the debugger software, an address value corresponding to a start location of the new code in the internal register.
11. The method of claim 10, wherein the target processor executes the new code based on the address value stored in the internal register when the handler program is terminated.
12. A debugging apparatus, comprising: a host processor including debugger software configured to generate a program execution binary including debug execution code, to provide the program execution binary to multiple parallel processors, to acquire context data corresponding to a state of a target processor immediately before the debug execution code is executed, the target processor being a processor in which the debug execution code is executed, among the multiple parallel processors, to analyze the context data, and to perform debugging of a program executed in the processor, in which the debugging execution code is executed; and main memory shared between the host processor and the multiple parallel processors.
13. The debugging apparatus of claim 12, wherein the debug execution code includes a break instruction for suspending execution of the program and a handler program for generating an interrupt for passing a control flow to the debugger software.
14. The debugging apparatus of claim 13, wherein the target processor suspends execution of the program in compliance with the break instruction and stores the context data in a context memory buffer in the main memory based on execution of the handler program.
15. The debugging apparatus of claim 14, wherein the context data is stored at a location assigned to match an identifier of the target processor in the context memory buffer.
16. The debugging apparatus of claim 13, wherein the target processor stores an address value of the break instruction in an internal register, and when the handler program is terminated, the target processor resumes the suspended execution of the program based on the address value stored in the internal register.
17. The debugging apparatus of claim 16, wherein the handler program generates the interrupt and thereby notifies the debugger software of a fact that execution of the program is suspended in the target processor, and the handler program is terminated when the interrupt is cleared by the debugger software.
18. The debugging apparatus of claim 14, wherein, when the interrupt is received, the debugger software acquires the context data from the context memory buffer and analyzes the context data.
19. The debugging apparatus of claim 12, wherein the debugger software generates the program execution binary including the debug execution code by inserting the debug execution code at a breakpoint set by a user for debugging in a general program execution binary generated by compiling a source program.
20. The debugging apparatus of claim 19, wherein the debugger software replaces the program execution binary including the debug execution code by deleting the debug execution code inserted at the breakpoint and again inserting the debug execution code at a new breakpoint requested by the user.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0030] The above and other objects, features, and advantages of the present disclosure will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0039] The present disclosure will be described in detail below with reference to the accompanying drawings. Repeated descriptions and descriptions of known functions and configurations which have been deemed to unnecessarily obscure the gist of the present disclosure will be omitted below. The embodiments of the present disclosure are intended to fully describe the present disclosure to a person having ordinary knowledge in the art to which the present disclosure pertains. Accordingly, the shapes, sizes, etc. of components in the drawings may be exaggerated in order to make the description clearer.
[0040] In the present specification, each of expressions such as “A or B”, “at least one of A and B”, “at least one of A or B”, “A, B, or C”, “at least one of A, B, and C”, and “at least one of A, B, or C” may include any one of the items listed in the expression or all possible combinations thereof.
[0041] Hereinafter, a preferred embodiment of the present disclosure will be described in detail with reference to the accompanying drawings.
[0042] A function of debugging a program executed in a processor core is a boon to software developers, but the frequency of use thereof is very low when the processor is actually used. In a single processor core, only a single hardware debug module is required to be implemented for the single core, so the implementation area is not large. However, in the case of many core parallel processors for performing parallel processing using thousands of processors, the cost for implementing a hardware debug module (area, wiring, and the like) is a serious problem.
[0043] Accordingly, the present disclosure proposes a method that enables debugging of respective programs executed in thousands of parallel processors without implementing a hardware debug module.
[0044]
[0045] Referring to
[0046] Hereinafter, function blocks configuring respective modules and the roles of the respective function blocks will be described using Table 1 below.
TABLE-US-00001 TABLE 1 Function block Description Kernel.c source code of a program executed in multiple parallel processors Core#0 to Core#N Debugger (SW) debugging software executed in a host processor and provided for software programmers HOST Processor a host processor in which an OS, user applications, and a debugger (SW) are executed Core#0-Core#N manycore parallel processors for executing a program corresponding to Kernel.c in parallel in order to accelerate an enormous amount of operation processing Memory system main memory shared between the host processor and the multiple parallel processors Core#0 to Core#N Kernel.exe a program execution binary generated by compiling the source code ‘kernel.c’ kernel.dbg a program execution binary generated by inserting break.instr into the execution binary ‘Kernel.exe’ and adding break.handler code thereto break.instr one of instructions executed in the multiple parallel processors Core#0 to Core#N when this instruction is executed, the program being executed is suspended and the break.handler program is executed. break.handler a program (included in the binary ‘Kernel.dbg’) executed in response to break.instr in the multiple parallel processors Core#0 to Core#N after execution of break.handler is finished, a program corresponding to the address stored in debug.PC is executed. debug.context a memory buffer for storing context data pertaining to the state of a core immediately before break.instr is executed in the core, in which break.instr and break.handler are executed the content of general purpose registers, stacks, and the like of cores that perform debugging is stored therein debug.code a memory buffer for storing program code to be executed by a specific core when a debugging process of a debugger (SW) is performed Entrypoint the address value of a program to be initially executed by each of the multiple parallel processors Core#0 to Core#N when a host processor writes an address value to an entrypoint register, a corresponding core starts a program from the entrypoint debug.PC when break.instr is executed, the PC value (instruction address) of break.instr is stored in debug.PC, and after execution of break.handler program is finished, execution of a program is resumed from the address stored in debug.PC Core.ID a register in a core for storing an ID value for identifying each of the multiple parallel processors Core#0 to Core#N the running core can be identified by checking Core.ID value in the Kernel program.
[0047]
[0048] Referring to
[0049] Here, the debugger software inserts debug execution code at a breakpoint set by a user for debugging in the execution binary of a general program, which is generated by compiling a source program, thereby generating a program execution binary including the debug execution code.
[0050] For example, the debugger software 211 run on the host processor 210 by a user, illustrated in
TABLE-US-00002 TABLE 2 //Kernel.exe Instruction.0 Instruction.1 Instruction.2 Instruction.3 Instruction.4 Instruction.5 ...
[0051] Subsequently, the debugger software 211 illustrated in
TABLE-US-00003 TABLE 3 //Kernel.dbg Instruction.0 Instruction.1 Instruction.2 Instruction.3 Instruction.4 break.instr Instruction.5 ... //break.handler Save Core context to debug.context Send IRQ to Host Wait for IRQ cleared Set PC <− debug.PC Continue execution
[0052] Here, the debug execution code may include a break instruction for suspending execution of the program and a handler program for generating an interrupt for passing a control flow to the debugger software.
[0053] For example, referring to Table 3, break.instr may correspond to a break instruction, and break.handler may correspond to the code of a handler program.
[0054] Here, after it respectively loads ‘kernel.exe’, which is the execution binary of a general program that does not include debug execution code, and ‘kernel.dbg’, which is a program execution binary including debug execution code, into main memory, the debugger software according to an embodiment of the present disclosure may set the entrypoint of each processor by separating a processor to perform debugging and a processor that does not perform debugging, among the multiple parallel processors.
[0055] For example, the debugger software 211 illustrated in
[0056] Accordingly, the processor that does not perform debugging executes the binary ‘Kernel.exe’, whereby the program may be executed without interruption. Also, the processor to perform debugging executes the binary ‘Kernel.dbg’, and may suspend execution of the program when it meets break.instr.
[0057] Also, in the method for debugging a program of many core parallel processors based on code execution according to an embodiment of the present disclosure, the debugger software acquires context data corresponding to the state of a target processor immediately before the debug execution code is executed, the target processor being a processor in which the debug execution code is executed, among the multiple parallel processors, at step S320.
[0058] Here, the target processor may suspend execution of the program in compliance with the break instruction and store the context data in the context memory buffer in the main memory based on execution of the handler program.
[0059] Here, the context data may be stored at the location assigned to match the identifier of the target processor in the context memory buffer.
[0060] For example, referring to
[0061] Here, the handler program generates an interrupt, thereby notifying the debugger software of the fact that execution of the program is suspended in the target processor. Subsequently, the handler program may be terminated when the interrupt is cleared by the debugger software.
[0062] For example, the handler program (break.handler) illustrated in
[0063] Also, in the method for debugging a program of many core parallel processors based on code execution according to an embodiment of the present disclosure, the debugger software analyzes the context data, thereby performing debugging of the program executed in the processor in which the debug execution code is executed at step S330.
[0064] Here, when an interrupt is received, the debugger software acquires the context data from the context memory buffer, thereby performing analysis.
[0065] For example, the debugger software 211 receiving the IRQ may acquire the context data from the context memory buffer (debug.context) in the main memory 230 and perform analysis thereon. Here, the user of the host processor 210 may check the analysis result and perform debugging of the process after that, and, in this process, the values of program variables and the like may be analyzed.
[0066] Here, the target processor may store the address value of the break instruction in the internal register thereof, and may resume the suspended execution of the program based on the address value stored in the internal register when the handler program is terminated.
[0067] For example, break.handler illustrated in
[0068] Subsequently, the target processor resumes execution of the program using the program execution binary, and may continue execution of instructions until it meets break.instr.
[0069] Here, the debugger software may replace the program execution binary including debug execution code by deleting the debug execution code, which was inserted at the breakpoint, and by again inserting the debug execution code at a new breakpoint in response to a user request.
[0070] For example, when the user of the host processor additionally requests functions such as step, step-in, step-out, breakpoint at function, and the like, the debugger software 211 illustrated in
[0071] Here, when execution of new code for debugging is requested by a user, the debugger software may generate new code including a break instruction at the end of the code, store the same in a debug code memory buffer in the main memory, and store the address value corresponding to the start location of the new code in the internal register.
[0072] Here, when the handler program is terminated, the target processor may execute the new code corresponding to the address value stored in the internal register.
[0073] For example, when a user requests execution of additional new code for debugging, the debugger software 211 illustrated in
TABLE-US-00004 TABLE 4 //debug.code { ... Instructions ... break.instr }
[0074] Here, after it stores the new code (debug.code) in the debug code memory buffer of the main memory, the debugger software 211 may set the value of debug.PC of the target processor to the address value of the start location of the new code (debug.code). Subsequently, the handler program (break.handler) may be terminated by clearing the IRQ, and after the handler program (break.handler) is terminated, the instruction at the address indicated by the value of debug.PC is executed, whereby instructions in the new code (debug.code) may be executed in the target processor.
[0075] Through the above-described method for debugging a program of many core parallel processors based on code execution, debugging of a program of each of parallel processors may be performed without a hardware debug module in a large-scale parallel system in which thousands or more processors are used.
[0076]
[0077] Referring to
[0078] Subsequently, the debugger software inserts break.instr at a breakpoint set by the user in the binary ‘Kernel.exe’ illustrated in Table 2 and adds the code of break.handler, as shown in Table 3, thereby generating binary code ‘kernel.dbg’ including debug execution code at step S420.
[0079] Subsequently, the start address of the binary ‘kernel.dbg’ may be stored in the entrypoint register of the processor to perform debugging, among the multiple parallel processors, at step S430.
[0080] Through the above-described process, the processor to perform debugging may execute the binary ‘Kernel.dbg’, and may suspend execution of the program when it meets break.instr.
[0081]
[0082] Referring to
[0083] First, among parallel processors, a target processor, the entrypoint register of which stores the start address of the binary ‘kernel.dbg’ through the process illustrated in
[0084] When it is determined at step S515 that break.instr is not executed, the program continues to be executed, and while the program is being executed, whether break.instr is executed may be determined.
[0085] Also, when it is determined at step S515 that break.instr is executed, execution of the program is suspended at step S520, the address value (the PC value) of break.instr is stored in the internal debug.PC register at step S530, and the code of a handler program (break.handler) may be executed at step S540.
[0086] Subsequently, the handler program (break.handler) may store context data, such as the content of general purpose registers, a stack, debug.PC, and the like of the target processor immediately before the break instruction (break.instr) is executed, in a context memory buffer (debug. context) at step S550.
[0087] Here, the handler program generates an interrupt, thereby notifying the debugger software of the fact that execution of the program is suspended in the target processor at step S560.
[0088] Subsequently, the debugger software analyzes the context data, thereby debugging the program executed in the processor in which the debug execution code is executed at step S570.
[0089]
[0090] Referring to
[0091] Here, break.instr is inserted as the last instruction of the new code (debug.code), whereby the control flow may be returned to the debugger software after execution of the new code (debug.code).
[0092] Subsequently, the debugger software may store the new code (debug.code) in a debug code memory buffer of main memory, and may set the value of debug.PC of a target processor to the start address value of the new code (debug.code) at step S620.
[0093] Subsequently, the handler program (break.handler) may be terminated by clearing the IRQ at step S630, and after the handler program (break.handler) is terminated, the instruction at the address indicated by the value of debug.PC is executed, whereby instructions in the new code (debug.code) may be executed in the target processor at step S640.
[0094]
[0095] Referring to
[0096] Subsequently, the debugger software inserts break.instr at a breakpoint set by a user in the binary ‘Kernel.exe’ illustrated in Table 2 and adds the code of break.handler, as shown in Table 3, thereby generating a binary ‘Kernel.dbg’ including debug execution code at step S704.
[0097] Subsequently, the debugger software may provide the binary ‘Kernel.dbg’ to multiple parallel processors Core#0 to Core#N, and the multiple parallel processors may execute the program by setting the start address of the binary ‘Kernel.dbg’ as the value of the entrypoint register thereof at step S706.
[0098] Subsequently, when break.instr is met during execution of the program, execution of the program is suspended, and a handler program (break.handler) may be executed at step S710.
[0099] Here, the handler program generates an interrupt (IRQ), thereby notifying the debugger software of the fact that execution of the program is suspended.
[0100] Subsequently, the debugger software analyzes context data at step S712, after which the debugger software may modify the binary ‘Kernel.dbg’ in response to a user request at step S714 or generate and execute new code (debug.code) at step S716.
[0101] For example, when the user of the host processor additionally requests functions such as step, step-in, step-out, breakpoint at function, and the like, the debugger software may delete break.instr from the binary ‘Kernel.dbg’. Subsequently, the debugger software may modify the binary ‘Kernel.dbg’ by again inserting break.instr at a breakpoint newly requested by the user. The modified binary ‘Kernel.dbg’ may be loaded into the main memory so as to replace Kernel.dbg stored therein.
[0102] In another example, when a user requests execution of additional new code for debugging, the debugger software may generate new code (debug.code) so as to have a structure such as that illustrated in Table 4 and store the same in a debug code memory buffer of the main memory. Subsequently, break.instr is inserted as the last instruction of the new code (debug.code), whereby the control flow may be returned to the debugger software after execution of the new code (debug.code).
[0103] Subsequently, the IRQ is cleared such that the handler program (break.handler) is terminated, and after the handler program (break.handler) is terminated, the instruction at the address indicated by the value of debug.PC is executed, whereby the control flow may be returned to the instruction at the address indicated by the value of debug.PC at step S718.
[0104]
[0105] Referring to
[0106] Accordingly, an embodiment of the present disclosure may be implemented as a non-transitory computer-readable storage medium in which methods implemented using a computer or instructions executable in a computer are recorded. When the computer-readable instructions are executed by a processor, the computer-readable instructions may perform a method according to at least one aspect of the present disclosure.
[0107] Here, the processor 810 may be a host processor of the present disclosure.
[0108] The processor 810 may include debugger software configured to generate a program execution binary including debug execution code, to provide the program execution binary to multiple parallel processors, to acquire context data corresponding to the state of a target processor immediately before the debug execution code is executed in the target processor, among the multiple parallel processors, and to analyze the context data so as to perform debugging of the program executed in the processor in which the debug execution code is executed.
[0109] Here, the debugger software may generate the program execution binary including debug execution code by inserting the debug execution code at a breakpoint set by a user for debugging in the execution binary of a general program, which is generated by compiling a source program.
[0110] Here, the debug execution code may include a break instruction for suspending execution of the program and a handler program for generating an interrupt for passing a control flow to the debugger software.
[0111] Here, the debugger software according to an embodiment of the present disclosure may load a general program execution binary (kernel.exe) that does not include debug execution code and a program execution binary (kernel.dbg) including debug execution code into main memory, and may set the entrypoint of each of the processors by separating a processor to perform debugging and a processor that does not perform debugging, among the multiple parallel processors.
[0112] Here, the target processor may suspend execution of the program in compliance with the break instruction, and may store the context data in a context memory buffer in the main memory based on execution of the handler program.
[0113] Here, the context data may be stored at the location assigned to match the identifier of the target processor in the context memory buffer.
[0114] Here, the handler program may generate an interrupt and notify the debugger software of the fact that execution of the program is suspended in the target processor, and the handler program may be terminated when the interrupt is cleared by the debugger software.
[0115] Here, upon receiving the interrupt, the debugger software may acquire the context data from the context memory buffer and analyze the same.
[0116] Here, the target processor may store the address value of the break instruction in the internal register thereof, and may resume the suspended execution of the program based on the address value stored in the internal register when the handler program is terminated.
[0117] Here, the debugger software deletes the debug execution code inserted at the breakpoint and again inserts the debug execution code at a new breakpoint requested by a user, thereby replacing the program execution binary including the debug execution code.
[0118] Here, when a user requests execution of new code for debugging, the debugger software may generate new code including a break instruction at the end of the code, store the same in a debug code memory buffer in the main memory, and store the address value corresponding to the start location of the new code in an internal register.
[0119] Here, the target processor may execute the new code using the address value stored in the internal register when the handler program is terminated.
[0120] Using the above-described debugging apparatus, debugging of a program of each of parallel processors may be performed without a hardware debug module in a large-scale parallel system in which thousands and more processors are used.
[0121] According to the present disclosure, debugging of a program of each of parallel processors may be performed without a hardware debug module in a large-scale parallel system in which thousands and more processors are used.
[0122] As described above, the method for debugging a program of many core parallel processors based on code execution and the apparatus for the same according to the present disclosure are not limitedly applied to the configurations and operations of the above-described embodiments, but all or some of the embodiments may be selectively combined and configured, so the embodiments may be modified in various ways.