SYSTEM AND METHODS FOR AGGREGATING FEATURES IN VIDEO FRAMES TO IMPROVE ACCURACY OF AI DETECTION ALGORITHMS
20210406737 · 2021-12-30
Assignee
Inventors
Cpc classification
H04N19/115
ELECTRICITY
A61B1/31
HUMAN NECESSITIES
G06V20/41
PHYSICS
G06T19/00
PHYSICS
G06V10/25
PHYSICS
A61B1/0005
HUMAN NECESSITIES
G06T2219/028
PHYSICS
G06V10/62
PHYSICS
International classification
H04N19/115
ELECTRICITY
Abstract
Methods and systems are provided for aggregating features in multiple video frames to enhance tissue abnormality detection algorithms, wherein a first detection algorithm identifies an abnormality and aggregates adjacent video frames to create a more complete image for analysis by an artificial intelligence detection algorithm, the aggregation occurring in real time as the medical procedure is being performed.
Claims
1. A system for identifying tissue abnormalities in video data generated by an optical endoscopy machine, the endoscopy machine outputting real-time images of an interior of an organ as video frames, the system comprising: at least one video monitor operably coupled to the endoscopy machine to display the video frames output by the endoscopy machine; a memory for storing non-volatile programmed instructions; and a processor configured to accept the video frames output by the endoscopy machine and to store the video frames in the memory, the processor further configured to execute the non-volatile programmed instructions to: analyze a first video frame using artificial intelligence to determine if any part of a first tissue abnormality is visible within the first video frame, and if the first video frame is determined to include the first tissue abnormality, analyze adjacent video frames to locate other parts of the first tissue abnormality; generate a reconstructed image of the first tissue abnormality that spans the first video frame and adjacent video frames in which the other parts of the first tissue abnormality are located; analyze, using artificial intelligence, the reconstructed image to classify the first tissue abnormality; and display on the at least one video monitor a bounding box surrounding a portion of the reconstructed image that is visible in a current video frame.
2. The system of claim 1, wherein the programmed instructions, when executed by the processor, generate the reconstructed image of the first tissue abnormality by aggregating at least one of the following in the first video frame and the adjacent video frames: a boundary of the first tissue abnormality, a color of the first tissue abnormality, and a texture of the first tissue abnormality.
3. The system of claim 1, wherein the programmed instructions, when executed by the processor, generate and display on the at least one video monitor a textual description of a type of the first tissue abnormality.
4. The system of claim 1, wherein the programmed instructions, when executed by the processor, provide that if analysis of the adjacent video frames does not locate other parts of the first tissue abnormality, the first video frame is analyzed using artificial intelligence to classify the first tissue abnormality and a bounding box is displayed on the at least one video monitor surrounding the first tissue abnormality.
5. The system of claim 4, wherein the programmed instructions, when executed by the processor, generate and display on the at least one video monitor a textual description of a type of the first tissue abnormality.
6. The system of claim 1, wherein the processor further is configured to execute the programmed instructions to: analyze the reconstructed image to estimate a degree of completeness of the reconstructed image, and display on the at least one video monitor the estimate of the degree of completeness of the reconstructed image.
7. The system of claim 6, wherein the processor further is configured to execute the programmed instructions to: determine a direction of movement of a camera of the colonoscopy machines to acquire additional video frames for use in generating the reconstructed image; and display on the at least one video monitor an indicator of the direction of movement.
8. The system of claim 1, wherein the processor further is configured to execute the programmed instructions to: if analysis of the adjacent video frames detects a second tissue abnormality different from the first tissue abnormality, analyze the adjacent video frames to locate other parts of the second tissue abnormality.
9. The system of claim 1, wherein the programmed instructions, when executed by the processor, generate a reconstructed image of the first tissue abnormality by adding adjacent features extracted from the adjacent video frames to features extracted from the first video frame.
10. The system of claim 1, wherein the programmed instructions that implement the artificial intelligence includes a machine learning capability.
11. A method of identifying tissue abnormalities in video data generated by an optical endoscopy machine, the endoscopy machine outputting real-time images of an interior of an organ as video frames, the method comprising: acquiring the video frames output by the endoscopy machine; analyzing a first video frame using artificial intelligence to determine if any part of a first tissue abnormality is visible within the first video frame, and if the first video frame is determined to include the first tissue abnormality, analyzing adjacent video frames to locate other parts of the first tissue abnormality; generating a reconstructed image of the first tissue abnormality that spans the first video frame and adjacent video frames in which the other parts of the first tissue abnormality are located; analyzing, using artificial intelligence, the reconstructed image to classify the first tissue abnormality; and displaying on at least one video monitor the real time images from the endoscopy machine and a bounding box surrounding a portion of the reconstructed image that is visible in a current video frame.
12. The method of claim 11, wherein generating the reconstructed image of the first tissue abnormality comprises aggregating at least one of the following in the first video frame and the adjacent video frames: a boundary of the first tissue abnormality, a color of the first tissue abnormality, and a texture of the first tissue abnormality.
13. The method of claim 11, further comprising generating and displaying on the at least one video monitor a textual description of a type of the first tissue abnormality.
14. The method of claim 1, further comprising, if analysis of the adjacent video frames does not locate other parts of the first tissue abnormality: analyzing the first video frame using artificial intelligence to classify the first tissue abnormality; and displaying a bounding box on the at least one video monitor surrounding the first tissue abnormality.
15. The method of claim 14, further comprising generating and displaying on the at least one video monitor a textual description of a type of the first tissue abnormality.
16. The method of claim 11, further comprising: analyzing the reconstructed image to estimate a degree of completeness of the reconstructed image, and displaying on the at least one video monitor the estimate of the degree of completeness of the reconstructed image.
17. The method of claim 16, further comprising: determining a direction of movement of a camera of the colonoscopy machines to acquire additional video frames for use in generating the reconstructed image; and displaying on the at least one video monitor an indicator of the direction of movement.
18. The method of claim 11, further comprising, if analysis of the adjacent video frames detects a second tissue abnormality different from the first tissue abnormality, analyzing the adjacent video frames to locate other parts of the second tissue abnormality.
19. The method of claim 11, further comprising generating a reconstructed image of the first tissue abnormality by adding adjacent features extracted from the adjacent video frames to features extracted from the first video frame.
20. The method of claim 11, further comprising implementing the artificial intelligence to include a machine learning capability.
Description
V. BRIEF DESCRIPTION OF THE DRAWINGS
[0031]
[0032]
[0033]
[0034]
[0035]
VI. DETAILED DESCRIPTION OF THE INVENTION
[0036] The present invention is directed to systems and methods for analyzing multiple video frames imaged by an endoscope with an artificial intelligence (“AI”) software module running on a general purpose or purpose-built computer to aggregate information about a potential tissue feature or abnormality, and to indicate to the endoscopist the location and extent of that feature or abnormality on a display viewed by the endoscopist. In accordance with the principles of the present invention, the AI module is programmed to make a preliminary prediction based on initially available information within a video frame, to aggregate additional information for a feature from additional frames, and preferably, to provide guidance to the endoscopist to direct him or her to move the imaging end of the endoscope to gather additional video frames that will enhance the AI module detection prediction.
[0037] Referring to
[0038] Colonoscope 11 acquires real-time video of the interior of the patient's colon and large intestine from a camera disposed at the distal tip of the colonoscope once it is inserted in the patient. Data from colonoscope 11, including real-time video, is processed by computer to generate video output 13. As shown in
[0039] Referring now to
[0040] If at step 25 the lesion in the additional video frames is adjudged to be the same lesion identified in previous frames, at step 25, features for the lesion are extracted and aggregated by combining information from the previous frame with information from the new frame at step 26. The AI module then reanalyzes the aggregated data for the lesion and updates its detection prediction analysis, at step 27. Specifically, at step 26, the software extracts features from the current video frame and compares that data with previously detected features for that same lesion. If the newly extracted data from the current frames add additional detail, that information then is combined together with the data from the prior frame or frames. If the AI module determines that additional images are required, it may issue directions, via the second window, to reposition the colonoscope camera to obtain additional video frames for analysis at step 29. Further details of that process are described below with respect to
[0041] The foregoing process described with respect to
[0042] Still referring to
[0043] In one preferred embodiment, the AI module may use landmarks identified by a machine learning algorithm to provide registration of images between multiple frames. Such anatomical landmarks may include tissue folds, discolored areas of tissue, blood vessels, polyps, ulcers or scars. Such landmarks may be used by the feature extraction algorithms, at step 26, to help determine if the new image(s) provide additional information for analysis or may be used at step 25 to determine whether a current lesion is the same lesion as the a previous frame or a new lesion, which is assigned a new identifier at step 28.
[0044] Referring now to
[0045] With respect to
[0046] More specifically, in
[0047] Once multiple frames of data are assemble to reconstruct a tissue feature, it is analyzed by feature detection algorithms of AI module 48, to generate a prediction and classification for the tissue feature or lesion. If the partial lesion/feature detector of the AI module indicates that additional image frames are required, the process of reconstructing and analyzing the data (now including additional image frames) is repeated, as described with respect to
[0048] Referring now to
[0049] In the alternative, or in addition, second monitor 55 may include as indicator of the completeness of the image acquisition, a progress bar, or other visual form of progress report, informing the endoscopist about the quality and quantity of data analyzed by the detection and characterization algorithms of the AI module. Second monitor 55 also may include a display including an updated textual classification of an area highlighted in bounding box 52, including a confidential level of that prediction based on the aggregated image data. For example, in
[0050] Although preferred illustrative embodiments of the present invention are described above, it will be evident to one skilled in the art that various changes and modifications may be made without departing from the invention. It is intended in the appended claims to cover all such changes and modifications that fall within the true spirit and scope of the invention.