Skip to content

In this project I aim to develop an unsupervised sense-and-avoid system for UAVs using sparse and dense optical flow.

Notifications You must be signed in to change notification settings

yudhisteer/Optical-Flow-Obstacle-Avoidance-for-UAV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Optical Flow Obstacle Avoidance for UAV

Problem Statement

Designing a sense-and-avoid system for UAVs that can effectively detect and track all obstacles poses a significant challenge. Existing object detection models, while powerful, have limitations in detecting all types of objects. Real-world scenarios consist of unpredictable moving objects such as children playing frisbee, running pets, or flying kites, and the example can go on and on. How can we label all these obstacles?

To address this challenge, a sense-and-avoid system capable of detecting any object in motion would be the perfect solution. By accurately tracking moving objects, the system would enable effective collision avoidance, ensuring the safety of the UAV and the surrounding environment. Additionally, it would provide real-time information for optimal flight path planning.

We aim to develop an unsupervised sense-and-avoid system for UAVs. By leveraging distinctive motion characteristics, our system eliminates the need for extensive labeling of object classes. This unsupervised method offers significant advantages, particularly in complex urban environments where encounters with unknown objects are likely. Our system's adaptability enables it to effectively detect and respond to diverse scenarios especially when encountering objects that were not present in our training data.

Abstract

In our study, we conducted a comprehensive analysis of optical flow techniques and their suitability for obstacle detection and avoidance in UAV applications. Specifically, we investigated the pros and cons of sparse and dense optical flow approaches. Sparse optical flow, known for its computational efficiency and quick processing, was found to be a viable option for real-time obstacle detection. However, it exhibited lower accuracy in capturing fine-grained motion details.

On the other hand, dense optical flow exhibited superior performance in terms of accuracy, capturing detailed motion information in the scene. However, it came at the cost of increased computational complexity and processing time. Despite these trade-offs, we discovered that both sparse and dense optical flow techniques hold great promise in detecting moving objects when combined with "low-cost" image analysis methods.

Below is a video showcasing "The Horse in Motion" from 1878, captured by Eadweard Muybridge. In this groundbreaking study, Muybridge utilized a series of multiple photographs to showcase the progressive motion of a galloping horse. (Video source: Sallie Gardner at a Gallop)

Race.Horse.First.Film.Ever.1878.Eadweard.Muybridge.mp4

Plan of Action

  1. Understanding Optical Flow

  2. Motion Field

  3. Optical Flow Constraint Equation

  4. Sparse Optical Flow

  5. Dense Optical Flow

  6. Obstacle Avoidance


1. Understanding Optical Flow

Optical flow is a method used to analyze the apparent motion of objects in a sequence of images. It is of paramount importance that we put emphasis on the word "apparent" as we are not analyzing the direct physical motion characteristics of an object itself.

The concept of "apparent motion" can be traced back to Gestalt psychology which explores how humans perceive and interpret visual stimuli. It emphasizes that the whole is different from the sum of its parts and focuses on the organization and grouping of visual elements to form meaningful perceptual experiences.

Fig 1. Illustration of several grouping principles. Common fate is a principle of Gestalt psychology that refers to the perceptual grouping of objects based on their shared motion or direction.

Optical flow involves tracking the movement of individual pixels or small regions over time, providing insights into how objects are moving within a scene. While images capture the spatial aspects of a scene, we now add the temporal dimension, enabling the perception of motion and dynamic changes.

When comparing two images, our objective is to determine the corresponding locations of points in the first image within the second image. This mapping of points is referred to as the motion field. By analyzing how brightness patterns in the first image relate to their positions in the second image, we can approximate the motion field. This involves tracking the displacement of brightness patterns to infer the underlying motion of points between the images. Optical flow provides an approximation of the underlying motion field based on the observed shifts in brightness patterns between consecutive frames.

Fig 2. Flow map where the direction of the arrow indicates the motion's direction, while the length of the arrow represents the speed or magnitude of the motion.

Image Source: Optical Flow

Each pixel in the image has an optical flow vector representing the motion of brightness patterns at that point. The vector's length indicates the speed of motion, and its direction indicates the specific direction of motion on the image plane. In an ideal scenario, optical flow perfectly matches the motion field, accurately representing object movement. However, in practice, optical flow and the motion field are not always identical due to occlusions and complex motion patterns, leading to incomplete and less accurate motion information. It's important to note that the motion of points between the images depends on the depth of the points, as this influences their perceived motion.


2. Motion Field

Consider a point in a scene that is moving in a certain direction in three-dimensional space. This movement is projected onto the image plane, resulting in a motion on the image known as the motion field for that point. However, measuring this motion field directly is often not possible. What we can measure instead is the motion of brightness patterns in the image, which is referred to as optical flow. Optical flow provides insights into how the brightness patterns within the image change and move, but it is an approximation of the actual motion field.

Fig 3. Ouchi Pattern (left) and Donguri Wave Illusion (right).

Image Source: Waves

The figure above shows examples of when the motion field is not equal to the optical flow. The images are static but when we move our eyes around the image, the latter seems to be moving. We have an optical flow in our visual system but there is no motion field!

Another example is the simulation below, whereby you can see the cube moving and we can see the optical flow with the green lines however, no optical flow is detected for the sphere which is rotating twice as fast as the cube.

Cube_of.mp4

In the example below, we see the cube and the cylinder moving and we can also see their optical flows. However, notice that the front of the cylinder has higher optical flow vectors since it is closer to the camera. Secondly, we have no optical flow for the top surface of the cylinder although it is rotating at the same angular speed. However, we cannot see it as there is no "apparent motion", no contrast, no patterns, and no texture moving.

Cube_2_of.mp4
-----------------------

3. Optical Flow Constraint Equation

Let's say we have two images of a scene taken in quick succession: a yellow car moving on a road. Now focus our attention on a single point within this window: side-view mirror of the car. We assume that the position of the point is CodeCogsEqn (9). Now, at time CodeCogsEqn (11), that point has moved to a new location: CodeCogsEqn (10) So the displacement of the point CodeCogsEqn (9) can be said to be CodeCogsEqn (21).

Fig 4. Displacement of an object over a short interval of time.

If we divide by CodeCogsEqn (13), then we essentially have the speed of the point in the X and Y directions and that we will call you CodeCogsEqn (14). That is the optical flow corresponding to the point.

In order to solve this problem we need to make some assumptions:

  1. The brightness of an image point remains constant over time. That is, intensity at new location in time equals intensity at old location in time.
  1. The displacement CodeCogsEqn (17) and time step CodeCogsEqn (18) are small. This allow us to come up with a linear approximation. Using Taylor Series expansion:

Note that we ignore the higher order terms as CodeCogsEqn (30) are already very small:

When we subtract the two equations above, we end up with the equation below:

We divide by CodeCogsEqn (27) and take limit as CodeCogsEqn (28):

We now have a linear constraint equation:

where:

  • CodeCogsEqn (29): Optical Flow

  • CodeCogsEqn (31): Spatial gradient

  • CodeCogsEqn (32): Temporal gradient

  • The three gradients are easy to calculate using derivative operators. The optical flow vector CodeCogsEqn (29) is what we are searching for.

  • This equation shows that if we have a flow vector and we apply it to the changes in the image over space, it will be completely balanced out by the changes in the image over time. This makes sense because we're assuming that the brightness of the image won't change.

You might notice that we can't solve the optical flow equation directly for the variables u and v because we only have one equation but two unknowns.

The geometric interpretation of the equation above is such that for any point (x,y) in the image, its optical flow (u,v) lies on the line:

We know (u,v) lies on the line but we do not know exactly where. However, our Optical Flow can be split up into 2 components CodeCogsEqn (42) which is the Normal Flow and CodeCogsEqn (43) which is the Parallel Flow.

We can easily computer the Normal Flow from just the Constraint line using the direction of the normal flow and the magnitude of it:

However, we cannot determine CodeCogsEqn (45), the optical flow component parallel to the constraint line.

Fig 5. Geometric interpretation of Normal and Parallel flow of the optical flow component.

3.1 Aperture Problem

It seems that the inherent limitations of the optical flow problem are not exclusive to the algorithms being developed. They also apply to us humans. An excellent example is the Aperture Problem

It's important to note that we don't observe the complete image of just one object. Our image contains multiple objects, and each local region within the image can have a potentially different flow. Therefore, it becomes necessary for us to focus on a small local patch, which we will refer to as our aperture.

Fig 6. The line moves downward diagonally (left). However, through the aperture, it seems to move upward diagonally (middle). We can only tell the motion locally in the direction perpendicular to the edge.

Image Source: Data Hacker

Therefore, what both you and I perceive is the normal flow, which represents the motion of the line we are observing perpendicular to the line itself. We are unable to directly measure the actual flow. Thus, locally, we can only estimate the normal flow, as shown in this demonstration. This challenge in estimating the optical flow is commonly known as the aperture problem.


4. Sparse Optical Flow

So far, we have seen that the optical flow estimation problem is an under-constraint problem. Sparse optical flow algorithms work by selecting a specific set of pixels, typically consisting of interesting features such as edges and corners, to track their corresponding velocity vectors representing motion. These chosen features are then passed through the optical flow function from one frame to the next, ensuring that the same points are consistently tracked across frames.

We're going to assume that for each pixel, the motion field and hence the optical flow is constant within a small neighborhood W around that pixel neighborhood. By considering a neighborhood of pixels around each pixel, we enhance the accuracy of optical flow estimation. This approach leads us to the Lucas-Kanade method, a technique widely used for estimating optical flow. Other various implementations of sparse optical flow methods exist, including the well-known Horn-Schunck method, the Buxton-Buxton method, and more.

Suppose we have a point (k,l) in our window W:

Fig 7. Applying the Lucas-Kanade method for sparse optical flow.

If our window W is of size nxn, we have CodeCogsEqn (48) points. Hence, in matrix form, we have:

Notce that we now have CodeCogsEqn (55) equations and 2 unknowns! We can solve for u as such:

For this method to work we need, CodeCogsEqn (52) to be invertible and CodeCogsEqn (52) to be well-conditioned.

Below are some examples of when this method will output poor results:

  1. Equations for all pixels in the window are more or less the same.
  2. Prominent gradient in one direction.

Fig 8. Examples where we may have poor results when computing the optical flow.

4.1 Coarse-to-Fine Estimation

Now let's examine a scenario where two images are captured in rapid succession. Due to the proximity of the car to the camera, its motion will be significant, possibly spanning several pixels based on perspective projection. In such a case, we cannot assume that the changes in x and y coordinates will be small. The Taylor series approximation, which relies on linearity, is no longer applicable for the image and brightness variations. Consequently, the simple linear optical flow constraint equation is no longer valid.

Fig 9. Pyramidal Lucas-Kanade (LK) Optical Flow is an algorithm that estimates the movement of sparse feature points between frames.

To address the challenge of an under-constrained optical flow problem, we can compute lower resolution versions of the images by downsampling them. This downsampling process involves averaging pixel values within small windows to create new low-resolution images. As we move to lower resolutions, the magnitude of motion decreases, allowing the optical flow constraint equation to become valid again.

Below are the steps of the Coarse-to-Fine Estimation algorithm using Lucas-Kanade:

Note: In general, moving objects that are closer to the camera will display more apparent motion than distant objects that are moving at the same speed due to motion parallax.

4.2 Lucas-Kanade Implementation

Sparse optical flow algorithms select a subset of feature points, such as corners, in the image and track their motion vectors between frames. The Shi-Tomasi corner detector is commonly used to identify these feature points. Below is an example of why we chose the Shi-Tomasi over the Harris Corner detection algorithm.

Fig 10. Harris v/s Shi-Tomasi Corner Detection.

Image Source: Medium

We first need to initialize the parameters for Shi-Tomasi. We chose to select 100 points with a quality level threshold of 0.3 and the minimum Euclidean distance allowed between detected corners to be 7.

# Shi-Tomasi Parameters
shitomasi_params = dict(maxCorners=100, qualityLevel=0.3, minDistance=7)

In Lucas-Kanade parameters, we have a window of size 15 x 15 which refer to the size of the window or neighborhood used for computing the optical flow and 2 for the maximum pyramid level used for the multi-resolution approach.

# Lucas Kanade Parameters
lk_params = dict(winSize=(15,15), maxLevel=2, criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

We will read the first frame of our video in grayscale and get the best features or corners using the goodFeaturesToTrack function which is an implementation of the Shi-Tomasi corner detector.

# Get first frame
frame_gray_init = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Get features from Shitomasi
edges = cv2.goodFeaturesToTrack(frame_gray_init, mask=None, **shitomasi_params)

Then we need to use these points (edges) detected and feed them into the calcOpticalFlowPyrLK function and track them (new_edges). The function detects and tracks feature points in the subsequent image frame based on their initial positions in the previous frame.

    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    new_edges, status, error = cv2.calcOpticalFlowPyrLK(frame_gray_init, frame_gray, edges, None, **lk_params)

When we get the coordinates of a corner in the previous and next frame, we can calculate its displacement. Using this information we can create a condition whereby we track only moving features or visualize them in another color as in the example below.

    # Go through each matched feature, and draw it on the second image
    for new, old in zip(good_new, good_old):
        a, b = new.ravel() # Current corner coordinates
        c, d = old.ravel() # Previous corner coordinates

        displacement_x = a - c
        displacement_y = b - d

Using a certain threshold, moving corners are red while static ones are yellow:

Untitled.video.-.Made.with.Clipchamp.1.mp4

5. Dense Optical Flow

In sparse optical flow, we computed the optical flow only for a set of features. In dense optical flow, we will consider all pixels in our frame. The computation will be slower but we will get more accurate results.

The main concept of this method involves approximating the neighboring pixels of each pixel using a polynomial function. Recall that in the Lucas-Kanade method, we previously used a linear approximation that relied on a first-order Taylor's expansion. Now, we aim to improve the accuracy of the approximation by incorporating second-order values.Several methods are available such as:

  • Dense Pyramid Lucas-Kanade
  • Farneback
  • Robust Local Optical Flow (RLOF)

The output of the dense optical flow algorithm can be visualized using the HSV color scheme. By employing the cv2.cartToPolar function, we can convert the displacement coordinates (dx, dy) of each pixel into polar coordinates, representing the magnitude and angle for that pixel. In this visualization scheme, we map the angle to the Hue channel and the magnitude to the Value channel, while keeping the Saturation channel constant. Hence, an object which moves faster will appear to be brighter, and depending on the direction they are moving, it will have different colors based on the color wheel below:

Fig 11. HSV color scheme

Below is an example of the Farneback algorithm:

    # Farneback function
    flow = cv2.calcOpticalFlowFarneback(frame_gray_init, frame_gray, None, 0.5, 3, 15, 3, 5, 1.2, 0)

We calculate the magnitude and angle of each vector and map them to the Hue and Value channel of the HSV color scheme.

    # Get magnitude and angle of vector
    magnitude, angle = cv2.cartToPolar(flow[..., 0], flow[..., 1]) #x #y

    # Hue
    hsv[..., 0] = (angle * 180 / (np.pi / 2))

    # Value: Intensity
    hsv[..., 2] = cv2.normalize(magnitude, None, 0, 255, cv2.NORM_MINMAX)

Below is an example of me running towards the camera. The Dense Lucas-Kanade algorithm outperforms the other two algorithms. While Farneback captures the outline of the shirt, Dense Lucas-Kanade provides a more detailed representation of the entire object. Although RLOF exhibits some noise in the output, it still demonstrates a notable optical flow mapping.

walk_5_of_dense_combined_titles.mp4

This video of me running away from the camera seems to show more or less the same results as the one above.

walk_3_of_dense_combined_titles.mp4

In the presence of an obstacle, the Dense Lucas-Kanade and RLOF algorithms show noticeable noise in scenes with no motion. However, the Farneback algorithm produces significantly less noise, enabling a clear outline of the obstacle even with minimal movement. For videos with small motions, RLOF and Dense Lucas-Kanade may not be suitable, while Farneback performs better.

Obstacle_2_of_dense_combined_titles.4.mp4

In a video where a drone is moving towards and away from a tree, recorded under challenging lighting conditions with numerous shadows, the Dense Lucas-Kanade algorithm appears to perform better. It accurately captures the motion, especially when the drone is approaching the tree at high speed. On the other hand, the Farneback algorithm exhibits a granular output that can be considered redundant. RLOF algorithm demonstrates poor performance under these conditions.

Obstacle-3-of-dense-combined-tit.mp4

Please note that I have not conducted a thorough analysis to benchmark these algorithms. My comments and observations are solely based on a visual examination of the results. To accurately assess their performance, further testing and analysis would be required to quantify their effectiveness.


6. Obstacle Avoidance

Now, we need to utilize the output from the Lucas-Kanade method in order to devise an Obstacle Avoiding Algorithm. We will test our solution on the DJI Tello drone. We will first assume a simple scenario whereby the drone is approaching an obstacle head-front. Based on some criteria, we want our drone to turn either left or right. However, notice that we also have an unwanted object in our background which may perturb our system.

Fig 12. The obstacle the drone will need to avoid plus an unwanted object in the background.

6.1 Obstacle Avoidance with Sparse Optical Flow

We try to implement the algorithm step by step:

  1. In my first try, I tried to display the important features on a mask. We can see the outline of the obstacle but also some features were extracted from the object in the background.

  2. In the second iteration, I divided the mask into two - the vertical line. Here, I wrote a condition on the steering such that if we have more features on the right than on the left then turn left else turn right. However, notice that the object in the background can still bias the system.

  3. On the 3rd try, I created a Region of Interest (ROI) - black rectangle - such that we only want to detect features within our ROI. Now, the object in the background is no longer a problem.

  4. Finally, I modified the code such that only the features in the ROI will be displayed. I am also calculating the sum of vector magnitudes in both halves. Below is the code:

            # Iterate through each trajectory
            for trajectory in trajectories:
                # Get the x and y coordinates of the current and previous points
                x, y = trajectory[-1]
                prev_x, prev_y = trajectory[-2]

                # Check if the current point is within the ROI
                if roi_x <= x <= roi_x + roi_width and roi_y <= y <= roi_y + roi_height:
                    # Calculate the magnitude of the optical flow vector
                    magnitude = np.sqrt((x - prev_x) ** 2 + (y - prev_y) ** 2)

                    # Check if the current point is on the left or right side
                    if x < midpoint:
                        total_magnitude_left += magnitude
                    else:
                        total_magnitude_right += magnitude

Fig 13. Pyramidal Lucas-Kanade (LK) Optical Flow is an algorithm that estimates the movement of sparse feature points between frames.

In summary: the scenario involves a figure divided into two quadrants, each with its own direction for flow vectors. This observation is useful for obstacle avoidance. To extract features, a black rectangle representing a predefined patch within the image is focused on. The purpose of this patch is to detect obstacles directly in front of the drone, disregarding objects outside this line of sight. The vertical black line divides the patch into left and right sections. By calculating the sum of vector magnitudes in both halves, the presence and magnitude of obstacles in each direction can be determined.

Now, we need to test it in real time on our drone. Below are some insights:

  • Sparse Feature Detection: Lucas-Kanade relies on sparse feature detection. It tracks a limited number of specific points or corners in the image. When dealing with complex scenes such as the tree outside in the video, it results in incomplete or inaccurate obstacle detection.

  • Limited Robustness to Illumination Changes: The algorithms can be sensitive to changes in lighting conditions. Illumination variations can affect the accuracy of feature detection and tracking, leading to unreliable results in different lighting conditions.

  • Difficulty in Handling Large Displacements: The Lucas-Kanade method assumes that the motion between frames is small, which limits its effectiveness in scenarios with large displacements. When objects move significantly between frames, the assumption of small motion breaks down, and the accuracy of the method decreases.

obs123-made-with-clipchamp_aqDUUVxJ.1.mp4
  • Lack of Robustness to Textureless Regions: When detecting corners or features in the image, they may struggle in textureless or low-texture regions where distinctive features are sparse or non-existent.

  • Computational Complexity: The Lucas-Kanade method involves iterative calculations to estimate the motion vectors, which can be computationally expensive. In real-time applications, this computational complexity can limit the system's ability to process frames in a timely manner, affecting the overall performance of the obstacle avoidance system.

Overall, while Lucas-Kanade and Shi-Tomasi algorithms provide valuable techniques for optical flow-based obstacle avoidance, their limitations should be considered when applying them to real-world scenarios.

6.2 Obstacle Avoidance with Dense Optical Flow

The outputs that we get from the Dense optical flow algorithm are the magnitude and angle of the vectors which we mapped to the value and hue of the HSV color scheme respectively. Objects that move faster will be brighter hence, have a high-intensity value. Objects that are close to the camera will also appear to move faster due to motion parallax hence, we can use this information too. From the color scheme above, we can see that "green" and "orange-yellow" will be mapped to objects that will be moving toward the camera. We will use this information in order to design our obstacle avoidance algorithm.

Dense optical flow is already computationally expensive hence, I did not want to use CNN or other Deep Learning methods to check for obstacles. I want to rely on image analysis in order to detect the obstacles in each frame. Based on the detection, we will be able to devise the control for the drone in order to avoid the obstacle.

  1. To extract important intensity information and disregard hue, I converted the image to grayscale.

  2. To reduce noise and emphasize important features, I applied a Gaussian blur to the image.

  3. Using Otsu's thresholding method, I separated the image into foreground and background.

  4. By dilating the thresholded image, I expanded the white region to highlight the foreground.

  5. I utilized connected components to label and extract connected regions in the binary image.

  6. After identifying the connected regions, I filtered out small regions based on their area.

  7. Finally, a bounding box was drawn around the regions whose area met the specified threshold.

def plot_image_threshold(gray_image, method, threshold=150):

    # apply gaussian blur
    gray_image = cv2.GaussianBlur(gray_image, (7, 7), 0)

    if method == cv2.threshold:
        if threshold != 0:
            # apply binary thresholding
            T, img_thresh = method(gray_image, threshold, 255, cv2.THRESH_BINARY) # THRESH_BINARY_INV
            print(T)
        else:
            # apply binary thresholding
            T, img_thresh = method(gray_image, threshold, 255, cv2.THRESH_BINARY| cv2.THRESH_OTSU) # THRESH_OTSU # THRESH_BINARY # THRESH_BINARY_INV
            print(T)
    else:
        # apply adaptiveThreshold thresholding
        img_thresh = method(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, threshold, 10) # ADAPTIVE_THRESH_MEAN_C # ADAPTIVE_THRESH_GAUSSIAN_C

    # Apply mask to remove background
    # img_thresh = cv2.bitwise_and(gray_image, gray_image, mask=img_thresh) # original image with no background
    return img_thresh
def plot_image_dilation(img_thresh):
    ## apply a series of dilations
    #img_dilated = cv2.erode(img_thresh.copy(), np.ones((10, 10), np.uint8), iterations=2)
    img_dilated = cv2.dilate(img_thresh.copy(), np.ones((10, 10), np.uint8), iterations=4)

    return img_dilated

Below are the results of the image analysis on one frame:

Fig 14. The process flow of the different image analysis operations performed on each frame.

def plot_image_connected(img_dilated, image):
    # Perform connected component analysis
    output = cv2.connectedComponentsWithStats(img_dilated, 4, cv2.CV_32S)
    (num_labels, labels, stats, centroids) = output

    # Create a copy of the original image to draw rectangles on
    img_with_rectangles = np.copy(img_dilated)
    img_copy = np.copy(image)

    # Iterate over each component (excluding background label 0)
    for label in range(1, num_labels):
        # Get the statistics for the current component
        left = stats[label, cv2.CC_STAT_LEFT]
        top = stats[label, cv2.CC_STAT_TOP]
        width = stats[label, cv2.CC_STAT_WIDTH]
        height = stats[label, cv2.CC_STAT_HEIGHT]
        area = stats[label, cv2.CC_STAT_AREA]
        print("Area: ", area)

        # Area to keep
        keepArea = 100000 < area < 300000

        # If True
        if keepArea:
            print("[INFO] keeping connected component '{}'".format(label))
            # Draw a rectangle around the current component
            cv2.rectangle(img_with_rectangles, (left, top), (left + width, top + height), (255, 255, 255), 2)
            cv2.rectangle(img_copy, (left, top), (left + width, top + height), (255, 255, 255), 2)

    return img_copy

Using the bounding box coordinates from the connected components process, I superimpose it on my original flow map:

Fig 15. Object detection using simple image analysis only.

Here's the image analysis process on the whole video of a moving obstacle:

hsv.-.Made.with.Clipchamp.mp4

In the first few frames, we have no detection but when the intensity of the flow map is more apparent, it does a pretty good job at identifying the object in motion in the frame.

I also tested it on a video of the drone flying towards the tree (static obstacle) using the Dense Lucas-Kanade and Farneback method. We have a lot of instances whereby the ground is being detected and this is because the ground being closer to the camera has a higher speed hence, higher intensity. These are counted as false positives.

Tree_Combined_titles.mp4

But when the drone approaches the tree, we successfully detect the obstacle and can draw a bounding box to it. Although, in a real-case scenario, we might want to detect the obstacle earlier. I believe this method, though computationally expensive, is a good system to track moving obstacles and not static ones.


7. Conclusion

In conclusion, our study has highlighted the advantages and trade-offs of sparse and dense optical flow techniques for obstacle detection and avoidance in UAV applications. Sparse optical flow offers real-time processing and computational efficiency, making it suitable for quick obstacle detection. However, it may lack the level of accuracy required to capture fine-grained motion details. On the other hand, dense optical flow provides superior accuracy in capturing detailed motion information, albeit at the expense of increased computational complexity and processing time.

Further Improvements

  1. We can also incorporate the concept of focus of expansion (FOE) which can improve obstacle avoidance capabilities. FOE refers to the point in the visual field where objects appear to be approaching or receding. By utilizing FOE information, our system can determine the direction of object motion relative to the UAV and adjust its flight path accordingly, actively avoiding potential collisions.
drone-1-foe.mp4
  1. Secondly, leveraging clustering techniques on the dense optical flow map can enhance the system's ability to distinguish and track multiple moving objects simultaneously. By grouping similar motion patterns together, clustering can provide a more comprehensive understanding of the scene dynamics and enable better obstacle detection and avoidance.

Fig 16. K-means clustering using a flow map of dense optical flow.

Looking ahead, we aim to further enhance our system by incorporating deep optical flow using deep learning models such as FlowNet or RAFT. These models have shown remarkable performance in estimating optical flow fields using deep neural networks.


References

[1] NanoNets. (n.d.). Optical Flow Explained: A Comprehensive Guide with Python Implementation. Webpage. https://nanonets.com/blog/optical-flow/

[2] First Principles of Computer Vision. (n.d.). Computer Vision Basics - Optical Flow. YouTube playlist. https://www.youtube.com/watch?v=lnXFcmLB7sM&list=PL2zRqk16wsdoYzrWStffqBAoUY8XdvatV&ab_channel=FirstPrinciplesofComputerVision

[3] NanoNets. (n.d.). Optical Flow Using Deep Learning. Webpage. https://nanonets.com/blog/optical-flow/#optical-flow-using-deep-learning

[4] MathWorks. (n.d.). Obstacle Avoidance with Camera Sensor using Simulink. [File Exchange]. https://www.mathworks.com/matlabcentral/fileexchange/94095-obstacle-avoidance-with-camera-sensor-using-simulink

[5] Medium. (n.d.). What is Optical Flow and Why Does It Matter in Deep Learning? Webpage. https://medium.com/swlh/what-is-optical-flow-and-why-does-it-matter-in-deep-learning

[6] Towards Data Science. (n.d.). A Brief Review of FlowNet. Webpage. https://towardsdatascience.com/a-brief-review-of-flownet-dca6bd574de0

[7] Zain Mehdiblog. (n.d.). Vision Based Obstacle Avoidance. Webpage. https://zainmehdiblog.wordpress.com/vision-based-obstacle-avoidance/

[8] GitHub. (n.d.). tfoptflow: TensorFlow Implementation of Optical Flow. README.md. https://github.com/philferriere/tfoptflow/blob/master/README.md

[9] Data Hacker. (n.d.). Calculating Sparse Optical Flow using Lucas-Kanade Method. Webpage. https://datahacker.rs/calculating-sparse-optical-flow-using-lucas-kanade-method/

[10] LearnOpenCV. (n.d.). Optical Flow using Deep Learning: RAFT. Webpage. https://learnopencv.com/optical-flow-using-deep-learning-raft/

[11] Towards AI. (n.d.). ECCV 2020 Best Paper Award: A New Architecture for Optical Flow. Webpage. https://pub.towardsai.net/eccv-2020-best-paper-award-a-new-architecture-for-optical-flow-3298c8a40dc7

[12] LearnOpenCV. (n.d.). Optical Flow in OpenCV. Webpage. https://learnopencv.com/optical-flow-in-opencv/

[13] PRG AERO. (n.d.). Vision-Based Obstacle Avoidance. Webpage. https://prgaero.github.io/2019/proj/p4b/#rviz

[14] PRG CS UMD. (n.d.). ENAE 788M: Flight Dynamics and Control. [Course]. http://prg.cs.umd.edu/enae788m

[15] Coursera. (n.d.). Robotics Perception: 3D Velocities from Optical Flow. Lecture. https://www.coursera.org/learn/robotics-perception/lecture/DgSNW/3d-velocities-from-optical-flow

[16] Kaggle. (n.d.). Optical Flow Estimation using RAFT. Code. https://www.kaggle.com/code/daigohirooka/optical-flow-estimation-using-raft

[17] PyImageSearch. (2021, May 12). Adaptive Thresholding with OpenCV. Webpage. https://pyimagesearch.com/2021/05/12/adaptive-thresholding-with-opencv-cv2-adaptivethreshold/

[18] PyImageSearch. (2021, April 28). OpenCV Morphological Operations: Blurring, Erosion, Dilation, and More. Webpage. https://pyimagesearch.com/2021/04/28/opencv-morphological-operations/

About

In this project I aim to develop an unsupervised sense-and-avoid system for UAVs using sparse and dense optical flow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages