Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of photos downloaded #25

Open
JaimyvS opened this issue May 6, 2024 · 1 comment
Open

Order of photos downloaded #25

JaimyvS opened this issue May 6, 2024 · 1 comment

Comments

@JaimyvS
Copy link

JaimyvS commented May 6, 2024

Hello,

Thanks for the great script. It really helps to quickly build datasets for training. But I have a question about the images that are being downloaded.

Are these sorted in a particular order before being downloaded? If I download 50 photos and then download 50 more. Will these be the same photos or are they randomly chosen? Say I later want to add more photos to a category.

Kind regards,

Jaimy van Schelven

@pderrenger
Copy link
Member

pderrenger commented Jun 9, 2024

@JaimyvS hello Jaimy,

Thank you for your kind words about the script! We're glad to hear that it's been helpful for you in building your datasets. 😊

Regarding your question about the order of the images being downloaded, the images are typically fetched based on the criteria set in the script, which can include factors like search keywords, sources, and any applied filters. The order in which images are downloaded can vary depending on these criteria and the source's current state.

If you download 50 photos and then download 50 more, there is a possibility of overlap unless the script is designed to track and avoid duplicates. To ensure you get unique images each time, you might want to implement a mechanism to keep track of already downloaded images or use a source that provides a unique set of images each time.

If you need to add more photos to a category later, you could modify the script to check against a list of already downloaded images to avoid duplicates. Here's a simple example of how you might approach this in Python:

import os

def download_images(category, num_images, downloaded_images):
    new_images = []
    for i in range(num_images):
        image = fetch_image(category)  # Replace with actual image fetching logic
        if image not in downloaded_images:
            new_images.append(image)
            downloaded_images.add(image)
    return new_images

# Example usage
downloaded_images = set(os.listdir('path_to_downloaded_images'))
new_images = download_images('category_name', 50, downloaded_images)

This way, you can maintain a set of already downloaded images and ensure that new downloads are unique.

If you encounter any issues or have further questions, please provide more details or a minimum reproducible code example so we can assist you better. You can find more information on creating a minimum reproducible example here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants