#Web Scraper - Save Image on new tab, close, then resume code

44 messages · Page 1 of 1 (latest)

vital current
#

I'm building a web scraper with Python that collects data on baseball cards. I'm an Information Systems Major and haven't really coded much. I'll admit that a lot of my code has been built through ChatGPT.

When running my scraper, I realized that I was saving the thumbnail of the image and not the high quality image. I want the scraper to click the element instead where it forces open a new tab and save the image there. After it's done saving, I want it to close the tab and continue scraping other cards. I am not sure how to approach this.

A database with sqlite3 is being used.

My code: https://paste.pythondiscord.com/COCQ
May need 'images' folder

I believe these lines might be where I should look:
111 - It's saving the image here. I think clicking the element might belong here
127 - That's the element it should click without the last /img

I know it's a lot but any help is appreciated. This is pretty much the last thing I need for my scraper

gaunt osprey
#

it looks like they have identical urls but instead of ending with 60.jpg it will be 240.jpg

vital current
#

I think the one I'm going for should be 1600

#

The first card should be this one:

gaunt osprey
#

can you send me a page where it has that resolution the ones I see are not that big

vital current
gaunt osprey
#

oh try 1600.jpg

vital current
#

How do I do that?

gaunt osprey
#

you mean in general

#

or in the code?

#

so right now you have these images being extracted from what you said I think

vital current
#

In the code. I only targeted the element to save. Sorry if I'm not understanding

gaunt osprey
#

so in the url replace 60.jpg with 1600.jpg

#

you can use .replace("60.jpg", "1600.jpg") on the url in the code

#
 main_image_url = driver.find_element(By.XPATH, '/html/body/div[1]/div[2]/div/div/div[8]/div[1]/div/a/img').get_attribute('src')
main_image_url =main_image_url.replace("60.jpg", "1600.jpg")
#

you can try that on line 127

vital current
#

Ok

#

Sorry for taking a bit, I accidentally changed something somewhere

#

Is there a way to see my history in VSC?

gaunt osprey
#

not sure

vital current
#

Yeah I keep getting an error now but I don't know what I changed

gaunt osprey
#

oh thats no good

#

wait

#

u have a copy of your file

#

that you uploaded

#

to show everyone

vital current
#

Oh true, thank you!

#

It still gives a small image

#

Oh wait, the image is 240, I'll change that

gaunt osprey
#

ok

vital current
#

It worked!

gaunt osprey
#

congrats

vital current
#

Thank you so much!

gaunt osprey
#

np

vital current
#

Really really appreciate it. Thank you!!!