4

I want to retrieve the price of the flight of this webpage using Python 3: https://www.google.es/flights?lite=0#flt=/m/0h3tv./m/04jpl.2018-12-17;c:EUR;e:1;a:FR;sd:1;t:f;tt:o

At first I got an error which after many hours I realized was due to the fact that I wasn't giving the webdriver enough time to load all elements. So to ensure that it had enough time I added a time.sleep like so:

time.sleep(1)

This made it work! However, I've read and was advised to not use this solution and to use WebDriverWait instead. So after many hours and several tutorials im stuck trying to pinpoint the exact CSS class the WebDriverWait should wait for.

The closest I think I've got is:

WebDriverWait(d, 1).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".flt-subhead1.gws-flights-results__price.gws-flights-results__cheapest-price")))

Any ideas on what I'm missing on?

2 Answers 2

6

You could use a css attribute = value selector to target, or if that value is dynamic you can use a css selector combination to positional match.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.google.es/flights?lite=0#flt=/m/0h3tv./m/04jpl.2018-12-17;c:EUR;e:1;a:FR;sd:1;t:f;tt:o")

#element = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR , '[jstcache="9322"]')))
element = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR , '.flt-subhead1.gws-flights-results__price.gws-flights-results__cheapest-price span + jsl')))
print(element.text)
#driver.quit()

No results case:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

driver = webdriver.Chrome()
url ="https://www.google.es/flights?lite=0#flt=/m/0h3tv./m/04jpl.2018-12-17;c:EUR;e:1;a:FR;sd:1;t:f;tt:o"  #"https://www.google.es/flights?lite=0#flt=/m/0h3tv./m/04jpl.2018-11-28;c:EUR;e:1;a:FR;sd:1;t:f;tt:o"
driver.get(url)

try:
    status = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'p[role=status')))
    print(status.text)
except TimeoutException as e:
    element = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR , '.flt-subhead1.gws-flights-results__price.gws-flights-results__cheapest-price span + jsl')))
    print(element.text)
#driver.quit()
Sign up to request clarification or add additional context in comments.

9 Comments

Hi QHarr, isn't the jstcache value generated dinamically and as such would change over time?
element = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR , '.flt-subhead1.gws-flights-results__price.gws-flights-results__cheapest-price span + jsl')))
Am getting this error: "selenium.common.exceptions.TimeoutException: Message:" so probably not locating the element
increase the wait to 10? element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR , '.flt-subhead1.gws-flights-results__price.gws-flights-results__cheapest-price span + jsl')))
Yeah my bad. The thing is im running the link through a while loop to get the results for the whole year and I accidentally posted the link for a day next month instead of the link which im using which is for today. Thanks for your help, I accepted your other answer aswell.
|
1

I may be wrong but I think you are trying to get the price of the flight trip.

If my assumption is correct, take a look at my approach. I find the Search Results list, then all the Itinerary inside the Search Results list, loop over and get all the price information. This is the best approach I can come up with and avoiding all the dynamic attributes

from selenium.webdriver import Chrome
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

wait = 20

driver = Chrome()
driver.get("https://www.google.es/flights?lite=0#flt=/m/0h3tv./m/04jpl.2018-12-17;c:EUR;e:1;a:FR;sd:1;t:f;tt:o")

# Get the Search Result List
search_results= WebDriverWait(driver, wait).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'ol[class="gws-flights-results__result-list"]')))

# loop through all the Itinerary
for result in search_results.find_elements_by_css_selector('div[class*="gws-flights-results__collapsed-itinerary"]'):
    price = result.find_element_by_css_selector('div[class="gws-flights-results__itinerary-price"]')
    print(price.text)

Output €18

2 Comments

Could you please elaborate on what you mean by trying to get the price from the URL?
My bad, I meant to say flight price.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.