Web scraping with Python, BeautifulSoup

Question

I have a problem with parsing links with Python. There is my code:

def get_content(html):
    soup = BeautifulSoup(html, 'lxml')
    items = soup.find_all('div', class_='grid-item___eaXVb')

    for item in items:
        link = item.find('a', class_='gl-product-card__details-link')
        print(link.get('href'))

And I get this error:

Traceback (most recent call last):
  File "parser.py", line 32, in <module>
    parse()
  File "parser.py", line 27, in parse
    get_content(html.text)
  File "parser.py", line 21, in get_content
    print(link.get('href'))
AttributeError: 'NoneType' object has no attribute 'get'

But when I try this:

    for item in items:
        link = item.find('a', class_='gl-product-card__details-link')
        print(type(link))

I get a repsonse, that all link have type:

<class 'bs4.element.Tag'>
<class 'bs4.element.Tag'>
...
...
...
<class 'bs4.element.Tag'>
<class 'bs4.element.Tag'>

Where did I make a mistake? What's wrong?

What is your expected result? why do you have to use type(link) ? — Sureshmani Kalirajan
– Sureshmani Kalirajan, Commented Jun 15, 2020 at 14:07

Andrej Kesely · Accepted Answer · 2020-06-15 14:35:39Z

To get all products titles and links, you can use this example:

import requests
from bs4 import BeautifulSoup


url = 'https://www.adidas.com/us/men-shoes?price=price%3C50.0'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}

soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')

for a in soup.select('div[class^="product-container"] a.gl-product-card__media-link'):
    label = a.find_next(class_='gl-label')
    print('{:<50} {}'.format(label.text, 'https://www.adidas.com' + a['href']))

Prints:

Adilette Lite Slides                               https://www.adidas.com/us/adilette-lite-slides/FU8299.html
Adilette Aqua Slides                               https://www.adidas.com/us/adilette-aqua-slides/F35550.html
U_Path Run Shoes                                   https://www.adidas.com/us/u_path-run-shoes/EE4466.html
adiease Shoes                                      https://www.adidas.com/us/adiease-shoes/BY4027.html
Nizza RF Slip-on Shoes                             https://www.adidas.com/us/nizza-rf-slip-on-shoes/EF1410.html
Adilette Slides                                    https://www.adidas.com/us/adilette-slides/280647.html
Goletto VII Turf Shoes                             https://www.adidas.com/us/goletto-vii-turf-shoes/FV8703.html
Adilette Comfort Slides                            https://www.adidas.com/us/adilette-comfort-slides/FW5337.html
Adilette Comfort Slides                            https://www.adidas.com/us/adilette-comfort-slides/FW5353.html
Adizero Spark MD Cleats                            https://www.adidas.com/us/adizero-spark-md-cleats/EF3476.html
CP Traxion Spikeless Shoes                         https://www.adidas.com/us/cp-traxion-spikeless-shoes/EE9206.html
CP Traxion Spikeless Shoes                         https://www.adidas.com/us/cp-traxion-spikeless-shoes/BB7900.html
CP Traxion Spikeless Shoes                         https://www.adidas.com/us/cp-traxion-spikeless-shoes/BD7138.html
CP Traxion Spikeless Shoes                         https://www.adidas.com/us/cp-traxion-spikeless-shoes/F34996.html
Adilette Lite Slides                               https://www.adidas.com/us/adilette-lite-slides/FU8296.html
Afterburner 6 Grail MD Cleats                      https://www.adidas.com/us/afterburner-6-grail-md-cleats/DB3106.html
Lite Racer CLN Shoes                               https://www.adidas.com/us/lite-racer-cln-shoes/EE8138.html

... and so on.

Collectives™ on Stack Overflow

Web scraping with Python, BeautifulSoup

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related