Navigating Tags – Web scraping with Beautiful Soup 4 p.2
![*](https://i0.wp.com/allprowebdesigns.com/wp-content/uploads/2024/04/1712326093_hqdefault.jpg?resize=480%2C360&ssl=1)
Video Title: Navigating Tags – Web scraping with Beautiful Soup 4 p.2
What’s going on? Everybody welcome to this second beautifulsoup tutorial in our little [beautifulsoup] [mini-series] in this tutorial, we’re going to be doing is Slightly more I suppose getting something really specific from my page where you can sort of think of it like navigation But it’s not totally [like] navigation like some other libraries have so anyway let’s get into [it] So I’m going to go ahead and just delete everything up to the soup so So first let’s say let’s say our objective was to get all the urls but really we just wanted like a specific set of urls, so You could say [right] for Url in soup dot find underscore all a And then you print them out and stuff okay, so we could do that route, but instead What if we like maybe we rather than getting like the body? Urls, maybe this is something we want to actually parse like the just that website we want to start navigating around that website so rather Than taking the urls that are in the body text of that website Maybe we want to do the NAv bar okay, so instead We could say something like this right remember we could do soup dot p and that Gives us like a paragraph tag we can do soup Nav and that will be basically the first Nav that it comes across so Nav Can equal soup not now so now let’s just to see what we’ve got. Let’s print the NAV and sure enough, it’s basically all of the Nav here cool, so we’ll close that and then now what we’re going to do is [we’ll] we want to find the urls, so Actually the process to do that I shouldn’t have deleted that other thing but anyway [for] a url in Nav find underscore all we will find all the a tags and Then we can say print url dot get and we want to get a Chara So this should give us all the links that it finds in the Nav bar So here are all the links now just for the record. It’s finding basically double the links because Let me go to the website We basically have two Nav bars on this website right this is one of the NAv bars So it’s like the the mobile version or small screen version right that’s on the side here, but if you’re not How do I know how to make this go there we go? Now I control my own website anyway if we make it bigger now the Nav bars on top But that other Nav bars like there the codes there for it, so that’s why it’s actually being found so anyways just close that so now we have all the links from the The Nav bar now again also in many cases. You could do something like this You could say body equals soup body But in our case that’s for paragraph and [souped-up] find oops actually body Or let’s say for paragraph in yet body dot find all p We can print paragraph text, but we’re going to find that this has Actually that worked out better than I thought I was going to work out Half isn’t expecting that so actually that ended up working out, but we have two to two body tags And I wasn’t expecting [that] it was actually going [to] find that so on a lot of websites They are like their content will be within a bot actually you know what I think I removed the body tag Now that I’m thinking about it Yeah, remove the second body tag before I went into here. That’s that’s what happen anyway [so] so that’s one option that you have so we were able to have to just get just the text From the body and in most scraping cases like if you’re just scraping a website And that’s you’re just looking for some basic text data This should be more than enough to just get that body data and so like if you’re scraping for content That’s all you really need to do The other thing you could do is like for [example] the reason why I got [rid] of the body tag is I added this div and then Class equals body well and sometimes you’ll see something like this because there might be multiple body tags or there might be a really specific kind of Div section that you’re hoping to parse and So let’s show kind of an example there. So what what you might? Think to do is like, okay? Well, we could say for DIV in soup dot find all Div we could try to print the dIv text and then I’m going to See soup yeah, so we’ll just do [that] And then we’ll print that Right you’re getting all [of] the dIV tag text. So that’s kind of a pain So one thing [that] beautiful soup allows us to do is actually have a div But we can also specify a class you just class underscore since classes a class so class underscore we can say body there and then you could say for DIV I’m going to run that really quick Right, so that’s good enough that gives us exactly what we were kind of hoping for so in this case Remember this one is for a paragraph Embodied find all paragraphs so that other [one’s] not going to find the stuff. That’s not within paragraph tags Where is this? This is finding basically all of the text that’s between div tags All right, so that’s like a little bit more about navigation. Just remember that you can you can basically specify a new Beautifulsoup object right, so this is a the whole beautiful soup Then we’re saying here just that first Navbar and then we’re able to just work between those navbar tags so again It’s problem. Maybe if you’re not too familiar with working with HtML. Right that’s basically when we did a Nav equals soup Nav. We’re basically dealing with Basically here. Where are you Nav two here right just that chunk so we’ve sliced out that chunk of information Okay, so that’s going to conclude the second beautiful soup tutorial. We’ve got one more tutorial. I’m going to be talking about Tables since that’s probably one of the basically the three most common things I found myself needing to scrape is first the con then urls then Tables, so we’re going to talk about tables in the next tutorial so stay tuned for that if you have questions comments concerns whatever on This tutorial feel free to leave them below Remember text-based tutorials on python programming net as well as the sample code [that] we wrote and I’ll see you in the next tutorial
-
Sale!
Wireless WIFI Repeater Extender Amplifier Booster 300Mbps
$29.99$14.99 Add to cartWireless WIFI Repeater Extender Amplifier Booster 300Mbps
Categories: Electronics, Wi-Fi Router, Wireless Wi-Fi Extender Tags: 300Mbps, 802.11N, Amplifier, Booster, Extender, mobile wi-fi booster, Remote, WIFI, Wireless, Wireless WIFI, Wireless WIFI Repeater, Wireless WIFI Repeater Extender, Wireless WIFI Repeater Extender Amplifier, Wireless WIFI Repeater Extender Amplifier Booster, Wireless WIFI Repeater Extender Amplifier Booster 300Mbps$29.99$14.99 -
Sale!
Full RGB Light Design Gaming Headset Headphones with Mic
$24.99$14.99 Add to cartFull RGB Light Design Gaming Headset Headphones with Mic
Categories: Electronics, Gaming, Gaming Headsets Tags: Design, Full, Full RGB Light Design Gaming Headset, Full RGB Light Design Gaming Headset Headphones, Full RGB Light Design Gaming Headset Headphones with Mic, Gamer, Gaming, Gaming Headset Headphones, gaming headset wireless, Headphone, Headphones, Headset, Light, Mic, Package, RGB$24.99$14.99 -
Sale!
Wireless BlueTooth Multi-Device Keyboard Mouse Combo
$39.99$19.99 Add to cartWireless BlueTooth Multi-Device Keyboard Mouse Combo
Categories: Electronics, Gaming, Gaming Keyboards, Keyboard Mouse Combos Tags: Combo, Keyboard, keyboard mouse combos, Mouse, MultiDevice, Set, WireKeyboard Mouse Combo, Wireless, Wireless BlueTooth Keyboard Mouse Combo, Wireless BlueTooth Keyboard Mouse Combos, Wireless BlueTooth Multi-Device Keyboard Mouse Combo, Wireless BlueTooth Multi-Device Keyboard Mouse Combos$39.99$19.99 -
Sale!
High Back Leather Executive Adjustable Swivel Gaming Chair with Headrest and Lumbar
$199.99$139.99 Add to cartHigh Back Leather Executive Adjustable Swivel Gaming Chair with Headrest and Lumbar
Categories: Gaming, Gaming Chairs Tags: Adjustable, Chair, computer chairs, Desk, Executive, Gaming, Girl, Headrest, High, High Back Leather Executive Adjustable Swivel Gaming Chair, High Back Leather Executive Adjustable Swivel Gaming Chair with Headrest, High Back Leather Executive Adjustable Swivel Gaming Chair with Headrest and Lumbar, High Back Leather Executive Adjustable Swivel Gaming Chairs, Leather, Lumbar, Office, Racing, Swivel$199.99$139.99 -
Sale!
Professional LED Light Wired Gaming Headphones with Noise Cancelling Microphone
$29.99$19.99 Select optionsProfessional LED Light Wired Gaming Headphones with Noise Cancelling Microphone
SKU: N/A Categories: Electronics, Gaming, Gaming Headsets Tags: Cancelling, Gaming, Gaming Headphones with Noise Cancelling Microphone, gaming headset, Headphones, Headset, LED, Light, Mic, Microphone, Noise, Professional, Professional LED Light Wired Gaming Headphones, Professional LED Light Wired Gaming Headphones with Noise Cancelling Microphone, Wired, Wired Gaming Headphones, Wired Gaming Headphones with Noise Cancelling Microphone$29.99$19.99 -
Sale!
Gaming Desk with LED Lights USB Power Outlets and Charging Ports
$349.99$249.99 Select optionsGaming Desk with LED Lights USB Power Outlets and Charging Ports
SKU: N/A Categories: Computer Desk, Gaming, Gaming Desk Tags: and Charging Ports, Charging, Desk, Desks, Gaming, gaming desk with led lights, Gaming Desks with LED Lights, Home, LED, Lights, Monitor, Office, Outlets, Port, Power, Room, Stand, USB, USB Power Outlets, White, Workstation$349.99$249.99 -
Sale!
Wired Mixed Backlit Anti-Ghosting Gaming Keyboard
$99.99$79.99 Add to cartWired Mixed Backlit Anti-Ghosting Gaming Keyboard
Categories: Electronics, Gaming, Gaming Keyboards Tags: Antighosting, Backlit, Blue, brown, Gaming, Gaming Keyboard, gaming keyboards, gaming keyboards and mouse, Keyboard, Laptop, Switch, Wired, Wired Mixed Backlit Anti-Ghosting Gaming Keyboard, Wired Mixed Backlit Anti-Ghosting Gaming Keyboards, Wired Mixed Backlit Gaming Keyboard$99.99$79.99 -
Sale!
Wireless Bluetooth 5.3 ANC Noise Cancellation Hi-Res Over the Ear Headphones Headset
$119.99$59.99 Add to cartWireless Bluetooth 5.3 ANC Noise Cancellation Hi-Res Over the Ear Headphones Headset
Categories: Electronics, Gaming, Gaming Headsets Tags: 5.3 ANC Noise Cancellation Hi-Res Over the Ear Headphones Headset, ANC, Audio, Bluetooth, Cancellation, Ear, Earphone, gaming headset, Headphones, Headset, Hi-Res Over the Ear Headphones Headset, HiRes, Noise, Wireless, Wireless Bluetooth 5.3 ANC Noise Cancellation Hi-Res Headphones, Wireless Bluetooth 5.3 ANC Noise Cancellation Hi-Res Over the Ear Headphones Headset, Wireless Bluetooth 5.3 ANC Noise Cancellation Hi-Res Over the Ear Headphones Headsets$119.99$59.99 -
Sale!
Wired Sports Gaming Headset Earbuds with Microphone
$19.99$9.99 Select optionsWired Sports Gaming Headset Earbuds with Microphone
SKU: N/A Categories: Gaming, Gaming Headsets Tags: Accessories, Earbud, Earphone, Earphones, Gaming, gaming headset with microphone, Headphones, Headset, IOS, Microphone, Sports, Wired, Wired Sports Gaming Headset Earbuds, Wired Sports Gaming Headset Earbuds with Microphone, Wired Sports Headset Earbuds$19.99$9.99 -
Sale!
150W Universal Multi USB Fast Charger 16 Port MAX Charging Station
$49.99$29.99 Add to cart150W Universal Multi USB Fast Charger 16 Port MAX Charging Station
Categories: Charging Stations, Electronics Tags: 150W, 150W Charging Station, 150W Universal Multi USB Charging Station, 150W Universal Multi USB Fast Charger 16 Port MAX Charging Station, 150W Universal Multi USB Fast Charger 16 Port MAX Charging Stations, 150W Universal Multi USB MAX Charging Station, 16 Port MAX Charging Station, 3.5A, Charger, Charging, Fast, laptop charging stations, Max, Multi, Port, Stand, Station, Universal, USB$49.99$29.99
Wish we had professor like you :(((((((((((((((((((((((
awesome
anyone knows how to scrape a specific p tag within a div? thanks
2021 and this is STILL the champion of informative Python code teaching. Thanks so much!
did you know that youre a god? 😀 lol… thank you so much for all of your videos, really inspirational
Im new to all of this, how can I exclude text from the paragraphs, i dont want to print or extract all the text with 'p' tag, how can i select the specific part i wanna scrap?
awsome
very helpful. THANK YOU
you are the best, always search " (any topic) sentdex" when I have to learn something
AWESOME LECTURE!!!
Sentdex, your vids can never get old, you're WillSendDex
thank you so much
Best tutorials out there!
That's really helpful… Have a different question that would you like to share python scraping code to find a particular web page opened from multiple web pages in different -2 chrome windows… Thank you in advance for your help in this regards
It's been a struggle for me. But, slowly I am getting the hang of python. At times, it does feel like I am being thrown into a fire though.
hello sir.. great teaching… can we scrap a password protected link after logged in to that link please tell….
How could I exclude URLs that contain a particular string?
JBL sound at 4:45
This kicks ass of so many other resources out there. I banged my head against a wall using Ryan Mitchell's book, then I came here.
How to get only second paragraph!!
Something like siblings will work. ?????
For instance,
<div class=“content”>
<p>first Content</p>
<p>second content</p>
</div>
you're awesome
hello i am making a discord bot (async and discord.py) how would i go about getting a search result into discord and being able to choose from that?
you have a nice way of talking which makes it enjoyable to follow along.
Have you heard of the frameWork Scrapy?
https://github.com/scrapy/scrapy
Looks pretty powerful
how do I just print the html: ?s=opportunity&mode=form&id=9e36b1a7cd80f0ae91085aa86a1dbbf5&tab=core&_cview=0
<tr id="row_0" class="lst-rw lst-rw-first lst-rw-odd">
<td class="lst-cl lst-cl-first" headers="lh_id">
<a href="?s=opportunity&mode=form&id=9e36b1a7cd80f0ae91085aa86a1dbbf5&tab=core&_cview=0" class="lst-lnk-notice"><div class="solt">Tenable Products Maintenance Renewal </div><div class="soln">RFQ-2019-014 </div><div class="solcc">D — Information technology services, including telecommunications services </div></a>
</td>
<td class="lst-cl " headers="lh_agency_name">
<div class="pagency">United States Senate </div>Office of the Sergeant at Arms<br>Finance Division
</td>
<td class="lst-cl " headers="lh_base_type">
Special Notice
</td>
<td class="lst-cl lst-cl-last lst-cl-first_sort" headers="lh_current_posted_date">
Oct 30, 2018
</td>
</tr>
What would you do if you wanted to scrape and save an image from a website?
how to scrap images from other websites
Who's watching this in 2018?
'invalid syntax invalid syntax' whenever I change the code in my beautifulsoup program at all, and have to restart. The struggle is real.
hey where do I go from here? thanks
https://pythonprogramming.net/navigating-pages-scraping-parsing-beautiful-soup-tutorial/
You're a really good teacher
Can you still use import urllib.request in 2018 bc for some reason I get an URLerror and sum about SSL
why soup.nav i not working for me
What do you do if you want to get data from a tag, but there are multiple of the same exact tag?
for example:
<div id="sortable">
<td align="right">27</td>
<td align="right">30</td>
<td align="right">19</td>
If i just want the second one how would I do that?
If i use soup.find("div", id="sortable") it only comes up with the first one? What would i do if i just want the middle or last one?
Thanks for the tutorial, works brilliantly!
So which other libraries have good navigation?
# Parsing a 'div'
why did u use class_='body'? what is class_?
for div in soup.find_all('div', class_='body'):
print(div.text)
I got this error! pls help fix it!
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
This tutorial was a great help! Thank you for providing such good examples and context.
I have a question: ('div', class_='…')? why you add the _ after class? I dont see any _ in the page code.
How can we connect to a website using US proxy with urllib.request
At 3:20 minutes how does he comment out multiple lines ?what is the command for it
Hello. You mentioned at the end of this video that when you scrap websites, you scrap first for content, then urls and then tables. Can you please tell me what you do with all that data? I'm trying to come up with the use cases for it. Thanks
So, your tutorials would be so much more productive if you took notes beforehand. Its like we are following your brain going all over the place. Talk about the objectives and outcomes first, then how to get there. Currently we are, 'lets try this…' then lets try this! Its hard to follow. Good luck!
nav function is working on pythonprograming site but when I try scrapping the information
from another website, nav shows an attribute error
'NoneType' object has no attribute 'find_all'
Can anyone explain why that's happening ?
Awesome tutorial..!!!
awesome tuts bro