How to make a web crawler?

Let’s learn how to make a web crawler. The most accurate or helpful solution is served by Stack Overflow.

There are ten answers to this question.

Best solution

Make a Web Crawler/ Spider

I'm looking into making a web crawler/ spider but I need someone to point me in the right direction to get started. Basically, my spider is going to search for audio files and index them. I'm just wondering if anyone has any ideas for how I should do it. I've heard having it done in php would be extremely slow. I know vb.net so could that come in handy? I was thinking about using googles filetype search to get links to crawl, would that be ok? Thanks Guys

Answer:

In VB.NET you will need to firstly get the HTML, use the WebClient class or HttpWebRequest and HttpWebResponse...

Read more

Belgin Fish at Stack Overflow Mark as irrelevant Undo

Other solutions

What are some libraries in Python that can help me make a web crawler?

The libraries can be in standard library or 3rd party. The crawler will just scrape some content of the pages.

Answer:

Module: urllib and beautiful soup for html parsing Framework: Scrapy

Read more

Sachit Adhikari at Quora Mark as irrelevant Undo

Answer:

Search engine crawler can now read your video content through Video Sitemaps. Create sitemap of video...

Read more

community wiki at wiki.answers.com Mark as irrelevant Undo

Once you have a website how do you make the most of web crawling ?

how do i make the most of the web crawler are there any words that pull you tward someone doing a search for your service and where do you place these words ?

Answer:

Create content that would be useful to people you want visiting your site, and use appropriate words...

Read more

QI4SRX5LJ2XDDRUUFRSIFVNQNQ at Yahoo! Answers Mark as irrelevant Undo

Help me in web crawler.?

Dear frnds, I want to make a web crawler but i do not know how to make it.Can someone help me in making a web crawler and can someone know where i learn it.I know html,css,javascript,php and ajax. if some one want to contact with me my email is i.arfanhaider...

Answer:

A webcrawler needs to make two things: - Fetching pages - Parse the page for url's to crawl If it does...

Read more

Arafn Haider at Yahoo! Answers Mark as irrelevant Undo

What can make a web app listen, learn for a specific topic from the web?

There are many topics people may be interested in from the web. For some topics, there are already some websites specialized for for them, e.g., weather.com for weather and espn .com for sport. Not all topics are well organized as the above two topics...

Answer:

For a web app to really become an "expert" you would have to invent Artificial General Intelligence...

Read more

Phillip Rhodes at Quora Mark as irrelevant Undo

How do I crawl various websites with one crawler?

I want to make a web crawler in java. please someone help me or give me some tips and help me to making source code of web crawler in java...please.......Thanks in advance

Answer:

You can try scrapy + selenium. I did something similar for GPlay, check the project on github Stravanni...

Read more

Giovanni Simonini at Quora Mark as irrelevant Undo

What are all the concepts I need to understand in order to build a web crawler?

I want to make a web crawler just for fun, but I'd like to know what I need to understand before starting. Are there particular data structures or technologies I need to know?

Answer:

Theres is a really nice course on www.udacity.com (Intro to CS) on building a web crawler in python...

Read more

Anonymous at Quora Mark as irrelevant Undo

If I want to make a Facebook desktop messenger, how should I go about it?

Facebook used to provide a messenger, but now they have closed down the service. Well  and good, but if I want make my own messenger, how should I go about it? Which platform, language, what libraries should I use I'm a complete novice, who is willing...

Answer:

Why don't you start off by studying the code in their website. That would tell you how the messenger...

Read more

Rishav Kundu at Quora Mark as irrelevant Undo

What are some really interesting web crawling projects?

I want to make some small project using web crawler to get used myself with python. Any idea of things that I should do? It could be a console or website application. Please share your experience/details if you ever done this before :)

Answer:

A nice little project that I've done in the past is a simple 'sites similar to X' recommendation engine...

Read more

Tom Robert at Quora Mark as irrelevant Undo

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.