Hey, id need someones help/info about this…..
I dlike to set-up a site in which pulls info from other sites, and updates it to my site.
Im just thinking how difficult would it be gather information from selected sites.
EG, if you possess a site that sells motorobikes as an example. and you look for a model motorbike on my web site, if it might search a several opther specific sites to for this aswell as some other sites, and explain to you rsults from each site you would have listed, nevertheless on my site.
Kind of including an overall web site for combining the rest of the sites results.
Is makin sense
Just wondering in the event that its complicated make a great site or what approach to take about it
Is it an online crawler or how hard will it be to do as well as could anyone supply me some details,
it becomes much appreciated.
Hard if you get cooperation along with XML feeds through the other sites. Very hard if you can’t.
You’re looking to generate a spider connected with some sort that should be able to scrape a internet page of content, sift out the needless stuff, and return the final results. Again, if you will find XML feeds engaged, that becomes much simpler because then you will not scrape an CODE page. If one does… well… that’s a new *****, and you don’t have better way that can put it than that. I know it’s really a ***** because I’ve done this myself on a few occasions, along with it’s nasty.
People make errors inside their code.
People today don’t necessarily program code their pages constantly.
Information are different across pages, like for example fields of facts.
You can easily have 800-1000 lines of code simply to spider one site because epidermis different ways pages are laid out on the site.
You should look into PHP’s Regex along with CURL if there isn’t an XML track.
There you possibly can choose what really should be searched for plus what info that will extract. These are generally fairly hard, so it will take many practice before you receive it right!
Such as the Game says… come across an XML (feed) track.
With out that, or an API provided with the source site, it is going to hardly be possible to do.