Lifelesspeople.com

 Forum FAQsForum FAQs  Knowledge BaseKnowledge Base  RulesRules   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   HostingHosting   RegisterRegister 
 DonateDonate   WikiWiki   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Would there be any issues with this?

 
Lifelesspeople.com Forum Index -> Web Architects' Abode
Post new topic   Reply to topic View previous topic :: View next topic  
Author Message
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Wed Apr 30, 2008 9:30 pm    Post subject: Would there be any issues with this? Reply with quote

I want to fill up my tutorial site's database with tutorials (just the links to them and a title). So I was thinking about creating a PHP script which could sort of "crawl" large tutorial sites and index the URLs and titles from those sites in my own database.

Would there be any copyright issues or anything by doing this?
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Thu May 01, 2008 4:28 am    Post subject: Reply with quote

Yes and no, it depends on tutorial per tutorial basis I guess.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Thu May 01, 2008 5:58 am    Post subject: Reply with quote

How so? Would this not be like a search engine crawling pages on the internet and storing them?
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Thu May 01, 2008 6:22 am    Post subject: Reply with quote

Certain sites don't allow direct linking of content or only allow certain other sites to link to them. They could ask you to remove the links. Though I think that the chances of that happening are small.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Thu May 01, 2008 9:10 pm    Post subject: Reply with quote

Now another question in regards to this topic, is it possible to find the URL a site redirects to in PHP?
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Fri May 02, 2008 2:41 am    Post subject: Reply with quote

Yes, is it a http or meta redirect?
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Fri May 02, 2008 6:05 am    Post subject: Reply with quote

I believe it is using header("Location : ********"). Here is an example (although this is my site, I am looking to find it on external sites)

http://www.tutorialtoday.com/t.....it_in_PHP/

I am looking to find the URL of the tutorial it will go to, which will end up being:

http://www.tutorialtoday.com/read_tutorial/65/
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Fri May 02, 2008 10:16 am    Post subject: Reply with quote

You should use normal sockets for this one. Cause then you can simply read the http headers.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Fri May 02, 2008 11:57 am    Post subject: Reply with quote

Ok well I've been messing around with fsock and LLP people seems to block it by giving a 400 error but how do you read the headers after you have opened a site?
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Fri May 02, 2008 1:21 pm    Post subject: Reply with quote

Send a regular http request, like: http://pastebin.com/f685ee30b (pastebin cause of security mod of forum)
each line should end with a line break \r\n.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
krt
...


Joined: 11 Jan 2005
Posts: 4765
Location: Down Under

PostPosted: Fri May 02, 2008 6:59 pm    Post subject: Reply with quote

Use cURL with the followlocation option on. Much simpler. cURL is also simpler when you have to deal with security, SSL, proxies and what not. This abridged list of Curl options lists the other options you may be interested in if you want to deal with what I said above or other things - POST data, headers, cookies, user agent spoofing etc.

Example:
Code:
<?php

$url = 'http://example.com';

// Create cURL resource
$ch = curl_init();

// Set options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

// Return output
$output = curl_exec($ch);

// Close cURL resource
curl_close($ch);

// Do what you want with output
echo $output;

?>

And no, there shouldn't be problems with what you are doing. I use and recommend the 10% maximum excerpt rule and an easy mechanism for reporting indexed pages for content authors to be on the safe side. Most will just appreciate the link and extra traffic.
Back to top
 
Scott
tutorialtoday.com


Joined: 24 Mar 2005
Posts: 2650
Location: Mississauga, Ontario

PostPosted: Fri May 02, 2008 7:30 pm    Post subject: Reply with quote

Thanks for the help from both of you. I already wrote the script before I saw your post about cURL, I just used fsockopen() as SolidRaven was talking about and then used preg_match_all() to check the headers for the location.

I was thinking the same thing that they wouldn't mind, I would personally prefer if people automatically index my tutorials on their site.
_________________
Tutorial Management Script - Version 1.4 Released
TutorialToday - Up and running, submit your tutorials!
Linux Tutorials - Coming Soon
Back to top
 
LP-SolidRaven
Dictator of the Dump


Joined: 06 Jun 2004
Posts: 7297
Location: The cheese is made out of moon

PostPosted: Sat May 03, 2008 12:29 am    Post subject: Reply with quote

The only problem with that krt is that cURL isn't installed everywhere.
_________________
Quote:

<bart416> I just realized something
<bart416> we celebrate the fact that this piece of rock made one rotation around a glowing ball of plasma that is kept together due to its own gravity well
<njsg> HAPPY NEW YEAR
<Easter> ^^
Back to top
 
ClickFanatic
Est. 2005


Joined: 18 Jan 2005
Posts: 4100
Location: A particular geographic area

PostPosted: Sat May 03, 2008 6:41 am    Post subject: Reply with quote

cURL is really useful. If it's not installed I would demand it! Silly
_________________
Captain Jell-O Buster from the Future
[img]http://feeds.feedburner.com/sparepencil.1.gif[/img]
Back to top
 
Display posts from previous:   
Post new topic   Reply to topic    Lifelesspeople.com Forum Index -> Web Architects' Abode All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Home | Hosting | News | Forum | Links | System Status | About | Archive | Donate ]
Powered by phpBB © 2001, 2002 phpBB Group
All trademarks and copyrights on this page are owned by their respective owners. Posts and comments are owned by the poster. Everything else © 2001 - 2007 Lifelesspeople.com