A simple PHP Yahoo cached page spider for recovering the web

Discuss any general topics regarding Shareaza.
Forum rules
Home | Wiki | Rules

A simple PHP Yahoo cached page spider for recovering the web

Postby outcrop » 19 Jun 2009 07:53

Attachments
yahoocachespider.zip
(6.99 KiB) Downloaded 223 times
Last edited by outcrop on 19 Jun 2009 16:03, edited 1 time in total.
outcrop
 
Posts: 15
Joined: 16 Jun 2009 09:50

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby ocexyz » 19 Jun 2009 10:51

Thank you Outcorp! Great work!
User avatar
ocexyz
 
Posts: 624
Joined: 15 Jun 2009 13:09

Instructions

Postby outcrop » 19 Jun 2009 12:15

1)Download php:
http://www.php.net/get/php-5.2.10-Win32 ... m/a/mirror

2)Unzip it to a directory like c:\
then make a new directory named "cache"

3)copy this spider code and save as yahoospider.php,
and download htmlsql.class.php from http://www.jonasjohn.de/lab/htmlsql.htm

put them in the same directory like c:\php\spider

4)run it.
click the start menu-run-cmd.exe,then goto the php directory,just type:
php.exe c:\php\spider\yahoospider.php
there will be errors or something else in the console, just ingnore it.
it will save all the got files to the cache directory by name


5)New query
When the spider ended, you can edit the query string:
$querystr = "wiki+site:pantheraproject.net";

to other search like:
$querystr = "developers+inurl:wiki+site:pantheraproject.net";

just change $querystr to any fomate like you search in the search engine.

then save the spider and goto 4.
outcrop
 
Posts: 15
Joined: 16 Jun 2009 09:50

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby borat1 » 19 Jun 2009 14:04

Thnx Outcrop !
Sadly I have got this error :
Parse error: syntax error, unexpected T_STRING in C:\PhP\spider\yahoospider.php on line 130
borat1
 
Posts: 15
Joined: 18 Jun 2009 13:37

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby outcrop » 19 Jun 2009 16:01

Attachments
yahoocachespider.zip
(6.99 KiB) Downloaded 243 times
outcrop
 
Posts: 15
Joined: 16 Jun 2009 09:50

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby old_death » 19 Jun 2009 16:09

I get the following error:
Fatal error: Call to undefined function curl_init() in C:\php\spider\yahoospider.php on line 142
User avatar
old_death
 
Posts: 1950
Joined: 13 Jun 2009 16:19

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby borat1 » 19 Jun 2009 16:44

Almost the same error as OD.
Fatal error: Call to undefined function curl_init() in C:\PhP\spider\yahoocachespider.php on line 157
borat1
 
Posts: 15
Joined: 18 Jun 2009 13:37

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby ocexyz » 19 Jun 2009 16:53

try to ignore, this could be effect of what now is hanging on pantproj.net
User avatar
ocexyz
 
Posts: 624
Joined: 15 Jun 2009 13:09

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby outcrop » 19 Jun 2009 17:28

outcrop
 
Posts: 15
Joined: 16 Jun 2009 09:50

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby outcrop » 19 Jun 2009 17:33

outcrop
 
Posts: 15
Joined: 16 Jun 2009 09:50

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby borat1 » 20 Jun 2009 07:14

Since I have a fixed ip, is there anyway to use this with TOR ?
borat1
 
Posts: 15
Joined: 18 Jun 2009 13:37

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby old_death » 20 Jun 2009 21:57

User avatar
old_death
 
Posts: 1950
Joined: 13 Jun 2009 16:19

;]

Postby aaron_walkhouse » 21 Jun 2009 00:48

Pick one.
User avatar
aaron_walkhouse
 
Posts: 78
Joined: 14 Jun 2009 03:09
Location: My igloos melt in June.

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby borat1 » 21 Jun 2009 06:16

My Computer -> properties ->Advanced -> Environment Variables
Edit system variable -> Path -> add c:\php;

create cache dir -> c:\php\cache

copy php.ini-recommended -> php.ini

Edit php.ini :

Line 528, change to :
include_path = ".;c:\php\includes"

Line 542, change to :
extension_dir = "c:\php\ext"

Line 630, change to :
auto_detect_line_endings = ON

Line 751 + 752 change to :
;SMTP = localhost
;smtp_port = 25

Line 661 - 705 to :
extension=php_bz2.dll
extension=php_curl.dll
;extension=php_dba.dll
;extension=php_dbase.dll
;extension=php_exif.dll
extension=php_fdf.dll
extension=php_gd2.dll
extension=php_gettext.dll
extension=php_gmp.dll
;extension=php_ifx.dll
;extension=php_imap.dll
;extension=php_interbase.dll
extension=php_ldap.dll
extension=php_mbstring.dll
;extension=php_mcrypt.dll
;extension=php_mhash.dll
;extension=php_mime_magic.dll
extension=php_ming.dll
extension=php_msql.dll
extension=php_mssql.dll
extension=php_mysql.dll
extension=php_mysqli.dll
;extension=php_oci8.dll
extension=php_openssl.dll
extension=php_pdo.dll
extension=php_pdo_firebird.dll
extension=php_pdo_mssql.dll
extension=php_pdo_mysql.dll
;extension=php_pdo_oci.dll
;extension=php_pdo_oci8.dll
extension=php_pdo_odbc.dll
extension=php_pdo_pgsql.dll
extension=php_pdo_sqlite.dll
extension=php_pgsql.dll
extension=php_pspell.dll
extension=php_shmop.dll
;extension=php_snmp.dll
extension=php_soap.dll
extension=php_sockets.dll
extension=php_sqlite.dll
;extension=php_sybase_ct.dll
extension=php_tidy.dll
extension=php_xmlrpc.dll
extension=php_xsl.dll
extension=php_zip.dll

(Maybe I made too many extensions active then really needed, but it seems to work allright...)

Please bare in mind :
I am a noob and was a bit drunk when I got home @6 in the morning after spending some time with
friends all night long, when I had a "bright" idea why it did not work here on my PC...
So feel free to add any comments or even better an improved php.ini people can use as a template !!

Almost forgot :
When blacklisted it seems to take at least 30 minutes before you can resume again.
So enjoy your breaks. :D

Current status :
A hangover and more then 2000 files in my cache and counting...
borat1
 
Posts: 15
Joined: 18 Jun 2009 13:37

Re: ;]

Postby old_death » 21 Jun 2009 13:52

User avatar
old_death
 
Posts: 1950
Joined: 13 Jun 2009 16:19

Re: A simple PHP Yahoo cached page spider for recovering the web

Postby borat1 » 21 Jun 2009 14:11

"can not open ./cache/Translate+-+Shareaza+Wiki.html"
Looks like it can not find your cache dir, where did you put it ?
And is it correctly spelled ?
borat1
 
Posts: 15
Joined: 18 Jun 2009 13:37


Return to General Discussion

Who is online

Users browsing this forum: Baidu [Spider] and 1 guest

cron