Get Paid to Participate - up to $1 per post!     Twitter     Facebook     Google+
Hosting Discussion
 

forgot password?



Reply


Old
  Post #1 (permalink)   02-10-2010, 01:41 PM
HD Newbie
 
Join Date: Feb 2010
Posts: 3

Status: elmouad is offline
Hello,

I want to create a search engine website using the open source software Nutch. My question is : do I need a dedicated server for that or a virtual private server is enough?.

Has someone experience with Nutch? what is the best hosting plan for using Nutch?

PS: I want to index about 10 Millions webpages.

Mouad
 
 
 


Old
  Post #2 (permalink)   02-10-2010, 03:11 PM
HD Wizard
 
Join Date: Mar 2005
Location: Atlanta, GA
Posts: 2,264

Status: handsonhosting is offline
How quickly do you want to index those pages - that's going to be your determining factor on this. The bigger the machine (or cluster of machines) the faster it will get done. I think you'll be dragging along if you opt for a VPS, but it depends on how quickly you need the indexing done and how often you're spidering back.

I've not worked with Nutch myself, but based on some of the reading that I've done it seems to be a memory intensive program, so be sure to get enough memory. There's a number of places that recommend NOT using a VPS server for this due to the constraints.
__________________
Emerson Nogueira
http://www.HandsOnWebHosting.com
cPanel Web Hosting, Domain Registration, Managed VPS Servers
 
 
 


Old
  Post #3 (permalink)   02-10-2010, 03:17 PM
HD Community Advisor
 
SenseiSteve's Avatar
 
Join Date: Mar 2009
Location: Saint Louis
Posts: 4,945
Send a message via MSN to SenseiSteve

Status: SenseiSteve is offline
I think on a $50 budget, you need to leave "quick' out of your vocabulary. Good luck on your project.
__________________
ProlimeHost- Dedicated Server Hosting & KVM SSD VPS
Three Datacenter Locations: Los Angeles, Denver & Singapore
SuperMicro Hardware | Multiple Bandwidth Providers | 24/7 On Site Engineers
 
 
 


Old
  Post #4 (permalink)   02-10-2010, 03:26 PM
HD Newbie
 
Join Date: Feb 2010
Posts: 3

Status: elmouad is offline
Quote:
Originally Posted by handsonhosting View Post
How quickly do you want to index those pages - that's going to be your determining factor on this. The bigger the machine (or cluster of machines) the faster it will get done. I think you'll be dragging along if you opt for a VPS, but it depends on how quickly you need the indexing done and how often you're spidering back.

I've not worked with Nutch myself, but based on some of the reading that I've done it seems to be a memory intensive program, so be sure to get enough memory. There's a number of places that recommend NOT using a VPS server for this due to the constraints.
The speed does not really matter, I need to reindex these pages once a 3 mounths, but the most important for me is the disk space I need at least 50 GB since I index about 1 million pages.
 
 
 


Old
  Post #5 (permalink)   02-10-2010, 05:55 PM
HD Wizard
 
Join Date: Mar 2005
Location: Atlanta, GA
Posts: 2,264

Status: handsonhosting is offline
Managed VPS or self managed? 50GB for $50 with ample bandwidth for all your indexing is going to be a tall order for most places.

May want to give out some detailed specs on what you need (memory, bandwidth etc) so that others can give you some more guidance.
__________________
Emerson Nogueira
http://www.HandsOnWebHosting.com
cPanel Web Hosting, Domain Registration, Managed VPS Servers
 
 
 


Old
  Post #6 (permalink)   02-11-2010, 03:03 AM
HD Newbie
 
Join Date: Feb 2010
Location: north yorkshire
Posts: 26
Send a message via Skype™ to Carbon_Neutral

Status: Carbon_Neutral is offline
MOD NOTE: Self-advertising is not allowed.
 
 
 
Reply

Thread Tools

New Post New Post   Old Post Old Post
Posting Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Sponsored By: