cougarguard.com — unofficial BYU Cougars / LDS sports, football, basketball forum and message board  

Go Back   cougarguard.com — unofficial BYU Cougars / LDS sports, football, basketball forum and message board > non-Sports > Chit Chat
Register FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
Old 01-22-2008, 08:43 PM   #1
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default Need ideas about building a computer

What is the most kick-a$$ workstation number-cruncher out there that you can buy, for say, under $10k.

I need something that can crunch very large datasets. Like an array of 35 million observations with 60 fields per observation.

Maybe could go to $20k.
MikeWaters is offline   Reply With Quote
Old 01-22-2008, 09:07 PM   #2
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default

I see that STATA has multiprocessor support.

http://www.stata.com/statamp/

Up to 32 processors.

SiCortex looks like an interesting product.

http://www.theregister.co.uk/2007/11...apult_cluster/
http://www.sicortex.com/products/me_...pult/datasheet

72 processors in a desktop form factor, for about 15k.
MikeWaters is offline   Reply With Quote
Old 01-24-2008, 12:09 AM   #3
Brian
Senior Member
 
Brian's Avatar
 
Join Date: Jan 2006
Location: Oak Ridge, TN
Posts: 1,308
Brian has a little shameless behaviour in the past
Default

Quote:
Originally Posted by MikeWaters View Post
I see that STATA has multiprocessor support.

http://www.stata.com/statamp/

Up to 32 processors.

SiCortex looks like an interesting product.

http://www.theregister.co.uk/2007/11...apult_cluster/
http://www.sicortex.com/products/me_...pult/datasheet

72 processors in a desktop form factor, for about 15k.
It's disturbing that the scaling graph only goes to 4 processors. Do they advertise numbers beyond 4?
Is this analysis compute or memory intensive? A processor/memory mismatch for the particular task will bite you.
__________________
e^(i * pi) + 1 = 0
5 great numbers in one little equation.
Brian is offline   Reply With Quote
Old 01-24-2008, 12:32 AM   #4
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default

Quote:
Originally Posted by Brian View Post
It's disturbing that the scaling graph only goes to 4 processors. Do they advertise numbers beyond 4?
Is this analysis compute or memory intensive? A processor/memory mismatch for the particular task will bite you.
STATA is located in College Station. What do you expect?

I'm guessing it is both memory and processor intensive. I'd love to have 16gb ram. that would be nice.
MikeWaters is offline   Reply With Quote
Old 01-24-2008, 12:53 AM   #5
Brian
Senior Member
 
Brian's Avatar
 
Join Date: Jan 2006
Location: Oak Ridge, TN
Posts: 1,308
Brian has a little shameless behaviour in the past
Default

Quote:
Originally Posted by MikeWaters View Post
STATA is located in College Station. What do you expect?

I'm guessing it is both memory and processor intensive. I'd love to have 16gb ram. that would be nice.
This is working figuring out before you shell out the cash.
It it doesn't scale well, you're wasting money on more processors. Just get a quad core and be done with it.
If the application is memory intensive, more processors will likely make it run slower. Memory access kills performance, which is why the Cell processor was created. If you have 32 processors constantly going to memory, you will be dead, rush-hour traffic. However, if the processors access memory, then compute for a long time on it, more processors could be a big win.
__________________
e^(i * pi) + 1 = 0
5 great numbers in one little equation.
Brian is offline   Reply With Quote
Old 01-24-2008, 12:56 AM   #6
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default

what's the best way of figuring out, short of calling STATA?

Btw, I use SAS and have never used STATA, but as far as I know, there is no version of SAS that takes advantage of multiple processors.
MikeWaters is offline   Reply With Quote
Old 01-24-2008, 01:04 AM   #7
Brian
Senior Member
 
Brian's Avatar
 
Join Date: Jan 2006
Location: Oak Ridge, TN
Posts: 1,308
Brian has a little shameless behaviour in the past
Default

Quote:
Originally Posted by MikeWaters View Post
what's the best way of figuring out, short of calling STATA?

Btw, I use SAS and have never used STATA, but as far as I know, there is no version of SAS that takes advantage of multiple processors.
is this open source? If you're paying for it, I'd call them. Or google.

it may be that this algorithm is difficult to parallelize. Getting stuff to scale is a *very* hard problem and an active area of research.
__________________
e^(i * pi) + 1 = 0
5 great numbers in one little equation.
Brian is offline   Reply With Quote
Old 01-24-2008, 01:12 AM   #8
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default

their site says that you should have 1.5x the memory of your largest dataset.

That's going to be pretty expensive. I think my current problem is that I don't have enough RAM. I think I have 4gb. I need something like 16 - 32.
MikeWaters is offline   Reply With Quote
Old 01-24-2008, 03:16 AM   #9
jay santos
Senior Member
 
Join Date: Jan 2006
Posts: 6,177
jay santos is on a distinguished road
Default

Quote:
Originally Posted by MikeWaters View Post
their site says that you should have 1.5x the memory of your largest dataset.

That's going to be pretty expensive. I think my current problem is that I don't have enough RAM. I think I have 4gb. I need something like 16 - 32.
Dumb question, but are you sure the query is being done on your computer and not a database server? A database that large surely would be on a dedicated right? And you're just using a tool that's querying the database while the server engine does the work and your computer is just getting the resulting dataset? I work with databases on my hard drive that are not that large, but pretty close. But I build them myself, and I wouldn't think that would be that common.
jay santos is offline   Reply With Quote
Old 01-24-2008, 03:21 AM   #10
MikeWaters
Demiurge
 
MikeWaters's Avatar
 
Join Date: Aug 2005
Posts: 36,365
MikeWaters is an unknown quantity at this point
Default

Quote:
Originally Posted by jay santos View Post
Dumb question, but are you sure the query is being done on your computer and not a database server? A database that large surely would be on a dedicated right? And you're just using a tool that's querying the database while the server engine does the work and your computer is just getting the resulting dataset? I work with databases on my hard drive that are not that large, but pretty close. But I build them myself, and I wouldn't think that would be that common.
ROTFLMAO! Yes, I'm quite sure.
MikeWaters is offline   Reply With Quote
Reply

Bookmarks


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 07:41 PM.


Powered by vBulletin® Version 3.8.2
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.