Distributed Computing: Scientific Asset or Money Making Gimmick?
Browse articles:
Auto Beauty Business Culture Dieting DIY Events Fashion Finance Food Freelancing Gardening Health Hobbies Home Internet Jobs Law Local Media Men's Health Mobile Nutrition Parenting Pets Pregnancy Products Psychology Real Estate Relationships Science Seniors Sports Technology Travel Wellness Women's Health
Browse companies:
Automotive Crafts & Gifts Department Stores Electronics Fashion Food & Drink Health & Beauty Home & Garden Online Services Sports & Outdoors Subscription Boxes Toys, Kids & Baby Travel & Events

Distributed Computing: Scientific Asset or Money Making Gimmick?

Distributed computing programs offer a chance to use the wasted idle computer time of your own PC to benefit various types of research.

Distributed computing was born in 1995 thanks to the prevalence of the World Wide Web and the ideas of David Gedye, who suggested linking computers together via the internet to supplement the processing power of the Search for ExtraTerrestrial Intelligence (SETI). The result of his work was named SETI@home and began in 1999. Internet users could sign up for the program and run a screen saver that would use a percentage of the computer's idle processing power (specified by the user) to decode radio signals from outer space. SETI at Berkeley uses two programs to search for potential signals of other intelligent life forms: SERENDIP and SETI@Home. SERENDIP is a real-time broad search, and does not require information recording analysis, as it discards a great deal of information. SETI@Home is quite the opposite: it scours data thoroughly utilizing Fourier transforms, the math of which is moderately complex and can be mostly simply defined as multivariable signal analysis. To perform this in-depth analysis, massive amounts of computer time are required. Researchers had a choice: they could pay a premium for supercomputer time, or they could involve individual PC's for the cost of programming and some server operations.

This is where the distributed computing aspect of SETI@Home comes in. Radio signals can be divided into small files and disseminated to any of the numerous computers within the system (at last count, over 180,000 active users). The user's computer then uses idle processor time to analyze the radio transmission sample, and returns it to Berkeley when work is complete. The project can send out the same file to numerous computers for repeatable results, and if a significant signal is discovered by a PC, it can then run more directly through the file to check for high-powered signals. Needless to say, alien signals have not been found. But radio source SHGb02+14a did make news when researchers realized that it had some unusual characteristics. Researchers do point out that the signal is an unlikely source, but the program is what allows them to sift through the gargantuan data stream to analyze signals like this.

If aliens aren't your thing, don't worry. Distributed computing has much more to offer the world than a search for distant radio signals, although enthusiasm remains high for SETI@Home, which is currently calculating data at a rate of nearly 753 trillion floating point operations per second (TeraFlops). If that just sounded like a foreign language to you, it's easier to say that on the list of the fastest supercomputers in the world, the SETI@Home network would rank 5th. That represents a huge savings in computer time. Distributed computing is a remarkably efficient system that uses a freely available resource (idle computer time) and the willingness of individuals to volunteer that time to analyzes divisible chunks of data. For a large list of available distributed computing programs, see here: http://distributedcomputing.info/projects.html

Recently, a project entitled Folding@home has taken much of the thunder from SETI@Home. With over 5 million registered computers and 436,000 active ones, Folding@Home now dwarfs the size of the SETI@Home project. It has done so in large part thanks to the incredible processing power of the Playstation 3 and the Nvidia and ATI Graphical Processing Units, which together now contribute approximately 85% of the projects power. If compared to the world's fastest supercomputers, Folding@home's network is more than double the speed of the fastest supercomputer in existence, the red-hot Jaguar - Cray XT5-HE Opteron Six Core 2.6 GHz located at the Oak Ridge National Laboratory in the United States. Folding@home currently calculates at 3,949 trillion floating point operations per second, or 3.9 PetaFlops (3.9 quadrillion or 3.9 X 10^15 flops!). Cray's supercomputing monster only calculates at a rate of approximately 1.7 PetaFlops.

Folding@home's mission is to understand how proteins fold and why the folding does not always work correctly. Folding at Home is operated by the Pande Group from Stanford University (founder Dr. Vijay Pande). Protein folding is an immensely rapid process that can perform thousands of steps in mere microseconds. The Folding@home program, like SETI@Home, uses PCs to simulate fragments of the data and return it to the central server. One thing that probably generates appeal for the program is that a number of papers have been published under the program, which allows users to see the potential results of donating their computer's surplus time.

Just after the year 2000, a number of programs like the GOMEZ Peer application (http://www.gomezpeerzone.com/) began to pop up. They advertised themselves as being distributed for-pay operations: programs that would pay you for your computer time. Most of these programs have shut down, but many users seem to be achieving about $30 a month from the utilization of this program. The top user, ffhxk, has netted almost $3,500, according to the website's front page. Most for-pay programs have shut down, however, and they are not particularly lucrative. Unfortunately, Gomez's pay structure is crafty, and it can take months to be approved for pay. Therefore, Gomez can utilize many members' computers while only paying a fraction of them for their use; this means that it is structured more as a gimmick.

The typical distributed computing experience, therefore, consists of a user downloading a small program from the website of the group they wish their computer to be utilized for. The program then runs in the background and can be controlled with specific tools tabs to fit the user's needs. I have run both Folding@home and SETI@Home on my PC for about five years with no deficiencies in speed or performance as a result. The programs only use surplus power, so if your computer needs all its resources, the distributed program receives none. On the websites of most distributed computing programs, there are statistics pages that allow the user to see how fast their computer is performing and join various teams in the quest to turn in the most results and receive the most points. This has resulted in some entertaining methods of improving user performance, including "farming out" a PC by installing a distributed program and removing all unnecessary programs, then hooking it to an internet connection and leaving it on in a cooled and ventilated room with ten (or maybe even a hundred!) computers suffering the same fate. Some users also overclock their computers, which essentially consists of running higher voltages to boost processor performance. The heat generated from this can necessitate cooling even as extreme as a nitrogen cooling system.

The most notorious case of distributed computing, however, actually made some headline news. David McOwen, who was a computer technician for DeKalb Technical College in Atlanta, Georgia, installed a distributed computing program on a number of the school machines. When computers began accessing the internet at a time when no one was present on the campus, eyebrows were raised. The issue soon reached court, where the massive fines the school system was prepared to levy were reduced to a probation sentence. Because he failed to acquire permission to install this program on the computers, and he violated the standing computer policy of the school district (which exists in similar form in most school systems), McOwen was fired and the precedent was set that distributed computing did not belong on school computers.

Pictured is the Arecibo Radio Observatory in Panama, where SETI@Home receives the data that it distributes to home PCs for analysis.












Need an answer?
Get insightful answers from community-recommended
in Software on Knoji.
Would you recommend this author as an expert in Software?
You have 0 recommendations remaining to grant today.
Comments (4)

How absurd; the one place where it would make the most sense (in a school), is the one place it isn't permitted.

Since this article has been written, distributed computing has grown in its capabilities and its usage in the IT field. The SETI project of course is the most widely known use, but there are large IT organizations that use a distributed platform to help ease the processor burden of their servers when other computers can join in the work. This article had some factual commentary, but the premise of the question has been long answered: Scientific Asset (it's a gimmick when used by some companies such as online gaming companies and hackers). One point I'd like to make: All of those "bots" and "zombies" used in organized web attacks use the basic principles of distributed computing and they are very effective, wouldn't you say?

Actually, as I demonstrated in the article, the question of whether distributed computing is a scientific asset or gimmick is a valid one, and it depends on the implementation. I even gave specific examples of both scenarios; the Gomez Peer fell into the gimmick category. I've also never heard of online gaming companies or hackers using distributed computing, nor does this apply to bots and zombies utilizing malicious code, although they are very effective at what they do. Distributed computing is defined as a central server which utilizes a network of personal computers in concert in order to create a veritable supercomputer (also defined above). What you're referring to would be more accurately referred to as phishing and e-mail fraud (http://www.technewsworld.com/story/33171.html?wlc=1281825050 as an example). The examples you mentioned have nothing in common with distributed computing, because they don't make a network at all, nor do they distribute files with the intent of having them analyzed and returned. Thank you for your comments, though! And if you do have information on distributed computing uses in the IT field, I'm sure it would make an excellent Factoid.

Hello, Dustin maybe I should take a moment to be more specific. I was trying to be a little more generic in my comments and I see they led to a misunderstanding of my intended thoughts. So allow me to break down more specifically what I was referring to, and allow me to take a moment to infuse information about my background. As one of the Certified Ethical Hackers (C|EH) , I have been involved in distributed computing hacking and cracking attempts (all authorized by the target company for their own benefit) to see what could be accomplished. I will in fact write an updated Factoid on the subject, that was an excellent piece of advice. Some material for you to view on the subject can be found via SANS at ( http://www.sans.org/reading_room/whitepapers/awareness/distributed-computing-unstoppable-brute-force_1330) and via an article from 1999 talking about using distributed computing to attempt to crack Cryptographic Systems ( http://portal.acm.org/citation.cfm?id=592257 ) so in fact in has been in use in the hacking/cracking world for quite some time. An online gaming system that is now using a distributed computing game by having you download a small piece of software (Similar to downloading the SETI client) for their game to work is Classic DOS Games (http://www.classicdosgames.com/special/distributed.html ). The 17 or bust project also used distributed computing ( http://www.seventeenorbust.com/ ) for reference. In the sense of distributed computing being the target and realm of personal computers, sure that is a big pool of the computers being referenced. Your reference of making a network, distributing files and analyzing/returning them is but one of the uses of distributed computing, not the only. I happen to work for the largest IT consulting firm in the world at the present time (no it isn't IBM) and we use distributed computing in some situations in-house. Again, please don't take any of my comments as an attack in any manner. The article you wrote is technically accurate, I was simply highlighting that the question of the article has been answered in the manner in which it isn't a gimmick, it is here to stay.