Memcached and Sherpa for Yahoo! News Activity Data Service :
Mixer, the recently announced Yahoo's new data service for news activities, uses Memcached and Sherpa for its data backend. Plus a combination of asynchronous libraries and task execution tools:
The data processing model and the clear separation between read and write data solutions is not only compelling, but essential for maintaining the SLA (max. 250ms/response):
Memcache maintains two types of …
…the moving parts on a cluster of this scale.
Facebook runs a 30PB Hadoop analytic data warehouse and Yahoo! has a 100,000 cores/40,000 machines Hadoop cluster . I'm wondering what are the largest Amazon Elastic MapReduce jobs ever run. Any ideas?
Original title and link: Tanuki: A 30000 Cores AWS Cluster ( NoSQL database ©myNoSQL)
…around. Besides the main Apache Hadoop distribution, there is Cloudera, Yahoo, IBM and even Amazon uses there own distribution for their Elastic Map Reduce Service. All these distributions were born becuase the main Apache distribution is not good enough. Yahoo is now launching a separate company - Hortonworks to fix this problem. It is essentially going to be a Cloudera competitior but before they start providing support to clients, they are going to fix the …
http://www.odbms.org/blog/2011/05/measuring-the-scalability-of-sql-and-nosql-systems/
Adam Silberstein and Raghu Ramakrishnan, Yahoo! Research.
via odbms.org
(thanks Roberto!)
Here's how I literally wasted eight hours of my life. :-)
We signed up for Yahoo! Search Boss last week. The process itself was pretty straight:
Sign into your Yahoo! account at https://developer.apps.yahoo.com/
Click " New Project", fill out the form.
Then click on the project name, activate " Yahoo! Search Boss" by suppling some billing info.
Consumer key rejected?
The above process doesn't even take five minutes, but then I spent eight …
…framework is free and open to all, and allows anyone to build some pretty cools stuff. For example, Yahoo! has built some cool stuff around RabbitMQ and riak_core for its RSS distribution system for its various portals.
Developers are limited only by their imaginations, such as building a distributed mobile gaming or social app where the payment processing or other asynchronous activities are managed with node.js but the distributed nature of the app is all Riak, for …
Yahoo!: 100,000 CPUs in 40,000 computers, all running Hadoop. Also running a 12 terabyte MOLAP cube based on Tableau Software
eBay: has 3 separate analytics environments:
6PB data warehouse for structured data and SQL access
40PB deep analytics ( Teradata)
20PB Hadoop system to support advanced analytic workload on unstructured data
Original title and link: Big Data Is Going Mainstream: Facebook, Yahoo!, eBay, Quantcast, and Many Others ( NoSQL …
…in the world, with over 20 PB of storage. By March 2011, the cluster had grown to 30 PB — that's 3,000 times the size of the Library of Congress!
Until now I thought Yahoo! has the largest Hadoop deployment, at least in terms of number of machines.
Original title and link: Moving an Elephant: Large Scale Hadoop Data Migration at Facebook ( NoSQL database ©myNoSQL)