[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Message 00463: Re: gil, meet aaron

Hi, Gil! It's great to hear from you. The crawl projects you mention sound fascinating.

I'm on vacation, but that means my schedule is very free. You can try me at 847 877 8895. Or name a time and I'll make sure to be by the phone.

On Dec 30, 2008 7:11 PM, "Gilad Elbaz" <gilelbaz@gmail.com> wrote:

Hi Aaron, (And, thanks for the intro Carl.)

I've heard great things about you from Carl and am aware of some of the cool and impactful things you have done recently and in the past.  I certainly share your stated passions at WatchDog.net for improving data accessibility and would really like to chat with you about such topics.

Very quick backgrounder on me: 
I co-founded Applied Semantics which developed natural language technologies that formed the basis for our AdSense product.  We were acquired by Google in 2003 and I co-ran the LA office until 2007 when I left.   Today I'm working on two things. CommonCrawl Foundation is attempting to cache as much of the web as possible and make it easy to process.  Current status - we have 1.5B pages on Amazon S3 and wrote some libraries to facilitate writing map-reduce jobs on it.  After a year of development, we just started crawling so we hope to scale up significantly.  Structured Commons Inc. is my core focus.  It's a for-profit that is still in stealth.  But, I can say we are very interested in making much more structured data accessible.

It seems we have some common interests and I'd really like to learn about your data crawling and parsing roadmap.  Also, if we can talk soon (by tomorrow), I may be able to make a 2008 grant to help with such work. 

Do you have any time to talk today or tomorrow?

- Gil

On Tue, Dec 23, 2008 at 5:22 PM, Carl Malamud <carl@media.org> wrote: > > Hi Gil - > > Great talki...

Gil Elbaz
Structured Commons Inc.
Office: 310-914-2400 x151
Cell: 310-722-2224