[Date Prev] [Date Next] | [Thread Prev] [Thread Next] | [Date Index] [Thread Index] |
the linkcheck code is going much slower than I expected. I haven'tgotten much chance to optimize it (although I'm using a O(1) lookup, so I'm not sure how it can be that much faster). I have 3.4GB of linkcheck reports so far, but it's still got a ways to go.
that's one reason I think you and Gil might enjoy talking to each other. He seems to be working at the same scale. He crawls as much as Brewster does. And, he is very good at the math to make these things finish computing. Hence his big adsense sale to Google.
I think $12.5K should certainly be enough to finish what we discussedfor phase 1.
I'm going to allocate that money to my salary then if you don't mind ... in theory I'm still in the running for a real job, but I'd like to make sure I have a few months and I'm afraid the pump is running very low ... I did no year-end fund-raising figuring eff, cc, and other legit operations ought to get first shot at a very limited attention span in the giving world.
P.S. thumper:/pro/bulk.resource.org/irs$ find . -type f | wc -l 39378123
I actually think if we add it all up this year, we're looking at >100m pages federal that were liberated. If these folks would just hand us the keys, we could do 1b in 2009, 10b in 2010, all loc.gov and the rest of .gov by 2012. I was asked to prepare a 3-page memo today, so maybe this might happen.
Carl