alexr_rwx | "Thank you, Mr. Nuff."

- I'm in Tallahassee, yay :)

- In the afternoon, I hung around with the family, and in the evening, I went climbing at the rock gym with Garrett

lomonthang (and my forearms and hands are now sore in a pleasant way -- man, climbing is hard) and then we had pizza (I've finally been to Decent Pizza! And it's quite good!) ... and watched Berry Gordy's The Last Dragon -- possibly the very best kung-fu film ever produced by Motown. Busta Rhymes looks exactly like the "Shogun of Harlem" character, and this was not lost on him; he parodies (well, maybe just "reproduces chunks verbatim because it was so silly to begin with") this film in his excellent video for "Dangerous". Right-o: many thanks to the G-UNIT for a lovely evening :)

- KOMPRESSOR NOW HAS OKCUPID PROFILE AND RECENTLY ANNOUNCED THE COMING SINGULARITY/ESCHATON:

kompressorpower.

- What if, instead of going out and spidering the web independently, search engines were run on data collected from consenting anonymized users? What better spider could there be than the set of users out there? The major issue I'm thinking about is that a page might match your search criteria really well by having the right words in it -- but does it answer the question you were thinking about? What if you could mod up (or down) the usefulness of a given page? Say you're looking for technical information about some computery thing, and all the hits you pull up are archived mailing list posts -- this happens to me all the time -- most of them are pretty useless, and you've got to do a lot of sifting. When you find that one post where that one alert mailinglist member says that exact thing you needed, you should be able to mark it in some way so other people are more likely to find it.

Flat | Top-Level Comments Only

From:

rusty42.livejournal.com

given your LISPy AI background, it's understandable that you'd ask

What if you could mod up (or down) the usefulness of a given page?

and my infosec background sadly answers:

because unscrupulous people will rent a botnet and monkey with the ratings.

of course, if you're interested in applying your AI kung to the infosec domain, let me know and i'll make the appropriate introductions.

oniugnip.livejournal.com

That actually occurred to me, the Unscrupulous People Problem, although I hadn't thought of a whole botnet being used...

Well, you could certainly check for pages getting suddenly and unusually popular -- have some metric for finding outliers. You could do a bit of processing on the pages and see if they're selling something. Maybe if a user finds himself diverted to a page that he thinks got botnet-modded-up, then it could be flagged and somebody could go have a look.

Or better than all of those, I think, would be having the user do an unobtrusive task every time they want to mod something, the way you have to do when you're registering for an LJ or buying tickets on ticketmaster or whatever. "Read this hard to read nonsense word" or something like that.

Well, you could certainly check for pages getting suddenly and unusually popular -- have some metric for finding outliers. You could do a bit of processing on the pages and see if they're selling something. Maybe if a user finds himself diverted to a page that he thinks got botnet-modded-up, then it could be flagged and somebody could go have a look.

well, a flood of traffic is easy to recognize, but what if a botnet has been programmed to randomly wait between mod-ups? (say, 1-3 days, about the same time for dynamic IPs to roll over...)

Or better than all of those, I think, would be having the user do an unobtrusive task every time they want to mod something, the way you have to do when you're registering for an LJ or buying tickets on ticketmaster or whatever. "Read this hard to read nonsense word" or something like that.

ah, a "Completely Automated Public Turing test to Tell Computers and Humans Apart"? well, botnets have plenty of CPU power to break captchas algorithmically (http://www.cs.sfu.ca/~mori/research/gimpy/), or, hey, just pipe the captcha into the bot-infected guy's browser and have a real human pass the test.

now, all this mumbo-jumbo is hypothetical and probably wouldn't affect a small site like del.icio.us, but if google started doing it, there would definitely be abuse (http://money.cnn.com/2004/12/02/technology/google_fraud/?cnn=yes).

Well said, sir.

... or, hey, just pipe the captcha into the bot-infected guy's browser and have a real human pass the test.

Very, very good point. I'm trying to think up a good replacement for captchas, but that is a really difficult problem.

*considers* Anyway, this is what you and Tim

neuroticmonk are for, right?

The deep deep issue is that I want a way to mark That One Really Good Mailinglist Post, and spidering the web seems archaic and vaguely wrong what when people are out there looking at stuff all day anyway. Maybe there's some other way to keep it honest. I'll have to consider...

Zot! SAT-style analogy problems. Or anything that requires really heavy language understanding -- maybe something like a reading comprehension question that requires you understand synonyms.

like, a multiple choice problem? one in which a random choice is correct 20 or 25% of the time?

Like a multiple choice without the multiple choices, clearly. Yeah, making all the options available right there is pretty obviously not the right way to do it.

brainfaucet.livejournal.com

Overall, I think Rusty's right. Though sites like Slashdot seem to be okay at minimizing the damage with karma, mod points, unobtrusive tasks, etc. Your user bot-net flagging would probably work well too.

Gimmie a ring next time you hit up the rock gym or something equally fun. 321.277.3899

I'll put your number in my phone :)

But I'm heading to Gainesville tomorrow, to visit that Lloyd kid! Maybe I'll be in town again soon... :-\

Say "Hurro!" to Lloyd, Janice, Cydelle and the bunnies for me. :)

You should be sneaky and develop a search site with user rankings. For your unobtrusive tasks, use your SAT-style questions you'd mentioned. Then start an open source site bent on defeating your SAT-style questions with AI... then benefit from the AI development that going into ruining your site. :P

You could even go further and start an open source site bent of improving the SATesq question building AI to beat the SATesq question answering algorithms... what better way to speed AI development than start a war!?

From: (Anonymous)

What better spider could there be than the set of users out there?

Namely, google. Simple reason. Speed. People can't bounce through that many pages that fast. Also, how will these anonymized users find pages if someone doesn't spider them first?

What you're really looking for is a question of having users mod sites based on usefulness and interestingness, etc. Only issue there is that you might mod a site really low because it doesn't answer your question, but it will answer mine perfectly well. I vote for language/semantics understanding and search engines that allow meta-searching.

- Tim

i think part of the problem is that people want an expert answer (i.e. one that can be achieved with a few search terms and skimming the top of the google results) without becoming an expert.

Not being an expert is no reason why your widget shouldn't work. You can't know everything. Or at least, your mom can't be expected to know everything.

sydelleofcourse.livejournal.com

Might I have the honor of "friending" you?

By all means :)

EXCEPT ONLY IF I CAN FRIEND YOU BACK!!

I'd be delighted.

lindseykuper

When you find that one post where that one alert mailinglist member says that exact thing you needed, you should be able to mark it in some way so other people are more likely to find it.

I'm probably taking you way too literally, but there's always StumbleUpon.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Alex R

"Thank you, Mr. Nuff."

"Thank you, Mr. Nuff."

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

May 2022

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags