alexr_rwx | May I present: kompressr

May I present: kompressr

kompressr: make text shorter harnessing the power of acronyms (MTSHTPOA)

Try it out, let me know what you think and if it breaks :) Should be useful for making long papers shorter, by automatically extracting acronyms and using them wherever possible.

Public beta!^tm

(running on App Engine, with NLTK!)

Flat | Top-Level Comments Only

Thanks for playing with it! :)

This isn't my area, but I'd think the problem of identifying important common phrases particular to a piece of work (like SE), while weeding out generally common phrases and things that are a concatenation of the two, might be paper-worthy.

It's a good intuition that that's an important problem, but it's way been done. There's a whole literature on finding sequences that commonly appear together (say, in a set of documents, or characteristically to particular documents)... words that NLP/corpus linguistics people say when they're discussing such a thing include "collocation (http://en.wikipedia.org/wiki/Collocation)" and "TF/IDF (http://en.wikipedia.org/wiki/Tf-idf")", if you're interested.

Flat | Top-Level Comments Only

May I present: kompressr

no subject