harvest full-text documents for text and data mining