Web and Text Mining for the Masses!

I installed the Word & Web Vector plugin for YALE (Rapidminer) this week and have been pleasantly surprised with it. However, with any YALE plugin or software, it takes a lot of time to figure out how to use it. Despite the large learning curve, I’ve been able to web mine a few websites and build a preliminary word list.

Now, no structured web data source is safe from the clutches of Neural Market Trends!

The Word & Web Vector Tool is a flexible Java library for statistical language modeling and integration of Web and Webservice based data sources. It supports the creation of word vector representations of text documents in the vector space model that is the point of departure for many text processing applications (e.g. text classification or information retrieval). Furthermore, it offers convenient interactive methods to extract data from structured sources, such was HTML or XML files. Finally, it allows to integrate external data by using Webservice APIs in a mashup-like way (e.g. for geo-mapping). [nemoz.org]

I’m looking forward to becoming the new Google! :)

About Tom

Blog owner of Neural Market Trends
This entry was posted in Data Mining. Bookmark the permalink.

2 Responses to Web and Text Mining for the Masses!

  1. Digital Dude says:

    Hi Tom,

    Care to share a small example of how to read some data like the us markets from a quoote.com web page ;-)

    Cordially,

    Digital Dude

  2. Tom says:

    DD: I would love to show my readers how its done but I’m very short on time these days, so it’ll have to wait.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>