WEB MINING

HOME            ABOUT US        CONTACT


Web mining - is the application of data mining techniques to

discover patterns from the Web. According to analysis targets, web

mining can be divided into three different types, which are Web

usage mining, Web content mining and Web structure mining.

Web usage mining is the process of finding out what users are

looking for on the Internet. Some users might be looking at only

textual data, whereas some others might be interested in

multimedia data.

Web structure mining is the process of using graph theory to

analyze the node and connection structure of a web site. According

to the type of web structural data, web structure mining can be

divided into two kinds:
1. Extracting patterns from hyperlinks in the web: a hyperlink is

a structural component that connects the web page to a different

location.
2. Mining the document structure: analysis of the tree-like

structure of page structures to describe HTML or XML tag usage.


Web mining essentially has many advantages which makes this

technology attractive to corporations including the government

agencies. This technology has enabled ecommerce to do personalized

marketing, which eventually results in higher trade volumes. The

government agencies are using this technology to classify threats

and fight against terrorism. The predicting capability of the

mining application can benefits the society by identifying

criminal activities. The companies can establish better customer

relationship by giving them exactly what they need. Companies can

understand the needs of the customer better and they can react to

customer needs faster. The companies can find, attract and retain

customers; they can save on production costs by utilizing the

acquired insight of customer requirements. They can increase

profitability by target pricing based on the profiles created.

They can even find the customer who might default to a competitor

the company will try to retain the customer by providing

promotional offers to the specific customer, thus reducing the

risk of losing a customer or customers.


Web mining, itself, doesn’t create issues, but this technology when used on data of personal nature might cause concerns. The most criticized ethical issue involving web mining is the invasion of privacy. Privacy is considered lost when information concerning an individual is obtained, used, or disseminated, especially if this occurs without their knowledge or consent [1]. The obtained data will be analyzed, and clustered to form profiles; the data will be made anonymous before clustering so that there are no personal profiles . Thus these applications de-individualize the users by judging them by their mouse clicks. De-individualization, can be defined as a tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics and merits.
Another important concern is that the companies collecting the data for a specific purpose might use the data for a totally different purpose, and this essentially violates the user’s interests. The growing trend of selling personal data as a commodity encourages website owners to trade personal data obtained from their site. This trend has increased the amount of data being captured and traded increasing the likeliness of one’s privacy being invaded. The companies which buy the data are obliged make it anonymous and these companies are considered authors of any specific release of mining patterns. They are legally responsible for the contents of the release; any inaccuracies in the release will result in serious lawsuits, but there is no law preventing them from trading the data.
Some mining algorithms might use controversial attributes like sex, race, religion, or sexual orientation to categorize individuals. These practices might be against the anti-discrimination legislation.  The applications make it hard to identify the use of such controversial attributes, and there is no strong rule against the usage of such algorithms with such attributes. This process could result in denial of service or a privilege to an individual based on his race, religion or sexual orientation, right now this situation can be avoided by the high ethical standards maintained by the data mining company. The collected data is being made anonymous so that, the obtained data and the obtained patterns cannot be traced back to an individual. It might look as if this poses no threat to one’s privacy, actually many extra information can be inferred by the application by combining two separate unscrupulous data from the user..........


DOWNLOAD PAPERS :