A web smart space is an intelligent environment which has additional capability of searching the information smartly and efficiently. New advancements like dynamic web contents generation has increased the size of web repositories. Among so many modern software analysis requirements, one is to search information from the given repository. But useful information extraction is a troublesome hitch due to the multi-lingual; base of the web data collection. The issue of semantic based information searching has become a standoff due to the inconsistencies and variations in the characteristics of the data. In the accomplished research, a web smart space framework has been proposed which introduces front end processing for a search engine to make the information retrieval process more intelligent and accurate. In orthodox searching anatomies, searching is performed only by using pattern matching technique and consequently a large number of irrelevant results are generated. The projected framework has insightful ability to improve this drawback and returns efficient outcomes. Designed framework gets text input from the user in the form complete question, understands the input and generates the meanings. Search engine searches on the basis of the information provided. INTRODUCTION Search Engine is software that searches for data based on some criteria. Every Web search engine site uses a search engine that it has either developed itself or has purchased from a third party. Search engines can differ dramatically in the way they find and index the material on the Web, and the way they search the indexes from the user's query. Although a search engine is technically the software and use algorithms to perform a search. For example, Google is a major search site on the Web, but rather than being called the "Google search site," it is commonly known as the "Google search engine." Following section describes basic concepts involved in the research: International Journal of Emerging Sciences ISSN: 2222-4254 1(1) April 2017 1.1. Web Smart Space Smart Spaces [2] needs a methodology to deal with complexity. Our approach is to construct smart spaces from autonomous parts, which we call agents, responsible for their own data and actions and communications with each other [4]. This gives a high intrinsic adaptability and survivability (desirable properties unless we wish to employ a multitude of maintenance engineers for every smart space) and encourages what are called "emergent behaviors" due to the combined actions and interactions of many agents. A smart space [3] is an environment with numerous elements that: can sense ,can think , can act ,can communicate ,can interact with other people This intelligent environment is also robust, self-managing, and scaleable. Driven by rapid developments in Web searching technology and Web mining, such smart spaces [2] can play an important role to perform useful tasks and to tackle the complex issues on web. 1.2. Description of Problem The problem specifically addressed in this research is primarily related to a large number of the irrelevant outcomes of a searching query. In conventional techniques, search engines use pattern matching methodologies and the web contents are searched by matching user given words. This surface matching technique generates results in millions and billions and most of the time irrelative and unrelated results are shown. To make search activity more efficient and effective, the major emphasis of research is contents searching bodies as information agents. Mostly research is being done at backend in the form of multi agents. The preprocessing of user’s query before a search engine processes it is also significant. RELATED WORK Information gathering on the Internet is a time consuming and somewhat tedious experience. The main method for gathering information on the Internet is the search engine. Literature reviewed indicates that Intelligent Information Agents can improve the process of gathering information on the Internet [1]. In general, when searching for a piece of information in a Web page, blocks likely to contain desired information are searched first, then more fine-grained International Journal of Emerging Sciences ISSN: 2222-4254 1(1) April 2017 3 blocks recursively until the desired information is found. Instead of recursively searching within a page, hierarchical information searching is a more straightforward process for finding and understanding page information [7]. General search engines always crawl and index everything in a page and decide which words or paragraphs are more important by analyzing term frequency and inverse document frequency [12]. In the general design of information agents, human’s knowledge is first called for deciding which blocks are needed for agents to crawl. It is difficult for general search engines and information agents to find important information for users. Usually, Agents are used for information retrieval from web. But it is difficult for them to recognize the significance of information in a page and they always consider the contents of Web pages as a linear data stream[9], where no such consideration of contextual information exists. The relations among different contexts and links in a web page is a significant element which increases accuracy and decreases the cost of search engines during the extraction and classification of informative regions of a web page. The structure of information within a page to represent the significance and relation of information is called the information hierarchy of a page. Agents collaborate to gather HTML pages from de the World Wide Web and treat them in order to be able to retrieve those pages from subsequent users’ queries. Crawling Agent collaboration is required in order to decide the URLs that should be first retrieved. Subsequent age treatment consists on first filtering the pages so that HTML format is transformed into XML and second indexing them so that information retrieval can be performed online [2]. A search engine operates, in the following order: Web crawling Deep Crawling Depth-first search (DFS) Fresh Crawling Breadth-first search (BFS) Indexing Searching Web search engines work by storing information about a large number of web pages, which they retrieve from the WWW itself. These pages are retrieved by a web crawler (sometimes also known as a spider) an automated web browser which follows every link it sees [6], exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages is stored in an index database for use in later queries. Some search engines [13], such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereas some store every word of every page it finds, such as AltaVista. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. International Journal of Emerging Sciences ISSN: 2222-4254 1(1) April 2017 CONCLUSION & FUTURE WORK In this , A Web Smart Space Framework for Information Mining: A base for Intelligent Search Engines has been presented. The proposed framework bases on a Natural Language Processing techniques and has a perceptive ability to understand the user requirements and then search and extract the refined results from the given Web repository. A new idea of web smart space has been introduced which basically introduces front end processing for a search engine to make the information retrieval process more intelligent and accurate. As in common searching techniques, searching is performed only by using pattern matching International Journal of Emerging Sciences ISSN: 2222-4254 1(1) April 2017 9 technique and consequently a large number of irrelevant results are generated. The projected framework has insightful ability to improve this drawback and returns efficient outcomes. Designed framework gets text input from the user in the form complete question, understands the input and generates the meanings. This preprocessed information is used to extract only the required information from web. The Web Smart Space Framework for information mining using Natural Language Processing may be used to get the more useful information from Web repository. Future struggle in this regard is to implement the proposed framework like Google search engine. The designed algorithm is used to understand the user’s information and extract the relative information. Current algorithm only considers the active-vice sentences. Improvement in the algorithm can improve the accuracy ratio of the result and can ultimately influence the accuracy ratio of searching results.(),英语论文题目,英语论文题目 |