مفهوم انبارداری اسناد و مدارک برای مدل سازی چند بعدی هوش کسب و کار(هوش تجاری) مبتنی بر متن
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|672||2006||18 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Decision Support Systems, Volume 42, Issue 2, November 2006, Pages 727–744
During the past decade, data warehousing has been widely adopted in the business community. It provides multi-dimensional analyses on cumulated historical business data for helping contemporary administrative decision-making. Nevertheless, it is believed that only about 20% information can be extracted from data warehouses concerning numeric data only, the other 80% information is hidden in non-numeric data or even in documents. Therefore, many researchers now advocate that it is time to conduct research work on document warehousing to capture complete business intelligence. Document warehouses, unlike traditional document management systems, include extensive semantic information about documents, cross-document feature relations, and document grouping or clustering to provide a more accurate and more efficient access to text-oriented business intelligence. In this paper, we discuss the basic concept of document warehousing and present its formal definitions. Then, we propose a general system framework and elaborate some useful applications to illustrate the importance of document warehousing. The work is essential for establishing an infrastructure to help combine text processing with numeric OLAP processing technologies. The combination of data warehousing and document warehousing will be one of the most important kernels of knowledge management and customer relationship management applications.
Data warehousing  and data mining techniques  are gaining popularity as organizations realize the benefits of being able to perform multi-dimensional analyses of cumulated historical business data to help contemporary administrative decision-making , , ,  and . This inspires enterprises to eagerly delve into useful business intelligence (BI) from both internal and external data. Business intelligence is supposed to provide decision-makers with the tactical and strategic information they need for understanding, managing, and coordinating the operations and processes in organizations. However, much of the efforts have only touched the tip of the information iceberg. While the techniques regarding data warehouses, multi-dimensional models, on-line analytical processing (OLAP), or even ad hoc reports have served enterprises well; they do not completely address the full scope of business intelligence. It is believed that , for the business intelligence of an enterprise, only about 20% information can be extracted from formatted data stored in relational databases. The remaining 80% information is hidden in unstructured or semi-structured documents. This is because the most prevalent medium for expressing information and knowledge is text. For instances, market survey reports, project status reports, meeting records, customer complaints, e-mails, patent application sheets, and advertisements of competitors are all recorded in documents. Despite that, documents in the Web, enterprise repositories, and public document management systems are all growing as well. Therefore, knowledge workers, managers, and executives still have to spend much of the working moment reading dozens, if not hundreds, of various types of electronic documents spread over the Internet. There is just too much text to digest in daily life. The fast-growing and tremendous amount of documents has far exceeded the human ability for comprehension without powerful tools. As a result, when doing important decision-making, some relevant documents may be ignored, and some irrelevant documents may be considered by intuition. We believe that leaving out information induced from relevant documents or keeping information by intuitively guessing from irrelevant documents may be detrimental, causing disaster from the strategy weaved by incomplete information.
نتیجه گیری انگلیسی
While data warehouses and the numeric-centric business intelligence technologies have served most of the enterprises well, they do not fully address the complete scope of business intelligence. In this paper, we advocate the importance of constructing document warehouses to support text-centric business intelligence, and propose an architecture for document warehousing. When documents are warehoused, users can perform ad hoc on-line analytical processing (OLAP) over text in a document warehouse, just as the way users can perform OLAP over summarized data in a data warehouse. The concept of document warehousing is not only providing the ability to very fast document access without degradation in performance even as the size of the warehouse grows, but also offering a set of versatile applications for content management of enterprise business intelligence. In business, document warehousing can help administrators organize meeting reports, gazettes, or even customer complaint e-mails, where the company personnel, products, and time may be regarded as the dimensions, such that documents related to some employees, or products in some time, at somewhere can be retrieved or browsed instantly. In recent years, we have seen most data warehouse applications applied in Customer Relationship Management (CRM), a promising trend in business affairs. However, a data warehouse creation only supports the numeric analyses of customer behaviors. To obtain the reason why customers buy (or did not buy) some products, we need to establish a document warehouse. By data warehousing, users can realize business phenomena regarding who, what, when, where, and which clearly. Nevertheless, to discover why the phenomena occurred, a document warehouse should be employed .